From ncoghlan at gmail.com Sat Oct 1 04:08:58 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 01 Oct 2005 12:08:58 +1000 Subject: [Python-Dev] PEP 350: Codetags In-Reply-To: References: <20050926223521.GE10940@kitchen.client.attbi.com> <20050928161039.GF10940@kitchen.client.attbi.com> <20050929153237.97E1.JCARLSON@uci.edu> <433CFA1F.4010804@gmail.com> Message-ID: <433DEFBA.9050401@gmail.com> Guido van Rossum wrote: > On 9/30/05, Nick Coghlan wrote: > >>An approach to this area that would make sense to me is: >> >>1. Defer PEP 350 >>2. Publish a simple Python module for finding and processing code tags in a >>configurable fashion >>3. Include a default configuration in the module that provides the behaviour >>described in PEP 350 >>4. After this hypothetical code tag processing module has been out in the wild >>for a while, re-open PEP 350 with an eye to including the module in the >>standard library >> >>The idea is that it should be possible to tailor the processing module in >>order to textually scan a codebase (possibly C or C++ rather than Python) in >>accordance with a project-specific system of code tagging, rather than >>requiring that the project necessarily use the default style included in the >>processing module (Although using a system other than the default one may >>result in reduced functionality, naturally). > > > Maybe I'm just an old fart, but this all seems way over-engineered. > > Even for projects the size of Python, a simple grep+find is sufficient. I expect many people would agree with you, but Micah was interested enough in the area to write a PEP about it. The above was just a suggestion for a different way of looking at the problem, so that writing a PEP would actually make sense. At the moment, if the tags used are project-specific, and the method used to find them is a simple grep+find, then I don't see a reason for the idea to be a *Python* Enhancement Proposal. Further, I see some interesting possibilities for automation if such a library exists. For example, a cron job that scans the checked in sources, and automatically converts new TODO's to RFE's in the project tracker, and adds a tracker cross-link into the source code comment. The job could similarly create bug reports for FIXME's. If the project tracker was one that supported URL links, and the project had a URL view of the source tree, then the cross-links between the code tag and the tracker could be actual URL references to each other. However, the starting point for exploring any such ideas would be a library that made it easier to work with code tags. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From guido at python.org Sat Oct 1 04:37:12 2005 From: guido at python.org (Guido van Rossum) Date: Fri, 30 Sep 2005 19:37:12 -0700 Subject: [Python-Dev] PEP 350: Codetags In-Reply-To: <433DEFBA.9050401@gmail.com> References: <20050926223521.GE10940@kitchen.client.attbi.com> <20050928161039.GF10940@kitchen.client.attbi.com> <20050929153237.97E1.JCARLSON@uci.edu> <433CFA1F.4010804@gmail.com> <433DEFBA.9050401@gmail.com> Message-ID: On 9/30/05, Nick Coghlan wrote: > Further, I see some interesting possibilities for automation if such a library > exists. For example, a cron job that scans the checked in sources, and > automatically converts new TODO's to RFE's in the project tracker, and adds a > tracker cross-link into the source code comment. The job could similarly > create bug reports for FIXME's. If the project tracker was one that supported > URL links, and the project had a URL view of the source tree, then the > cross-links between the code tag and the tracker could be actual URL > references to each other. With all respect for the OP, that's exactly the kind of enthusiastic over-engineering that I'm afraid the PEP will encourage. I seriously doubt that any of that work will contribute towards a project's success (compared to simply having a convention of putting XXX in the code). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ms at cerenity.org Sat Oct 1 16:21:32 2005 From: ms at cerenity.org (Michael Sparks) Date: Sat, 1 Oct 2005 15:21:32 +0100 Subject: [Python-Dev] Active Objects in Python In-Reply-To: References: <397621172.20050927111836@MailBlocks.com> Message-ID: <200510011521.33481.ms@cerenity.org> On Friday 30 September 2005 22:13, Michael Sparks (home address) wrote: > I wrote a white paper based on my Python UK talk, which is here: > ? ? * http://www.bbc.co.uk/rd/pubs/whp/whp11.shtml Oops that URL isn't right. It should be: * http://www.bbc.co.uk/rd/pubs/whp/whp113.shtml Sorry! (Thanks to LD 'Gus' Landis for pointing that out!) Regards, Michael. -- "Though we are not now that which in days of old moved heaven and earth, that which we are, we are: one equal temper of heroic hearts made weak by time and fate but strong in will to strive, to seek, to find and not to yield" -- "Ulysses", Tennyson From reinhold-birkenfeld-nospam at wolke7.net Sat Oct 1 19:28:54 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Sat, 01 Oct 2005 19:28:54 +0200 Subject: [Python-Dev] Tests and unicode Message-ID: Hi, I looked whether I could make the test suite pass again when compiled with --disable-unicode. One problem is that no Unicode escapes can be used since compiling the file raises ValueErrors for them. Such strings would have to be produced using unichr(). Is this the right way? Or is disabling Unicode not supported any more? Reinhold -- Mail address is perfectly valid! From blais at furius.ca Sat Oct 1 23:50:25 2005 From: blais at furius.ca (Martin Blais) Date: Sat, 1 Oct 2005 17:50:25 -0400 Subject: [Python-Dev] Pythonic concurrency - cooperative MT In-Reply-To: <1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr> References: <1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr> Message-ID: <8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com> Hi. I hear a confusion that is annoying me a bit in some of the discussions on concurrency, and I thought I'd flush my thoughts here to help me clarify some of that stuff, because some people on the list appear to discuss generators as a concurrency scheme, and as far as I know (and please correct me if I'm wrong) they really are not adressing that at all (full explanation below). Before I go on, I must say that I am not in any way an authority on concurrent programming, I'm just a guy who happens to have done a fair amount of threaded programming, so if any of the smart people on the list notice something completely stupid and off the mark that I might be saying here, please feel free to bang on it with the thousand-pound hammer of your hacker-fu and put me to shame (I love to learn). As far as I understand, generators are just a convenient way to program apparently "independent" control flows (which are not the same as "concurrent" control flows) in a constrained, structured way, a way that is more powerful than what is allowed by using a stack. By giving up using the stack concept as a fast way to allocate local function variables, it becomes possible to exit and enter chunks of code multiple times, at specific points, within an automatically restored local context (i.e. the local variables, stored on the heap). Generators make it more convenient to do just that: enter and re-enter some code that is expressed as if it would be running in a single execution flow (with explicit points of exit/re-entry, "yields"). The full monty version of that, is what you get when you write assembly code (*memories of adolescent assembly programming on the C=64 abound here now*): you can JMP anywhere anytime, and a chunk of code (a function) can be reentered anywhere anytime as well, maybe even reentered somewhere else than where it left off. The price to pay for this is one of complexity: in assembly you have to manage restoring the local context yourself (i.e in assembly code this just means restoring the values of some registers which are assumed set and used by the code, like the local variables of a function), and there is no clear grouping of the local scope that is saved. Generators give you that for free: they automatically organize all that local context as belonging to the generator object, and it expresses clear points of exit/re-entry with the yield calls. They are really just a fancy goto, with some convenient assumptions about how control should flow. This happens to be good enough for simplifying a whole class of problems and I suppose the Python and Ruby communities are all learning to love them and use them more and more. (I think the more fundamental consequence of generators is to raise questions about the definition of what "a function" is: if I have a single chunk of code in which different parts uses two disjoint sets of variables, and it can be entered via a few entry/exit points, is it really one or two or multiple "functions"? What if different parts share some of the local scope only? Where does the function begin and end? And more importantly, is there a more complex yet stull manageable abstraction that would allow even more flexible control flow than generators allow, straddling the boundaries of what a function is?) You could easily implement something very similar to generators by encapsulating the local scope explicitly in the form of a class, with instance attributes, and having an normal method "step()" that would be careful about saving state in the object's attributes everytime it returns and restoring state from those attributes everytime it gets called. This is what iterators do. Whenever you want to "schedule" your object to be running, you call the step() method. So just in that sense generators really aren't all that exciting or "new". The main problem that generators solve is that they make this save/restore mechanism automatic, thus allowing you to write a single flow of execution as a normal function with explicit exit points (yield). It's much nicer having that in the language than having to write code that can be restored (especially when you have to write a loop with complex conditions/flow which must run and return only one iteration every time they become runnable). Therefore, as far as I understand it, generators themselves DO NOT implement any form of concurrency. I feel that where generators and concurrency come together is often met with confusion in the discussions I see about them, but maybe that's just me. I see two aspects that allow generators to participate in the elaboration of a concurrency scheme: 1. The convenience of expression of a single execution flow (with explicit interruption points) makes it easy to implement pseudo-concurrency IF AND ONLY IF you consider a generator as an independent unit of control flow (i.e. a task). Whether those generators can run asynchronously is yet undefined and depends on who calls them. 2. In a more advanced framework/language, perhaps some generators could be considered to always be possible to run asynchronously, ruled by a system of true concurrency with some kind of scheduling algorithm that oversees that process. Whether this has been implemented by many is still a mystery to me, but I can see how a low-level library that provides asynchronously running execution vehicles for each CPU could be used to manage and run a pool of shared generator objects in a way that is better (for a specific application) than the relatively uninformed scheduling provided by the threads abstraction (more at the end). Pseudo or cooperative concurrency is not the same as true asynchronous concurrency. You can ONLY avoid having to deal with issues of mutual exclusion if you DO NOT have true asynchronous concurrency (i.e. two real CPUs running at the same time)--unless you have some special scheduling system that implements very specific assumptions about data access vs the code that it schedules, in which case that scheduling algorithm itself will have to deal with potential mutual exclusion problems: YOU DON'T GET OUT OF IT, if you have two real, concurrent processing units making calculations, you have to deal with the way that they might access some same piece of data at the same time. I suppose that the essence of what I want to say with this diatribe is that everyone who talks about generators as a way to avoid the hard problems of concurrent programming should really explicitly frame the discussion in the context of a single process cooperative scheme that runs on a single processor (at any one time). It does not hold outside of that context, outside of that context you HAVE to deal with mutex issues, and that's always where it gets messy (even with generators). Now, IMO where it gets interesting is when you consider that what you're doing when you are executing multiple asynchronous control flows with explicit code in your process, is that you're essentially bringing "up" the scheduler from the kernel layer into your own code. This is very cool. This may allow you specialize that scheduler with assumptions which may ultimately simplify the implementation of your independent control flows. For example, if you have two sets of generators that access disjoint sets of data, and two processing units, your scheduler could make sure that no two generators from the same set get scheduled at the same time. If you do that then you might not have to lock access to your data structures at all. You can imagine more complex variants on this theme. One of the problems that you have with using generators like this, is that automatic "yield" on resource access does not occur automatically, like it does in threading. With threads, the kernel is invoked when access to a low-level resource is requested, and may decide to put your process in the wait queue when it judges necessary. I don't know how you would do that with generators. To implement that explicitly, you would need an asynchronous version of all the functions that may block on resources (e.g. file open, socket write, etc.), in order to be able to insert a yield statement at that point, after the async call, and there should be a way for the scheduler to check if the resource is "ready" to be able to put your generator back in the runnable queue. (A question comes to mind here: Twisted must be doing something like this with their "deferred objects", no? I figure they would need to do something like this too. I will have to check.) Any comment welcome. cheers, From solipsis at pitrou.net Sun Oct 2 00:46:21 2005 From: solipsis at pitrou.net (Antoine) Date: Sun, 2 Oct 2005 00:46:21 +0200 (CEST) Subject: [Python-Dev] Pythonic concurrency - cooperative MT In-Reply-To: <8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com> References: <1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr> <8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com> Message-ID: <1860.::ffff:213.41.177.172.1128206781.squirrel@webmail.nerim.net> Hi Martin, [snip] The "confusion" stems from the fact that two issues are mixed up in this discussion thread: - improving concurrency schemes to make it easier to write well-behaving applications with independent parallel flows - improving concurrency schemes to improve performance when there are several hardware threads available The respective solutions to these problems do not necessarily go hand in hand. > To implement that explicitly, you would need an > asynchronous version of all the functions that may block on > resources (e.g. file open, socket write, etc.), in order to be > able to insert a yield statement at that point, after the async > call, and there should be a way for the scheduler to check if the > resource is "ready" to be able to put your generator back in the > runnable queue. You can also use a helper thread and signal the scheduling loop when some action in the helper thread has finished. It is an elegant solution because it helps you keep a small generic scheduling loop instead of putting select()-like calls in it. (this is how I've implemented timers in my little cooperative multi-threading system, for example) > (A question comes to mind here: Twisted must be doing something > like this with their "deferred objects", no? I figure they would > need to do something like this too. I will have to check.) A Deferred object is just the abstraction of a callback - or, rather, two callbacks: one for success and one for failure. Twisted is architected around an event loop, which calls your code back when a registered event happens (for example when an operation is finished, or when some data arrives on the wire). Compared to generators, it is a different way of expressing cooperative multi-threading. Regards Antoine. From ms at cerenity.org Sun Oct 2 01:13:15 2005 From: ms at cerenity.org (Michael Sparks) Date: Sun, 2 Oct 2005 00:13:15 +0100 Subject: [Python-Dev] Pythonic concurrency - cooperative MT In-Reply-To: <8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com> References: <1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr> <8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com> Message-ID: <200510020013.16626.ms@cerenity.org> On Saturday 01 October 2005 22:50, Martin Blais wrote: ... > because some people on the list appear to discuss generators as > a concurrency scheme, and as far as I know they really are not > addressing that at all. Our project started in the context of dealing with the task of a naturally concurrent environment. Specifically the task is that of dealing with large numbers of concurrent connections to a server. As a result, when I've mentioned concurrency it's been due to coming from that viewpoint. In the past I have worked with systems essentially structured in a similar way to Twisted for this kind of problem, but decided against that style for our current project. (Note some people misunderstand my opinions here due to a badly phrased lightning talk ~16 months ago at Europython 2004 - I think twisted is very much best of breed in python for what it does. I just think there //might// be a nicer way. (might :) ) Since I now work in an R&D dept I wondered what would happen if instead of the basic approach that underlies systems like twisted what would happen if you took a much more CSP-like approach to building such systems, but using generators rather than threads or explicit state machines. A specific goal was to try and make code simpler for people to work with - with the aim actually of simplifying maintenance as the main by-product. I hadn't heard of anyone trying this approach then and hypothesised it *might* achieve that goal. As a result from day 1 it became clear that where an event based system would normally use a reactor/proactor based approach, that you replace that with a scheduler that repeatedly calls a next method on objects given to it to schedule. In terms of concurrency that is clearly a co-operative multitasking system in the same way as a simplistic event based system is. (Both get more complex in reality when you can't avoid blocking forcing the use of threads for some tasks) So when you say this: > explicitly frame the discussion in the context of a single > process cooperative scheme that runs on a single processor (at > any one time). This is spot on. However, any generator can be farmed off and run in a thread. Any communications you did with the generator can be wrapped via Queues then - forming a controlled bridge between the threads. Similarly we're currently looking at using non-blocking pipes and pickling to communicate with generators running in a forked environment. As a result if you write your code as generators it can migrate to a threaded or process based environment, and scale across multiple processes (and hence processors) if tools to perform this migration are put in place. We're a little way off doing that, but this looks to be highly reasonable. > As far as I understand, generators are just a convenient way to They give you code objects that can do a return and continue later. This isn't really the same as the ability to just do a goto into random points in a function. You can only go back to the point the generator yielded at (unless someone has a perverse trick :-). > You could easily implement something very similar to generators > by encapsulating the local scope explicitly in the form of a > class, with instance attributes, and having an normal > method "step()" that would be careful about saving state in the > object's attributes everytime it returns and restoring state from > those attributes everytime it gets called. For a more explicit version of this we have a (deliberately naive) C++ version of generators & our core concurrency system. Mechanism is here: http://tinyurl.com/7gaol , example use here: http://tinyurl.com/bgwro That does precisely that. (except we use a next() method there :-) > Therefore, as far as I understand it, generators themselves DO > NOT implement any form of concurrency. By themselves, they don't. They can be used to deal with concurrency though. > 2. In a more advanced framework/language, perhaps some generators > could be considered to always be possible to run > asynchronously, ruled by a system of true concurrency with > some kind of scheduling algorithm that oversees that process. > Whether this has been implemented by many is still a mystery > to me, This is what we do. Our tutorial we've given to trainees (one of whom have had very little experience of even programming) was able to pick up our approach quickly due to our tutorial. This requires them to implement a mini-version of the framework, which might actually aid the discussion here since it very clearly shows the core of our system. (nb it is however a simplified version) I previously posted a link to it, which is here: http://kamaelia.sourceforge.net/MiniAxon/ > but I can see how a low-level library that provides > asynchronously running execution vehicles for each CPU could > be used to manage and run a pool of shared generator objects > in a way that is better (for a specific application) than the > relatively uninformed scheduling provided by the threads > abstraction (more at the end). > > Pseudo or cooperative concurrency is not the same as true > asynchronous concurrency. Correct. I've had discussions with a colleague at work who wants to work on the underlying formal semantics of our system for verification purposes, and he pointed out that the core assumption with a pure generator approach effectively serialises the application, which may hide problems in the true parallel approach (eg only using processes for a CSP-like system). However that statement had an underlying assumption: that the system would be a pure generator system. As soon as you involve multiple systems using network connections, and threads (since we have threaded components as well), and processes (which has always been on the cards, all our desktop machines are dual processor and it just seems a waste to use just one) then the system goes truly asynchronous. As a result we (at least :-) have thought about these problems along the way. > you have to deal with the way that they > might access some same piece of data at the same time. We do. We have both an underlying approach to deal with this and a metaphor that encourages correct usage. The underlying mechanism is based on explicit hand off of data between asynchronous activities. Once you have handed off a piece of data, you no longer own it and can no longer change it. If you are handed a piece of data you can do anything you like with it, for as long as you like until you hand it off or throw it away. The metaphor of old-fashioned paper based inboxes (or in-trays) and outboxes (or out-trays) conceptually reinforces this idea - naturally encouraging safer programming styles. This means that we only ever have a single reader and single writer for any item of data, which eliminates whole swathes of concurrency issues - whether you're pseudo-concurrent (ie threads[*], generators) or truly concurrent (processes on multiple processors). [*] Still only 1 CPU really. Effectively there is no global data. If there is any global data (since we do have a global address space we tend to think of as similar to a linda tuple space), then it has a single owner. Others may read it, but only one may write to it. Because this is python, this is enforced by convention. (But the use is discouraged and rarely needed). The use of generators effectively also hides the local variables from accidental external modification. Which is a secondary layer of protection. > If you do that then you might not have to lock access to your data > structures at all. We don't have to lock data structures at all - this is because we have explicit hand off of data. If we hand off between processes, we do this via Queues that handle the locking issues for us. > To implement that explicitly, you would need an > asynchronous version of all the functions that may block on > resources (e.g. file open, socket write, etc.) Or you can create a generator that handles reading from a file and hands off the data on to the next component explicitly. The file reader is given CPU time by the scheduler. This can seem odd unless you've done any shell programming in which case the idea should be obvious: echo `ls *py |while read i; do wc -l $i |cut -d \ -f1; done` | sed -e 's/ /+/g' | bc (yes I know there's better ways of doing this :) So all in all, I'd say "yes" generators aren't really concurrent, but they *are* a very good way (IMHO) of dealing with concurrency in a single thread and map naturally if you're careful in designing your approach early on to map to a thread/process based approach cleanly. If you think I'm talking a load of sphericals (for all I know it's possible I am, though I hope I'm not :-) , please look at our tutorial first, then at our howto for building components [*] and tell me what we're doing wrong. I'd really like to know so we can make the system better, easier for newbies (and hence everyone else), and more trustable. [*] http://tinyurl.com/dp8n7 (This really feels like this more of a comp.lang.python discussion really though, because AFAICT, python already has everything we need for this. I might revisit that thought when we've looked at shared memory issues though. IMHO though that would be largely stuff for the standard library.) Best Regards, Michael. -- Michael Sparks, Senior R&D Engineer, Digital Media Group Michael.Sparks at rd.bbc.co.uk, http://kamaelia.sourceforge.net/ British Broadcasting Corporation, Research and Development Kingswood Warren, Surrey KT20 6NP This e-mail may contain personal views which are not the views of the BBC. From oliphant at ee.byu.edu Sun Oct 2 01:39:24 2005 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Sat, 01 Oct 2005 17:39:24 -0600 Subject: [Python-Dev] Why does __getitem__ slot of builtin call sequence methods first? Message-ID: <433F1E2C.4050501@ee.byu.edu> The new ndarray object of scipy core (successor to Numeric Python) is a C extension type that has a getitem defined in both the as_mapping and the as_sequence structure. The as_sequence mapping is just so PySequence_GetItem will work correctly. As exposed to Python the ndarray object as a .__getitem__ wrapper method. Why does this wrapper call the sequence getitem instead of the mapping getitem method? Is there anyway to get at a mapping-style __getitem__ method from Python? This looks like a bug to me (which is why I'm posting here...) Thanks for any help or insight. -Travis Oliphant From guido at python.org Sun Oct 2 02:41:32 2005 From: guido at python.org (Guido van Rossum) Date: Sat, 1 Oct 2005 17:41:32 -0700 Subject: [Python-Dev] Why does __getitem__ slot of builtin call sequence methods first? In-Reply-To: <433F1E2C.4050501@ee.byu.edu> References: <433F1E2C.4050501@ee.byu.edu> Message-ID: On 10/1/05, Travis Oliphant wrote: > > The new ndarray object of scipy core (successor to Numeric Python) is a > C extension type that has a getitem defined in both the as_mapping and > the as_sequence structure. > > The as_sequence mapping is just so PySequence_GetItem will work correctly. > > As exposed to Python the ndarray object has a .__getitem__ wrapper method. > > Why does this wrapper call the sequence getitem instead of the mapping > getitem method? > > Is there anyway to get at a mapping-style __getitem__ method from Python? Hmm... I'm sure the answer is in typeobject.c, but that is one of the more obfuscated parts of Python's guts. I wrote it four years ago and since then I've apparently lost enough brain cells (or migrated them from language implementation to to language design service :) that I don't understand it inside out any more like I did while I was in the midst of it. However, I wonder if the logic isn't such that if you define both sq_item and mp_subscript, __getitem__ calls sq_item; I wonder if by removing sq_item it might call mp_subscript? Worth a try, anyway. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From radeex at gmail.com Sun Oct 2 04:00:03 2005 From: radeex at gmail.com (Christopher Armstrong) Date: Sun, 2 Oct 2005 12:00:03 +1000 Subject: [Python-Dev] Pythonic concurrency - cooperative MT In-Reply-To: <8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com> References: <1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr> <8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com> Message-ID: <60ed19d40510011900g7ef5c86jc047affbdd06fd59@mail.gmail.com> On 10/2/05, Martin Blais wrote: > One of the problems that you have with using generators like > this, is that automatic "yield" on resource access does not occur > automatically, like it does in threading. With threads, the > kernel is invoked when access to a low-level resource is > requested, and may decide to put your process in the wait queue > when it judges necessary. I don't know how you would do that > with generators. To implement that explicitly, you would need an > asynchronous version of all the functions that may block on > resources (e.g. file open, socket write, etc.), in order to be > able to insert a yield statement at that point, after the async > call, and there should be a way for the scheduler to check if the > resource is "ready" to be able to put your generator back in the > runnable queue. > > (A question comes to mind here: Twisted must be doing something > like this with their "deferred objects", no? I figure they would > need to do something like this too. I will have to check.) As I mentioned in the predecessor of this thread (I think), I've written a thing called "Defgen" or "Deferred Generators" which allows you to write a generator to yield control when waiting for a Deferred to fire. So this is basically "yield or resource access". In the Twisted universe, every asynchronous resource-retrieval is done by returning a Deferred and later firing that Deferred. Generally, you add callbacks to get the value, but if you use defgen you can say stuff like (in Python 2.5 syntax) try: x = yield getPage('http://python.org/') except PageNotFound: print "Where did Python go!" else: assert "object-oriented" in x Many in the Twisted community get itchy about over-use of defgen, since it makes it easier to assume too much consistency in state, but it's still light-years beyond pre-emptive shared-memory threading when it comes to that. -- Twisted | Christopher Armstrong: International Man of Twistery Radix | -- http://radix.twistedmatrix.com | Release Manager, Twisted Project \\\V/// | -- http://twistedmatrix.com |o O| | w----v----w-+ From oliphant at ee.byu.edu Sun Oct 2 05:17:20 2005 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Sat, 01 Oct 2005 21:17:20 -0600 Subject: [Python-Dev] Why does __getitem__ slot of builtin call sequence methods first? In-Reply-To: References: <433F1E2C.4050501@ee.byu.edu> Message-ID: <433F5140.8050705@ee.byu.edu> Guido van Rossum wrote: >On 10/1/05, Travis Oliphant wrote: > > >>The new ndarray object of scipy core (successor to Numeric Python) is a >>C extension type that has a getitem defined in both the as_mapping and >>the as_sequence structure. >> >>The as_sequence mapping is just so PySequence_GetItem will work correctly. >> >>As exposed to Python the ndarray object has a .__getitem__ wrapper method. >> >>Why does this wrapper call the sequence getitem instead of the mapping >>getitem method? >> >>Is there anyway to get at a mapping-style __getitem__ method from Python? >> >> > >Hmm... I'm sure the answer is in typeobject.c, but that is one of the >more obfuscated parts of Python's guts. I wrote it four years ago and >since then I've apparently lost enough brain cells (or migrated them >from language implementation to to language design service :) that I >don't understand it inside out any more like I did while I was in the >midst of it. > >However, I wonder if the logic isn't such that if you define both >sq_item and mp_subscript, __getitem__ calls sq_item; I wonder if by >removing sq_item it might call mp_subscript? Worth a try, anyway. > > > Thanks for the tip. I think I figured out the problem, and it was my misunderstanding of how types inherit in C that was the source of my problem. Basically, Python is doing what you would expect, the mp_item is used for __getitem__ if both mp_item and sq_item are present. However, the addition of these descriptors (and therefore the resolution of any comptetion for __getitem__ calls) is done *before* the inheritance of any slots takes place. The new ndarray object inherits from a "big" array object that doesn't define the sequence and buffer protocols (which have the size limiting int dependencing in their interfaces). The ndarray object has standard tp_as_sequence and tp_as_buffer slots filled. Figuring the array object would inherit its tp_as_mapping protocol from "big" array (which it does just fine), I did not explicitly set that slot in its Type object. Thus, when PyType_Ready was called on the ndarray object, the tp_as_mapping was NULL and so __getitem__ mapped to the sequence-defined version. Later the tp_as_mapping slots were inherited but too late for __getitem__ to be what I expected. The easy fix was to initialize the tp_as_mapping slot before calling PyType_Ready. Hopefully, somebody else searching in the future for an answer to their problem will find this discussion useful. Thanks for your help, -Travis From ncoghlan at gmail.com Sun Oct 2 05:23:19 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 02 Oct 2005 13:23:19 +1000 Subject: [Python-Dev] Why does __getitem__ slot of builtin call sequence methods first? In-Reply-To: References: <433F1E2C.4050501@ee.byu.edu> Message-ID: <433F52A7.3010700@gmail.com> Guido van Rossum wrote: > Hmm... I'm sure the answer is in typeobject.c, but that is one of the > more obfuscated parts of Python's guts. I wrote it four years ago and > since then I've apparently lost enough brain cells (or migrated them > from language implementation to to language design service :) that I > don't understand it inside out any more like I did while I was in the > midst of it. > > However, I wonder if the logic isn't such that if you define both > sq_item and mp_subscript, __getitem__ calls sq_item; I wonder if by > removing sq_item it might call mp_subscript? Worth a try, anyway. As near as I can tell, the C/API documentation is silent on how slots are populated when multiple methods mapping to the same slot are defined by a C object, but this is a quote from the comment describing add_operators() in typeobject.c: > In the latter case, the first slotdef entry encoutered wins. Since > slotdef entries are sorted by the offset of the slot in the > PyHeapTypeObject, this gives us some control over disambiguating > between competing slots: the members of PyHeapTypeObject are listed > from most general to least general, so the most general slot is > preferred. In particular, because as_mapping comes before as_sequence, > for a type that defines both mp_subscript and sq_item, mp_subscript > wins. Further, in PyObject_GetItem (in abstract.c), tp_as_mapping->mp_subscript is checked first, with tp_as_sequence->mp_item only being checked if mp_subscript isn't found. Importantly, this is the function invoked by the BINARY_SUBSCR opcode. So, the *intent* certainly appears to be that mp_subscript should be preferred both by the C abstract object API and from normal Python code. *However*, the precedence applied by add_operators() is governed by the slotdefs structure in typeobject.c, which, according to the above comment, is meant to match the order the slots appear in memory in the _typeobject structure in object.h, and favour the mapping methods over the sequence methods. There's actually two serious problems with the description in this comment: Firstly, the two orders don't actually match. In the object layout, the ordering of the abstract object methods is as follows: PyNumberMethods *tp_as_number; PySequenceMethods *tp_as_sequence; PyMappingMethods *tp_as_mapping; But in the slotdefs table, the PySequence and PyMapping slots are listed first, followed by the PyNumber methods. Secondly, in both the object layout and the slotdefs table, the PySequence methods appear *before* the PyMapping methods, which means that tp_as_sequence->sq_item appears as "__getitem__" even though a subscript operation will actually invoke "tp_as_mapping->mp_subscript". In short, I think Travis is right in calling this behaviour a bug. There's a similar problem with the methods that exist in both tp_as_number and tp_as_sequence - the abstract C API and the Python intepreter will favour the tp_as_number methods, but the slot definitions will favour tp_as_sequence. The fix is actually fairly simple: reorder the slotdefs table so that the sequence of slots is "Number, Mapping, Sequence" rather than adhering strictly to the sequence of methods given in the definition of _typeobject. The only objects affected by this change would be C extension objects which define two C-level methods which map to the same Python-level slot name. The observed behavioural change is that the methods accessible via the Python-level slot names would change (either from the Sequence method to the Mapping method, or from the Sequence method to the Number method). Given that the only documentation I can find of the behaviour in that scenario is a comment in typeobject.c, that the implementation doesn't currently match the comment, and that the current implementation means that the methods accessed via the slot names don't match the methods normal Python syntax actually invokes, I find it hard to see how fixing it could cause any signficant problems. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Sun Oct 2 05:27:54 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 02 Oct 2005 13:27:54 +1000 Subject: [Python-Dev] Why does __getitem__ slot of builtin call sequence methods first? In-Reply-To: <433F52A7.3010700@gmail.com> References: <433F1E2C.4050501@ee.byu.edu> <433F52A7.3010700@gmail.com> Message-ID: <433F53BA.7030203@gmail.com> Nick Coghlan wrote: [A load of baloney] Scratch everything I said in my last message - init_slotdefs() sorts the slotdefs table correctly, so that the order it is written in the source is irrelevant. Travis found the real answer to his problem. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From kbk at shore.net Sun Oct 2 05:44:44 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Sat, 01 Oct 2005 23:44:44 -0400 Subject: [Python-Dev] IDLE development In-Reply-To: (Noam Raphael's message of "Sun, 11 Sep 2005 02:54:08 +0300") References: Message-ID: <87u0g0h9ar.fsf@hydra.bayview.thirdcreek.com> Noam Raphael writes: > More than a year and a half ago, I posted a big patch to IDLE which > adds support for completion and much better calltips, along with some > other improvements. I have responded on idle-dev. -- KBK From martin at v.loewis.de Sun Oct 2 09:57:31 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 02 Oct 2005 09:57:31 +0200 Subject: [Python-Dev] Help needed with MSI permissions Message-ID: <433F92EB.2040509@v.loewis.de> I have various reports that the Python 2.4 installer does not work if you are trying to install in a non-standard location as a non-privileged user, e.g. #1298962, #1234328, #1232947, #1199808. Despite many attempts, I haven't been able to reproduce any such problem, and the submitters weren't really able to experiment much, either. So, if anybody is able to reproduce any of these reports, and give me instructions on how to reproduce it myself, that would be very much appreciated. Regards, Martin From martin at v.loewis.de Sun Oct 2 21:07:02 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 02 Oct 2005 21:07:02 +0200 Subject: [Python-Dev] Tests and unicode In-Reply-To: References: Message-ID: <43402FD6.7030905@v.loewis.de> Reinhold Birkenfeld wrote: > One problem is that no Unicode escapes can be used since compiling > the file raises ValueErrors for them. Such strings would have to > be produced using unichr(). You mean, in Unicode literals? There are various approaches, depending on context: - you could encode the literals as UTF-8, and decode it when the module/test case is imported. See test_support.TESTFN_UNICODE for an example. - you could use unichr - you could use eval, see test_re for an example > Is this the right way? Or is disabling Unicode not supported any more? There are certainly tests that cannot be executed when Unicode is not available. It would be good if such tests get skipped instead of being failing, and it would be good if all tests that do not require Unicode support run even when Unicode support is missing. Whether "it is supported" is a tricky question: your message indicates that, right now, it is *not* supported (or else you wouldn't have noticed a problem). Whether we think it should be supported depends on who "we" is, as with all these minor features: some think it is a waste of time, some think it should be supported if reasonably possible, and some think this a conditio sine qua non. It certainly isn't a release-critical feature. Regards, Martin From martin at v.loewis.de Sun Oct 2 21:52:14 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 02 Oct 2005 21:52:14 +0200 Subject: [Python-Dev] C API doc fix In-Reply-To: References: Message-ID: <43403A6E.5080101@v.loewis.de> Jim Jewett wrote: > > ======== > Py_UNICODE > Python uses this type to store Unicode ordinals. It is > typically a typedef alias, but the underlying type -- and > the size of that type -- varies across different systems. > ======== I think I objected to such a formulation, requesting that the precise procedure is documented for chosing the alias. I then went on saying what the precise procedure is, and then an argument about that procedure arose. I still believe that the precise procedure should be documented (in addition to saying that its outcome may vary across installations). Regards, Martin From mwh at python.net Sun Oct 2 22:32:38 2005 From: mwh at python.net (Michael Hudson) Date: Sun, 02 Oct 2005 21:32:38 +0100 Subject: [Python-Dev] Tests and unicode In-Reply-To: (Reinhold Birkenfeld's message of "Sat, 01 Oct 2005 19:28:54 +0200") References: Message-ID: <2mk6gvfymx.fsf@starship.python.net> Reinhold Birkenfeld writes: > Hi, > > I looked whether I could make the test suite pass again > when compiled with --disable-unicode. > > One problem is that no Unicode escapes can be used since compiling > the file raises ValueErrors for them. Such strings would have to > be produced using unichr(). Yeah, I've bumped into this. > Is this the right way? Or is disabling Unicode not supported any more? I don't know. More particularly, I don't know if anyone actually uses a unicode-disabled build. If noone does, we might as well rip the code out. Cheers, mwh -- Sufficiently advanced political correctness is indistinguishable from irony. -- Erik Naggum From mwh at python.net Sun Oct 2 22:36:01 2005 From: mwh at python.net (Michael Hudson) Date: Sun, 02 Oct 2005 21:36:01 +0100 Subject: [Python-Dev] Why does __getitem__ slot of builtin call sequence methods first? In-Reply-To: <433F5140.8050705@ee.byu.edu> (Travis Oliphant's message of "Sat, 01 Oct 2005 21:17:20 -0600") References: <433F1E2C.4050501@ee.byu.edu> <433F5140.8050705@ee.byu.edu> Message-ID: <2mfyrjfyha.fsf@starship.python.net> Travis Oliphant writes: > Thanks for the tip. I think I figured out the problem, and it was my > misunderstanding of how types inherit in C that was the source of my > problem. > > Basically, Python is doing what you would expect, the mp_item is used > for __getitem__ if both mp_item and sq_item are present. However, the > addition of these descriptors (and therefore the resolution of any > comptetion for __getitem__ calls) is done *before* the inheritance of > any slots takes place. Oof. That'd do it. > The new ndarray object inherits from a "big" array object that doesn't > define the sequence and buffer protocols (which have the size limiting > int dependencing in their interfaces). The ndarray object has standard > tp_as_sequence and tp_as_buffer slots filled. I guess the reason this hasn't come up before is that non-trivial C inheritance is still pretty rare. > The easy fix was to initialize the tp_as_mapping slot before calling > PyType_Ready. Hopefully, somebody else searching in the future for an > answer to their problem will find this discussion useful. Well, it sounds like a bug that should be easy to fix. I can't think of a reason to do slot wrapper generation before slot inheritance, though I wouldn't like to bet more than a beer on not having missed something... Cheers, mwh -- There are two kinds of large software systems: those that evolved from small systems and those that don't work. -- Seen on slashdot.org, then quoted by amk From blais at furius.ca Sun Oct 2 23:49:51 2005 From: blais at furius.ca (Martin Blais) Date: Sun, 2 Oct 2005 17:49:51 -0400 Subject: [Python-Dev] Pythonic concurrency - cooperative MT In-Reply-To: <1766050860214964952@unknownmsgid> References: <1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr> <8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com> <1766050860214964952@unknownmsgid> Message-ID: <8393fff0510021449p3607898cp6dcecc615450b46b@mail.gmail.com> On 10/1/05, Antoine wrote: > > > like this with their "deferred objects", no? I figure they would > > need to do something like this too. I will have to check.) > > A Deferred object is just the abstraction of a callback - or, rather, two > callbacks: one for success and one for failure. Twisted is architected > around an event loop, which calls your code back when a registered event > happens (for example when an operation is finished, or when some data > arrives on the wire). Compared to generators, it is a different way of > expressing cooperative multi-threading. So, the question is, in Twisted, if I want to defer on an operation that is going to block, say I'm making a call to run a database query that I'm expecting will take much time, and want to yield ("defer") for other events to be processed while the query is executed, how do I do that? As far as I remember the Twisted docs I read a long time ago did not provide a solution for that. From radeex at gmail.com Mon Oct 3 01:19:32 2005 From: radeex at gmail.com (Christopher Armstrong) Date: Mon, 3 Oct 2005 09:19:32 +1000 Subject: [Python-Dev] Pythonic concurrency - cooperative MT In-Reply-To: <8393fff0510021449p3607898cp6dcecc615450b46b@mail.gmail.com> References: <1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr> <8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com> <1766050860214964952@unknownmsgid> <8393fff0510021449p3607898cp6dcecc615450b46b@mail.gmail.com> Message-ID: <60ed19d40510021619p7f6e2641udf172a0d0d19283e@mail.gmail.com> On 10/3/05, Martin Blais wrote: > On 10/1/05, Antoine wrote: > > > > > like this with their "deferred objects", no? I figure they would > > > need to do something like this too. I will have to check.) > > > > A Deferred object is just the abstraction of a callback - or, rather, two > > callbacks: one for success and one for failure. Twisted is architected > > around an event loop, which calls your code back when a registered event > > happens (for example when an operation is finished, or when some data > > arrives on the wire). Compared to generators, it is a different way of > > expressing cooperative multi-threading. > > So, the question is, in Twisted, if I want to defer on an operation > that is going to block, say I'm making a call to run a database query > that I'm expecting will take much time, and want to yield ("defer") > for other events to be processed while the query is executed, how do I > do that? As far as I remember the Twisted docs I read a long time ago > did not provide a solution for that. Deferreds don't make blocking code non-blocking; they're just a way to make it nicer to write non-blocking code. There are utilities in Twisted for wrapping a blocking function call in a thread and having the result returned in a Deferred, though (see deferToThread). There is also a lightweight and complete wrapper for DB-API2 database modules in twisted.enterprise.adbapi, which does the threading interaction for you. So, since this then exposes a non-blocking API, you can do stuff like d = pool.runQuery('SELECT User_ID FROM Users') d.addCallback(gotDBData) d2 = ldapfoo.getUser('bob') d2.addCallback(gotLDAPData) And both the database call and the ldap request will be worked on concurrently. -- Twisted | Christopher Armstrong: International Man of Twistery Radix | -- http://radix.twistedmatrix.com | Release Manager, Twisted Project \\\V/// | -- http://twistedmatrix.com |o O| | w----v----w-+ From blais at furius.ca Mon Oct 3 07:53:52 2005 From: blais at furius.ca (Martin Blais) Date: Mon, 3 Oct 2005 01:53:52 -0400 Subject: [Python-Dev] Pythonic concurrency - cooperative MT In-Reply-To: <60ed19d40510021619p7f6e2641udf172a0d0d19283e@mail.gmail.com> References: <1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr> <8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com> <1766050860214964952@unknownmsgid> <8393fff0510021449p3607898cp6dcecc615450b46b@mail.gmail.com> <60ed19d40510021619p7f6e2641udf172a0d0d19283e@mail.gmail.com> Message-ID: <8393fff0510022253l6c37f3f8x6f5254018d6f348e@mail.gmail.com> On 10/2/05, Christopher Armstrong wrote: > On 10/3/05, Martin Blais wrote: > > On 10/1/05, Antoine wrote: > > > > > > > like this with their "deferred objects", no? I figure they would > > > > need to do something like this too. I will have to check.) > > > > > > A Deferred object is just the abstraction of a callback - or, rather, two > > > callbacks: one for success and one for failure. Twisted is architected > > > around an event loop, which calls your code back when a registered event > > > happens (for example when an operation is finished, or when some data > > > arrives on the wire). Compared to generators, it is a different way of > > > expressing cooperative multi-threading. > > > > So, the question is, in Twisted, if I want to defer on an operation > > that is going to block, say I'm making a call to run a database query > > that I'm expecting will take much time, and want to yield ("defer") > > for other events to be processed while the query is executed, how do I > > do that? As far as I remember the Twisted docs I read a long time ago > > did not provide a solution for that. > > Deferreds don't make blocking code non-blocking; they're just a way to > make it nicer to write non-blocking code. There are utilities in > Twisted for wrapping a blocking function call in a thread and having > the result returned in a Deferred, though (see deferToThread). There > is also a lightweight and complete wrapper for DB-API2 database > modules in twisted.enterprise.adbapi, which does the threading > interaction for you. > > So, since this then exposes a non-blocking API, you can do stuff like > > d = pool.runQuery('SELECT User_ID FROM Users') > d.addCallback(gotDBData) > d2 = ldapfoo.getUser('bob') > d2.addCallback(gotLDAPData) > > And both the database call and the ldap request will be worked on concurrently. Very nice! However, if you're using a thread to do just that, it's just using a part of what threads were designed for: it's really just using the low-level kernel knowledge about resource access and when they become ready to wait on the resource, since you're not going to run much actual code in the thread itself (apart from setting up to do the blocking call and returning its value). Now, if we had something in the language that allows us to do something like that--make the most important potentially blocking calls asynchronously-- we could implement a more complete scheduler that could really leverage generators to create a more interesting concurrency solution with less overhead. For example, imagine that some class of generators are used as tasks, like we were discussing before. When you would call the special yield_read() call (a variation on e.g. os.read() call), there is an implicit yield that allows other generators which are ready to run until the data is available, without the overhead of 1. context switching to the helper threads and back; 2. synchronization for communcation with the helper threads (I assume threads would not be created dynamically, for efficiency. I imagine there is a pool of helpers waiting to do the async call jobs, and communication with them to dispatch the call jobs does not come for free (i.e. locking)). We really don't need threads at all to do that (at least for the common blocking calls), just some low-level support for building a scheduler. Using threads to do that has a cost, it is more or less a kludge, in that context (but we have nothing better for now). cheers, From blais at furius.ca Mon Oct 3 08:09:13 2005 From: blais at furius.ca (Martin Blais) Date: Mon, 3 Oct 2005 02:09:13 -0400 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). Message-ID: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> Hi. Like a lot of people (or so I hear in the blogosphere...), I've been experiencing some friction in my code with unicode conversion problems. Even when being super extra careful with the types of str's or unicode objects that my variables can contain, there is always some case or oversight where something unexpected happens which results in a conversion which triggers a decode error. str.join() of a list of strs, where one unicode object appears unexpectedly, and voila! exception galore. Sometimes the problem shows up late because your test code doesn't always contain accented characters. I'm sure many of you experienced that or some variant at some point. I came to realize recently that this problem shares strong similarity with the problem of implicit type conversions in C++, or at least it feels the same: Stuff just happens implicitly, and it's hard to track down where and when it happens by just looking at the code. Part of the problem is that the unicode object acts a lot like a str, which is convenient, but... What if we could completely disable the implicit conversions between unicode and str? In other words, if you would ALWAYS be forced to call either .encode() or .decode() to convert between one and the other... wouldn't that help a lot deal with that issue? How hard would that be to implement? Would it break a lot of code? Would some people want that? (I know I would, at least for some of my code.) It seems to me that this would make the code more explicit and force the programmer to become more aware of those conversions. Any opinions welcome. cheers, From reinhold-birkenfeld-nospam at wolke7.net Mon Oct 3 10:15:49 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Mon, 03 Oct 2005 10:15:49 +0200 Subject: [Python-Dev] Tests and unicode In-Reply-To: <43402FD6.7030905@v.loewis.de> References: <43402FD6.7030905@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Reinhold Birkenfeld wrote: >> One problem is that no Unicode escapes can be used since compiling >> the file raises ValueErrors for them. Such strings would have to >> be produced using unichr(). > > You mean, in Unicode literals? There are various approaches, depending > on context: > - you could encode the literals as UTF-8, and decode it when the > module/test case is imported. See test_support.TESTFN_UNICODE > for an example. > - you could use unichr > - you could use eval, see test_re for an example Okay. I can fix this, but several library modules must be fixed too (mostly simple fixes), e.g. pickletools, gettext, doctest or encodings. >> Is this the right way? Or is disabling Unicode not supported any more? > > There are certainly tests that cannot be executed when Unicode is not > available. It would be good if such tests get skipped instead of being > failing, and it would be good if all tests that do not require Unicode > support run even when Unicode support is missing. That's my approach too. > Whether "it is supported" is a tricky question: your message indicates > that, right now, it is *not* supported (or else you wouldn't have > noticed a problem). Well, the core builds without Unicode, and any code that doesn't use unicode should run fine too. But the tests fail at the moment. > Whether we think it should be supported depends > on who "we" is, as with all these minor features: some think it is > a waste of time, some think it should be supported if reasonably > possible, and some think this a conditio sine qua non. It certainly > isn't a release-critical feature. Correct. I'll see if I have the time. Reinhold -- Mail address is perfectly valid! From mwh at python.net Mon Oct 3 10:43:06 2005 From: mwh at python.net (Michael Hudson) Date: Mon, 03 Oct 2005 09:43:06 +0100 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> (Martin Blais's message of "Mon, 3 Oct 2005 02:09:13 -0400") References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> Message-ID: <2mbr27f0th.fsf@starship.python.net> Martin Blais writes: > What if we could completely disable the implicit conversions between > unicode and str? In other words, if you would ALWAYS be forced to > call either .encode() or .decode() to convert between one and the > other... wouldn't that help a lot deal with that issue? I don't know. I've made one or two apps safe against this and it's mostly just annoying. > How hard would that be to implement? import sys reload(sys) sys.setdefaultencoding('undefined') > Would it break a lot of code? Would some people want that? (I know > I would, at least for some of my code.) It seems to me that this > would make the code more explicit and force the programmer to become > more aware of those conversions. Any opinions welcome. I'm not sure it's a sensible default. Cheers, mwh -- It is never worth a first class man's time to express a majority opinion. By definition, there are plenty of others to do that. -- G. H. Hardy From mal at egenix.com Mon Oct 3 11:43:13 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 03 Oct 2005 11:43:13 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <2mbr27f0th.fsf@starship.python.net> References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> <2mbr27f0th.fsf@starship.python.net> Message-ID: <4340FD31.20202@egenix.com> Michael Hudson wrote: > Martin Blais writes: > > >>What if we could completely disable the implicit conversions between >>unicode and str? In other words, if you would ALWAYS be forced to >>call either .encode() or .decode() to convert between one and the >>other... wouldn't that help a lot deal with that issue? > > > I don't know. I've made one or two apps safe against this and it's > mostly just annoying. > >>How hard would that be to implement? > > import sys > reload(sys) > sys.setdefaultencoding('undefined') You shouldn't post tricks like these :-) The correct way to change the default encoding is by providing a sitecustomize.py module which then call the sys.setdefaultencoding("undefined"). Note that the codec "undefined" was added for just this reason. >>Would it break a lot of code? Would some people want that? (I know >>I would, at least for some of my code.) It seems to me that this >>would make the code more explicit and force the programmer to become >>more aware of those conversions. Any opinions welcome. > > I'm not sure it's a sensible default. Me neither, especially since this would make it impossible to write polymorphic code - e.g. ', '.join(list) wouldn't work anymore if list contains Unicode; dito for u', '.join(list) with list containing a string. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 30 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mal at egenix.com Mon Oct 3 13:49:20 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 03 Oct 2005 13:49:20 +0200 Subject: [Python-Dev] --disable-unicode (Tests and unicode) In-Reply-To: References: <43402FD6.7030905@v.loewis.de> Message-ID: <43411AC0.6040000@egenix.com> Reinhold Birkenfeld wrote: > Martin v. L?wis wrote: >>>Whether we think it should be supported depends >>on who "we" is, as with all these minor features: some think it is >>a waste of time, some think it should be supported if reasonably >>possible, and some think this a conditio sine qua non. It certainly >>isn't a release-critical feature. > > Correct. I'll see if I have the time. Is the added complexity needed to support not having Unicode support compiled into Python really worth it ? I know that Martin introduced this feature a long time ago, so he will have had a reason for it. Today, I think the situation has changed: computers have more memory, are faster and the need to integrate (e.g. via XML) is stronger than ever - and maybe we should consider removing the option to get a cleaner code base with fewer #ifdefs and SyntaxErrors from the standard lib. What do you think ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 30 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From solipsis at pitrou.net Mon Oct 3 14:32:48 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 03 Oct 2005 14:32:48 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> Message-ID: <1128342768.6138.114.camel@fsol> Le lundi 03 octobre 2005 ? 02:09 -0400, Martin Blais a ?crit : > > What if we could completely disable the implicit conversions between > unicode and str? This would be very annoying when dealing with some modules or libraries where the type (str / unicode) returned by a function depends on the context, build, or platform. A good rule of thumb is to convert to unicode everything that is semantically textual, and to only use str for what is to be semantically treated as a string of bytes (network packets, identifiers...). This is also, AFAIU, the semantic model which is favoured for a hypothetical future version of Python. This is what I'm using to do safe conversion to a given type without worrying about the type of the argument: DEFAULT_CHARSET = 'utf-8' def safe_unicode(s, charset=None): """ Forced conversion of a string to unicode, does nothing if the argument is already an unicode object. This function is useful because the .decode method on an unicode object, instead of being a no-op, tries to do a double conversion back and forth (which often fails because 'ascii' is the default codec). """ if isinstance(s, str): return s.decode(charset or DEFAULT_CHARSET) else: return s def safe_str(s, charset=None): """ Forced conversion of an unicode to string, does nothing if the argument is already a plain str object. This function is useful because the .encode method on an str object, instead of being a no-op, tries to do a double conversion back and forth (which often fails because 'ascii' is the default codec). """ if isinstance(s, unicode): return s.encode(charset or DEFAULT_CHARSET) else: return s Good luck Antoine. From fredrik at pythonware.com Mon Oct 3 14:59:44 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 3 Oct 2005 14:59:44 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> <1128342768.6138.114.camel@fsol> Message-ID: Antoine Pitrou wrote: > A good rule of thumb is to convert to unicode everything that is > semantically textual and isn't pure ASCII. (anyone who are tempted to argue otherwise should benchmark their applications, both speed- and memorywise, and be prepared to come up with very strong arguments for why python programs shouldn't be allowed to be fast and memory-efficient whenever they can...) From solipsis at pitrou.net Mon Oct 3 15:26:55 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 03 Oct 2005 15:26:55 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> <1128342768.6138.114.camel@fsol> Message-ID: <1128346015.6138.149.camel@fsol> Le lundi 03 octobre 2005 ? 14:59 +0200, Fredrik Lundh a ?crit : > Antoine Pitrou wrote: > > > A good rule of thumb is to convert to unicode everything that is > > semantically textual > > and isn't pure ASCII. How can you be sure that something that is /semantically textual/ will always remain "pure ASCII" ? That's contradictory, unless your software never goes out of the anglo-saxon world (and even...). > (anyone who are tempted to argue otherwise should benchmark their > applications, both speed- and memorywise, and be prepared to come > up with very strong arguments for why python programs shouldn't be > allowed to be fast and memory-efficient whenever they can...) I think most applications don't critically depend on text processing performance. OTOH, international adaptability is the kind of thing that /will/ bite you one day if you don't prepare for it at the beginning. Also, if necessary, the distinction could be an implementation detail and the conversion be transparent (like int vs. long): the text would be coded in an 8-bit charset as long as possible and converted to a wide encoding only when necessary. The important thing is that these optimisations, if they are necessary, should be transparently handled by the Python runtime. (it seems to me - I may be mistaken - that modern Windows versions treat every string as 16-bit unicode internally. Why are they doing it if it is that inefficient?) Regards Antoine. From blais at furius.ca Mon Oct 3 16:34:09 2005 From: blais at furius.ca (Martin Blais) Date: Mon, 3 Oct 2005 10:34:09 -0400 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <4340FD31.20202@egenix.com> References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> <2mbr27f0th.fsf@starship.python.net> <4340FD31.20202@egenix.com> Message-ID: <8393fff0510030734k12a9e032pf935979fe3579389@mail.gmail.com> On 10/3/05, M.-A. Lemburg wrote: > > > > I'm not sure it's a sensible default. > > Me neither, especially since this would make it impossible > to write polymorphic code - e.g. ', '.join(list) wouldn't > work anymore if list contains Unicode; dito for u', '.join(list) > with list containing a string. Sounds like what you want is exactly what I want to avoid (for those two types anyway). cheers, From fredrik at pythonware.com Mon Oct 3 16:39:54 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 3 Oct 2005 16:39:54 +0200 Subject: [Python-Dev] Divorcing str and unicode (no moreimplicit conversions). References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com><1128342768.6138.114.camel@fsol> <1128346015.6138.149.camel@fsol> Message-ID: Antoine Pitrou wrote: > > > A good rule of thumb is to convert to unicode everything that is > > > semantically textual > > > > and isn't pure ASCII. > > How can you be sure that something that is /semantically textual/ will > always remain "pure ASCII" ? "is" != "will always remain" From jim at zope.com Mon Oct 3 16:49:44 2005 From: jim at zope.com (Jim Fulton) Date: Mon, 03 Oct 2005 10:49:44 -0400 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> Message-ID: <43414508.9060807@zope.com> Martin Blais wrote: > Hi. > > Like a lot of people (or so I hear in the blogosphere...), I've been > experiencing some friction in my code with unicode conversion > problems. Even when being super extra careful with the types of str's > or unicode objects that my variables can contain, there is always some > case or oversight where something unexpected happens which results in > a conversion which triggers a decode error. str.join() of a list of > strs, where one unicode object appears unexpectedly, and voila! > exception galore. Sometimes the problem shows up late because your > test code doesn't always contain accented characters. I'm sure many > of you experienced that or some variant at some point. > > I came to realize recently that this problem shares strong similarity > with the problem of implicit type conversions in C++, or at least it > feels the same: Stuff just happens implicitly, and it's hard to track > down where and when it happens by just looking at the code. Part of > the problem is that the unicode object acts a lot like a str, which is > convenient, but... I agree. I think it was a mistake to implicitly convert mixed string expressions to unicode. > What if we could completely disable the implicit conversions between > unicode and str? In other words, if you would ALWAYS be forced to > call either .encode() or .decode() to convert between one and the > other... wouldn't that help a lot deal with that issue? Perhaps. > How hard would that be to implement? Not hard. We considered doing it for Zope 3, but ... > Would it break a lot of code? Yes. > Would some people want that? No, I wouldn't want lots of code to break. ;) > (I know I would, at least for some of my > code.) It seems to me that this would make the code more explicit and > force the programmer to become more aware of those conversions. Any > opinions welcome. I think it's too late to change this. I wish it had been done differently. (OTOH, I'm very happy we have Unicode support, so I'm not really complaining. :) I'll note that this hasn't been that much of a problem for us in Zope. We follow the strategy: Antoine Pitrou wrote: ... > A good rule of thumb is to convert to unicode everything that is > semantically textual, and to only use str for what is to be semantically > treated as a string of bytes (network packets, identifiers...). This is > also, AFAIU, the semantic model which is favoured for a hypothetical > future version of Python. This approach has worked pretty well for us. Still, when there is a problem, it's a real pain to debug because the error occurs too late, as you point out. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim at zope.com Mon Oct 3 16:51:38 2005 From: jim at zope.com (Jim Fulton) Date: Mon, 03 Oct 2005 10:51:38 -0400 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <4340FD31.20202@egenix.com> References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> <2mbr27f0th.fsf@starship.python.net> <4340FD31.20202@egenix.com> Message-ID: <4341457A.8090805@zope.com> M.-A. Lemburg wrote: > Michael Hudson wrote: > >>Martin Blais writes: >> >> >> >>>What if we could completely disable the implicit conversions between >>>unicode and str? In other words, if you would ALWAYS be forced to >>>call either .encode() or .decode() to convert between one and the >>>other... wouldn't that help a lot deal with that issue? >> >> >>I don't know. I've made one or two apps safe against this and it's >>mostly just annoying. >> >> >>>How hard would that be to implement? >> >>import sys >>reload(sys) >>sys.setdefaultencoding('undefined') > > > You shouldn't post tricks like these :-) > > The correct way to change the default encoding is by > providing a sitecustomize.py module which then call the > sys.setdefaultencoding("undefined"). This is a much more evil trick IMO, as it affects all Python code, rather than a single program. I would argue that it's evil to change the default encoding in the first place, except in this case to disable implicit encoding or decoding. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From fredrik at pythonware.com Mon Oct 3 17:12:04 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 3 Oct 2005 17:12:04 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> <2mbr27f0th.fsf@starship.python.net><4340FD31.20202@egenix.com> <4341457A.8090805@zope.com> Message-ID: Jim Fulton wrote: > I would argue that it's evil to change the default encoding > in the first place, except in this case to disable implicit > encoding or decoding. absolutely. unfortunately, all attempts to add such information to the sys module documentation seem to have failed... (last time I tried, I seem to remember that someone argued that "it's there, so it should be documented in a neutral fashion") From pjd at satori.za.net Mon Oct 3 18:13:05 2005 From: pjd at satori.za.net (Piet Delport) Date: Mon, 03 Oct 2005 18:13:05 +0200 Subject: [Python-Dev] Proposal for 2.5: Returning values from PEP 342 enhanced generators Message-ID: <43415891.1040804@satori.za.net> PEP 255 ("Simple Generators") closes with: > Q. Then why not allow an expression on "return" too? > > A. Perhaps we will someday. In Icon, "return expr" means both "I'm > done", and "but I have one final useful value to return too, and > this is it". At the start, and in the absence of compelling uses > for "return expr", it's simply cleaner to use "yield" exclusively > for delivering values. Now that Python 2.5 gained enhanced generators (multitudes rejoice!), i think there is a compelling use for valued return statements in cooperative multitasking code, of the kind: def foo(): Data = yield Client.read() [...] MoreData = yield Client.read() [...] return FinalResult def bar(): Result = yield foo() For generators written in this style, "yield" means "suspend execution of the current call until the requested result/resource can be provided", and "return" regains its full conventional meaning of "terminate the current call with a given result". The simplest / most straightforward implementation would be for "return Foo" to translate to "raise StopIteration, Foo". This is consistent with "return" translating to "raise StopIteration", and does not break any existing generator code. (Another way to think about this change is that if a plain StopIteration means "the iterator terminated", then a valued StopIteration, by extension, means "the iterator terminated with the given value".) Motivation by real-world example: One system that could benefit from this change is Christopher Armstrong's defgen.py[1] for Twisted, which he recently reincarnated (as newdefgen.py) to use enhanced generators. The resulting code is much cleaner than before, and closer to the conventional synchronous style of writing. [1] the saga of which is summarized here: http://radix.twistedmatrix.com/archives/000114.html However, because enhanced generators have no way to differentiate their intermediate results from their "real" result, the current solution is a somewhat confusing compromise: the last value yielded by the generator implicitly becomes the result returned by the call. Thus, to return something, in general, requires the idiom "yield Foo; return". If valued returns are allowed, this would become "return Foo" (and the code implementing defgen itself would probably end up simpler, as well). From jcarlson at uci.edu Mon Oct 3 18:35:34 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 03 Oct 2005 09:35:34 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <1128346015.6138.149.camel@fsol> References: <1128346015.6138.149.camel@fsol> Message-ID: <20051003091416.9817.JCARLSON@uci.edu> Antoine Pitrou wrote: > > Le lundi 03 octobre 2005 ? 14:59 +0200, Fredrik Lundh a ?crit : > > Antoine Pitrou wrote: > > > > > A good rule of thumb is to convert to unicode everything that is > > > semantically textual > > > > and isn't pure ASCII. > > How can you be sure that something that is /semantically textual/ will > always remain "pure ASCII" ? That's contradictory, unless your software > never goes out of the anglo-saxon world (and even...). Non-unicode text input widgets. Works great. Can be had with the ANSI wxPython installation. > (it seems to me - I may be mistaken - that modern Windows versions treat > every string as 16-bit unicode internally. Why are they doing it if it > is that inefficient?) Because modern Windows supports all sorts of symbols which are necessary for certain special English uses (greek symbols for math, etc.), and trying to have all of them without just using the unicode backend that is used for all of the international "builds" (isn't it just a language definition?) anyways, would be a waste of time/effort. - Josiah From jason.orendorff at gmail.com Mon Oct 3 18:37:22 2005 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Mon, 3 Oct 2005 12:37:22 -0400 Subject: [Python-Dev] PEP 343 and __with__ Message-ID: I'm -1 on PEP 343. It seems ...complex. And even with all the complexity, I *still* won't be able to type with self.lock: ... which I submit is perfectly reasonable, clean, and clear. Instead I have to type with locking(self.lock): ... where locking() is apparently either a new builtin, a standard library function, or some 6-line contextmanager I have to write myself. So I have two suggestions. 1. I didn't find any suggestion of a __with__() method in the archives. So I feel I should suggest it. It would work just like __iter__(). class RLock: @contextmanager def __with__(self): self.acquire() try: yield finally: self.release() __with__() always returns a new context manager object. Just as with iterators, a context manager object has "cm.__with__() is cm". The 'with' statement would call __with__(), of course. Optionally, the type constructor could magically apply @contextmanager to __with__() if it's a generator, which is the usual case. It looks like it already does similar magic with __new__(). Perhaps this is too cute though. 2. More radical: Let's get rid of __enter__() and __exit__(). The only example in PEP 343 that uses them is Example 4, which exists only to show that "there's more than one way to do it". It all seems fishy to me. Why not get rid of them and use only __with__()? In this scenario, Python would expect __with__() to return a coroutine (not to say "iterator") that yields exactly once. Then the "@contextmanager" decorator wouldn't be needed on __with__(), and neither would any type constructor magic. The only drawback I see is that context manager methods implemented in C will work differently from those implemented in Python. Since C doesn't have coroutines, I imagine there would have to be enter() and exit() slots. Maybe this is a major design concern; I don't know. My apologies if this is redundant or unwelcome at this date. -j From fredrik at pythonware.com Mon Oct 3 18:42:07 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 3 Oct 2005 18:42:07 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). References: <1128346015.6138.149.camel@fsol> <20051003091416.9817.JCARLSON@uci.edu> Message-ID: Josiah Carlson wrote: > > > and isn't pure ASCII. > > > > How can you be sure that something that is /semantically textual/ will > > always remain "pure ASCII" ? That's contradictory, unless your software > > never goes out of the anglo-saxon world (and even...). > > Non-unicode text input widgets. Works great. Can be had with the ANSI > wxPython installation. You're both missing that Python is dynamically typed. A single string source doesn't have to return the same type of strings, as long as the objects it returns are compatible with Python's string model and with each other. Under the default encoding (and quite a few other encodings), that's true for plain ascii strings and Unicode strings. This is a good thing. From pje at telecommunity.com Mon Oct 3 19:02:40 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 03 Oct 2005 13:02:40 -0400 Subject: [Python-Dev] PEP 343 and __with__ In-Reply-To: Message-ID: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> At 12:37 PM 10/3/2005 -0400, Jason Orendorff wrote: >I'm -1 on PEP 343. It seems ...complex. And even with all the >complexity, I *still* won't be able to type > > with self.lock: ... > >which I submit is perfectly reasonable, clean, and clear. Which is why it's proposed to add __enter__/__exit__ to locks, and somewhat more controversially, file objects. (Guido objected on the basis that people might reuse the file object, but reusing a closed file object results in a sensible error message and so doesn't seem like a problem to me.) >[snip] >__with__() always returns a new context manager object. Just as with >iterators, a context manager object has "cm.__with__() is cm". > >The 'with' statement would call __with__(), of course. You didn't offer any reasons why this would be useful and/or good. >2. More radical: Let's get rid of __enter__() and __exit__(). The >only example in PEP 343 that uses them is Example 4, which exists only >to show that "there's more than one way to do it". It all seems fishy >to me. Why not get rid of them and use only __with__()? In this >scenario, Python would expect __with__() to return a coroutine (not to >say "iterator") that yields exactly once. Because this multiplies the difficulty of implementing context managers in C. It's easy to define a pair of C methods for __enter__ and __exit__, but an iterator requires creating another class in C. The yield-based syntax is just syntax sugar, not the essence of the proposal. >The only drawback I see is that context manager methods implemented in >C will work differently from those implemented in Python. Since C >doesn't have coroutines, I imagine there would have to be enter() and >exit() slots. Maybe this is a major design concern; I don't know. Considering your argument that locks should be contextmanagers, it would seem like a good idea for C implementations to be easy. :) >My apologies if this is redundant or unwelcome at this date. Since the PEP is accepted and has patches for both its implementation and a good part of its documentation, a major change like this would certainly need a better rationale. If your idea was that __with__ would somehow make it easier for locks to be context managers, it's based on a flawed premise. All that's required now is to have __enter__ and __exit__ call acquire() and release(). At this point, it's simply an open issue as to which stdlib objects will be context managers, and which will have helper functions or classes to serve as context managers. The actual API used to implement them has little or no bearing on that issue. From solipsis at pitrou.net Mon Oct 3 19:39:57 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 03 Oct 2005 19:39:57 +0200 Subject: [Python-Dev] unifying str and unicode In-Reply-To: <20051003091416.9817.JCARLSON@uci.edu> References: <1128346015.6138.149.camel@fsol> <20051003091416.9817.JCARLSON@uci.edu> Message-ID: <1128361197.6138.212.camel@fsol> Hi, Josiah: > > How can you be sure that something that is /semantically textual/ will > > always remain "pure ASCII" ? That's contradictory, unless your software > > never goes out of the anglo-saxon world (and even...). > > Non-unicode text input widgets. You didn't understand my statement. I didn't mean : - how can you /technically enforce/ no unicode text at all but : - how can you be sure that your users will never /want/ to enter some text that can't be represented with the current 8-bit charset? Of course the answer to the latter is: you can't. Fredrik: > Under the default encoding (and quite a few other encodings), that's true for > plain ascii strings and Unicode strings. If I have an unicode string containing legal characters greater than 0x7F, and I pass it to a function which converts it to str, the conversion fails. If I have an 8-bit string containing legal non-ascii characters in it (for example the name of a file as returned by the filesystem, which I of course have no prior control on), and I give it to a function which does an implicit conversion to unicode, the conversion fails. Here is an example so that you really understand. I am under a French locale (iso-8859-15), let's just try to enter a French word and see what happens when converting to unicode: -> As a string constant: >>> s = "?t?" >>> s '\xe9t\xe9' >>> u = unicode(s) Traceback (most recent call last): File "", line 1, in ? UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 0: ordinal not in range(128) -> By asking for input: >>> s = raw_input() ?t? >>> s '\xe9t\xe9' >>> unicode(s) Traceback (most recent call last): File "", line 1, in ? UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 0: ordinal not in range(128) It should work, but it fails miserably. In the current situation, if the programmer doesn't carefully plan for these cases by manually managing conversions (which of course he can do - but it's boring and bothersome - not to mention that many programmers do not even understand the issue!), some users will see the program die with a nasty exception, just because they happen to need a bit more than the plain latin alphabet without diacritics. (even the standard Python library is bitten: witness the weird getcwd() / getcwdu() pair...) I find it surprising that you claim there is no difficulty when everything points to the contrary. See for example how often confused developers ask for help on mailing-lists... Regards Antoine. From mwh at python.net Mon Oct 3 20:02:13 2005 From: mwh at python.net (Michael Hudson) Date: Mon, 03 Oct 2005 19:02:13 +0100 Subject: [Python-Dev] PEP 343 and __with__ In-Reply-To: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> (Phillip J. Eby's message of "Mon, 03 Oct 2005 13:02:40 -0400") References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> Message-ID: <2my85aeaxm.fsf@starship.python.net> "Phillip J. Eby" writes: > Since the PEP is accepted and has patches for both its implementation and a > good part of its documentation, a major change like this would certainly > need a better rationale. Though given the amount of interest said patch has attracted (none at all) perhaps noone cares very much and the proposal should be dropped. Which would be a shame given the time I spent on it and all the hot air here on python-dev... Cheers, mwh (who still likes PEP 343 and doesn't particularly like Jason's suggested changes). -- Gevalia is undrinkable low-octane see-through only slightly roasted bilge water. Compared to .us coffee it is quite drinkable. -- M?ns Nilsson, asr From guido at python.org Mon Oct 3 20:07:07 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 3 Oct 2005 11:07:07 -0700 Subject: [Python-Dev] PEP 343 and __with__ In-Reply-To: <2my85aeaxm.fsf@starship.python.net> References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <2my85aeaxm.fsf@starship.python.net> Message-ID: For the record, I very much want PEPs 342 and 343 implemented. I haven't had the time to look at the patch and don't expect to find the time any time soon, but it's not for lack of desire to see this feature implemented. I don't like Jason's __with__ proposal and even less like his idea to drop __enter__ and __exit__ (I think this would just make it harder to provide efficient implementations in C). I'm all for adding __enter__ and __exit__ to locks. I'm even considering that it might be a good idea to add them to files. For the record, here at Elemental we write a lot of Java code that uses database connections in a pattern that would have greatly benefited from a similar construct in Java. :) --Guido On 10/3/05, Michael Hudson wrote: > "Phillip J. Eby" writes: > > > Since the PEP is accepted and has patches for both its implementation and a > > good part of its documentation, a major change like this would certainly > > need a better rationale. > > Though given the amount of interest said patch has attracted (none at > all) perhaps noone cares very much and the proposal should be dropped. > Which would be a shame given the time I spent on it and all the hot > air here on python-dev... > > Cheers, > mwh > (who still likes PEP 343 and doesn't particularly like Jason's > suggested changes). > > -- > Gevalia is undrinkable low-octane see-through only slightly > roasted bilge water. Compared to .us coffee it is quite > drinkable. -- M?ns Nilsson, asr > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Mon Oct 3 20:37:55 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 3 Oct 2005 20:37:55 +0200 Subject: [Python-Dev] unifying str and unicode References: <1128346015.6138.149.camel@fsol><20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol> Message-ID: Antoine Pitrou wrote: > > Under the default encoding (and quite a few other encodings), that's true for > > plain ascii strings and Unicode strings. > > If I have an unicode string containing legal characters greater than > 0x7F, and I pass it to a function which converts it to str, the > conversion fails. so? if it does that, it's not unicode safe. what's that has to do with my argument (which is that you can safely mix ascii strings and unicode strings, because that's how things were designed). > Here is an example so that you really understand. I wrote the unicode type. I do understand how it works. From pje at telecommunity.com Mon Oct 3 21:20:34 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 03 Oct 2005 15:20:34 -0400 Subject: [Python-Dev] PEP 343 and __with__ In-Reply-To: <2my85aeaxm.fsf@starship.python.net> References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> At 07:02 PM 10/3/2005 +0100, Michael Hudson wrote: >"Phillip J. Eby" writes: > > > Since the PEP is accepted and has patches for both its implementation > and a > > good part of its documentation, a major change like this would certainly > > need a better rationale. > >Though given the amount of interest said patch has attracted (none at >all) Actually, I have been reading the patch and meant to comment on it. I was perplexed by the odd stack behavior of the new opcode until I realized that it's try/finally that's weird. :) I was planning to look into whether that could be cleaned up as well, when I got distracted and didn't go back to it. > perhaps noone cares very much and the proposal should be dropped. I care an awful lot, as 'with' is another framework-dissolving tool that makes it possible to do more things in library form, without needing to resort to template methods. It also enables more context-sensitive programming, in that "global" states can be set and restored in a structured fashion. It may take a while to feel the effects, but it's going to be a big improvement to Python, maybe as big as new-style classes, and certainly bigger than decorators. From solipsis at pitrou.net Mon Oct 3 21:37:22 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 03 Oct 2005 21:37:22 +0200 Subject: [Python-Dev] unifying str and unicode In-Reply-To: References: <1128346015.6138.149.camel@fsol> <20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol> Message-ID: <1128368242.6138.258.camel@fsol> Hi, Le lundi 03 octobre 2005 ? 20:37 +0200, Fredrik Lundh a ?crit : > > If I have an unicode string containing legal characters greater than > > 0x7F, and I pass it to a function which converts it to str, the > > conversion fails. > > so? if it does that, it's not unicode safe. [...] > what's that has to do with > my argument (which is that you can safely mix ascii strings and unicode > strings, because that's how things were designed). If that's how things were designed, then Python's entire standard library (not to mention third-party libraries) is not "unicode safe" - to quote your own words - since many functions may return 8-bit strings containing non-ascii characters. There lies the problem for many people, until the stdlib is fixed - or until the string types are changed. That's why you very regularly see people complaining about how conversions sometimes break their code in various ways. Anyway, I don't think we will reach an agreement here. We have different expectations w.r.t. to how the programming language may/should handle general text. I propose we end the discussion. Regards Antoine. From fredrik at pythonware.com Mon Oct 3 21:47:19 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 3 Oct 2005 21:47:19 +0200 Subject: [Python-Dev] unifying str and unicode References: <1128346015.6138.149.camel@fsol><20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol> <1128368242.6138.258.camel@fsol> Message-ID: Antoine Pitrou wrote: > > > If I have an unicode string containing legal characters greater than > > > 0x7F, and I pass it to a function which converts it to str, the > > > conversion fails. > > > > so? if it does that, it's not unicode safe. > [...] > > what's that has to do with > > my argument (which is that you can safely mix ascii strings and unicode > > strings, because that's how things were designed). > > If that's how things were designed, then Python's entire standard > brary (not to mention third-party libraries) is not "unicode safe" - > to quote your own words - since many functions may return 8-bit strings > containing non-ascii characters. huh? first you talk about functions that convert unicode strings to 8-bit strings, now you talk about functions that return raw 8-bit strings? and all this in response to a post that argues that it's in fact a good idea to use plain strings to hold textual data that happens to contain ASCII only, because 1) it works, by design, and 2) it's almost always more efficient. if you don't know what your own argument is, you cannot expect anyone to understand it. From martin at v.loewis.de Mon Oct 3 22:32:11 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 03 Oct 2005 22:32:11 +0200 Subject: [Python-Dev] --disable-unicode (Tests and unicode) In-Reply-To: <43411AC0.6040000@egenix.com> References: <43402FD6.7030905@v.loewis.de> <43411AC0.6040000@egenix.com> Message-ID: <4341954B.6050300@v.loewis.de> M.-A. Lemburg wrote: > Is the added complexity needed to support not having Unicode support > compiled into Python really worth it ? If there are volunteers willing to maintain it, and the other volunteers are not affected: certainly. > I know that Martin introduced this feature a long time ago, > so he will have had a reason for it. I added it because users requested it. I personally never use it. > Today, I think the situation has changed: computers have more > memory, are faster and the need to integrate (e.g. via XML) > is stronger than ever - and maybe we should consider removing > the option to get a cleaner code base with fewer #ifdefs > and SyntaxErrors from the standard lib. > > What do you think ? -0 for just ripping it out. +0 if PEP 5 is followed, atleast in spirit (i.e. give users advance warning to let them protest). I guess users in embedded builds (either in embedded systems, or embedding Python into some other application) might still be interested in the feature. Of course, these users could either recreate the feature if we remove it, or just stay with Python 2.4. Regards, Martin From solipsis at pitrou.net Mon Oct 3 22:38:19 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 03 Oct 2005 22:38:19 +0200 Subject: [Python-Dev] unifying str and unicode In-Reply-To: References: <1128346015.6138.149.camel@fsol> <20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol> <1128368242.6138.258.camel@fsol> Message-ID: <1128371900.6138.299.camel@fsol> > > If that's how things were designed, then Python's entire standard > > brary (not to mention third-party libraries) is not "unicode safe" - > > to quote your own words - since many functions may return 8-bit strings > > containing non-ascii characters. > > huh? first you talk about functions that convert unicode strings to 8-bit > strings, now you talk about functions that return raw 8-bit strings? Are you deliberately missing the argument? And can't you understand that conversions are problematic in both directions (str -> unicode /and/ unicode -> str)? If an stdlib function returns an 8-bit string containing non-ascii data, then this string used in unicode context incurs an implicit conversion, which fails. How's that for "unicode safety" of stdlib functions? Will you argue that this gives no difficulties to anyone ? > all this in response to a post that argues that it's in fact a good idea to > use plain strings to hold textual data that happens to contain ASCII only, To which you apparently didn't read my answer, that is: you can never be sure that a variable containing something which is /semantically/ textual (*) will never contain anything other than ASCII text. For example raw_input() won't tell you that its 8-bit string result contains some chars > 0x7F. Same for many other library functions. How do you cope with (more or less occasional) non-ascii data coming in as 8-bit strings? (*) that is, contains some natural language Either you carefully plan for non-ascii text coming in your application (including workarounds against Python's ascii-by-default conversion policy), or you deliberately cripple your application by deciding that non-ASCII text is forbidden in (some or all) places. Choose the latter and you'll be hostile to users. And this thread began with a poster who found difficult the way implicit conversions happen in Python. So it's very funny that you deny the existence of a problem for certain developers. Antoine. From mal at egenix.com Mon Oct 3 22:52:04 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 03 Oct 2005 22:52:04 +0200 Subject: [Python-Dev] --disable-unicode (Tests and unicode) In-Reply-To: <4341954B.6050300@v.loewis.de> References: <43402FD6.7030905@v.loewis.de> <43411AC0.6040000@egenix.com> <4341954B.6050300@v.loewis.de> Message-ID: <434199F4.3010905@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: > >>Is the added complexity needed to support not having Unicode support >>compiled into Python really worth it ? > > If there are volunteers willing to maintain it, and the other volunteers > are not affected: certainly. No objections there. I only see that --disable-unicode has already been broken a couple of times in the past and no-one (except those running test suites regularly) really noticed - at least not AFAIK. >>I know that Martin introduced this feature a long time ago, >>so he will have had a reason for it. > > I added it because users requested it. I personally never use it. > >>Today, I think the situation has changed: computers have more >>memory, are faster and the need to integrate (e.g. via XML) >>is stronger than ever - and maybe we should consider removing >>the option to get a cleaner code base with fewer #ifdefs >>and SyntaxErrors from the standard lib. >> >>What do you think ? > > -0 for just ripping it out. +0 if PEP 5 is followed, atleast > in spirit (i.e. give users advance warning to let them protest). > > I guess users in embedded builds (either in embedded systems, > or embedding Python into some other application) might still > be interested in the feature. Of course, these users could either > recreate the feature if we remove it, or just stay with > Python 2.4. If embedded build users rely on it, I'd suggest that these users take over maintenance of the patch set. Let's add a note to the configure switch that the feature will be removed in 2.6 and see what happens. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 30 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From pje at telecommunity.com Mon Oct 3 22:56:34 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 03 Oct 2005 16:56:34 -0400 Subject: [Python-Dev] unifying str and unicode In-Reply-To: <1128371900.6138.299.camel@fsol> References: <1128346015.6138.149.camel@fsol> <20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol> <1128368242.6138.258.camel@fsol> Message-ID: <5.1.1.6.0.20051003165105.03d84008@mail.telecommunity.com> At 10:38 PM 10/3/2005 +0200, Antoine Pitrou wrote: >To which you apparently didn't read my answer, that is: >you can never be sure that a variable containing something which >is /semantically/ textual (*) will never contain anything other than >ASCII text. For example raw_input() won't tell you that its 8-bit string >result contains some chars > 0x7F. Same for many other library >functions. How do you cope with (more or less occasional) non-ascii data >coming in as 8-bit strings? Presumably in Python 3.0, opening a file in "text" mode will require an encoding to be specified, and opening it in "binary" mode will cause it to produce or consume byte arrays, not strings. This should apply to sockets too, and really any I/O facility, including GUI frameworks, DBAPI objects, os.listdir(), etc. Of course, to get there we really need to add a convenient bytes type, perhaps by enhancing the current 'array' module. It'd be nice to have a way to get this in 2.x versions so people can start fixing stuff to work the right way. With no 8-bit strings coming in, there should be no unicode/str problems except those you create yourself. From solipsis at pitrou.net Mon Oct 3 22:59:02 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 03 Oct 2005 22:59:02 +0200 Subject: [Python-Dev] bytes type In-Reply-To: <5.1.1.6.0.20051003165105.03d84008@mail.telecommunity.com> References: <1128346015.6138.149.camel@fsol> <20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol> <1128368242.6138.258.camel@fsol> <5.1.1.6.0.20051003165105.03d84008@mail.telecommunity.com> Message-ID: <1128373142.6138.301.camel@fsol> > Presumably in Python 3.0, opening a file in "text" mode will require an > encoding to be specified, and opening it in "binary" mode will cause it to > produce or consume byte arrays, not strings. This should apply to sockets > too, and really any I/O facility, including GUI frameworks, DBAPI objects, > os.listdir(), etc. Great :) > Of course, to get there we really need to add a convenient bytes type, > perhaps by enhancing the current 'array' module. It'd be nice to have a > way to get this in 2.x versions so people can start fixing stuff to work > the right way. Could the "bytes" type be just the same as the current "str" type but without the implicit unicode conversion ? Or am I missing some desired functionality ? From guido at python.org Mon Oct 3 23:02:37 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 3 Oct 2005 14:02:37 -0700 Subject: [Python-Dev] bytes type In-Reply-To: <1128373142.6138.301.camel@fsol> References: <1128346015.6138.149.camel@fsol> <20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol> <1128368242.6138.258.camel@fsol> <5.1.1.6.0.20051003165105.03d84008@mail.telecommunity.com> <1128373142.6138.301.camel@fsol> Message-ID: On 10/3/05, Antoine Pitrou wrote: > Could the "bytes" type be just the same as the current "str" type but > without the implicit unicode conversion ? Or am I missing some desired > functionality ? No. It will be a mutable array of bytes. It will intentionally resemble strings as little as possible. There won't be a literal for it. But you will be able to convert between bytes and strings quite easily by specifying an encoding. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jason.orendorff at gmail.com Mon Oct 3 23:15:26 2005 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Mon, 3 Oct 2005 17:15:26 -0400 Subject: [Python-Dev] PEP 343 and __with__ In-Reply-To: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> Message-ID: Phillip J. Eby writes: > You didn't offer any reasons why this would be useful and/or good. It makes it dramatically easier to write Python classes that correctly support 'with'. I don't see any simple way to do this under PEP 343; the only sane thing to do is write a separate @contextmanager generator, as all of the examples do. Consider: # decimal.py class Context: ... def __enter__(self): ??? def __exit__(self, t, v, tb): ??? DefaultContext = Context(...) Kindly implement __enter__() and __exit__(). Make sure your implementation is thread-safe (not easy, even though decimal.getcontext/.setcontext are thread-safe!). Also make sure it supports nested 'with DefaultContext:' blocks (I don't mean lexically nested, of course; I mean nested at runtime.) The answer requires thread-local storage and a separate stack of saved context objects per thread. It seems a little ridiculous to me. Whereas: class Context: ... def __with__(self): old = decimal.getcontext() decimal.setcontext(self) try: yield finally: decimal.setcontext(old) As for the second proposal, I was thinking we'd have one mental model for context managers (block template generators), rather than two (generators vs. enter/exit methods). Enter/exit seemed superfluous, given the examples in the PEP. > [T]his multiplies the difficulty of implementing context managers in C. Nonsense. static PyObject * lock_with() { return PyContextManager_FromCFunctions(self, lock_acquire, lock_release); } There probably ought to be such an API even if my suggestion is in fact garbage (as, admittedly, still seems the most likely thing). Cheers, -j From martin at v.loewis.de Mon Oct 3 23:28:46 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 03 Oct 2005 23:28:46 +0200 Subject: [Python-Dev] unifying str and unicode In-Reply-To: <1128371900.6138.299.camel@fsol> References: <1128346015.6138.149.camel@fsol> <20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol> <1128368242.6138.258.camel@fsol> <1128371900.6138.299.camel@fsol> Message-ID: <4341A28E.7060506@v.loewis.de> Antoine Pitrou wrote: > To which you apparently didn't read my answer, that is: > you can never be sure that a variable containing something which > is /semantically/ textual (*) will never contain anything other than > ASCII text. That is simply not true. There are variables that is semantically textual, yet I can be sure that this is a byte string only if it consists just of ASCII. For example, if you invoke a Tkinter function, it will return a byte string if the result is purely ASCII, else return a Unicode string. This is an interface guarantee, hence I can be sure. Regards, Martin From blais at furius.ca Mon Oct 3 23:35:40 2005 From: blais at furius.ca (Martin Blais) Date: Mon, 3 Oct 2005 17:35:40 -0400 Subject: [Python-Dev] unifying str and unicode In-Reply-To: <1128371900.6138.299.camel@fsol> References: <1128346015.6138.149.camel@fsol> <20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol> <1128368242.6138.258.camel@fsol> <1128371900.6138.299.camel@fsol> Message-ID: <8393fff0510031435n7ef19cbcg297b8881d75d0a08@mail.gmail.com> On 10/3/05, Antoine Pitrou wrote: > > > > If that's how things were designed, then Python's entire standard > > > brary (not to mention third-party libraries) is not "unicode safe" - > > > to quote your own words - since many functions may return 8-bit strings > > > containing non-ascii characters. > > > > huh? first you talk about functions that convert unicode strings to 8-bit > > strings, now you talk about functions that return raw 8-bit strings? > > Are you deliberately missing the argument? > And can't you understand that conversions are problematic in both > directions (str -> unicode /and/ unicode -> str)? Both directions are a problem. Just a note: it's not so much the conversions that I find problematic, but rather the implicit nature of the conversions (combined with the fact that they may fail). In addition to being difficult to track down, these implicit conversions may be costing processing time as well. cheers, From pje at telecommunity.com Tue Oct 4 01:26:39 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 03 Oct 2005 19:26:39 -0400 Subject: [Python-Dev] PEP 343 and __with__ In-Reply-To: References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051003191825.01f8be98@mail.telecommunity.com> At 05:15 PM 10/3/2005 -0400, Jason Orendorff wrote: >Phillip J. Eby writes: > > You didn't offer any reasons why this would be useful and/or good. > >It makes it dramatically easier to write Python classes that correctly >support 'with'. I don't see any simple way to do this under PEP 343; >the only sane thing to do is write a separate @contextmanager >generator, as all of the examples do. Wha? For locks (the example you originally gave), this is trivial. >Consider: > > # decimal.py > class Context: > ... > def __enter__(self): > ??? > def __exit__(self, t, v, tb): > ??? > > DefaultContext = Context(...) > >Kindly implement __enter__() and __exit__(). Make sure your >implementation is thread-safe (not easy, even though >decimal.getcontext/.setcontext are thread-safe!). Also make sure it >supports nested 'with DefaultContext:' blocks (I don't mean lexically >nested, of course; I mean nested at runtime.) > >The answer requires thread-local storage and a separate stack of saved >context objects per thread. It seems a little ridiculous to me. Okay, it was completely non-obvious from your post that this was the problem you're trying to solve. >Whereas: > > class Context: > ... > def __with__(self): > old = decimal.getcontext() > decimal.setcontext(self) > try: > yield > finally: > decimal.setcontext(old) This could also be done with a Context.replace() @contextmanager method. On the whole, I'm torn. I definitely like the additional flexibility this gives. On the other hand, it seems to me that __with__ and the additional C baggage violates the "if the implementation is hard to explain" rule. Also, people have already put a lot of effort into implementation and documentation patches based on an accepted PEP. That's not enough to override "the right thing to do", especially if it comes with a volunteer willing to update the work, but in this case the amount of additional goodness seems small, and it's not immediately apparent that you're volunteering to help change this even if Guido blessed it. From mal at egenix.com Tue Oct 4 01:35:57 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 04 Oct 2005 01:35:57 +0200 Subject: [Python-Dev] unifying str and unicode In-Reply-To: <8393fff0510031435n7ef19cbcg297b8881d75d0a08@mail.gmail.com> References: <1128346015.6138.149.camel@fsol> <20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol> <1128368242.6138.258.camel@fsol> <1128371900.6138.299.camel@fsol> <8393fff0510031435n7ef19cbcg297b8881d75d0a08@mail.gmail.com> Message-ID: <4341C05D.5000706@egenix.com> Martin Blais wrote: > On 10/3/05, Antoine Pitrou wrote: > >>>>If that's how things were designed, then Python's entire standard >>>>brary (not to mention third-party libraries) is not "unicode safe" - >>>>to quote your own words - since many functions may return 8-bit strings >>>>containing non-ascii characters. >>> >>>huh? first you talk about functions that convert unicode strings to 8-bit >>>strings, now you talk about functions that return raw 8-bit strings? >> >>Are you deliberately missing the argument? >>And can't you understand that conversions are problematic in both >>directions (str -> unicode /and/ unicode -> str)? > > > Both directions are a problem. > > Just a note: it's not so much the conversions that I find problematic, > but rather the implicit nature of the conversions (combined with the > fact that they may fail). In addition to being difficult to track > down, these implicit conversions may be costing processing time as > well. We've already pointed you to a solution which you might want to use. Why don't you just try it ? BTW, if you want to read up on all the reasons why Unicode was done the way it was, have a look at: http://www.python.org/peps/pep-0100.html and read up in the python-dev archives: http://mail.python.org/pipermail/python-dev/2000-March/thread.html and the next months after the initial checkin. >From what I've read on the web about the Python Unicode implementation we have one of the better ones compared to other languages implementations and their choices and design decisions. None of them is perfect, but that's seems to be an inherent problem with Unicode no matter how you try to approach it - even more so, if you are trying to add it to a language that has used ordinary C strings for text from day 1. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 30 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From solipsis at pitrou.net Tue Oct 4 02:37:42 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 04 Oct 2005 02:37:42 +0200 Subject: [Python-Dev] bytes type In-Reply-To: References: <1128346015.6138.149.camel@fsol> <20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol> <1128368242.6138.258.camel@fsol> <5.1.1.6.0.20051003165105.03d84008@mail.telecommunity.com> <1128373142.6138.301.camel@fsol> Message-ID: <1128386262.6138.342.camel@fsol> Le lundi 03 octobre 2005 ? 14:02 -0700, Guido van Rossum a ?crit : > On 10/3/05, Antoine Pitrou wrote: > > Could the "bytes" type be just the same as the current "str" type but > > without the implicit unicode conversion ? Or am I missing some desired > > functionality ? > > No. It will be a mutable array of bytes. It will intentionally > resemble strings as little as possible. There won't be a literal for > it. Thinking about it, it may have to offer the search and replace facilities offered by strings (including regular expressions). Here is an use case : say I'm reading an HTML file (or receiving it over the network). Since the character encoding can be specified in the HTML file itself (in the ...), I must first receive it as a bytes object. But then I must fetch the encoding information from the HTML header: therefore I must use some string ops on the bytes object to parse this information. Only after I have discovered the encoding, can I finally convert the bytes object to a text string. Or would there be another way to do it? From guido at python.org Tue Oct 4 02:42:49 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 3 Oct 2005 17:42:49 -0700 Subject: [Python-Dev] bytes type In-Reply-To: <1128386262.6138.342.camel@fsol> References: <20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol> <1128368242.6138.258.camel@fsol> <5.1.1.6.0.20051003165105.03d84008@mail.telecommunity.com> <1128373142.6138.301.camel@fsol> <1128386262.6138.342.camel@fsol> Message-ID: This would presumaby support the (read-only part of the) buffer API so search would be covered. I don't see a use case for replace. Alternatively, you could always specify Latin-1 as the encoding and convert it that way -- I don't think there's any input that can cause Latin-1 decoding to fail. On 10/3/05, Antoine Pitrou wrote: > Le lundi 03 octobre 2005 ? 14:02 -0700, Guido van Rossum a ?crit : > > On 10/3/05, Antoine Pitrou wrote: > > > Could the "bytes" type be just the same as the current "str" type but > > > without the implicit unicode conversion ? Or am I missing some desired > > > functionality ? > > > > No. It will be a mutable array of bytes. It will intentionally > > resemble strings as little as possible. There won't be a literal for > > it. > > Thinking about it, it may have to offer the search and replace > facilities offered by strings (including regular expressions). > > Here is an use case : say I'm reading an HTML file (or receiving it over > the network). Since the character encoding can be specified in the HTML > file itself (in the ...), I must first receive it as a > bytes object. But then I must fetch the encoding information from the > HTML header: therefore I must use some string ops on the bytes object to > parse this information. Only after I have discovered the encoding, can I > finally convert the bytes object to a text string. > > Or would there be another way to do it? > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From solipsis at pitrou.net Tue Oct 4 02:50:34 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 04 Oct 2005 02:50:34 +0200 Subject: [Python-Dev] bytes type In-Reply-To: References: <20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol> <1128368242.6138.258.camel@fsol> <5.1.1.6.0.20051003165105.03d84008@mail.telecommunity.com> <1128373142.6138.301.camel@fsol> <1128386262.6138.342.camel@fsol> Message-ID: <1128387034.6138.355.camel@fsol> Le lundi 03 octobre 2005 ? 17:42 -0700, Guido van Rossum a ?crit : > I don't see a use case for replace. Agreed. > Alternatively, you could always specify Latin-1 as the encoding and > convert it that way -- I don't think there's any input that can cause > Latin-1 decoding to fail. You seem to be right. ? In 1992, the IANA registered the character map ISO-8859-1 (note the extra hyphen), a superset of ISO/IEC 8859-1, for use on the Internet. This map assigns control characters to the code values 00-1F, 7F, and 80-9F. It thus provides for 256 characters via every possible 8-bit value. ? http://en.wikipedia.org/wiki/ISO_8859-1#ISO-8859-1 Regards Antoine. From pjd at satori.za.net Mon Oct 3 07:53:50 2005 From: pjd at satori.za.net (Piet Delport) Date: Mon, 03 Oct 2005 07:53:50 +0200 Subject: [Python-Dev] Proposal for 2.5: Returning values from PEP 342 enhanced generators Message-ID: <4340C76E.8020502@satori.za.net> PEP 255 ("Simple Generators") closes with: > Q. Then why not allow an expression on "return" too? > > A. Perhaps we will someday. In Icon, "return expr" means both "I'm > done", and "but I have one final useful value to return too, and > this is it". At the start, and in the absence of compelling uses > for "return expr", it's simply cleaner to use "yield" exclusively > for delivering values. Now that Python 2.5 gained enhanced generators (multitudes rejoice!), i think there is a compelling use for valued return statements in cooperative multitasking code, of the kind: def foo(): Data = yield Client.read() [...] MoreData = yield Client.read() [...] return FinalResult def bar(): Result = yield foo() For generators written in this style, "yield" means "suspend execution of the current call until the requested result/resource can be provided", and "return" regains its full conventional meaning of "terminate the current call with a given result". The simplest / most straightforward implementation would be for "return Foo" to translate to "raise StopIteration, Foo". This is consistent with "return" translating to "raise StopIteration", and does not break any existing generator code. (Another way to think about this change is that if a plain StopIteration means "the iterator terminated", then a valued StopIteration, by extension, means "the iterator terminated with the given value".) Motivation by real-world example: One system that could benefit from this change is Christopher Armstrong's defgen.py[1] for Twisted, which he recently reincarnated (as newdefgen.py) to use enhanced generators. The resulting code is much cleaner than before, and closer to the conventional synchronous style of writing. [1] the saga of which is summarized here: http://radix.twistedmatrix.com/archives/000114.html However, because enhanced generators have no way to differentiate their intermediate results from their "real" result, the current solution is a somewhat confusing compromise: the last value yielded by the generator implicitly becomes the result returned by the call. Thus, to return something, in general, requires the idiom "yield Foo; return". If valued returns are allowed, this would become "return Foo" (and the code implementing defgen itself would probably end up simpler, as well). From tonynelson at georgeanelson.com Tue Oct 4 03:11:29 2005 From: tonynelson at georgeanelson.com (Tony Nelson) Date: Mon, 3 Oct 2005 21:11:29 -0400 Subject: [Python-Dev] Unicode charmap decoders slow Message-ID: Is there a faster way to transcode from 8-bit chars (charmaps) to utf-8 than going through unicode()? I'm writing a small card-file program. As a test, I use a 53 MB MBox file, in mac-roman encoding. My program reads and parses the file into messages in about 3 to 5 seconds (Wow! Go Python!), but takes about 14 seconds to iterate over the cards and convert them to utf-8: for i in xrange(len(cards)): u = unicode(cards[i], encoding) cards[i] = u.encode('utf-8') The time is nearly all in the unicode() call. It's not so much how much time it takes, but that it takes 4 times as long as the real work, just to do table lookups. Looking at the source (which, if I have it right, is PyUnicode_DecodeCharmap() in unicodeobject.c), I think it is doing a dictionary lookup for each character. I would have thought that it would make and cache a LUT the size of the charmap (and hook the relevent dictionary stuff to delete the cached LUT if the dictionary is changed). (You may consider this a request for enhancement. ;) I thought of using U"".translate(), but the unicode version is defined to be slow, and anyway I can't find any way to just shove my 8-bit data into a unicode string without translation. Is there some similar approach? I'm almost (but not quite) ready to try it in Pyrex. I'm new to Python. I didn't google anything relevent on python.org or in groups. I posted this in comp.lang.python yesterday, got a couple of responses, but I think this may be too sophisticated a question for that group. I'm not a member of this list, so please copy me on replies so I don't have to hunt them down in the archive. ____________________________________________________________________ TonyN.:' ' From radeex at gmail.com Tue Oct 4 04:03:41 2005 From: radeex at gmail.com (Christopher Armstrong) Date: Tue, 4 Oct 2005 13:03:41 +1100 Subject: [Python-Dev] Proposal for 2.5: Returning values from PEP 342 enhanced generators In-Reply-To: <43415891.1040804@satori.za.net> References: <43415891.1040804@satori.za.net> Message-ID: <60ed19d40510031903h3b33c62cifbadcfd7aa83cdd2@mail.gmail.com> On 10/4/05, Piet Delport wrote: > One system that could benefit from this change is Christopher Armstrong's > defgen.py[1] for Twisted, which he recently reincarnated (as newdefgen.py) to > use enhanced generators. The resulting code is much cleaner than before, and > closer to the conventional synchronous style of writing. > > [1] the saga of which is summarized here: > http://radix.twistedmatrix.com/archives/000114.html > > However, because enhanced generators have no way to differentiate their > intermediate results from their "real" result, the current solution is a > somewhat confusing compromise: the last value yielded by the generator > implicitly becomes the result returned by the call. Thus, to return > something, in general, requires the idiom "yield Foo; return". If valued > returns are allowed, this would become "return Foo" (and the code implementing > defgen itself would probably end up simpler, as well). Hey, that would be nice. I've found people confused by the way defgen handles return values before, getting seemingly meaningless values out of their defgens (if the defgen didn't specifically yield some meaningful value at the end). At first I thought "return foo" in a generator ought to be equivalent to "yield foo; return", but at least for defgen, it turns out raising StopIteration(foo) would be better, as I would have a very explicit way to specify and find the return value of the generator. -- Twisted | Christopher Armstrong: International Man of Twistery Radix | -- http://radix.twistedmatrix.com | Release Manager, Twisted Project \\\V/// | -- http://twistedmatrix.com |o O| | w----v----w-+ From jepler at unpythonic.net Tue Oct 4 04:25:48 2005 From: jepler at unpythonic.net (jepler@unpythonic.net) Date: Mon, 3 Oct 2005 21:25:48 -0500 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: References: Message-ID: <20051004022548.GC7081@unpythonic.net> As the OP suggests, decoding with a codec like mac-roman or iso8859-1 is very slow compared to encoding or decoding with utf-8. Here I'm working with 53k of data instead of 53 megs. (Note: this is a laptop, so it's possible that thermal or battery management features affected these numbers a bit, but by a factor of 3 at most) $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "u.encode('utf-8')" 1000 loops, best of 3: 591 usec per loop $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('utf-8')" 1000 loops, best of 3: 1.25 msec per loop $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('mac-roman')" 100 loops, best of 3: 13.5 msec per loop $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('iso8859-1')" 100 loops, best of 3: 13.6 msec per loop With utf-8 encoding as the baseline, we have decode('utf-8') 2.1x as long decode('mac-roman') 22.8x as long decode('iso8859-1') 23.0x as long Perhaps this is an area that is ripe for optimization. Jeff -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20051003/28a511e5/attachment.pgp From skip at pobox.com Mon Oct 3 23:45:44 2005 From: skip at pobox.com (skip@pobox.com) Date: Mon, 3 Oct 2005 16:45:44 -0500 Subject: [Python-Dev] unifying str and unicode In-Reply-To: <1128371900.6138.299.camel@fsol> References: <1128346015.6138.149.camel@fsol> <20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol> <1128368242.6138.258.camel@fsol> <1128371900.6138.299.camel@fsol> Message-ID: <17217.42632.726476.530117@montanaro.dyndns.org> Antoine> If an stdlib function returns an 8-bit string containing Antoine> non-ascii data, then this string used in unicode context incurs Antoine> an implicit conversion, which fails. Such strings should be converted to Unicode at the point where they enter the application. That's likely the only place where you have a good chance of knowing the data encoding. Files generally have no encoding information associated with them. Some databases don't handle Unicode transparently. If you hang onto the input from such devices as plain strings until you need them as Unicode, you will almost certainly not know how the string was encoded. The state of the outside Unicode world being as miserable as it is (think web input forms), you often don't know the encoding at the interface and have to guess anyway. Even so, isolating that guesswork to the interface is better than recovering somewhere further downstream. Skip From foom at fuhm.net Tue Oct 4 05:44:13 2005 From: foom at fuhm.net (James Y Knight) Date: Mon, 3 Oct 2005 23:44:13 -0400 Subject: [Python-Dev] unifying str and unicode In-Reply-To: References: <1128346015.6138.149.camel@fsol><20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol> <1128368242.6138.258.camel@fsol> Message-ID: <0F192CD5-00EE-466B-A1D6-528F394754D1@fuhm.net> On Oct 3, 2005, at 3:47 PM, Fredrik Lundh wrote: > Antoine Pitrou wrote: > > >>>> If I have an unicode string containing legal characters greater >>>> than >>>> 0x7F, and I pass it to a function which converts it to str, the >>>> conversion fails. >>>> >>> >>> so? if it does that, it's not unicode safe. >>> >> [...] >> >>> what's that has to do with >>> my argument (which is that you can safely mix ascii strings and >>> unicode >>> strings, because that's how things were designed). >>> >> >> If that's how things were designed, then Python's entire standard >> brary (not to mention third-party libraries) is not "unicode safe" - >> to quote your own words - since many functions may return 8-bit >> strings >> containing non-ascii characters. >> > > huh? first you talk about functions that convert unicode strings > to 8-bit > strings, now you talk about functions that return raw 8-bit > strings? and > all this in response to a post that argues that it's in fact a good > idea to > use plain strings to hold textual data that happens to contain > ASCII only, > because 1) it works, by design, and 2) it's almost always more > efficient. > > if you don't know what your own argument is, you cannot expect anyone > to understand it. Your point would be much easier to stomach if the "str" type could *only* hold 7-bit ASCII. Perhaps that can be done when Python gets an actual bytes type in 3.0. There indeed are a multitude of uses for the efficient storage/processing of ASCII-only data. However, currently, there are problems because it's so easy to screw yourself without noticing when mixing unicode and str objects. If, on the other hand, you have a 7bit ascii string type, and a 16/32-bit unicode string type, both can be used interchangeably and there is no possibility for any en/de-coding issues. And asciiOnlyStringType.encode('utf-8') can become _ultra_ efficient, as a bonus. :) Seems win-win to me. James From walter at livinglogic.de Tue Oct 4 09:37:29 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Tue, 4 Oct 2005 09:37:29 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <20051004022548.GC7081@unpythonic.net> References: <20051004022548.GC7081@unpythonic.net> Message-ID: <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> Am 04.10.2005 um 04:25 schrieb jepler at unpythonic.net: > As the OP suggests, decoding with a codec like mac-roman or > iso8859-1 is very > slow compared to encoding or decoding with utf-8. Here I'm working > with 53k of > data instead of 53 megs. (Note: this is a laptop, so it's possible > that > thermal or battery management features affected these numbers a > bit, but by a > factor of 3 at most) > > $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "u.encode('utf-8')" > 1000 loops, best of 3: 591 usec per loop > $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('utf-8')" > 1000 loops, best of 3: 1.25 msec per loop > $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('mac-roman')" > 100 loops, best of 3: 13.5 msec per loop > $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('iso8859-1')" > 100 loops, best of 3: 13.6 msec per loop > > With utf-8 encoding as the baseline, we have > decode('utf-8') 2.1x as long > decode('mac-roman') 22.8x as long > decode('iso8859-1') 23.0x as long > > Perhaps this is an area that is ripe for optimization. For charmap decoding we might be able to use an array (e.g. a tuple (or an array.array?) of codepoints instead of dictionary. Or we could implement this array as a C array (i.e. gencodec.py would generate C code). Bye, Walter D?rwald From fredrik at pythonware.com Tue Oct 4 10:33:15 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 4 Oct 2005 10:33:15 +0200 Subject: [Python-Dev] unifying str and unicode References: <1128346015.6138.149.camel@fsol><20051003091416.9817.JCARLSON@uci.edu><1128361197.6138.212.camel@fsol><1128368242.6138.258.camel@fsol> <0F192CD5-00EE-466B-A1D6-528F394754D1@fuhm.net> Message-ID: James Y Knight wrote: > Your point would be much easier to stomach if the "str" type could > *only* hold 7-bit ASCII. why? strings are not mutable, so it's not like an ASCII string will suddenly sprout non-ASCII characters. what ends up in a string is defined by the string source. if you cannot trust the source, your programs will never work. after all, there's no- thing in Python that keeps things like: s = file.readline().decode("iso-8859-1") s = elem.findtext("node") s = device.read_encoded_data() from returning integers instead of strings, or returning socket objects on odd fridays. but if the interface spec says that they always return strings that adher to python's text model (=unicode or things that can be mixed with unicode), you can trust them as much as you can trust anything else in Python. (this is of course also why we talk about file-like objects in Python, and sequences, and iterators and iterables, and stuff like that. it's not type(obj) that's important, it's what you can do with obj and how it behaves when you do it) From mwh at python.net Tue Oct 4 10:50:16 2005 From: mwh at python.net (Michael Hudson) Date: Tue, 04 Oct 2005 09:50:16 +0100 Subject: [Python-Dev] PEP 343 and __with__ In-Reply-To: <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> (Phillip J. Eby's message of "Mon, 03 Oct 2005 15:20:34 -0400") References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> Message-ID: <2mu0fxekdz.fsf@starship.python.net> "Phillip J. Eby" writes: > At 07:02 PM 10/3/2005 +0100, Michael Hudson wrote: >>"Phillip J. Eby" writes: >> >> > Since the PEP is accepted and has patches for both its implementation >> and a >> > good part of its documentation, a major change like this would certainly >> > need a better rationale. >> >>Though given the amount of interest said patch has attracted (none at >>all) > > Actually, I have been reading the patch and meant to comment on it. Oh, good. > I was perplexed by the odd stack behavior of the new opcode until I > realized that it's try/finally that's weird. :) :) > I was planning to look into whether that could be cleaned up as > well, when I got distracted and didn't go back to it. I see. I don't know whether trying to clean up the stack protocol around exceptions is worth the about of pain it causes in the head (anyone still thinking about removing the block stack?). >> perhaps noone cares very much and the proposal should be dropped. > > I care an awful lot, as 'with' is another framework-dissolving tool that > makes it possible to do more things in library form, without needing to > resort to template methods. It also enables more context-sensitive > programming, in that "global" states can be set and restored in a > structured fashion. It may take a while to feel the effects, but it's > going to be a big improvement to Python, maybe as big as new-style classes, > and certainly bigger than decorators. I think 'as big as new-style classes' is probably an exaggeration, but I'm glad my troll caught a few people :) Cheers, mwh -- Those who have deviant punctuation desires should take care of their own perverted needs. -- Erik Naggum, comp.lang.lisp From ncoghlan at gmail.com Tue Oct 4 10:59:44 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 04 Oct 2005 18:59:44 +1000 Subject: [Python-Dev] PEP 343 and __with__ In-Reply-To: <2mu0fxekdz.fsf@starship.python.net> References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <2mu0fxekdz.fsf@starship.python.net> Message-ID: <43424480.9080900@gmail.com> Michael Hudson wrote: > I think 'as big as new-style classes' is probably an exaggeration, but > I'm glad my troll caught a few people :) I was planning on looking at your patch too, but I was waiting for an answer from Guido about the fate of the ast-branch for Python 2.5. Given that we have patches for PEP 342 and PEP 343 against the trunk, but ast-branch still isn't even passing the Python 2.4 test suite, I'm wondering if it should be bumped from the feature list again. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Tue Oct 4 12:21:43 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 04 Oct 2005 20:21:43 +1000 Subject: [Python-Dev] PEP 343 and __with__ In-Reply-To: References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> Message-ID: <434257B7.9000909@gmail.com> Jason Orendorff wrote: > Phillip J. Eby writes: > >>You didn't offer any reasons why this would be useful and/or good. > > > It makes it dramatically easier to write Python classes that correctly > support 'with'. I don't see any simple way to do this under PEP 343; > the only sane thing to do is write a separate @contextmanager > generator, as all of the examples do. Hmm, it's kind of like the iterable/iterator distinction. Being able to do: class Whatever(object): def __iter__(self): for item in self.stuff: yield item is a very handy way of defining "this is how you iterate over this class". The only cost is that actual iterators then need to define an __iter__ method that returns 'self' (which isn't much of a cost, and is trivial to do even for iterators written in C). If there was a __with__ slot, then we could consider that as identifying a "manageable context", with three methods to identify an actual context manager: __with__ that returns self __enter__ __exit__ Then the explanation of what a with statement does would simply look like: abc = EXPR.__with__() # This is the only change exc = (None, None, None) VAR = abc.__enter__() try: try: BLOCK except: exc = sys.exc_info() raise finally: abc.__exit__(*exc) And the context management for decimal.Context would look like: class Context: ... @contextmanager def __with__(self): old = decimal.getcontext() new = self.copy() # Make this nesting and thread safe decimal.setcontext(new) try: yield new finally: decimal.setcontext(old) And for threading.Lock would look like: class Lock: ... def __with__(self): return self def __enter__(self): self.acquire() return self def __exit__(self): self.release() Also, any class could make an existing independent context manager (such as 'closing') its native context manager as follows: class SomethingCloseable: ... def __with__(self): return closing(self) > As for the second proposal, I was thinking we'd have one mental model > for context managers (block template generators), rather than two > (generators vs. enter/exit methods). Enter/exit seemed superfluous, > given the examples in the PEP. Try to explain the semantics of the with statement without referring to the __enter__ and __exit__ methods, and then see if you still think they're superfluous ;) The @contextmanager generator decorator is just syntactic sugar for writing duck-typed context managers - the semantics of the with statement itself can only be explained in terms of the __enter__ and __exit__ methods. Indeed, explaining how the @contextmanager decorator itself works requires recourse to the __enter__ and __exit__ methods of the actual context manager object the decorator produces. However, I think the idea of have a distinction between manageable contexts and context managers similar to the distinction between iterables and iterators is one well worth considering. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From guido at python.org Tue Oct 4 16:31:55 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Oct 2005 07:31:55 -0700 Subject: [Python-Dev] PEP 343 and __with__ In-Reply-To: <43424480.9080900@gmail.com> References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com> Message-ID: On 10/4/05, Nick Coghlan wrote: > I was planning on looking at your patch too, but I was waiting for an answer > from Guido about the fate of the ast-branch for Python 2.5. Given that we have > patches for PEP 342 and PEP 343 against the trunk, but ast-branch still isn't > even passing the Python 2.4 test suite, I'm wondering if it should be bumped > from the feature list again. What do you want me to say about the AST branch? It's not my branch, I haven't even checked it out, I'm just patiently waiting for the folks who started it to finally finish it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jason.orendorff at gmail.com Tue Oct 4 16:38:49 2005 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Tue, 4 Oct 2005 10:38:49 -0400 Subject: [Python-Dev] PEP 343 and __with__ In-Reply-To: <434257B7.9000909@gmail.com> References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <434257B7.9000909@gmail.com> Message-ID: The argument I am going to try to make is that Python coroutines need a more usable API. > Try to explain the semantics of the with statement without referring to the > __enter__ and __exit__ methods, and then see if you still think they're > superfluous ;) > > The @contextmanager generator decorator is just syntactic sugar [...] > [T]he semantics of the with statement itself can > only be explained in terms of the __enter__ and __exit__ methods. That's not true. It can certainly use the coroutine API instead. Now... as specified in PEP 342, the coroutine API can be used to implement 'with', but it's ugly. I think this is a problem with the coroutine API, not the idea of using coroutines per se. Actually I think 'with' is a pretty tame use case for coroutines. Other Python objects (dicts, lists, strings) have convenience methods that are strictly redundant but make them much easier to use. Coroutines should, too. This: with EXPR as VAR: BLOCK expands to this under PEP 342: _cm = contextmanager(EXPR) VAR = _cm.next() try: BLOCK except: try: _cm.throw(*sys.exc_info()) except: pass raise finally: try: _cm.next() except StopIteration: pass except: raise else: raise RuntimeError Blah. But it could look like this: _cm = (EXPR).__with__() VAR = _cm.start() try: BLOCK except: _cm.throw(*excinfo) else: _cm.finish() I think that looks quite nice. Here is the proposed specification for start() and finish(): class coroutine: # pseudocode ... def start(self): """ Convenience method -- exactly like next(), but assert that this coroutine hasn't already been started. """ if self.__started: raise ValueError # or whatever return self.next() def finish(self): """ Convenience method -- like next(), but expect the coroutine to complete without yielding again. """ try: self.next() except (StopIteration, GeneratorExit): pass else: raise RuntimeError("coroutine didn't finish") Why is this good? - Makes coroutines more usable for everyone, not just for implementing 'with'. - For example, if you want to feed values to a coroutine, call start() first and then send() repeatedly. Quite sensible. - Single mental model for 'with' (always uses a coroutine or lookalike object). - No need for "contextmanager" wrapper. - Harder to implement a context manager object incorrectly (it's quite easy to screw up with __begin__ and __end__). -j From jason.orendorff at gmail.com Tue Oct 4 16:51:11 2005 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Tue, 4 Oct 2005 10:51:11 -0400 Subject: [Python-Dev] PEP 343 and __with__ In-Reply-To: References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <434257B7.9000909@gmail.com> Message-ID: Right after I sent the preceding message I got a funny feeling I'm wasting everybody's time here. I apologize. Guido's original concern about speedy C implementation for locks stands. I don't see a good way around it. By the way, my expansion of 'with' using coroutines (in previous message) was incorrect. The corrected version is shorter; see below. -j This: with EXPR as VAR: BLOCK would expand to this under PEP 342 and my proposal: _cm = (EXPR).__with__() VAR = _cm.next() try: BLOCK except: _cm.throw(*sys.exc_info()) finally: try: _cm.next() except (StopIteration, GeneratorExit): pass else: raise RuntimeError("coroutine didn't finish") From guido at python.org Tue Oct 4 16:54:15 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Oct 2005 07:54:15 -0700 Subject: [Python-Dev] PEP 343 and __with__ In-Reply-To: References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <434257B7.9000909@gmail.com> Message-ID: On 10/4/05, Jason Orendorff wrote: > This: > > with EXPR as VAR: > BLOCK > > expands to this under PEP 342: > > _cm = contextmanager(EXPR) > VAR = _cm.next() > try: > BLOCK > except: > try: > _cm.throw(*sys.exc_info()) > except: > pass > raise > finally: > try: > _cm.next() > except StopIteration: > pass > except: > raise > else: > raise RuntimeError Where in the world do you get this idea? The translation is as follows, according to PEP 343: abc = EXPR exc = (None, None, None) VAR = abc.__enter__() try: try: BLOCK except: exc = sys.exc_info() raise finally: abc.__exit__(*exc) PEP 342 doesn't touch on the expansion of with-statements at all. I think I know where you're coming from, but please do us a favor and don't misrepresent the PEPs. If anything, your proposal is more complicated; it requires four new APIs instead of two, and requires an extra call to set up (__with__() followed by start()). Proposals like yours (and every other permutation) were brought up during the initial discussion. We picked one. Don't create more churn by arguing for a different variant. Spend your efforts on implementing it so you can actually use it and see how bad it is (I predict it won't be bad at all). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 4 16:56:32 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Oct 2005 07:56:32 -0700 Subject: [Python-Dev] PEP 343 and __with__ In-Reply-To: References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <434257B7.9000909@gmail.com> Message-ID: On 10/4/05, Jason Orendorff wrote: > Right after I sent the preceding message I got a funny feeling I'm > wasting everybody's time here. I apologize. Guido's original concern > about speedy C implementation for locks stands. I don't see a good > way around it. OK. Our messages crossed, so you can ignore my response. Let's spend our time implementing the PEPs as they stand, then see what else we can do with the new APIs. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Tue Oct 4 21:50:04 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 04 Oct 2005 21:50:04 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> Message-ID: <4342DCEC.5020204@v.loewis.de> Walter D?rwald wrote: > For charmap decoding we might be able to use an array (e.g. a tuple > (or an array.array?) of codepoints instead of dictionary. This array would have to be sparse, of course. Using an array.array would be more efficient, I guess - but we would need a C API for arrays (to validate the type code, and to get ob_item). > Or we could implement this array as a C array (i.e. gencodec.py would > generate C code). For decoding, we would not get any better than array.array, except for startup cost. For encoding, having a C trie might give considerable speedup. _codecs could offer an API to convert the current dictionaries into lookup-efficient structures, and the conversion would be done when importing the codec. For the trie, two levels (higher and lower byte) would probably be sufficient: I believe most encodings only use 2 "rows" (256 code point blocks), very few more than three. Regards, Martin From mal at egenix.com Tue Oct 4 22:29:36 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 04 Oct 2005 22:29:36 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> Message-ID: <4342E630.5060801@egenix.com> Walter D?rwald wrote: > Am 04.10.2005 um 04:25 schrieb jepler at unpythonic.net: > > >>As the OP suggests, decoding with a codec like mac-roman or >>iso8859-1 is very >>slow compared to encoding or decoding with utf-8. Here I'm working >>with 53k of >>data instead of 53 megs. (Note: this is a laptop, so it's possible >>that >>thermal or battery management features affected these numbers a >>bit, but by a >>factor of 3 at most) >> >>$ timeit.py -s "s='a'*53*1024; u=unicode(s)" "u.encode('utf-8')" >>1000 loops, best of 3: 591 usec per loop >>$ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('utf-8')" >>1000 loops, best of 3: 1.25 msec per loop >>$ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('mac-roman')" >>100 loops, best of 3: 13.5 msec per loop >>$ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('iso8859-1')" >>100 loops, best of 3: 13.6 msec per loop >> >>With utf-8 encoding as the baseline, we have >> decode('utf-8') 2.1x as long >> decode('mac-roman') 22.8x as long >> decode('iso8859-1') 23.0x as long >> >>Perhaps this is an area that is ripe for optimization. > > > For charmap decoding we might be able to use an array (e.g. a tuple > (or an array.array?) of codepoints instead of dictionary. > > Or we could implement this array as a C array (i.e. gencodec.py would > generate C code). That would be a possibility, yes. Note that the charmap codec was meant as faster replacement for the old string transpose function. Dictionaries are used for the mapping to avoid having to store huge (largely empty) mapping tables - it's a memory-speed tradeoff. Of course, a C version could use the same approach as the unicodedatabase module: that of compressed lookup tables... http://aggregate.org/TechPub/lcpc2002.pdf genccodec.py anyone ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 04 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From walter at livinglogic.de Tue Oct 4 23:48:08 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Tue, 4 Oct 2005 23:48:08 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <4342DCEC.5020204@v.loewis.de> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> Message-ID: <6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de> Am 04.10.2005 um 21:50 schrieb Martin v. L?wis: > Walter D?rwald wrote: > >> For charmap decoding we might be able to use an array (e.g. a >> tuple (or an array.array?) of codepoints instead of dictionary. >> > > This array would have to be sparse, of course. For encoding yes, for decoding no. > Using an array.array > would be more efficient, I guess - but we would need a C API for > arrays > (to validate the type code, and to get ob_item). For decoding it should be sufficient to use a unicode string of length 256. u"\ufffd" could be used for "maps to undefined". Or the string might be shorter and byte values greater than the length of the string are treated as "maps to undefined" too. >> Or we could implement this array as a C array (i.e. gencodec.py >> would generate C code). >> > > For decoding, we would not get any better than array.array, except for > startup cost. Yes. > For encoding, having a C trie might give considerable speedup. _codecs > could offer an API to convert the current dictionaries into > lookup-efficient structures, and the conversion would be done when > importing the codec. > > For the trie, two levels (higher and lower byte) would probably be > sufficient: I believe most encodings only use 2 "rows" (256 code > point blocks), very few more than three. This might work, although nobody has complained about charmap encoding yet. Another option would be to generate a big switch statement in C and let the compiler decide about the best data structure. Bye, Walter D?rwald From marvinpublic at comcast.net Wed Oct 5 00:05:20 2005 From: marvinpublic at comcast.net (Marvin) Date: Tue, 04 Oct 2005 18:05:20 -0400 Subject: [Python-Dev] Static builds on Windows (continued) Message-ID: <4342FCA0.8020409@comcast.net> Earlier references: http://mail.python.org/pipermail/python-dev/2004-July/046499.html I want to be able to create a version of python24.lib that is a static library, suitable for creating a python.exe or other .exe using python's api. So I did as the earlier poster suggested, using 2.4.1 sources. I modified the PCBuild/pythoncore and python .vcproj files as follows: General/ ConfigurationType/ Static library (was dynamic in pythoncore) c/C++ Code Generation RT Library /MT (was /MTD for mt DLL) c/c++/Precompiled/ Not Using Precompiled headers (based on some MSDN hints) librarian OutputFile .//python24.lib Preprocessor: added Py_NO_ENABLED_SHARED. Removed USE_DL_IMPORT I built pythoncore and python. The resulting python.exe worked fine, but did indeed fail when I tried to dynamically load anything (Dialog said: the application terminated abnormally) Now I am not very clueful about the dllimport/dllexport business. But it seems that I should be able to link MY program against a .lib somehow (a real lib), and let the .EXE export the symbols somehow. My first guess is to try to use /MTD, use Py_NO_ENABLE_SHARED when building python24.lib, but then use PY_ENABLE_SHARED when compiling the python.c. I'll try that later, but anyone have more insight into the right way to do this? marvin From martin at v.loewis.de Wed Oct 5 00:08:45 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Oct 2005 00:08:45 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de> Message-ID: <4342FD6D.7000308@v.loewis.de> Walter D?rwald wrote: >> This array would have to be sparse, of course. > > > For encoding yes, for decoding no. [...] > For decoding it should be sufficient to use a unicode string of length > 256. u"\ufffd" could be used for "maps to undefined". Or the string > might be shorter and byte values greater than the length of the string > are treated as "maps to undefined" too. Right. That's what I meant with "sparse": you somehow need to represent "no value". > This might work, although nobody has complained about charmap encoding > yet. Another option would be to generate a big switch statement in C > and let the compiler decide about the best data structure. I would try to avoid generating C code at all costs. Maintaining the build processes will just be a nightmare. Regards, Martin From martin at v.loewis.de Wed Oct 5 00:21:20 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Oct 2005 00:21:20 +0200 Subject: [Python-Dev] Static builds on Windows (continued) In-Reply-To: <4342FCA0.8020409@comcast.net> References: <4342FCA0.8020409@comcast.net> Message-ID: <43430060.6070909@v.loewis.de> Marvin wrote: > I built pythoncore and python. The resulting python.exe worked fine, but did > indeed fail when I tried to dynamically load anything (Dialog said: the > application terminated abnormally) Not sure what you are trying to do here. In your case, dynamic loading simply cannot work. The extension modules all link with python24.dll, which you don't have. It may find some python24.dll, which then gives conflicts with the Python interpreter that is already running. So what you really should do is disable dynamic loading entirely. To do so, remove dynload_win from your project, and #undef HAVE_DYNAMIC_LOADING in PC/pyconfig.h. Not sure if anybody has recently tested whether this configuration actually works - if you find that it doesn't, please post your patches to sf.net/projects/python. If you really want to provide dynamic loading of some kind, you should arrange the extension modules to import the symbols from your .exe. Linking the exe should generate an import library, and you should link the extensions against that. HTH, Martin From tonynelson at georgeanelson.com Tue Oct 4 18:44:16 2005 From: tonynelson at georgeanelson.com (Tony Nelson) Date: Tue, 4 Oct 2005 12:44:16 -0400 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> References: <20051004022548.GC7081@unpythonic.net> <20051004022548.GC7081@unpythonic.net> Message-ID: At 9:37 AM +0200 10/4/05, Walter D?rwald wrote: >Am 04.10.2005 um 04:25 schrieb jepler at unpythonic.net: > >>As the OP suggests, decoding with a codec like mac-roman or iso8859-1 is >>very slow compared to encoding or decoding with utf-8. Here I'm working >>with 53k of data instead of 53 megs. (Note: this is a laptop, so it's >>possible that thermal or battery management features affected these >>numbers a bit, but by a factor of 3 at most) >> >> $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "u.encode('utf-8')" >> 1000 loops, best of 3: 591 usec per loop >> $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('utf-8')" >> 1000 loops, best of 3: 1.25 msec per loop >> $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('mac-roman')" >> 100 loops, best of 3: 13.5 msec per loop >> $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('iso8859-1')" >> 100 loops, best of 3: 13.6 msec per loop >> >> With utf-8 encoding as the baseline, we have >> decode('utf-8') 2.1x as long >> decode('mac-roman') 22.8x as long >> decode('iso8859-1') 23.0x as long >> >> Perhaps this is an area that is ripe for optimization. > >For charmap decoding we might be able to use an array (e.g. a tuple >(or an array.array?) of codepoints instead of dictionary. > >Or we could implement this array as a C array (i.e. gencodec.py would >generate C code). Fine -- as long as it still allows changing code points. I add the missing "Apple logo" code point to mac-roman in order to permit round-tripping (0xF0 <=> 0xF8FF, per Apple docs). (New bug #1313051.) If an all-C implementation wouldn't permit changing codepoints, I suggest instead just /caching/ the translation in C arrays stored with the codec object. The cache would be invalidated on any write to the codec's mapping dictionary, and rebuilt the next time anything was translated. This would maintain the present semantics, work with current codecs, and still provide the desired speed improvement. But is there really no way to say this fast in pure Python? The way a one-to-one byte mapping can be done with "".translate()? ____________________________________________________________________ TonyN.:' ' From tonynelson at georgeanelson.com Wed Oct 5 04:52:22 2005 From: tonynelson at georgeanelson.com (Tony Nelson) Date: Tue, 4 Oct 2005 22:52:22 -0400 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de> References: <4342DCEC.5020204@v.loewis.de> <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> Message-ID: [Recipient list not trimmed, as my replies must be vetted by a moderator, which seems to delay them. :] At 11:48 PM +0200 10/4/05, Walter D?rwald wrote: >Am 04.10.2005 um 21:50 schrieb Martin v. L?wis: > >> Walter D?rwald wrote: >> >>> For charmap decoding we might be able to use an array (e.g. a >>> tuple (or an array.array?) of codepoints instead of dictionary. >>> >> >> This array would have to be sparse, of course. > >For encoding yes, for decoding no. > >> Using an array.array would be more efficient, I guess - but we would >> need a C API for arrays (to validate the type code, and to get ob_item). > >For decoding it should be sufficient to use a unicode string of >length 256. u"\ufffd" could be used for "maps to undefined". Or the >string might be shorter and byte values greater than the length of >the string are treated as "maps to undefined" too. With Unicode using more than 64K codepoints now, it might be more forward looking to use a table of 256 32-bit values, with no need for tricky values. There is no need to add any C code to the codecs; just add some more code to the existing C function (which, if I have it right, is PyUnicode_DecodeCharmap() in unicodeobject.c). ... >> For encoding, having a C trie might give considerable speedup. _codecs >> could offer an API to convert the current dictionaries into >> lookup-efficient structures, and the conversion would be done when >> importing the codec. >> >> For the trie, two levels (higher and lower byte) would probably be >> sufficient: I believe most encodings only use 2 "rows" (256 code >> point blocks), very few more than three. > >This might work, although nobody has complained about charmap >encoding yet. Another option would be to generate a big switch >statement in C and let the compiler decide about the best data >structure. I'm willing to complain. :) I might allow saving of my (53 MB) MBox file. (Not that editing received mail makes as much sense as searching it.) Encoding can be made fast using a simple hash table with external chaining. There are max 256 codepoints to encode, and they will normally be well distributed in their lower 8 bits. Hash on the low 8 bits (just mask), and chain to an area with 256 entries. Modest storage, normally short chains, therefore fast encoding. At 12:08 AM +0200 10/5/05, Martin v. L?wis wrote: >I would try to avoid generating C code at all costs. Maintaining the >build processes will just be a nightmare. I agree; also I don't think the generated codecs need to be changed at all. All the changes can be made to the existing C functions, by adding caching per a reply of mine that hasn't made it to the list yet. Well, OK, something needs to hook writes to the codec's dictionary, but I /think/ that only needs Python code. I say: >...I suggest instead just /caching/ the translation in C arrays stored >with the codec object. The cache would be invalidated on any write to the >codec's mapping dictionary, and rebuilt the next time anything was >translated. This would maintain the present semantics, work with current >codecs, and still provide the desired speed improvement. Note that this caching is done by new code added to the existing C functions (which, if I have it right, are in unicodeobject.c). No architectural changes are made; no existing codecs need to be changed; everything will just work, and usually work faster, with very modest memory requirements of one 256 entry array of 32-bit Unicode values and a hash table with 256 1-byte slots and 256 chain entries, each having a 4 byte Unicode value, a byte output value, a byte chain index, and probably 2 bytes of filler, for a hash table size of 2304 bytes per codec. ____________________________________________________________________ TonyN.:' ' From martin at v.loewis.de Wed Oct 5 08:36:58 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Oct 2005 08:36:58 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: References: <4342DCEC.5020204@v.loewis.de> <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> Message-ID: <4343748A.9050105@v.loewis.de> Tony Nelson wrote: >>For decoding it should be sufficient to use a unicode string of >>length 256. u"\ufffd" could be used for "maps to undefined". Or the >>string might be shorter and byte values greater than the length of >>the string are treated as "maps to undefined" too. > > > With Unicode using more than 64K codepoints now, it might be more forward > looking to use a table of 256 32-bit values, with no need for tricky > values. You might be missing the point. \ufffd is REPLACEMENT CHARACTER, which would indicate that the byte with that index is really unused in that encoding. > Encoding can be made fast using a simple hash table with external chaining. > There are max 256 codepoints to encode, and they will normally be well > distributed in their lower 8 bits. Hash on the low 8 bits (just mask), and > chain to an area with 256 entries. Modest storage, normally short chains, > therefore fast encoding. This is what is currently done: a hash map with 256 keys. You are complaining about the performance of that algorithm. The issue of external chaining is likely irrelevant: there likely are no collisions, even though Python uses open addressing. >>...I suggest instead just /caching/ the translation in C arrays stored >>with the codec object. The cache would be invalidated on any write to the >>codec's mapping dictionary, and rebuilt the next time anything was >>translated. This would maintain the present semantics, work with current >>codecs, and still provide the desired speed improvement. That is not implementable. You cannot catch writes to the dictionary. > Note that this caching is done by new code added to the existing C > functions (which, if I have it right, are in unicodeobject.c). No > architectural changes are made; no existing codecs need to be changed; > everything will just work Please try to implement it. You will find that you cannot. I don't see how regenerating/editing the codecs could be avoided. Regards, Martin From martin at v.loewis.de Wed Oct 5 08:47:54 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Oct 2005 08:47:54 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: References: <20051004022548.GC7081@unpythonic.net> <20051004022548.GC7081@unpythonic.net> Message-ID: <4343771A.5040203@v.loewis.de> Tony Nelson wrote: > But is there really no way to say this fast in pure Python? The way a > one-to-one byte mapping can be done with "".translate()? Well, .translate isn't exactly pure Python. One-to-one between bytes and Unicode code points simply can't work. Just try all alternatives yourself and see if you can get any better than charmap_decode. Some would argue that charmap_decode *is* fast. Regards, Martin From walter at livinglogic.de Wed Oct 5 10:21:06 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed, 5 Oct 2005 10:21:06 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <4342FD6D.7000308@v.loewis.de> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de> <4342FD6D.7000308@v.loewis.de> Message-ID: Am 05.10.2005 um 00:08 schrieb Martin v. L?wis: > Walter D?rwald wrote: > >>> This array would have to be sparse, of course. >>> >> For encoding yes, for decoding no. >> > [...] > >> For decoding it should be sufficient to use a unicode string of >> length 256. u"\ufffd" could be used for "maps to undefined". Or >> the string might be shorter and byte values greater than the >> length of the string are treated as "maps to undefined" too. > > Right. That's what I meant with "sparse": you somehow need to > represent > "no value". OK, but I don't think that we really need a sparse data structure for that. I used the following script to check that: ----- import sys, os.path, glob, encodings has = 0 hasnt = 0 for enc in glob.glob("%s/*.py" % os.path.dirname(encodings.__file__)): enc = enc.rsplit(".")[-2].rsplit("/")[-1] try: __import__("encodings.%s" % enc) codec = sys.modules["encodings.%s" % enc] except: pass else: if hasattr(codec, "decoding_map"): print codec for i in xrange(0, 256): if codec.decoding_map.get(i, None) is not None: has += 1 else: hasnt += 1 print "assigned values:", has, "unassigned values:", hasnt ---- It reports that in all the charmap codecs there are 15292 assigned byte values and only 324 unassigned ones. I.e. only about 2% of the byte values map to "undefined". Storing those codepoints in the array as U+FFFD would only need 648 (or 1296 for wide builds) additional bytes. I don 't think a sparse data structure could beat that. >> This might work, although nobody has complained about charmap >> encoding yet. Another option would be to generate a big switch >> statement in C and let the compiler decide about the best data >> structure. > I would try to avoid generating C code at all costs. Maintaining > the build processes will just be a nightmare. Sounds resonable. Bye, Walter D?rwald From mal at egenix.com Wed Oct 5 10:39:25 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 05 Oct 2005 10:39:25 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <4342FD6D.7000308@v.loewis.de> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de> <4342FD6D.7000308@v.loewis.de> Message-ID: <4343913D.3080609@egenix.com> Martin v. L?wis wrote: >>Another option would be to generate a big switch statement in C >>and let the compiler decide about the best data structure. > > I would try to avoid generating C code at all costs. Maintaining the > build processes will just be a nightmare. We could automate this using distutils; however I'm not sure whether this would then also work on Windows. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 05 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From jepler at unpythonic.net Wed Oct 5 14:54:05 2005 From: jepler at unpythonic.net (jepler@unpythonic.net) Date: Wed, 5 Oct 2005 07:54:05 -0500 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <20051004022548.GC7081@unpythonic.net> References: <20051004022548.GC7081@unpythonic.net> Message-ID: <20051005125405.GA13147@unpythonic.net> The function the module below, xlate.xlate, doesn't quite do what "".decode does. (mostly that characters that don't exist are mapped to u+fffd always, instead of having the various behaviors avilable to "".decode) It builds the fast decoding structure once per call, but when decoding 53kb of data that overhead is small enough to make it much faster than s.decode('mac-roman'). For smaller buffers (I tried 53 characters), s.decode is two times faster. (43us vs 21us) $ timeit.py -s "s='a'*53*1024; import xlate" "s.decode('mac-roman')" 100 loops, best of 3: 12.8 msec per loop $ timeit.py -s "s='a'*53*1024; import xlate, encodings.mac_roman" \ "xlate.xlate(s, encodings.mac_roman.decoding_map)" 1000 loops, best of 3: 573 usec per loop Jeff -------------- next part -------------- #include #include #include PyObject *xlate(PyObject *s, PyObject *o) { unsigned char *inbuf; int i, length, pos=0; PyObject *map, *key, *value, *ret; Py_UNICODE *u, *ru; if(!PyArg_ParseTuple(o, "s#O", (char*)&inbuf, &length, &map)) return NULL; if(!PyDict_Check(map)) { PyErr_SetString(PyExc_TypeError, "Argument 2 must be a dictionary"); return NULL; } u = PyMem_Malloc(sizeof(Py_UNICODE) * 256); if(!u) { return NULL; } for(i=0; i<256; i++) { u[i] = 0xfffd; } while(PyDict_Next(map, &pos, &key, &value)) { int ki, vi; if(!PyInt_Check(key)) { PyErr_SetString(PyExc_TypeError, "Dictionary keys must be ints"); return NULL; } ki = PyInt_AsLong(key); if(ki < 0 || ki > 255) { PyErr_Format(PyExc_TypeError, "Dictionary keys must be in the range 0..255 (saw %d)", ki); return NULL; } if(value == Py_None) continue; if(!PyInt_Check(value)) { PyErr_SetString(PyExc_TypeError, "Dictionary values must be ints or None"); return NULL; } vi = PyInt_AsLong(value); u[ki] = vi; } ret = PyUnicode_FromUnicode(NULL, length); if(!ret) { free(u); return NULL; } ru = PyUnicode_AsUnicode(ret); for(i=0; i References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com> Message-ID: <4343D532.2030202@gmail.com> Guido van Rossum wrote: > On 10/4/05, Nick Coghlan wrote: > >>I was planning on looking at your patch too, but I was waiting for an answer >>from Guido about the fate of the ast-branch for Python 2.5. Given that we have >>patches for PEP 342 and PEP 343 against the trunk, but ast-branch still isn't >>even passing the Python 2.4 test suite, I'm wondering if it should be bumped >>from the feature list again. > > > What do you want me to say about the AST branch? It's not my branch, I > haven't even checked it out, I'm just patiently waiting for the folks > who started it to finally finish it. It was a question I asked a few weeks back [1] that didn't get any response (even from Brett!), to do with the fact that for Python 2.4 there was a deadline for landing the ast-branch that was a month or two in advance of the deadline for 2.4a1. I thought you'd set that deadline, but now that I look for it, I can't actually find any evidence of that. The only thing I can find is Jeremy's email saying it wasn't ready in time [2] (Jeremy's concern about reference leaks in ast-branch when it encounters compile errors is one I share, btw). Anyway, the question is: What do we want to do with ast-branch? Finish bringing it up to Python 2.4 equivalence, make it the HEAD, and only then implement the approved PEP's (308, 342, 343) that affect the compiler? Or implement the approved PEP's on the HEAD, and move the goalposts for ast-branch to include those features as well? I believe the latter is the safe option in terms of making sure 2.5 is a solid release, but doing it that way suggests to me that the ast compiler would need to be held over until 2.6, which would be somewhat unfortunate. Given that I don't particularly like that answer, I'd love for someone to convince me I'm wrong ;) Cheers, Nick. [1] http://mail.python.org/pipermail/python-dev/2005-September/056449.html [2] http://mail.python.org/pipermail/python-dev/2004-June/045121.html -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From guido at python.org Wed Oct 5 16:52:44 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Oct 2005 07:52:44 -0700 Subject: [Python-Dev] Python 2.5 and ast-branch In-Reply-To: <4343D532.2030202@gmail.com> References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com> <4343D532.2030202@gmail.com> Message-ID: On 10/5/05, Nick Coghlan wrote: > Anyway, the question is: What do we want to do with ast-branch? Finish > bringing it up to Python 2.4 equivalence, make it the HEAD, and only then > implement the approved PEP's (308, 342, 343) that affect the compiler? Or > implement the approved PEP's on the HEAD, and move the goalposts for > ast-branch to include those features as well? > > I believe the latter is the safe option in terms of making sure 2.5 is a solid > release, but doing it that way suggests to me that the ast compiler would need > to be held over until 2.6, which would be somewhat unfortunate. > > Given that I don't particularly like that answer, I'd love for someone to > convince me I'm wrong ;) Given the total lack of response, I have a different suggestion. Let's *abandon* the AST-branch. We're fooling ourselves believing that we can ever switch to that branch, no matter how theoretically better it is. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From hyeshik at gmail.com Wed Oct 5 17:06:06 2005 From: hyeshik at gmail.com (Hye-Shik Chang) Date: Thu, 6 Oct 2005 00:06:06 +0900 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <4342E630.5060801@egenix.com> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342E630.5060801@egenix.com> Message-ID: <4f0b69dc0510050806h478a1b9fr6953999e8d9d312b@mail.gmail.com> On 10/5/05, M.-A. Lemburg wrote: > Of course, a C version could use the same approach as > the unicodedatabase module: that of compressed lookup > tables... > > http://aggregate.org/TechPub/lcpc2002.pdf > > genccodec.py anyone ? > I had written a test codec for single byte character sets to evaluate algorithms to use in CJKCodecs once before (it's not a direct implemention of you've mentioned, tough) I just ported it to unicodeobject (as attached). It showed relatively fine result than charmap codecs: % python ./Lib/timeit.py -s "s='a'*1024*1024; u=unicode(s)" "s.decode('iso8859-1')" 10 loops, best of 3: 96.7 msec per loop % ./python ./Lib/timeit.py -s "s='a'*1024*1024; u=unicode(s)" "s.decode('iso8859_10_fc')" 10 loops, best of 3: 22.7 msec per loop % ./python ./Lib/timeit.py -s "s='a'*1024*1024; u=unicode(s)" "s.decode('utf-8')" 100 loops, best of 3: 18.9 msec per loop (Note that it doesn't contain any documentation nor good error handling yet. :-) Hye-Shik -------------- next part -------------- A non-text attachment was scrubbed... Name: fastmapcodec.diff Type: application/octet-stream Size: 18814 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20051006/2106c236/fastmapcodec-0001.obj From walter at livinglogic.de Wed Oct 5 17:08:04 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed, 05 Oct 2005 17:08:04 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <4343748A.9050105@v.loewis.de> References: <4342DCEC.5020204@v.loewis.de> <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <4343748A.9050105@v.loewis.de> Message-ID: <4343EC54.7090201@livinglogic.de> Martin v. L?wis wrote: > Tony Nelson wrote: > >>> For decoding it should be sufficient to use a unicode string of >>> length 256. u"\ufffd" could be used for "maps to undefined". Or the >>> string might be shorter and byte values greater than the length of >>> the string are treated as "maps to undefined" too. >> >> With Unicode using more than 64K codepoints now, it might be more forward >> looking to use a table of 256 32-bit values, with no need for tricky >> values. > > You might be missing the point. \ufffd is REPLACEMENT CHARACTER, > which would indicate that the byte with that index is really unused > in that encoding. OK, here's a patch that implements this enhancement to PyUnicode_DecodeCharmap(): http://www.python.org/sf/1313939 The mapping argument to PyUnicode_DecodeCharmap() can be a unicode string and is used as a decoding table. Speed looks like this: python2.4 -mtimeit "s='a'*53*1024; u=unicode(s)" "s.decode('utf-8')" 1000 loops, best of 3: 538 usec per loop python2.4 -mtimeit "s='a'*53*1024; u=unicode(s)" "s.decode('mac-roman')" 100 loops, best of 3: 3.85 msec per loop ./python-cvs -mtimeit "s='a'*53*1024; u=unicode(s)" "s.decode('utf-8')" 1000 loops, best of 3: 539 usec per loop ./python-cvs -mtimeit "s='a'*53*1024; u=unicode(s)" "s.decode('mac-roman')" 1000 loops, best of 3: 623 usec per loop Creating the decoding_map as a string should probably be done by gencodec.py directly. This way the first import of the codec would be faster too. Bye, Walter D?rwald From mal at egenix.com Wed Oct 5 17:52:54 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 05 Oct 2005 17:52:54 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <4f0b69dc0510050806h478a1b9fr6953999e8d9d312b@mail.gmail.com> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342E630.5060801@egenix.com> <4f0b69dc0510050806h478a1b9fr6953999e8d9d312b@mail.gmail.com> Message-ID: <4343F6D6.3030305@egenix.com> Hye-Shik Chang wrote: > On 10/5/05, M.-A. Lemburg wrote: > >>Of course, a C version could use the same approach as >>the unicodedatabase module: that of compressed lookup >>tables... >> >> http://aggregate.org/TechPub/lcpc2002.pdf >> >>genccodec.py anyone ? >> > > > I had written a test codec for single byte character sets to evaluate > algorithms to use in CJKCodecs once before (it's not a direct > implemention of you've mentioned, tough) I just ported it > to unicodeobject (as attached). Thanks. Please upload the patch to SF. Looks like we now have to competing patches: yours and the one written by Walter. So far you've only compared decoding strings into Unicode and they seem to be similar in performance. Do they differ in encoding performance ? > It showed relatively fine result > than charmap codecs: > > % python ./Lib/timeit.py -s "s='a'*1024*1024; u=unicode(s)" > "s.decode('iso8859-1')" > 10 loops, best of 3: 96.7 msec per loop > % ./python ./Lib/timeit.py -s "s='a'*1024*1024; u=unicode(s)" > "s.decode('iso8859_10_fc')" > 10 loops, best of 3: 22.7 msec per loop > % ./python ./Lib/timeit.py -s "s='a'*1024*1024; u=unicode(s)" > "s.decode('utf-8')" > 100 loops, best of 3: 18.9 msec per loop > > (Note that it doesn't contain any documentation nor good error > handling yet. :-) > > > Hye-Shik -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 05 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From martin at v.loewis.de Wed Oct 5 20:21:41 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Oct 2005 20:21:41 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <4343913D.3080609@egenix.com> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de> <4342FD6D.7000308@v.loewis.de> <4343913D.3080609@egenix.com> Message-ID: <434419B5.7030803@v.loewis.de> M.-A. Lemburg wrote: >>I would try to avoid generating C code at all costs. Maintaining the >>build processes will just be a nightmare. > > > We could automate this using distutils; however I'm not sure > whether this would then also work on Windows. It wouldn't. Regards, Martin From martin at v.loewis.de Wed Oct 5 20:40:04 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Oct 2005 20:40:04 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <4343EC54.7090201@livinglogic.de> References: <4342DCEC.5020204@v.loewis.de> <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <4343748A.9050105@v.loewis.de> <4343EC54.7090201@livinglogic.de> Message-ID: <43441E04.3060307@v.loewis.de> Walter D?rwald wrote: > OK, here's a patch that implements this enhancement to > PyUnicode_DecodeCharmap(): http://www.python.org/sf/1313939 Looks nice! > Creating the decoding_map as a string should probably be done by > gencodec.py directly. This way the first import of the codec would be > faster too. Hmm. How would you represent the string in source code? As a Unicode literal? With \u escapes, or in a UTF-8 source file? Or as a UTF-8 string, with an explicit decode call? I like the current dictionary style for being readable, as it also adds the Unicode character names into comments. Regards, Martin From mal at egenix.com Wed Oct 5 22:34:16 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 05 Oct 2005 22:34:16 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <434419B5.7030803@v.loewis.de> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de> <4342FD6D.7000308@v.loewis.de> <4343913D.3080609@egenix.com> <434419B5.7030803@v.loewis.de> Message-ID: <434438C8.1030001@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: > >>> I would try to avoid generating C code at all costs. Maintaining the >>> build processes will just be a nightmare. >> >> >> >> We could automate this using distutils; however I'm not sure >> whether this would then also work on Windows. > > > It wouldn't. Could you elaborate why not ? Using distutils on Windows is really easy... -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 05 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mal at egenix.com Wed Oct 5 22:45:18 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 05 Oct 2005 22:45:18 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <43441E04.3060307@v.loewis.de> References: <4342DCEC.5020204@v.loewis.de> <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <4343748A.9050105@v.loewis.de> <4343EC54.7090201@livinglogic.de> <43441E04.3060307@v.loewis.de> Message-ID: <43443B5E.5010606@egenix.com> Martin v. L?wis wrote: > Walter D?rwald wrote: > >>OK, here's a patch that implements this enhancement to >>PyUnicode_DecodeCharmap(): http://www.python.org/sf/1313939 > > Looks nice! Indeed (except for the choice of the "map this character to undefined" code point). Hye-Shik, could you please provide some timeit figures for the fastmap encoding ? >>Creating the decoding_map as a string should probably be done by >>gencodec.py directly. This way the first import of the codec would be >>faster too. > > > Hmm. How would you represent the string in source code? As a Unicode > literal? With \u escapes, or in a UTF-8 source file? Or as a UTF-8 > string, with an explicit decode call? > > I like the current dictionary style for being readable, as it also > adds the Unicode character names into comments. Not only that: it also allows 1-n and 1-0 mappings which was part of the idea to use a mapping object (such as a dictionary) as basis for the codec. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 05 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From martin at v.loewis.de Wed Oct 5 22:57:21 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Oct 2005 22:57:21 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <434438C8.1030001@egenix.com> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de> <4342FD6D.7000308@v.loewis.de> <4343913D.3080609@egenix.com> <434419B5.7030803@v.loewis.de> <434438C8.1030001@egenix.com> Message-ID: <43443E31.1070606@v.loewis.de> M.-A. Lemburg wrote: >>It wouldn't. > > > Could you elaborate why not ? Using distutils on Windows is really > easy... The current build process for Windows simply doesn't provide it. You expect to select "Build/All" from the menu (or some such), and expect all code to be compiled. The VC build process only considers VC project files. Maybe it is possible to hack up a project file to invoke distutils as the build process, but no such project file is currently available, nor is it known whether it is possible to create one. Whatever the build process: it should properly with debug and release build, with alternative compilers (such as Itanium compiler), and place the files so that debugging from the VStudio environment is possible. All of this is not the case of today, and nobody has worked on making it possible. I very much doubt distutils in its current form could handle it. Regards, Martin From bcannon at gmail.com Wed Oct 5 23:00:35 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 5 Oct 2005 14:00:35 -0700 Subject: [Python-Dev] Python 2.5 and ast-branch In-Reply-To: References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com> <4343D532.2030202@gmail.com> Message-ID: To answer Nick's email here, I didn't respond to that initial email because it seemed specifically directed at Guido and not me. On 10/5/05, Guido van Rossum wrote: > On 10/5/05, Nick Coghlan wrote: > > Anyway, the question is: What do we want to do with ast-branch? Finish > > bringing it up to Python 2.4 equivalence, make it the HEAD, and only then > > implement the approved PEP's (308, 342, 343) that affect the compiler? Or > > implement the approved PEP's on the HEAD, and move the goalposts for > > ast-branch to include those features as well? > > > > I believe the latter is the safe option in terms of making sure 2.5 is a solid > > release, but doing it that way suggests to me that the ast compiler would need > > to be held over until 2.6, which would be somewhat unfortunate. > > > > Given that I don't particularly like that answer, I'd love for someone to > > convince me I'm wrong ;) > > Given the total lack of response, I have a different suggestion. Let's > *abandon* the AST-branch. We're fooling ourselves believing that we > can ever switch to that branch, no matter how theoretically better it > is. > Since the original people who have done the majority of the work (Jeremy, Tim, Neal, Nick, logistix, and myself) have fallen so far behind this probably is not a bad decision. Obviously I would like to see the work pan out, but since I personally just have not found the time to shuttle the branch the rest of the way I really am in no position to say much in terms of objecting to its demise. Maybe I can come up with a new design and get my dissertation out of it. =) -Brett From mal at egenix.com Wed Oct 5 23:15:56 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 05 Oct 2005 23:15:56 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <43443E31.1070606@v.loewis.de> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de> <4342FD6D.7000308@v.loewis.de> <4343913D.3080609@egenix.com> <434419B5.7030803@v.loewis.de> <434438C8.1030001@egenix.com> <43443E31.1070606@v.loewis.de> Message-ID: <4344428C.4020309@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: > >>> It wouldn't. >> >> >> >> Could you elaborate why not ? Using distutils on Windows is really >> easy... > > > The current build process for Windows simply doesn't provide it. > You expect to select "Build/All" from the menu (or some such), > and expect all code to be compiled. The VC build process only > considers VC project files. > > Maybe it is possible to hack up a project file to invoke distutils > as the build process, but no such project file is currently available, > nor is it known whether it is possible to create one. Whatever the > build process: it should properly with debug and release build, > with alternative compilers (such as Itanium compiler), and place > the files so that debugging from the VStudio environment is possible. > All of this is not the case of today, and nobody has worked on > making it possible. I very much doubt distutils in its current form > could handle it. I see, so you have to create a VC project file for each codec - that would be hard to maintain indeed. For Unix platforms this would be no problem at all since there all extension modules are built using distutils anyway. Thanks for the explanation. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 05 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From trentm at ActiveState.com Wed Oct 5 23:18:31 2005 From: trentm at ActiveState.com (Trent Mick) Date: Wed, 5 Oct 2005 14:18:31 -0700 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <43443E31.1070606@v.loewis.de> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de> <4342FD6D.7000308@v.loewis.de> <4343913D.3080609@egenix.com> <434419B5.7030803@v.loewis.de> <434438C8.1030001@egenix.com> <43443E31.1070606@v.loewis.de> Message-ID: <20051005211831.GB5220@ActiveState.com> [Martin v. Loewis wrote] > Maybe it is possible to hack up a project file to invoke distutils > as the build process, but no such project file is currently available, > nor is it known whether it is possible to create one. This is essentially what the "_ssl" project does, no? It defers to "build_ssl.py" to do the build work. I didn't see what the full build requirements were earlier in this thread though, so I may be missing something. Trent -- Trent Mick TrentM at ActiveState.com From martin at v.loewis.de Thu Oct 6 00:00:52 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 06 Oct 2005 00:00:52 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <20051005211831.GB5220@ActiveState.com> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de> <4342FD6D.7000308@v.loewis.de> <4343913D.3080609@egenix.com> <434419B5.7030803@v.loewis.de> <434438C8.1030001@egenix.com> <43443E31.1070606@v.loewis.de> <20051005211831.GB5220@ActiveState.com> Message-ID: <43444D14.2070802@v.loewis.de> Trent Mick wrote: > [Martin v. Loewis wrote] > >>Maybe it is possible to hack up a project file to invoke distutils >>as the build process, but no such project file is currently available, >>nor is it known whether it is possible to create one. > > > This is essentially what the "_ssl" project does, no? More or less, yes. It does support both debug and release build. It does not support Itanium builds (atleast not the way the other projects do); as a result, the Itanium build currently just doesn't offer SSL. More importantly, build_ssl.py is not based on distutils. Instead, it is manually hacked up - a VBScript file would have worked as well. So if you were to create many custom build scripts (one per codec), you might just as well generate the VS project files directly. Regards, Martin From hyeshik at gmail.com Thu Oct 6 05:11:06 2005 From: hyeshik at gmail.com (Hye-Shik Chang) Date: Thu, 6 Oct 2005 12:11:06 +0900 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <43443B5E.5010606@egenix.com> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <4343748A.9050105@v.loewis.de> <4343EC54.7090201@livinglogic.de> <43441E04.3060307@v.loewis.de> <43443B5E.5010606@egenix.com> Message-ID: <4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com> On 10/6/05, M.-A. Lemburg wrote: > Hye-Shik, could you please provide some timeit figures for > the fastmap encoding ? > (before applying Walter's patch, charmap decoder) % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10'; u=unicode(s, e)" "s.decode(e)" 100 loops, best of 3: 3.35 msec per loop (applied the patch, improved charmap decoder) % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10'; u=unicode(s, e)" "s.decode(e)" 1000 loops, best of 3: 1.11 msec per loop (the fastmap decoder) % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc'; u=unicode(s, e)" "s.decode(e)" 1000 loops, best of 3: 1.04 msec per loop (utf-8 decoder) % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s, e)" "s.decode(e)" 1000 loops, best of 3: 851 usec per loop Walter's decoder and the fastmap decoder run in mostly same way. So the performance difference is quite minor. Perhaps, the minor difference came from the existence of wrapper function on each codecs; the fastmap codec provides functions usable as Codecs.{en,de}code directly. (encoding, charmap codec) % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10'; u=unicode(s, e)" "u.encode(e)" 100 loops, best of 3: 3.51 msec per loop (encoding, fastmap codec) % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc'; u=unicode(s, e)" "u.encode(e)" 1000 loops, best of 3: 536 usec per loop (encoding, utf-8 codec) % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s, e)" "u.encode(e)" 1000 loops, best of 3: 1.5 msec per loop If the encoding optimization can be easily done in Walter's approach, the fastmap codec would be too expensive way for the objective because we must maintain not only fastmap but also charmap for backward compatibility. Hye-Shik From pje at telecommunity.com Thu Oct 6 06:47:40 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 06 Oct 2005 00:47:40 -0400 Subject: [Python-Dev] Removing the block stack (was Re: PEP 343 and __with__) In-Reply-To: <2mu0fxekdz.fsf@starship.python.net> References: <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com> At 09:50 AM 10/4/2005 +0100, Michael Hudson wrote: >(anyone still thinking about removing the block stack?). I'm not any more. My thought was that it would be good for performance, by reducing the memory allocation overhead for frames enough to allow pymalloc to be used instead of the platform malloc. After more investigation, however, I realized that was a dumb idea, because for a typical application the amortized allocation cost of frames approaches zero as the program runs and allocates as many frames as it will ever use, as large as it will ever use them, and just recycles them on the free list. And all of the ways I came up with for removing the block stack were a lot more complex than leaving it as-is. Clearly, the cost of function calls in Python lies somewhere else, and I'd probably look next at parameter tuple allocation, and other frame initialization activities. I seem to recall that Armin Rigo once supplied a patch that sped up calls at the cost of slowing down recursive or re-entrant ones, and I seem to recall that it was based on preinitializing frames, not just preallocating them: http://mail.python.org/pipermail/python-dev/2004-March/042871.html However, the patch was never applied because of its increased memory usage as well as the slowdown for recursion. Every so often, in blue-sky thinking about alternative Python VM designs, I think about making frames virtual, in the sense of not even having "real" frame objects except for generators, sys._getframe(), and tracebacks. I suspect, however, that doing this in a way that doesn't mess with the current C API is non-trivial. And for many "obvious" ways to simplify the various stacks, locals, etc., the downside could be more complexity for generators, and probably less speed as well. For example, we could use a single "stack" arena in the heap for parameters, locals, cells, and blocks, rather than doing all the various sub-allocations within the frame. But then creating a frame would involve copying data off the top of this pseudo-stack, and doing all the offset computations and perhaps some other trickery as well. And resuming a generator would have to either copy it back, or have some sane way to make calls out to a new stack arena when calling other functions - thus making those operations slower. The real problem, of course, with any of these ideas is that we are at best shaving a few percentage points here, a few points there, so it's comparatively speaking rather expensive to do the experiments to see if they help anything. From nnorwitz at gmail.com Thu Oct 6 07:09:21 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Wed, 5 Oct 2005 22:09:21 -0700 Subject: [Python-Dev] Removing the block stack (was Re: PEP 343 and __with__) In-Reply-To: <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com> References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <2mu0fxekdz.fsf@starship.python.net> <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com> Message-ID: On 10/5/05, Phillip J. Eby wrote: > At 09:50 AM 10/4/2005 +0100, Michael Hudson wrote: > >(anyone still thinking about removing the block stack?). > > I'm not any more. My thought was that it would be good for performance, by > reducing the memory allocation overhead for frames enough to allow pymalloc > to be used instead of the platform malloc. I did something similar to reduce the frame size to under 256 bytes (don't recall if I made a patch or not) and it had no overall effect on perf. > Clearly, the cost of function calls in Python lies somewhere else, and I'd > probably look next at parameter tuple allocation, and other frame > initialization activities. I think that's a big part of it. This patch shows C calls getting sped up primarly by avoiding tuple creation: http://python.org/sf/1107887 I hope to work on that and get it into 2.5. I've also been thinking about avoiding tuple creation when calling python functions. The change I have in mind would probably have to wait until p3k, but could yield some speed ups. Warning: half baked idea follows. My thoughts are to dynamically allocate the Python stack memory (e.g., void *stack = malloc(128MB)). Then all calls within each thread uses its own stack. So things would be pushed onto the stack like they are currently, but we wouldn't need to do create a tuple to pass to a method, they could just be used directly. Basically more closely simulate the way it currently works in hardware. This would mean all the PyArg_ParseTuple()s would have to change. It may be possible to fake it out, but I'm not sure it's worth it which is why it would be easier to do this for p3k. The general idea is to allocate the stack in one big hunk and just walk up/down it as functions are called/returned. This only means incrementing or decrementing pointers. This should allow us to avoid a bunch of copying and tuple creation/destruction. Frames would hopefully be the same size which would help. Note that even though there is a free list for frames, there could still be PyObject_GC_Resize()s often (or unused memory). WIth my idea, hopefully there would be better memory locality, which could speed things up. n From martin at v.loewis.de Thu Oct 6 09:04:14 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 06 Oct 2005 09:04:14 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <4343748A.9050105@v.loewis.de> <4343EC54.7090201@livinglogic.de> <43441E04.3060307@v.loewis.de> <43443B5E.5010606@egenix.com> <4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com> Message-ID: <4344CC6E.40206@v.loewis.de> Hye-Shik Chang wrote: > If the encoding optimization can be easily done in Walter's approach, > the fastmap codec would be too expensive way for the objective because > we must maintain not only fastmap but also charmap for backward > compatibility. IMO, whether a new function is added or whether the existing function becomes polymorphic (depending on the type of table being passed) is a minor issue. Clearly, the charmap API needs to stay for backwards compatibility; in terms of code size or maintenance, I would actually prefer separate functions. One issue apparently is people tweaking the existing dictionaries, with additional entries they think belong there. I don't think we need to preserve compatibility with that approach in 2.5, but I also think that breakage should be obvious: the dictionary should either go away completely at run-time, or be stored under a different name, so that any attempt of modifying the dictionary gives an exception instead of having no interesting effect. I envision a layout of the codec files like this: decoding_dict = ... decoding_map, encoding_map = codecs.make_lookup_tables(decoding_dict) I think it should be possible to build efficient tables in a single pass over the dictionary, so startup time should be fairly small (given that the dictionaries are currently built incrementally, anyway, due to the way dictionary literals work). Regards, Martin From pje at telecommunity.com Thu Oct 6 09:06:39 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 06 Oct 2005 03:06:39 -0400 Subject: [Python-Dev] Removing the block stack (was Re: PEP 343 and __with__) In-Reply-To: References: <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com> <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <2mu0fxekdz.fsf@starship.python.net> <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051006024517.01f6f0d0@mail.telecommunity.com> At 10:09 PM 10/5/2005 -0700, Neal Norwitz wrote: >I've also been thinking about avoiding tuple creation when calling >python functions. The change I have in mind would probably have to >wait until p3k, but could yield some speed ups. > >Warning: half baked idea follows. Yeah, I've been baking that idea for a long time, and it's a bit more complex than you've suggested, due to generators, sys._getframe(), and tracebacks. >My thoughts are to dynamically allocate the Python stack memory (e.g., >void *stack = malloc(128MB)). Then all calls within each thread uses >its own stack. So things would be pushed onto the stack like they are >currently, but we wouldn't need to do create a tuple to pass to a >method, they could just be used directly. Basically more closely >simulate the way it currently works in hardware. Actually, Python/ceval.c already skips creating a tuple when calling Python functions with a fixed number of arguments (caller and callee) and no cell vars (i.e., not a closure). It copies them straight from the calling frame stack to the callee frame's stack. >This would mean all the PyArg_ParseTuple()s would have to change. It >may be possible to fake it out, but I'm not sure it's worth it which >is why it would be easier to do this for p3k. Actually, I've been thinking that replacing the arg tuple with a PyObject* array would allow us to skip tuple creation when calling C functions, since you could just give the C functions a pointer to the arguments on the caller's stack. That would let us get rid of most remaining tuple allocations. I suppose we'd also need either an argcount parameter. The old APIs taking tuples for calls could trivially convert the tuples to a array pointer and size, then call the new APIs. Actually, we'd probably have to have a tp_arraycall slot or something, with the existing tp_call forwarding to tp_arraycall in most cases, but occasionally the reverse. The tricky part is making sure you don't end up with cases where you call a tuple API that converts to an array that then turns it back into a tuple! >The general idea is to allocate the stack in one big hunk and just >walk up/down it as functions are called/returned. This only means >incrementing or decrementing pointers. This should allow us to avoid >a bunch of copying and tuple creation/destruction. Frames would >hopefully be the same size which would help. Note that even though >there is a free list for frames, there could still be >PyObject_GC_Resize()s often (or unused memory). WIth my idea, >hopefully there would be better memory locality, which could speed >things up. Yeah, unfortunately for your idea, generators would have to copy off bits of the stack and then copy them back in, making generators slower. If it weren't for that part, the idea would probably be a good one, as arguments, locals, cells, and the block and value stacks could all be handled that way, with the compiler treating all operations as base-pointer offsets, thereby eliminating lots of more-complex pointer management in ceval.c and frameobject.c. Another possible fix for generators would be of course to give them their own stack arena, but then you have the problem of needing to copy overflows from one such stack to another - at which point you're basically back to having frames. On the other hand, maybe the good part of this idea is just eliminating all the pointer fudging and having the compiler determine stack offsets. Then, the frame object layout would just consist of a big hunk of stack space, laid out as a PyObject* array. The main problem with this concept is that it would change the meaning of certain opcodes, since right now the offsets of free variables in opcodes start over the numbering, but this approach would add the number of locals to those offsets. From martin at v.loewis.de Thu Oct 6 09:15:06 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 06 Oct 2005 09:15:06 +0200 Subject: [Python-Dev] Removing the block stack (was Re: PEP 343 and __with__) In-Reply-To: References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <2mu0fxekdz.fsf@starship.python.net> <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com> Message-ID: <4344CEFA.1040708@v.loewis.de> Neal Norwitz wrote: > My thoughts are to dynamically allocate the Python stack memory (e.g., > void *stack = malloc(128MB)). Then all calls within each thread uses > its own stack. So things would be pushed onto the stack like they are > currently, but we wouldn't need to do create a tuple to pass to a > method, they could just be used directly. Basically more closely > simulate the way it currently works in hardware. One issue with argument tuples on the stack (or some sort of stack) is that functions may hold onto argument tuples longer: def foo(*args): global last_args last_args = args I considered making true tuple objects (i.e. with ob_type etc.) on the stack, but this possibility breaks it. Regards, Martin From walter at livinglogic.de Thu Oct 6 09:28:05 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Thu, 06 Oct 2005 09:28:05 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <43441E04.3060307@v.loewis.de> References: <4342DCEC.5020204@v.loewis.de> <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <4343748A.9050105@v.loewis.de> <4343EC54.7090201@livinglogic.de> <43441E04.3060307@v.loewis.de> Message-ID: <4344D205.6090800@livinglogic.de> Martin v. L?wis wrote: > Walter D?rwald wrote: > >> OK, here's a patch that implements this enhancement to >> PyUnicode_DecodeCharmap(): http://www.python.org/sf/1313939 > > Looks nice! > >> Creating the decoding_map as a string should probably be done by >> gencodec.py directly. This way the first import of the codec would be >> faster too. > > Hmm. How would you represent the string in source code? As a Unicode > literal? With \u escapes, Yes, simply by outputting repr(decoding_string). > or in a UTF-8 source file? This might get unreadable, if your editor can't detect the coding header. > Or as a UTF-8 > string, with an explicit decode call? This is another possibility, but is unreadable too. But we might add the real codepoints as comments. > I like the current dictionary style for being readable, as it also > adds the Unicode character names into comments. We could use decoding_string = ( u"\u009c" # 0x0004 -> U+009C: CONTROL u"\u0009" # 0x0005 -> U+000c: HORIZONTAL TABULATION ... ) However the current approach has the advantage, that only those byte values that differ from the identical mapping have to be specified. Bye, Walter D?rwald From stephen at xemacs.org Thu Oct 6 10:41:33 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 06 Oct 2005 17:41:33 +0900 Subject: [Python-Dev] unifying str and unicode In-Reply-To: <4341C05D.5000706@egenix.com> (M.'s message of "Tue, 04 Oct 2005 01:35:57 +0200") References: <1128346015.6138.149.camel@fsol> <20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol> <1128368242.6138.258.camel@fsol> <1128371900.6138.299.camel@fsol> <8393fff0510031435n7ef19cbcg297b8881d75d0a08@mail.gmail.com> <4341C05D.5000706@egenix.com> Message-ID: <87fyrfox4y.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "M" == "M.-A. Lemburg" writes: M> From what I've read on the web about the Python Unicode M> implementation we have one of the better ones compared to other M> languages implementations and their choices and design M> decisions. Yes, indeed! Speaking-as-a-card-carrying-member-of-the-loyal-opposition-ly y'rs, -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From mwh at python.net Thu Oct 6 10:44:49 2005 From: mwh at python.net (Michael Hudson) Date: Thu, 06 Oct 2005 09:44:49 +0100 Subject: [Python-Dev] Removing the block stack In-Reply-To: (Neal Norwitz's message of "Wed, 5 Oct 2005 22:09:21 -0700") References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <2mu0fxekdz.fsf@starship.python.net> <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com> Message-ID: <2mpsqjdofy.fsf@starship.python.net> Neal Norwitz writes: > On 10/5/05, Phillip J. Eby wrote: >> At 09:50 AM 10/4/2005 +0100, Michael Hudson wrote: >> >(anyone still thinking about removing the block stack?). >> >> I'm not any more. My thought was that it would be good for performance, by >> reducing the memory allocation overhead for frames enough to allow pymalloc >> to be used instead of the platform malloc. > > I did something similar to reduce the frame size to under 256 bytes > (don't recall if I made a patch or not) and it had no overall effect > on perf. Hey, me too! I also came to the same conclusion. Cheers, mwh -- The ultimate laziness is not using Perl. That saves you so much work you wouldn't believe it if you had never tried it. -- Erik Naggum, comp.lang.lisp From walter at livinglogic.de Thu Oct 6 10:51:47 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Thu, 06 Oct 2005 10:51:47 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <4344CC6E.40206@v.loewis.de> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <4343748A.9050105@v.loewis.de> <4343EC54.7090201@livinglogic.de> <43441E04.3060307@v.loewis.de> <43443B5E.5010606@egenix.com> <4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com> <4344CC6E.40206@v.loewis.de> Message-ID: <4344E5A3.7060401@livinglogic.de> Martin v. L?wis wrote: > Hye-Shik Chang wrote: > >> If the encoding optimization can be easily done in Walter's approach, >> the fastmap codec would be too expensive way for the objective because >> we must maintain not only fastmap but also charmap for backward >> compatibility. > > IMO, whether a new function is added or whether the existing function > becomes polymorphic (depending on the type of table being passed) is > a minor issue. Clearly, the charmap API needs to stay for backwards > compatibility; in terms of code size or maintenance, I would actually > prefer separate functions. OK, I can update the patch accordingly. Any suggestions for the name? PyUnicode_DecodeCharmapString? > One issue apparently is people tweaking the existing dictionaries, > with additional entries they think belong there. I don't think we > need to preserve compatibility with that approach in 2.5, but I > also think that breakage should be obvious: the dictionary should > either go away completely at run-time, or be stored under a > different name, so that any attempt of modifying the dictionary > gives an exception instead of having no interesting effect. IMHO it should be stored under a different name, because there are codecs (c037, koi8_r, iso8859_11), that reuse existing dictionaries. Or we could have a function that recreates the dictionary from the string. > I envision a layout of the codec files like this: > > decoding_dict = ... > decoding_map, encoding_map = codecs.make_lookup_tables(decoding_dict) Apart from the names (and the fact that encoding_map is still a dictionary), that's what my patch does. > I think it should be possible to build efficient tables in a single > pass over the dictionary, so startup time should be fairly small > (given that the dictionaries are currently built incrementally, anyway, > due to the way dictionary literals work). Bye, Walter D?rwald From mal at egenix.com Thu Oct 6 11:09:51 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 06 Oct 2005 11:09:51 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <4344E5A3.7060401@livinglogic.de> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <4343748A.9050105@v.loewis.de> <4343EC54.7090201@livinglogic.de> <43441E04.3060307@v.loewis.de> <43443B5E.5010606@egenix.com> <4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com> <4344CC6E.40206@v.loewis.de> <4344E5A3.7060401@livinglogic.de> Message-ID: <4344E9DF.4060706@egenix.com> Walter D?rwald wrote: > Martin v. L?wis wrote: > >> Hye-Shik Chang wrote: >> >>> If the encoding optimization can be easily done in Walter's approach, >>> the fastmap codec would be too expensive way for the objective because >>> we must maintain not only fastmap but also charmap for backward >>> compatibility. >> >> >> IMO, whether a new function is added or whether the existing function >> becomes polymorphic (depending on the type of table being passed) is >> a minor issue. Clearly, the charmap API needs to stay for backwards >> compatibility; in terms of code size or maintenance, I would actually >> prefer separate functions. > > > OK, I can update the patch accordingly. Any suggestions for the name? > PyUnicode_DecodeCharmapString? No, you can factor this part out into a separate C function - there's no need to add a completely new entry point just for this optimization. Later on we can then also add support for compressed tables to the codec in the same way. >> One issue apparently is people tweaking the existing dictionaries, >> with additional entries they think belong there. I don't think we >> need to preserve compatibility with that approach in 2.5, but I >> also think that breakage should be obvious: the dictionary should >> either go away completely at run-time, or be stored under a >> different name, so that any attempt of modifying the dictionary >> gives an exception instead of having no interesting effect. > > > IMHO it should be stored under a different name, because there are > codecs (c037, koi8_r, iso8859_11), that reuse existing dictionaries. Only koi8_u reuses the dictionary from koi8_r - and it's easy to recreate the codec from a standard mapping file. > Or we could have a function that recreates the dictionary from the string. Actually, I'd prefer that these operations be done by the codec generator script, so that we don't have additional startup time. The dictionaries should then no longer be generated and instead. I'd like the comments to stay, though. This can be done like this (using string concatenation applied by the compiler): decoding_charmap = ( u'x' # 0x0000 -> 0x0078 LATIN SMALL LETTER X u'y' # 0x0001 -> 0x0079 LATIN SMALL LETTER Y ... ) Either way, monkey patching the codec won't work anymore. Doesn't really matter, though, as this was never officially supported. We've always told people to write their own codecs if they need to modify an existing one and then hook it into the system using either a new codec search function or by adding an appropriate alias. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 06 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mal at egenix.com Thu Oct 6 11:13:50 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 06 Oct 2005 11:13:50 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4343748A.9050105@v.loewis.de> <4343EC54.7090201@livinglogic.de> <43441E04.3060307@v.loewis.de> <43443B5E.5010606@egenix.com> <4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com> Message-ID: <4344EACE.5070102@egenix.com> Hye-Shik Chang wrote: > On 10/6/05, M.-A. Lemburg wrote: > >>Hye-Shik, could you please provide some timeit figures for >>the fastmap encoding ? >> Thanks for the timings. > (before applying Walter's patch, charmap decoder) > > % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10'; > u=unicode(s, e)" "s.decode(e)" > 100 loops, best of 3: 3.35 msec per loop > > (applied the patch, improved charmap decoder) > > % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10'; > u=unicode(s, e)" "s.decode(e)" > 1000 loops, best of 3: 1.11 msec per loop > > (the fastmap decoder) > > % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc'; > u=unicode(s, e)" "s.decode(e)" > 1000 loops, best of 3: 1.04 msec per loop > > (utf-8 decoder) > > % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s, > e)" "s.decode(e)" > 1000 loops, best of 3: 851 usec per loop > > Walter's decoder and the fastmap decoder run in mostly same way. > So the performance difference is quite minor. Perhaps, the minor > difference came from the existence of wrapper function on each codecs; > the fastmap codec provides functions usable as Codecs.{en,de}code > directly. > > (encoding, charmap codec) > > % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10'; > u=unicode(s, e)" "u.encode(e)" > 100 loops, best of 3: 3.51 msec per loop > > (encoding, fastmap codec) > > % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc'; > u=unicode(s, e)" "u.encode(e)" > 1000 loops, best of 3: 536 usec per loop > > (encoding, utf-8 codec) > > % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s, > e)" "u.encode(e)" > 1000 loops, best of 3: 1.5 msec per loop I wonder why the UTF-8 codec is slower than the fastmap codec in this case. > If the encoding optimization can be easily done in Walter's approach, > the fastmap codec would be too expensive way for the objective because > we must maintain not only fastmap but also charmap for backward > compatibility. Indeed. Let's go with a patched charmap codec then. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 06 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ncoghlan at gmail.com Thu Oct 6 12:06:22 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 06 Oct 2005 20:06:22 +1000 Subject: [Python-Dev] Python 2.5 and ast-branch In-Reply-To: References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com> <4343D532.2030202@gmail.com> Message-ID: <4344F71E.9000708@gmail.com> [Brett] > To answer Nick's email here, I didn't respond to that initial email > because it seemed specifically directed at Guido and not me. Fair enough. I think I was actually misrembering the sequence of events leading up to 2.4a1, so the question was less appropriate for Guido than I thought :) [Guido] > On 10/5/05, Guido van Rossum wrote: >>Given the total lack of response, I have a different suggestion. Let's >>*abandon* the AST-branch. We're fooling ourselves believing that we >>can ever switch to that branch, no matter how theoretically better it >>is. [Brett] > Since the original people who have done the majority of the work > (Jeremy, Tim, Neal, Nick, logistix, and myself) have fallen so far > behind this probably is not a bad decision. Obviously I would like to > see the work pan out, but since I personally just have not found the > time to shuttle the branch the rest of the way I really am in no > position to say much in terms of objecting to its demise. If we kill the branch for now, then anyone that wants to bring up the idea again can write a PEP first, not only to articulate the benefits of switching to an AST compiler (Jeremy has a few notes scattered around the web on that front), but also to propose a solid migration strategy. We tried the "develop in parallel, switch when done"; it doesn't seem to have worked due to the way it split developer effort between the branches, and both the HEAD and ast-branch ended up losing out. > Maybe I can come up with a new design and get my dissertation out of it. =) A strategy that may work out better is to develop something independent of the Python core that can: 1. Produce an ASDL based AST structure from: - Python source code - CPython 'AST' - CPython bytecode 2. Parse an ASDL based AST structure and produce: - Python source code - CPython 'AST' - CPython bytecode That is, initially develop an enhanced replacement for the compiler package, rather than aiming directly to replace the actual CPython compiler. Then the folks who want to do serious bytecode hacking can reverse compile the bytecode on the fly ;) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From hyeshik at gmail.com Thu Oct 6 13:33:11 2005 From: hyeshik at gmail.com (Hye-Shik Chang) Date: Thu, 6 Oct 2005 20:33:11 +0900 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <4344EACE.5070102@egenix.com> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4343748A.9050105@v.loewis.de> <4343EC54.7090201@livinglogic.de> <43441E04.3060307@v.loewis.de> <43443B5E.5010606@egenix.com> <4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com> <4344EACE.5070102@egenix.com> Message-ID: <4f0b69dc0510060433l1ce316coe769eaf65ae34b1a@mail.gmail.com> On 10/6/05, M.-A. Lemburg wrote: > Hye-Shik Chang wrote: > > (encoding, fastmap codec) > > > > % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc'; > > u=unicode(s, e)" "u.encode(e)" > > 1000 loops, best of 3: 536 usec per loop > > > > (encoding, utf-8 codec) > > > > % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s, > > e)" "u.encode(e)" > > 1000 loops, best of 3: 1.5 msec per loop > > I wonder why the UTF-8 codec is slower than the fastmap > codec in this case. I guess that resizing made the difference. fastmap encoder doesn't resize the output buffer at all in the test case while UTF-8 encoder allocates 4*53*1024 bytes and resizes it to 53*1024 bytes in the end. Hye-Shik From mfb at lotusland.dyndns.org Thu Oct 6 14:36:51 2005 From: mfb at lotusland.dyndns.org (Matthew F. Barnes) Date: Thu, 6 Oct 2005 07:36:51 -0500 (CDT) Subject: [Python-Dev] Lexical analysis and NEWLINE tokens Message-ID: <23766.64.141.129.62.1128602211.squirrel@localhost> I posted this question to python-help, but I think I have a better chance of getting the answer here. I'm looking for clarification on when NEWLINE tokens are generated during lexical analysis of Python source code. In particular, I'm confused about some of the top-level components in Python's grammar (file_input, interactive_input, and eval_input). Section 2.1.7 of the reference manual states that blank lines (lines consisting only of whitespace and possibly a comment) do not generate NEWLINE tokens. This is supported by the definition of a suite, which does not allow for standalone or consecutive NEWLINE tokens. suite ::= stmt_list NEWLINE | NEWLINE INDENT statement+ DEDENT Yet the grammar for top-level components seems to suggest that a parsable input may consist entirely of a single NEWLINE token, or include consecutive NEWLINE tokens. file_input ::= (NEWLINE | statement)* interactive_input ::= [stmt_list] NEWLINE | compound_stmt NEWLINE eval_input ::= expression_list NEWLINE* To me this seems to contradict section 2.1.7 in so far as I don't see how it's possible to generate such a sequence of tokens. What kind of input would generate NEWLINE tokens in the top-level components of the grammar? Matthew Barnes matthew at barnes.net From walter at livinglogic.de Thu Oct 6 14:40:24 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Thu, 06 Oct 2005 14:40:24 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <4344E9DF.4060706@egenix.com> References: <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> <4343748A.9050105@v.loewis.de> <4343EC54.7090201@livinglogic.de> <43441E04.3060307@v.loewis.de> <43443B5E.5010606@egenix.com> <4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com> <4344CC6E.40206@v.loewis.de> <4344E5A3.7060401@livinglogic.de> <4344E9DF.4060706@egenix.com> Message-ID: <43451B38.5030203@livinglogic.de> M.-A. Lemburg wrote: > [...] >>Or we could have a function that recreates the dictionary from the string. > > Actually, I'd prefer that these operations be done by the > codec generator script, so that we don't have additional > startup time. The dictionaries should then no longer be > generated and instead. I'd like the comments to stay, though. > This can be done like this (using string concatenation > applied by the compiler): > > decoding_charmap = ( > u'x' # 0x0000 -> 0x0078 LATIN SMALL LETTER X > u'y' # 0x0001 -> 0x0079 LATIN SMALL LETTER Y > ... > ) I'd prefer that too. > Either way, monkey patching the codec won't work anymore. > Doesn't really matter, though, as this was never officially > supported. > > We've always told people to write their own codecs > if they need to modify an existing one and then hook it into > the system using either a new codec search function or by > adding an appropriate alias. OK, so can someone update gencodec.py and recreate the charmap codecs? BTW, is codecs.make_encoding_map part of the official API, or can we change it to expect a string instead of a dictionary? Bye, Walter D?rwald From tonynelson at georgeanelson.com Wed Oct 5 20:03:03 2005 From: tonynelson at georgeanelson.com (Tony Nelson) Date: Wed, 5 Oct 2005 14:03:03 -0400 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <4343748A.9050105@v.loewis.de> References: <4342DCEC.5020204@v.loewis.de> <20051004022548.GC7081@unpythonic.net> <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de> <4342DCEC.5020204@v.loewis.de> Message-ID: At 8:36 AM +0200 10/5/05, Martin v. L?wis wrote: >Tony Nelson wrote: ... >> Encoding can be made fast using a simple hash table with external chaining. >> There are max 256 codepoints to encode, and they will normally be well >> distributed in their lower 8 bits. Hash on the low 8 bits (just mask), and >> chain to an area with 256 entries. Modest storage, normally short chains, >> therefore fast encoding. > >This is what is currently done: a hash map with 256 keys. You are >complaining about the performance of that algorithm. The issue of >external chaining is likely irrelevant: there likely are no collisions, >even though Python uses open addressing. I think I'm complaining about the implementation, though on decode, not encode. In any case, there are likely to be collisions in my scheme. Over the next few days I will try to do it myself, but I will need to learn Pyrex, some of the Python C API, and more about Python to do it. >>>...I suggest instead just /caching/ the translation in C arrays stored >>>with the codec object. The cache would be invalidated on any write to the >>>codec's mapping dictionary, and rebuilt the next time anything was >>>translated. This would maintain the present semantics, work with current >>>codecs, and still provide the desired speed improvement. > >That is not implementable. You cannot catch writes to the dictionary. I should have been more clear. I am thinking about using a proxy object in the codec's 'encoding_map' and 'decoding_map' slots, that will forward all the dictionary stuff. The proxy will delete the cache on any call which changes the dictionary contents. There are proxy classed and dictproxy (don't know how its implemented yet) so it seems doable, at least as far as I've gotten so far. >> Note that this caching is done by new code added to the existing C >> functions (which, if I have it right, are in unicodeobject.c). No >> architectural changes are made; no existing codecs need to be changed; >> everything will just work > >Please try to implement it. You will find that you cannot. I don't >see how regenerating/editing the codecs could be avoided. Will do! ____________________________________________________________________ TonyN.:' ' From mwh at python.net Thu Oct 6 17:07:47 2005 From: mwh at python.net (Michael Hudson) Date: Thu, 06 Oct 2005 16:07:47 +0100 Subject: [Python-Dev] Lexical analysis and NEWLINE tokens In-Reply-To: <23766.64.141.129.62.1128602211.squirrel@localhost> (Matthew F. Barnes's message of "Thu, 6 Oct 2005 07:36:51 -0500 (CDT)") References: <23766.64.141.129.62.1128602211.squirrel@localhost> Message-ID: <2mhdbuela4.fsf@starship.python.net> "Matthew F. Barnes" writes: > I posted this question to python-help, but I think I have a better chance > of getting the answer here. > > I'm looking for clarification on when NEWLINE tokens are generated during > lexical analysis of Python source code. In particular, I'm confused about > some of the top-level components in Python's grammar (file_input, > interactive_input, and eval_input). > > Section 2.1.7 of the reference manual states that blank lines (lines > consisting only of whitespace and possibly a comment) do not generate > NEWLINE tokens. This is supported by the definition of a suite, which > does not allow for standalone or consecutive NEWLINE tokens. > > suite ::= stmt_list NEWLINE | NEWLINE INDENT statement+ DEDENT I don't have the spare brain cells to think about your real problem (sorry) but something to be aware of is that the pseudo EBNF of the reference manual is purely descriptive -- it is not actually used in the parsing of Python code at all. Among other things this means it could well just be wrong :/ The real grammar is Grammar/Grammar in the source distribution. Cheers, mwh -- The Internet is full. Go away. -- http://www.disobey.com/devilshat/ds011101.htm From guido at python.org Thu Oct 6 17:27:13 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 6 Oct 2005 08:27:13 -0700 Subject: [Python-Dev] PEP 343 and __with__ In-Reply-To: References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <434257B7.9000909@gmail.com> Message-ID: Just a quick note. Nick convinced me that adding __with__ (without losing __enter__ and __exit__!) is a good thing, especially for the decimal context manager. He's got a complete proposal for PEP changes which he'll post here. After a brief feedback period I'll approve his changes and he'll check them into the PEP. My apologies to Jason for missing the point he was making; thanks to Nick for getting it and turning it into a productive change proposal. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Oct 6 17:30:45 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 6 Oct 2005 08:30:45 -0700 Subject: [Python-Dev] Lexical analysis and NEWLINE tokens In-Reply-To: <2mhdbuela4.fsf@starship.python.net> References: <23766.64.141.129.62.1128602211.squirrel@localhost> <2mhdbuela4.fsf@starship.python.net> Message-ID: I think it is a relic from the distant past, when the lexer did generate NEWLINE for every blank line. I think the only case where you can still get a NEWLINE by itself is in interactive mode. This code is extremely convoluted and may be buggy in end cases; this could explain why you get a continuation prompt after entering a comment in interactive mode... --Guido On 10/6/05, Michael Hudson wrote: > "Matthew F. Barnes" writes: > > > I posted this question to python-help, but I think I have a better chance > > of getting the answer here. > > > > I'm looking for clarification on when NEWLINE tokens are generated during > > lexical analysis of Python source code. In particular, I'm confused about > > some of the top-level components in Python's grammar (file_input, > > interactive_input, and eval_input). > > > > Section 2.1.7 of the reference manual states that blank lines (lines > > consisting only of whitespace and possibly a comment) do not generate > > NEWLINE tokens. This is supported by the definition of a suite, which > > does not allow for standalone or consecutive NEWLINE tokens. > > > > suite ::= stmt_list NEWLINE | NEWLINE INDENT statement+ DEDENT > > I don't have the spare brain cells to think about your real problem > (sorry) but something to be aware of is that the pseudo EBNF of the > reference manual is purely descriptive -- it is not actually used in > the parsing of Python code at all. Among other things this means it > could well just be wrong :/ > > The real grammar is Grammar/Grammar in the source distribution. > > Cheers, > mwh > > -- > The Internet is full. Go away. > -- http://www.disobey.com/devilshat/ds011101.htm > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Thu Oct 6 18:11:17 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 06 Oct 2005 12:11:17 -0400 Subject: [Python-Dev] Lexical analysis and NEWLINE tokens In-Reply-To: <23766.64.141.129.62.1128602211.squirrel@localhost> Message-ID: <5.1.1.6.0.20051006120937.01f6f2a0@mail.telecommunity.com> At 07:36 AM 10/6/2005 -0500, Matthew F. Barnes wrote: >I posted this question to python-help, but I think I have a better chance >of getting the answer here. > >I'm looking for clarification on when NEWLINE tokens are generated during >lexical analysis of Python source code. If you're talking about the "tokenize" module, NEWLINE is only generated following a logical line, which is one that contains code. From nas at arctrix.com Thu Oct 6 18:22:53 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Thu, 6 Oct 2005 16:22:53 +0000 (UTC) Subject: [Python-Dev] Python 2.5 and ast-branch References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com> <4343D532.2030202@gmail.com> <4344F71E.9000708@gmail.com> Message-ID: Nick Coghlan wrote: > If we kill the branch for now, then anyone that wants to bring up the idea > again can write a PEP first I still have some (very) small hope that it can be finished. If we don't get it done soon then I fear that it will never happen. I had hoped that a SoC student would pick up the task or someone would ask for a grant from the PSF. Oh well. > A strategy that may work out better is [...] Another thought I've had recently is that most of the complexity seems to be in the CST to AST translator. Perhaps having a parser that provided a nicer CST might help. Neil From guido at python.org Thu Oct 6 19:03:00 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 6 Oct 2005 10:03:00 -0700 Subject: [Python-Dev] Python 2.5 and ast-branch In-Reply-To: References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com> <4343D532.2030202@gmail.com> <4344F71E.9000708@gmail.com> Message-ID: On 10/6/05, Neil Schemenauer wrote: > Nick Coghlan wrote: > > If we kill the branch for now, then anyone that wants to bring up the idea > > again can write a PEP first > > I still have some (very) small hope that it can be finished. If we > don't get it done soon then I fear that it will never happen. I had > hoped that a SoC student would pick up the task or someone would ask > for a grant from the PSF. Oh well. > > > A strategy that may work out better is [...] > > Another thought I've had recently is that most of the complexity > seems to be in the CST to AST translator. Perhaps having a parser > that provided a nicer CST might help. Dream on, Neil... Adding more work won't make it more likely to happen. The only alternative to abandoning it that I see is to merge it back into main NOW, using the time that remains us until the 2.5 release to make it robust. That way, everybody can help out (and it may motivate more people). Even if this is a temporary regression (e.g. PEP 342), it might be worth it -- but only if there are at least two people committed to help out quickly when there are problems. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From BruceEckel-Python3234 at mailblocks.com Thu Oct 6 19:12:05 2005 From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel) Date: Thu, 6 Oct 2005 11:12:05 -0600 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <8393fff0510022253l6c37f3f8x6f5254018d6f348e@mail.gmail.com> References: <1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr> <8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com> <1766050860214964952@unknownmsgid> <8393fff0510021449p3607898cp6dcecc615450b46b@mail.gmail.com> <60ed19d40510021619p7f6e2641udf172a0d0d19283e@mail.gmail.com> <8393fff0510022253l6c37f3f8x6f5254018d6f348e@mail.gmail.com> Message-ID: <9410082351.20051006111205@MailBlocks.com> Jeremy Jones published a blog discussing some of the ideas we've talked about here: http://www.oreillynet.com/pub/wlg/8002 Although I hope our conversation isn't done, as he suggests! At some point when more ideas have been thrown about (and TIJ4 is done) I hope to summarize what we've talked about in an article. Bruce Eckel http://www.BruceEckel.com mailto:BruceEckel-Python3234 at mailblocks.com Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e" Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel Subscribe to my newsletter: http://www.mindview.net/Newsletter My schedule can be found at: http://www.mindview.net/Calendar From arathorn at fastwebnet.it Thu Oct 6 19:26:26 2005 From: arathorn at fastwebnet.it (Paolo Invernizzi) Date: Thu, 06 Oct 2005 19:26:26 +0200 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <9410082351.20051006111205@MailBlocks.com> References: <1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr> <8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com> <1766050860214964952@unknownmsgid> <8393fff0510021449p3607898cp6dcecc615450b46b@mail.gmail.com> <60ed19d40510021619p7f6e2641udf172a0d0d19283e@mail.gmail.com> <8393fff0510022253l6c37f3f8x6f5254018d6f348e@mail.gmail.com> <9410082351.20051006111205@MailBlocks.com> Message-ID: Just to add another 2 cents.... http://www.erights.org/talks/promises/paper/tgc05.pdf --- Paolo Invernizzi Bruce Eckel wrote: > Jeremy Jones published a blog discussing some of the ideas we've > talked about here: > http://www.oreillynet.com/pub/wlg/8002 > Although I hope our conversation isn't done, as he suggests! > > At some point when more ideas have been thrown about (and TIJ4 is > done) I hope to summarize what we've talked about in an article. From jeremy at alum.mit.edu Thu Oct 6 21:42:47 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Thu, 6 Oct 2005 15:42:47 -0400 Subject: [Python-Dev] Python 2.5 and ast-branch In-Reply-To: References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com> <4343D532.2030202@gmail.com> <4344F71E.9000708@gmail.com> Message-ID: On 10/6/05, Guido van Rossum wrote: > On 10/6/05, Neil Schemenauer wrote: > > Nick Coghlan wrote: > > > If we kill the branch for now, then anyone that wants to bring up the idea > > > again can write a PEP first > > > > I still have some (very) small hope that it can be finished. If we > > don't get it done soon then I fear that it will never happen. I had > > hoped that a SoC student would pick up the task or someone would ask > > for a grant from the PSF. Oh well. > > > > > A strategy that may work out better is [...] > > > > Another thought I've had recently is that most of the complexity > > seems to be in the CST to AST translator. Perhaps having a parser > > that provided a nicer CST might help. > > Dream on, Neil... Adding more work won't make it more likely to happen. You're both right. The CST-to-AST translator is fairly complex; it would be better to parse directly to an AST. On the other hand, the AST translator seems fairly complete and not particularly hard to write. I'd love to see a new parser in 2.6. > The only alternative to abandoning it that I see is to merge it back > into main NOW, using the time that remains us until the 2.5 release to > make it robust. That way, everybody can help out (and it may motivate > more people). > > Even if this is a temporary regression (e.g. PEP 342), it might be > worth it -- but only if there are at least two people committed to > help out quickly when there are problems. I'm sorry I didn't respond earlier. I've been home with a new baby for the last six weeks and haven't been keeping a close eye on my email. (I didn't see Nick's earlier email until his most recent post.) It would take a few days of work to get the branch ready to merge to the head. There are basic issues like renaming newcompile.c to compile.c and the like. I could work on that tomorrow and Monday. I did do a little work on the ast branch earlier this week. The remaining issues feel pretty manageable, so you can certainly count me as one of the two people committed to help out. I'll make a point of keeping a closer eye on python-dev email, in addition to writing some code. Jeremy From ms at cerenity.org Thu Oct 6 21:54:56 2005 From: ms at cerenity.org (Michael Sparks) Date: Thu, 6 Oct 2005 20:54:56 +0100 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <9410082351.20051006111205@MailBlocks.com> References: <8393fff0510022253l6c37f3f8x6f5254018d6f348e@mail.gmail.com> <9410082351.20051006111205@MailBlocks.com> Message-ID: <200510062054.56985.ms@cerenity.org> Hi Bruce, On Thursday 06 October 2005 18:12, Bruce Eckel wrote: > Although I hope our conversation isn't done, as he suggests! ... > At some point when more ideas have been thrown about (and TIJ4 is > done) I hope to summarize what we've talked about in an article. I don't know if you saw my previous post[1] to python-dev on this topic, but Kamaelia is specifically aimed at making concurrency simple and easy to use. Initially we were focussed on using scheduled generators for co-operative CSP-style (but with buffers) concurrency. [1] http://tinyurl.com/dfnah, http://tinyurl.com/e4jfq We've tested the system so far on 2 relatively inexperienced programmers (as well as experienced, but the more interesting group is novices). The one who hadn't done much programming at all (a little bit of VB, pre-university) actually fared better IMO. This is probably because concurrency became part of his standard toolbox of approaches. I've placed the slides I've produced for Euro OSCON on Kamaelia here: * http://cerenity.org/KamaeliaEuroOSCON2005.pdf The corrected URL for the whitepaper based on work now 6 months old (we've come quite a way since then!) is here: * http://www.bbc.co.uk/rd/pubs/whp/whp113.shtml Consider a simple server for sending text (generated by a user typing into the server) to multiple clients connecting to a server. This is a naturally concurrent problem in various ways (user interaction, splitting, listening for connections, serving connections, etc). Why is that interesting to us? It's effectively a microcosm of how subtitling works. (I work at the BBC) In Kamaelia this looks like this: === start === class ConsoleReader(threadedcomponent): def run(self): while 1: line = raw_input(">>> ") line = line + "\n" self.outqueues["outbox"].put(line) Backplane("subtitles").activate() pipeline( ConsoleReader(), publishTo("subtitles"), ).activate() def subtitles_protocol(): return subscribeTo("subtitles") SimpleServer(subtitles_protocol, 5000).run() === end === The ConsoleReader is threaded to allow the use of the naive way of reading from the input, whereas the server, backplane (a named splitter component in practice), pipelines, publishing, subscribing, splitting, etc are all single threaded co-operative concurrency. A possible client for this text service might be: pipeline( TCPClient("subtitles.rd.bbc.co.uk", 5000), Ticker(), ).run() (Though that would be a bit bare, even if it does use pygame :) The entire system is based around communicating generators, but we also have threads for blocking operations. (Though the entire network subsystem is non-blocking) What I'd be interested in, is hearing how our system doesn't match with the goals of the hypothetical concurrency system you'd like to see (if it doesn't). The main reason I'm interested in hearing this, is because the goals you listed are ones we want to achieve. If you don't think our system matches it (we don't have process migration as yet, so that's one area) I'd be interested in hearing what areas you think are deficient. However, the way we're beginning to refer to the project is to refer to just the component aspect rather than concurrency - for one simple reason - we're getting to stage where we can ignore /most/ concurrency issues(not all). If you have any time for feedback, it'd be appreciated. If you don't I hope it's useful food for thought! Best Regards, Michael -- Michael Sparks, Senior R&D Engineer, Digital Media Group Michael.Sparks at rd.bbc.co.uk, http://kamaelia.sourceforge.net/ British Broadcasting Corporation, Research and Development Kingswood Warren, Surrey KT20 6NP This e-mail may contain personal views which are not the views of the BBC. From BruceEckel-Python3234 at mailblocks.com Thu Oct 6 22:06:37 2005 From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel) Date: Thu, 6 Oct 2005 14:06:37 -0600 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <200510062054.56985.ms@cerenity.org> References: <8393fff0510022253l6c37f3f8x6f5254018d6f348e@mail.gmail.com> <9410082351.20051006111205@MailBlocks.com> <200510062054.56985.ms@cerenity.org> Message-ID: <1093762964.20051006140637@MailBlocks.com> This does look quite fascinating, and I know there's a lot of really interesting work going on at the BBC now -- looks like some really pioneering stuff going on with respect to TV show distribution over the internet, new compression formats, etc. So yes indeed, this is quite high on my list to research. Looks like people there have been doing some interesting work. Right now I'm just trying to cast a net, so that people can put in ideas, for when the Java book is done and I can spend more time on it. Thursday, October 6, 2005, 1:54:56 PM, Michael Sparks wrote: > Hi Bruce, > On Thursday 06 October 2005 18:12, Bruce Eckel wrote: >> Although I hope our conversation isn't done, as he suggests! > ... >> At some point when more ideas have been thrown about (and TIJ4 is >> done) I hope to summarize what we've talked about in an article. > I don't know if you saw my previous post[1] to python-dev on this topic, but > Kamaelia is specifically aimed at making concurrency simple and easy to use. > Initially we were focussed on using scheduled generators for co-operative > CSP-style (but with buffers) concurrency. > [1] http://tinyurl.com/dfnah, http://tinyurl.com/e4jfq > We've tested the system so far on 2 relatively inexperienced programmers > (as well as experienced, but the more interesting group is novices). The one > who hadn't done much programming at all (a little bit of VB, pre-university) > actually fared better IMO. This is probably because concurrency became > part of his standard toolbox of approaches. > I've placed the slides I've produced for Euro OSCON on Kamaelia here: > * http://cerenity.org/KamaeliaEuroOSCON2005.pdf > The corrected URL for the whitepaper based on work now 6 months old (we've > come quite a way since then!) is here: > * http://www.bbc.co.uk/rd/pubs/whp/whp113.shtml > Consider a simple server for sending text (generated by a user typing into the > server) to multiple clients connecting to a server. This is a naturally > concurrent problem in various ways (user interaction, splitting, listening > for connections, serving connections, etc). Why is that interesting to us? > It's effectively a microcosm of how subtitling works. (I work at the BBC) > In Kamaelia this looks like this: > === start === > class ConsoleReader(threadedcomponent): > def run(self): > while 1: > line = raw_input(">>> ") > line = line + "\n" > self.outqueues["outbox"].put(line) > Backplane("subtitles").activate() > pipeline( > ConsoleReader(), > publishTo("subtitles"), > ).activate() > def subtitles_protocol(): > return subscribeTo("subtitles") > SimpleServer(subtitles_protocol, 5000).run() > === end === > The ConsoleReader is threaded to allow the use of the naive way of > reading from the input, whereas the server, backplane (a named splitter > component in practice), pipelines, publishing, subscribing, splitting, > etc are all single threaded co-operative concurrency. > A possible client for this text service might be: > pipeline( > TCPClient("subtitles.rd.bbc.co.uk", 5000), > Ticker(), > ).run() > (Though that would be a bit bare, even if it does use pygame :) > The entire system is based around communicating generators, but we also > have threads for blocking operations. (Though the entire network subsystem > is non-blocking) > What I'd be interested in, is hearing how our system doesn't match with > the goals of the hypothetical concurrency system you'd like to see (if it > doesn't). The main reason I'm interested in hearing this, is because the > goals you listed are ones we want to achieve. If you don't think our system > matches it (we don't have process migration as yet, so that's one area) > I'd be interested in hearing what areas you think are deficient. > However, the way we're beginning to refer to the project is to refer to > just the component aspect rather than concurrency - for one simple > reason - we're getting to stage where we can ignore /most/ concurrency > issues(not all). > If you have any time for feedback, it'd be appreciated. If you don't I hope > it's useful food for thought! > Best Regards, > Michael Bruce Eckel http://www.BruceEckel.com mailto:BruceEckel-Python3234 at mailblocks.com Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e" Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel Subscribe to my newsletter: http://www.mindview.net/Newsletter My schedule can be found at: http://www.mindview.net/Calendar From marvinpublic at comcast.net Thu Oct 6 23:22:09 2005 From: marvinpublic at comcast.net (Marvin) Date: Thu, 06 Oct 2005 17:22:09 -0400 Subject: [Python-Dev] Static builds on Windows (continued) In-Reply-To: References: Message-ID: <43459581.4060509@comcast.net> > Date: Wed, 05 Oct 2005 00:21:20 +0200 > From: "Martin v. L?wis" > Subject: Re: [Python-Dev] Static builds on Windows (continued) > Cc: python-dev at python.org > Message-ID: <43430060.6070909 at v.loewis.de> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Marvin wrote: > >>I built pythoncore and python. The resulting python.exe worked fine, but did >>indeed fail when I tried to dynamically load anything (Dialog said: the >>application terminated abnormally) > > > Not sure what you are trying to do here. In your case, dynamic loading > simply cannot work. The extension modules all link with python24.dll, > which you don't have. It may find some python24.dll, which then gives > conflicts with the Python interpreter that is already running. > > So what you really should do is disable dynamic loading entirely. To do > so, remove dynload_win from your project, and #undef > HAVE_DYNAMIC_LOADING in PC/pyconfig.h. > > Not sure if anybody has recently tested whether this configuration > actually works - if you find that it doesn't, please post your patches > to sf.net/projects/python. > > If you really want to provide dynamic loading of some kind, you should > arrange the extension modules to import the symbols from your .exe. > Linking the exe should generate an import library, and you should link > the extensions against that. > > HTH, > Martin > I'll try that when I get back to this and feed back my results. I figured out that I can avoid the need for dynamic loading. I wanted to use some existing extension modules, but the whole point was to use the existing ones which as you point out are linked against a dll. So even if I created an .EXE that exported the symbols, I'd still have to rebuild the extensions. From jcarlson at uci.edu Fri Oct 7 00:15:07 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 06 Oct 2005 15:15:07 -0700 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <200510062054.56985.ms@cerenity.org> References: <9410082351.20051006111205@MailBlocks.com> <200510062054.56985.ms@cerenity.org> Message-ID: <20051006143740.287E.JCARLSON@uci.edu> Michael Sparks wrote: > What I'd be interested in, is hearing how our system doesn't match with > the goals of the hypothetical concurrency system you'd like to see (if it > doesn't). The main reason I'm interested in hearing this, is because the > goals you listed are ones we want to achieve. If you don't think our system > matches it (we don't have process migration as yet, so that's one area) > I'd be interested in hearing what areas you think are deficient. I've not used the system you have worked on, so perhaps this is easy, but the vast majority of concurrency issues can be described as fitting into one or more of the following task distribution categories. 1. one to many (one producer, many consumers) without duplication (no consumer has the same data, essentially a distributed queue) 2. one to many (one producer, many consumers) with duplication (the producer broadcasts to all consumers) 3. many to one (many producers, one consumer) 4. many to many (many producers, many consumers) without duplication (no consumer has the same data, essentially a distributed queue) 5. many to many (many producers, many consumers) with duplication (all producers broadcast to all consumers) 6. one to one without duplication MPI, for example, handles all the above cases with minor work, and tuple space systems such as Linda can support all of the above with a bit of work in cases 2 and 5. If Kamaelia is able to handle all of the above mechanisms in both a blocking and non-blocking fashion, then I would guess it has the basic requirements for most concurrent applications. If, however, it is not able to easily handle all of the above mechanisms, or has issues with blocking and/or non-blocking semantics on the producer and/or consumer end, then it is likely that it will have difficulty gaining traction in certain applications where the unsupported mechanism is common and/or necessary. One nice thing about the message queue style (which it seems as though Kamaelia implements) is that it guarantees that a listener won't recieve the same message twice when broadcasting a message to multiple listeners (case 2 and 5 above) - something that is a bit more difficult to guarantee in a tuple space scenario, but which is still possible (which spurns me to add it into my tuple space implementation before it is released). Another nice thing is that subscriptions to a queue seem to be persistant in Kamaelia, which I should also implement. - Josiah From kbk at shore.net Fri Oct 7 02:00:29 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Thu, 06 Oct 2005 20:00:29 -0400 Subject: [Python-Dev] Python 2.5 and ast-branch In-Reply-To: (Jeremy Hylton's message of "Thu, 6 Oct 2005 15:42:47 -0400") References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com> <4343D532.2030202@gmail.com> <4344F71E.9000708@gmail.com> Message-ID: <87irwadwma.fsf@hydra.bayview.thirdcreek.com> Jeremy Hylton writes: > On 10/6/05, Guido van Rossum wrote: >> The only alternative to abandoning it that I see is to merge it back >> into main NOW, using the time that remains us until the 2.5 release to >> make it robust. That way, everybody can help out (and it may motivate >> more people). >> >> Even if this is a temporary regression (e.g. PEP 342), it might be >> worth it -- but only if there are at least two people committed to >> help out quickly when there are problems. > > I'm sorry I didn't respond earlier. I've been home with a new baby > for the last six weeks and haven't been keeping a close eye on my > email. (I didn't see Nick's earlier email until his most recent > post.) > > It would take a few days of work to get the branch ready to merge to > the head. There are basic issues like renaming newcompile.c to > compile.c and the like. I could work on that tomorrow and Monday. Unless I'm missing something, we would need to merge HEAD to the AST branch once more to pick up the changes in MAIN since the last merge, and then make sure everything in the AST branch is passing the test suite. Otherwise we risk having MAIN broken for awhile following a merge. Finally, we can then merge the diff of HEAD to AST back into MAIN. If we try to merge the entire AST branch since its inception, we will re-apply to MAIN those changes made in MAIN which have already been merged to the AST branch and it will be difficult to sort out all the conflicts. If we try to merge the AST branch from the its last merge tag to its head we will miss the work done on AST prior to that merge. Let me know at kbk at shore.net if you want to do this. -- KBK From raymond.hettinger at verizon.net Fri Oct 7 02:26:07 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Thu, 06 Oct 2005 20:26:07 -0400 Subject: [Python-Dev] Python 2.5 and ast-branch In-Reply-To: <87irwadwma.fsf@hydra.bayview.thirdcreek.com> Message-ID: <000001c5cad5$b6b83ee0$a105a044@oemcomputer> > Unless I'm missing something, we would need to merge HEAD to the AST > branch once more to pick up the changes in MAIN since the last merge, > and then make sure everything in the AST branch is passing the test > suite. Otherwise we risk having MAIN broken for awhile following a > merge. IMO, merging to the head is a somewhat dangerous strategy that doesn't have any benefits. Whether done on the head or in the branch, the same amount of work needs to be done. If the stability of the head is disrupted, it may impede other maintenance efforts because it is harder to test bug fixes when the test suites are not passing. From ms at cerenity.org Fri Oct 7 02:45:16 2005 From: ms at cerenity.org (Michael Sparks) Date: Fri, 7 Oct 2005 01:45:16 +0100 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <20051006143740.287E.JCARLSON@uci.edu> References: <9410082351.20051006111205@MailBlocks.com> <200510062054.56985.ms@cerenity.org> <20051006143740.287E.JCARLSON@uci.edu> Message-ID: <200510070145.17284.ms@cerenity.org> On Thursday 06 October 2005 23:15, Josiah Carlson wrote: [... 6 specific use cases ...] > If Kamaelia is able to handle all of the above mechanisms in both a > blocking and non-blocking fashion, then I would guess it has the basic > requirements for most concurrent applications. It can. I can easily knock up examples for each if required :-) That said, a more interesting example implemented this week (as part of a rapid prototyping project to look at collaborative community radio) implements an networked audio mixer matrix. That allows mutiple sources of audio to be mixed, sent on to multiple destinations, may be duplicate mixes of each other, but also may select different mixes. The same system also includes point to point communications for network control of the mix. That application covers ( I /think/ ) 1, 2, 3, 4, and 6 on your list of things as I understand what you mean. 5 is fairly trivial though. (The largest bottleneck in writing it was my personal misunderstanding of how to actually mix 16bit signed audio :-) Regarding blocking & non-blocking, links can be marked to synchronous, which forces blocking style behaviour. Since generally we're using generators, we can't block for real which is why we throw an exception there. However, threaded components can & do block. The reason for this was due to the architecture being inspired by noting the similarities between asynchronous hardware systems/langages and network systems. > into my tuple space implementation before it is released. I'd be interested in hearing more about that BTW. One thing we've found is that much organic systems have a neural system for communications between things, (hence Axon :), that you also need to equivalent of a hormonal system. In the unix shell world, IMO the environment acts as that for pipelines, and similarly that's why we have an assistant system. (Which has key/value lookup facilities) It's a less obvious requirement, but is a useful one nonetheless, so I don't really see a message passing style as excluding a linda approach - since they're orthoganal approaches. Best Regards, Michael. -- Michael Sparks, Senior R&D Engineer, Digital Media Group Michael.Sparks at rd.bbc.co.uk, http://kamaelia.sourceforge.net/ British Broadcasting Corporation, Research and Development Kingswood Warren, Surrey KT20 6NP This e-mail may contain personal views which are not the views of the BBC. From guido at python.org Fri Oct 7 04:34:11 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 6 Oct 2005 19:34:11 -0700 Subject: [Python-Dev] Python 2.5 and ast-branch In-Reply-To: <000001c5cad5$b6b83ee0$a105a044@oemcomputer> References: <87irwadwma.fsf@hydra.bayview.thirdcreek.com> <000001c5cad5$b6b83ee0$a105a044@oemcomputer> Message-ID: [Kurt] > > Unless I'm missing something, we would need to merge HEAD to the AST > > branch once more to pick up the changes in MAIN since the last merge, > > and then make sure everything in the AST branch is passing the test > > suite. Otherwise we risk having MAIN broken for awhile following a > > merge. [Raymond] > IMO, merging to the head is a somewhat dangerous strategy that doesn't > have any benefits. Whether done on the head or in the branch, the same > amount of work needs to be done. > > If the stability of the head is disrupted, it may impede other > maintenance efforts because it is harder to test bug fixes when the test > suites are not passing. Well, at some point it will HAVE to be merged into the head. The longer we wait the more painful it will be. If we suffer a week of instability now, I think that's acceptable, as long as all developers are suitably alerted, and as long as the AST team works towards resolving the issues ASAP. I happen to agree with Kurt that we should first merge the head into the branch; then the AST team can work on making sure the entire test suite passes; then they can merge back into the head. BUT this should only be done with a serious commitment from the AST team (I think Neil and Jeremy are offering this -- I just don't know how much time they will have available, realistically). My main point is, we should EITHER abandon the AST branch, OR force a quick resolution. I'm willing to suffer a week of instability in head now, or in a week or two -- but I'm not willing to wait again. Let's draw a line in the sand. The AST team (which includes whoever will help) has up to three weeks to het the AST branch into a position where it passes all the current unit tests merged in from the head. Then they merge it into the head after which we can accept at most a week of instability in the head. After that the AST team must remain available to resolve remaining issues quickly. How does this sound to the non-AST-branch developers who have to suffer the inevitable post-merge instability? I think it's now or never -- waiting longer isn't going to make this thing easier (not with several more language changes approved: with-statement, extended import, what else...) What does the AST team think? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Fri Oct 7 04:42:16 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 06 Oct 2005 22:42:16 -0400 Subject: [Python-Dev] Python 2.5 and ast-branch In-Reply-To: References: <000001c5cad5$b6b83ee0$a105a044@oemcomputer> <87irwadwma.fsf@hydra.bayview.thirdcreek.com> <000001c5cad5$b6b83ee0$a105a044@oemcomputer> Message-ID: <5.1.1.6.0.20051006224020.01f76148@mail.telecommunity.com> At 07:34 PM 10/6/2005 -0700, Guido van Rossum wrote: >How does this sound to the non-AST-branch developers who have to >suffer the inevitable post-merge instability? I think it's now or >never -- waiting longer isn't going to make this thing easier (not >with several more language changes approved: with-statement, extended >import, what else...) Do the AST branch changes affect the interface of the "parser" module? Or do they just add new functionality? From bcannon at gmail.com Fri Oct 7 05:48:40 2005 From: bcannon at gmail.com (Brett Cannon) Date: Thu, 6 Oct 2005 20:48:40 -0700 Subject: [Python-Dev] Python 2.5 and ast-branch In-Reply-To: References: <87irwadwma.fsf@hydra.bayview.thirdcreek.com> <000001c5cad5$b6b83ee0$a105a044@oemcomputer> Message-ID: On 10/6/05, Guido van Rossum wrote: > [Kurt] > > > Unless I'm missing something, we would need to merge HEAD to the AST > > > branch once more to pick up the changes in MAIN since the last merge, > > > and then make sure everything in the AST branch is passing the test > > > suite. Otherwise we risk having MAIN broken for awhile following a > > > merge. > > [Raymond] > > IMO, merging to the head is a somewhat dangerous strategy that doesn't > > have any benefits. Whether done on the head or in the branch, the same > > amount of work needs to be done. > > > > If the stability of the head is disrupted, it may impede other > > maintenance efforts because it is harder to test bug fixes when the test > > suites are not passing. > > Well, at some point it will HAVE to be merged into the head. The > longer we wait the more painful it will be. If we suffer a week of > instability now, I think that's acceptable, as long as all developers > are suitably alerted, and as long as the AST team works towards > resolving the issues ASAP. > > I happen to agree with Kurt that we should first merge the head into > the branch; then the AST team can work on making sure the entire test > suite passes; then they can merge back into the head. > > BUT this should only be done with a serious commitment from the AST > team (I think Neil and Jeremy are offering this -- I just don't know > how much time they will have available, realistically). > > My main point is, we should EITHER abandon the AST branch, OR force a > quick resolution. I'm willing to suffer a week of instability in head > now, or in a week or two -- but I'm not willing to wait again. > > Let's draw a line in the sand. The AST team (which includes whoever > will help) has up to three weeks to het the AST branch into a position > where it passes all the current unit tests merged in from the head. > Then they merge it into the head after which we can accept at most a > week of instability in the head. After that the AST team must remain > available to resolve remaining issues quickly. > So basically we have until November 1 to get all tests passing? For anyone who wants a snapshot of where things stand, http://www.python.org/sf/1191458 lists the tests that are currently failing (read the comments to get the current list; count is at 14). All AST-related tracker items are under the AST group so filtering to just AST stuff is easy. I am willing to guess a couple of those tests will start passing as soon as http://www.python.org/sf/1246473 is dealt with (this is just based on looking at some of the failure output seeming to be off by one). As of right now the lnotab is only has statement granularity when it really needs expression granularity. That requires tweaking all instances where an expression node is created to also take in the line number of where the expression exists. This fix is one of the main reasons I have not touched the AST branch; it is not difficult, but it is not exactly fun or small either. =) > How does this sound to the non-AST-branch developers who have to > suffer the inevitable post-merge instability? I think it's now or > never -- waiting longer isn't going to make this thing easier (not > with several more language changes approved: with-statement, extended > import, what else...) > > What does the AST team think? > Well, I have homework this weekend, a midterm two weeks from tomorrow (so the preceding weekend will be studying), and October 23 is my birthday so I will be busy that entire weekend visiting family. In other words Python time is a premium this month. But I will try to squeeze in what time I can. But I think the three week time frame is reasonable to light the fire under our asses to get this thing done (especially if it inspires people to jump in and help out; as always, people interested in joining in, check out the branch and read Python/compile.txt ). -Brett From kbk at shore.net Fri Oct 7 07:01:15 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Fri, 07 Oct 2005 01:01:15 -0400 Subject: [Python-Dev] Python 2.5 and ast-branch In-Reply-To: (Guido van Rossum's message of "Thu, 6 Oct 2005 19:34:11 -0700") References: <87irwadwma.fsf@hydra.bayview.thirdcreek.com> <000001c5cad5$b6b83ee0$a105a044@oemcomputer> Message-ID: <873bndex9g.fsf@hydra.bayview.thirdcreek.com> Guido van Rossum writes: > I happen to agree with Kurt that we should first merge the head into > the branch; then the AST team can work on making sure the entire > test suite passes; then they can merge back into the head. I can be available to do this again. It would involve freezing the AST branch for a day. Once the AST branch is stable, we would need to freeze everything, merge MAIN to AST one more time to pick up the last few changes in MAIN, and then merge the AST head back to MAIN. By doing these merges from MAIN to AST we would have effectively moved the AST branch point along MAIN to HEAD. So the final join is HEAD to AST, conducted from MAIN. I'll run a local experiment to verify this concept is workable. -- KBK From jcarlson at uci.edu Fri Oct 7 08:25:37 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 06 Oct 2005 23:25:37 -0700 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <200510070145.17284.ms@cerenity.org> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> Message-ID: <20051006221436.2892.JCARLSON@uci.edu> Michael Sparks wrote: > > On Thursday 06 October 2005 23:15, Josiah Carlson wrote: > [... 6 specific use cases ...] > > If Kamaelia is able to handle all of the above mechanisms in both a > > blocking and non-blocking fashion, then I would guess it has the basic > > requirements for most concurrent applications. > > It can. I can easily knock up examples for each if required :-) That's cool, I trust you. One thing I notice is absent from the Kamaelia page is benchmarks. On the one hand, benchmarks are technically useless, as one can tend to benchmark those things that a system does well, and ignore those things that it does poorly (take, for example how PyLinda's speed test only ever inserts and removes one tuple at a time...try inserting 100k and use wildcards to extract those 100k, and you'll note how poor it performs, or database benchmarks, etc.). However, if one's benchmarks provide examples from real use, then it shows that at least someone has gotten some X performance from the system. I'm personally interested in latency and throughput for varying sizes of data being passed through the system. > That said, a more interesting example implemented this week (as part of > a rapid prototyping project to look at collaborative community radio) > implements an networked audio mixer matrix. That allows mutiple sources of > audio to be mixed, sent on to multiple destinations, may be duplicate mixes > of each other, but also may select different mixes. The same system also > includes point to point communications for network control of the mix. Very neat. How much data? What kind of throughput? What kinds of latencies? > That application covers ( I /think/ ) 1, 2, 3, 4, and 6 on your list of > things as I understand what you mean. 5 is fairly trivial though. Cool. > Regarding blocking & non-blocking, links can be marked to synchronous, which > forces blocking style behaviour. Since generally we're using generators, we > can't block for real which is why we throw an exception there. However, > threaded components can & do block. The reason for this was due to the > architecture being inspired by noting the similarities between asynchronous > hardware systems/langages and network systems. On the client side, I was lazy and used synchronous/blocking sockets to block on read/write (every client thread gets its own connection, meaning that tuple puts are never sitting in a queue). I've also got server-side timeouts for when you don't want to wait too long for data. rslt = tplspace.get(PATTERN, timeout=None) > > into my tuple space implementation before it is released. > > I'd be interested in hearing more about that BTW. One thing we've found is > that much organic systems have a neural system for communications between > things, (hence Axon :), that you also need to equivalent of a hormonal system. > In the unix shell world, IMO the environment acts as that for pipelines, and > similarly that's why we have an assistant system. (Which has key/value lookup > facilities) I have two recent posts about the performance and features of a (hacked together) tuple space system I worked on (for two afternoons) in my blog. "Feel Lucky" for "Josiah Carlson" in google and you will find it. > It's a less obvious requirement, but is a useful one nonetheless, so I don't > really see a message passing style as excluding a linda approach - since > they're orthoganal approaches. Indeed. For me, the idea of being able to toss a tuple into memory somewhere and being able to find it later maps into my mind as: ('name', arg1, ...) -> name(arg1, ...), which is, quite literally, an RPC semantic (which seems a bit more natural to me than subscribing to the 'name' queue). With the ability to send to either single or multiple listeners, you get message passing, broadcast messages, and a standard job/result queueing semantic. The only thing that it is missing is a prioritization mechanism (fifo, numeric priority, etc.), which would get us a job scheduling kernel. Not bad for a "message passing"/"tuple space"/"IPC" library. (all of the above described have direct algorithms for implementation). - Josiah From mwh at python.net Fri Oct 7 08:51:24 2005 From: mwh at python.net (Michael Hudson) Date: Fri, 07 Oct 2005 07:51:24 +0100 Subject: [Python-Dev] Python 2.5 and ast-branch In-Reply-To: (Guido van Rossum's message of "Thu, 6 Oct 2005 19:34:11 -0700") References: <87irwadwma.fsf@hydra.bayview.thirdcreek.com> <000001c5cad5$b6b83ee0$a105a044@oemcomputer> Message-ID: <2mwtkpddlf.fsf@starship.python.net> Guido van Rossum writes: > How does this sound to the non-AST-branch developers who have to > suffer the inevitable post-merge instability? I think it's now or > never -- waiting longer isn't going to make this thing easier (not > with several more language changes approved: with-statement, extended > import, what else...) It sounds OK to me. Cheers, mwh -- To summarise the summary of the summary:- people are a problem. -- The Hitch-Hikers Guide to the Galaxy, Episode 12 From martin at v.loewis.de Fri Oct 7 09:59:50 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 07 Oct 2005 09:59:50 +0200 Subject: [Python-Dev] PyObject_Init documentation Message-ID: <43462AF6.3080103@v.loewis.de> says If type indicates that the object participates in the cyclic garbage detector, it is added to the detector's set of observed objects. Is this really correct? I thought you need to invoke PyObject_GC_TRACK explicitly? Regards, Martin From ncoghlan at iinet.net.au Fri Oct 7 11:57:17 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Fri, 07 Oct 2005 19:57:17 +1000 Subject: [Python-Dev] Proposed changes to PEP 343 Message-ID: <4346467D.5010005@iinet.net.au> Based on Jason's comments regarding decimal.Context, and to explicitly cover the terminology agreed on during the documentation discussion back in July, I'm proposing a number of changes to PEP 343. I'll be updating the checked in PEP assuming there aren't any objections in the next week or so (and assuming I get CVS access sorted out ;). The idea of dropping __enter__/__exit__ and defining the with statement solely in terms of coroutines is *not* included in the suggested changes, but I added a new item under "Resolved Open Issues" to cover some of the reasons why. Cheers, Nick. 1. Amend the statement specification such that: with EXPR as VAR: BLOCK is translated as: abc = (EXPR).__with__() exc = (None, None, None) VAR = abc.__enter__() try: try: BLOCK except: exc = sys.exc_info() raise finally: abc.__exit__(*exc) 2. Add the following to the subsequent explanation: The call to the __with__ method serves a similar purpose to the __iter__ method for iterables and iterators. An object such as threading.Lock may provide its own __enter__ and __exit__ methods, and simply return 'self' from its __with__ method. A more complex object such as decimal.Context may return a distinct context manager which takes care of setting and restoring the appropriate decimal context in the thread. 3. Update ContextWrapper in the "Generator Decorator" section to include: def __with__(self): return self 4. Add a paragraph to the end of the "Generator Decorator" section: By applying the @contextmanager decorator to a context's __with__ method, it is as easy to write a generator-based context manager for the context as it is to write a generator-based iterator for an iterable (see the decimal.Context example below). 5. Add three items under "Resolved Open Issues": 2. After this PEP was originally approved, a subsequent discussion on python-dev [4] settled on the term "context manager" for objects which provide __enter__ and __exit__ methods, and "context management protocol" for the protocol itself. With the addition of the __with__ method to the protocol, a natural extension is to call objects which provide only a __with__ method "contexts" (or "manageable contexts" in situations where the general term "context" would be ambiguous). The distinction between a context and a context manager is very similar to the distinction between an iterable and an iterator. 3. The originally approved version of this PEP did not include a __with__ method - the method was only added to the PEP after Jason Orendorff pointed out the difficulty of writing appropriate __enter__ and __exit__ methods for decimal.Context [5]. This approach allows a class to use the @contextmanager decorator to defines a native context manager using generator syntax. It also allows a class to use an existing independent context manager as its native context manager by applying the independent context manager to 'self' in its __with__ method. It even allows a class written in C to use a coroutine based context manager written in Python. The __with__ method parallels the __iter__ method which forms part of the iterator protocol. 4. The suggestion was made by Jason Orendorff that the __enter__ and __exit__ methods could be removed from the context management protocol, and the protocol instead defined directly in terms of the coroutine interface described in PEP 342 (or a cleaner version of that interface with start() and finish() convenience methods) [6]. Guido rejected this idea [7]. The following are some of benefits of keeping the __enter__ and __exit__ methods: - it makes it easy to implement a simple context manager in C without having to rely on a separate coroutine builder - it makes it easy to provide a low-overhead implementation for context managers which don't need to maintain any special state between the __enter__ and __exit__ methods (having to use a coroutine for these would impose unnecessary overhead without any compensating benefit) - it makes it possible to understand how the with statement works without having to first understand the concept of a coroutine 6. Add new references: [4] http://mail.python.org/pipermail/python-dev/2005-July/054658.html [5] http://mail.python.org/pipermail/python-dev/2005-October/056947.html [6] http://mail.python.org/pipermail/python-dev/2005-October/056969.html [7] http://mail.python.org/pipermail/python-dev/2005-October/057018.html 7. Update Example 4 to include a __with__ method: def __with__(self): return self 8. Replace Example 9 with the following example: 9. Here's a proposed native context manager for decimal.Context: # This would be a new decimal.Context method @contextmanager def __with__(self): # We set the thread context to a copy of this context # to ensure that changes within the block are kept # local to the block. This also gives us thread safety # and supports nested usage of a given context. newctx = self.copy() oldctx = decimal.getcontext() decimal.setcontext(newctx) try: yield newctx finally: decimal.setcontext(oldctx) Sample usage: def sin(x): with decimal.getcontext() as ctx: ctx.prec += 2 # Rest of sin calculation algorithm # uses a precision 2 greater than normal return +s # Convert result to normal precision def sin(x): with decimal.ExtendedContext: # Rest of sin calculation algorithm # uses the Extended Context from the # General Decimal Arithmetic Specification return +s # Convert result to normal context -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From fredrik at pythonware.com Fri Oct 7 12:26:34 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 7 Oct 2005 12:26:34 +0200 Subject: [Python-Dev] Proposed changes to PEP 343 References: <4346467D.5010005@iinet.net.au> Message-ID: Nick Coghlan wrote: > 9. Here's a proposed native context manager for decimal.Context: > > # This would be a new decimal.Context method > @contextmanager > def __with__(self): wouldn't it be better if the ContextWrapper class (or some variation thereof) could be used as a base class for the decimal.Context class? using decorators on methods to provide "is a" behaviour for the class doesn't really feel pythonic... From ncoghlan at iinet.net.au Fri Oct 7 13:50:42 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Fri, 07 Oct 2005 21:50:42 +1000 Subject: [Python-Dev] PEP 342 suggestion: start(), __call__() and unwind_call() methods Message-ID: <43466112.5050608@iinet.net.au> I'm lifting Jason's PEP 342 suggestions out of the recent PEP 343 thread, in case some of the folks interested in coroutines stopped following that discussion. Jason suggested two convenience methods, .start() and .finish(). start() simply asserted that the generator hadn't been started yet, and I find the parallel with "Thread.start()" appealing: def start(self): """ Convenience method -- exactly like next(), but assert that this coroutine hasn't already been started. """ if self.__started: raise RuntimeError("Coroutine already started") return self.next() I've embellished Jason's suggested finish() method quite a bit though. 1. Use send() rather than next() 2. Call it __call__() rather than finish() 3. Add an unwind_call() variant that gives similar semantics for throw() 4. Support getting a return value from the coroutine using the syntax "raise StopIteration(val)" 5. Add an exception "ContinueIteration" that is used to indicate the generator hasn't finished yet, rather than expecting the generator to finish and raising RuntimeError if it doesn't It ends up looking like this: def __call__(self, value=None): """ Call a generator as a coroutine Returns the first argument supplied to StopIteration or None if no argument was supplied. Raises ContinueIteration with the value yielded as the argument if the generator yields a value """ if not self.__started: raise RuntimeError("Coroutine not started") try: if exc: yield_val = self.throw(value, *exc) else: yield_val = self.send(value) except (StopIteration), ex: if ex.args: return args[0] else: raise ContinueIteration(yield_val) def unwind_call(self, *exc): """Raise an exception in a generator used as a coroutine. Returns the first argument supplied to StopIteration or None if no argument was supplied. Raises ContinueIteration if the generator yields a value with the value yield as the argument """ try: yield_val = self.throw(*exc) except (StopIteration), ex: if ex.args: return args[0] else: raise ContinueIteration(yield_val) Now here's the trampoline scheduler from PEP 342 using this idea: import collections class Trampoline: """Manage communications between coroutines""" running = False def __init__(self): self.queue = collections.deque() def add(self, coroutine): """Request that a coroutine be executed""" self.schedule(coroutine) def run(self): result = None self.running = True try: while self.running and self.queue: func = self.queue.popleft() result = func() return result finally: self.running = False def stop(self): self.running = False def schedule(self, coroutine, stack=(), call_result=None, *exc): # Define the new pseudothread def pseudothread(): try: if exc: result = coroutine.unwind_call(call_result, *exc) else: result = coroutine(call_result) except (ContinueIteration), ex: # Called another coroutine callee = ex.args[0] self.schedule(callee, (coroutine,stack)) except: if stack: # send the error back to the caller caller = stack[0] prev_stack = stack[1] self.schedule( caller, prev_stack, *sys.exc_info() ) else: # Nothing left in this pseudothread to # handle it, let it propagate to the # run loop raise else: if stack: # Finished, so pop the stack and send the # result to the caller caller = stack[0] prev_stack = stack[1] self.schedule(caller, prev_stack, result) # Add the new pseudothread to the execution queue self.queue.append(pseudothread) Notice how a non-coroutine callable can be yielded, and it will still work happily with the scheduler, because the desire to continue execution is indicated by the ContinueIteration exception, rather than by the type of the returned value. With this relatively dumb scheduler, that doesn't provide any particular benefit - the specific pseudothread doesn't block, but eventually the scheduler itself blocks when it executes the non-coroutine call. However, it wouldn't take too much to make the scheduler smarter and give it a physical thread pool that it used whenever it encountered a non-coroutine call And that's the real trick here: with these additions to PEP 342, the decision of how to deal with blocking calls could be made in the scheduler, without affecting the individual coroutines. All the coroutine writers would need to remember to do is to write any potentially blocking operations as yielded lambda expressions or functional.partial invocations, rather than as direct function calls. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Fri Oct 7 14:38:19 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 07 Oct 2005 22:38:19 +1000 Subject: [Python-Dev] Proposed changes to PEP 343 In-Reply-To: References: <4346467D.5010005@iinet.net.au> Message-ID: <43466C3B.50704@gmail.com> Fredrik Lundh wrote: > Nick Coghlan wrote: > > >> 9. Here's a proposed native context manager for decimal.Context: >> >> # This would be a new decimal.Context method >> @contextmanager >> def __with__(self): > > > wouldn't it be better if the ContextWrapper class (or some variation thereof) could > be used as a base class for the decimal.Context class? using decorators on methods > to provide "is a" behaviour for the class doesn't really feel pythonic... That's not what the decorator is for - it's there to turn the generator used to implement the __with__ method into a context manager, rather than saying anything about decimal.Context as a whole. However, requiring a decorator to get a slot to work right looks pretty ugly to me, too. What if we simply special-cased the __with__ slot in type(), such that if it is populated with a generator object, that object is automatically wrapped using the @contextmanager decorator? (Jason actually suggested this idea previously) I initially didn't like the idea because of EIBTI, but I've realised that "def __with__(self):" is pretty darn explicit in its own right. I've also realised that defining __with__ using a generator, but forgetting to add the @contextmanager to the front would be a lovely source of bugs, particularly if generators are given a default __exit__() method that simply invokes self.close(). On the other hand, if __with__ is special-cased, then the slot definition wouldn't look ugly, and we'd still be free to define a generator's normal with statement semantics as: def __exit__(self, *exc): self.close() Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Fri Oct 7 14:43:07 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 07 Oct 2005 22:43:07 +1000 Subject: [Python-Dev] PEP 342 suggestion: start(), __call__() and unwind_call() methods In-Reply-To: <43466112.5050608@iinet.net.au> References: <43466112.5050608@iinet.net.au> Message-ID: <43466D5B.7050103@gmail.com> Nick Coghlan wrote: > It ends up looking like this: > > def __call__(self, value=None): > """ Call a generator as a coroutine > > Returns the first argument supplied to StopIteration or > None if no argument was supplied. > Raises ContinueIteration with the value yielded as the > argument if the generator yields a value > """ > if not self.__started: > raise RuntimeError("Coroutine not started") > try: > if exc: > yield_val = self.throw(value, *exc) > else: > yield_val = self.send(value) > except (StopIteration), ex: > if ex.args: > return args[0] > else: > raise ContinueIteration(yield_val) Oops, I didn't finish fixing this after I added unwind_call(). Try this version instead: def __call__(self, value=None): """ Call a generator as a coroutine Returns the first argument supplied to StopIteration or None if no argument was supplied. Raises ContinueIteration with the value yielded as the argument if the generator yields a value """ try: yield_val = self.send(value) except (StopIteration), ex: if ex.args: return args[0] else: raise ContinueIteration(yield_val) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From mwh at python.net Fri Oct 7 15:02:12 2005 From: mwh at python.net (Michael Hudson) Date: Fri, 07 Oct 2005 14:02:12 +0100 Subject: [Python-Dev] Proposed changes to PEP 343 In-Reply-To: <43466C3B.50704@gmail.com> (Nick Coghlan's message of "Fri, 07 Oct 2005 22:38:19 +1000") References: <4346467D.5010005@iinet.net.au> <43466C3B.50704@gmail.com> Message-ID: <2mfyrdcwff.fsf@starship.python.net> Nick Coghlan writes: > What if we simply special-cased the __with__ slot in type(), such that if it > is populated with a generator object, that object is automatically wrapped > using the @contextmanager decorator? (Jason actually suggested this idea > previously) You don't want to check if it's a generator, you want to check if it's a function whose func_code has the relavent bit set. Seems a bit magical to me, but haven't thought about it hard. Cheers, mwh -- I think my standards have lowered enough that now I think ``good design'' is when the page doesn't irritate the living fuck out of me. -- http://www.jwz.org/gruntle/design.html From ajm at flonidan.dk Fri Oct 7 15:23:47 2005 From: ajm at flonidan.dk (Anders J. Munch) Date: Fri, 7 Oct 2005 15:23:47 +0200 Subject: [Python-Dev] Proposed changes to PEP 343 Message-ID: <9B1795C95533CA46A83BA1EAD4B01030031F0B@flonidanmail.flonidan.net> Nick Coghlan did a +1 job to write: > 1. Amend the statement specification such that: > > with EXPR as VAR: > BLOCK > > is translated as: > > abc = (EXPR).__with__() > exc = (None, None, None) > VAR = abc.__enter__() > try: > try: > BLOCK > except: > exc = sys.exc_info() > raise > finally: > abc.__exit__(*exc) Note that __with__ and __enter__ could be combined into one with no loss of functionality: abc,VAR = (EXPR).__with__() exc = (None, None, None) try: try: BLOCK except: exc = sys.exc_info() raise finally: abc.__exit__(*exc) - Anders From eric.nieuwland at xs4all.nl Fri Oct 7 15:38:20 2005 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Fri, 7 Oct 2005 15:38:20 +0200 Subject: [Python-Dev] Proposed changes to PEP 343 In-Reply-To: <4346467D.5010005@iinet.net.au> References: <4346467D.5010005@iinet.net.au> Message-ID: <3479d2bd94a07f4dd6e06a6874ca74f0@xs4all.nl> Nick Coghlan wrote: > 1. Amend the statement specification such that: > > with EXPR as VAR: > BLOCK > > is translated as: > > abc = (EXPR).__with__() > exc = (None, None, None) > VAR = abc.__enter__() > try: > try: > BLOCK > except: > exc = sys.exc_info() > raise > finally: > abc.__exit__(*exc) Is this correct? What happens to with 40*13+2 as X: print X ? --eric From ncoghlan at gmail.com Fri Oct 7 16:09:48 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 08 Oct 2005 00:09:48 +1000 Subject: [Python-Dev] Proposed changes to PEP 343 In-Reply-To: <3479d2bd94a07f4dd6e06a6874ca74f0@xs4all.nl> References: <4346467D.5010005@iinet.net.au> <3479d2bd94a07f4dd6e06a6874ca74f0@xs4all.nl> Message-ID: <434681AC.2060503@gmail.com> Eric Nieuwland wrote: > What happens to > > with 40*13+2 as X: > print X It would fail with a TypeError because the relevant slot in the type object was NULL - the TypeError checks aren't shown for simplicity's sake. This behaviour isn't really any different from the existing PEP 343 - the only difference is that the statement looks for a __with__ slot on the original EXPR, rather than looking directly for an __enter__ slot. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Fri Oct 7 16:12:39 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 08 Oct 2005 00:12:39 +1000 Subject: [Python-Dev] Proposed changes to PEP 343 In-Reply-To: <9B1795C95533CA46A83BA1EAD4B01030031F0B@flonidanmail.flonidan.net> References: <9B1795C95533CA46A83BA1EAD4B01030031F0B@flonidanmail.flonidan.net> Message-ID: <43468257.9030008@gmail.com> Anders J. Munch wrote: > Note that __with__ and __enter__ could be combined into one with no > loss of functionality: > > abc,VAR = (EXPR).__with__() They can't be combined, because they're invoked on different objects. It would be like trying to combine __iter__() and next() into the same method for iterators. . . Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Fri Oct 7 16:16:20 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 08 Oct 2005 00:16:20 +1000 Subject: [Python-Dev] Proposed changes to PEP 343 In-Reply-To: <2mfyrdcwff.fsf@starship.python.net> References: <4346467D.5010005@iinet.net.au> <43466C3B.50704@gmail.com> <2mfyrdcwff.fsf@starship.python.net> Message-ID: <43468334.2000307@gmail.com> Michael Hudson wrote: > > You don't want to check if it's a generator, you want to check if it's > a function whose func_code has the relavent bit set. > Fair point :) > Seems a bit magical to me, but haven't thought about it hard. Same here - I'm just starting to think that the alternative is worse, because it leaves open the nonsensical possibility of writing a __with__ method as a generator *without* applying the contextmanager decorator, and that would just be bizarre - if you want to get an iterable, why aren't you writing an __iter__ method instead? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at iinet.net.au Fri Oct 7 16:20:07 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Sat, 08 Oct 2005 00:20:07 +1000 Subject: [Python-Dev] Sourceforge CVS access Message-ID: <43468417.4000701@iinet.net.au> Could one of the Sourceforge powers-that-be grant me check in access so I can update PEP 343 directly? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From eric.nieuwland at xs4all.nl Fri Oct 7 16:21:37 2005 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Fri, 7 Oct 2005 16:21:37 +0200 Subject: [Python-Dev] Proposed changes to PEP 343 In-Reply-To: <434681AC.2060503@gmail.com> References: <4346467D.5010005@iinet.net.au> <3479d2bd94a07f4dd6e06a6874ca74f0@xs4all.nl> <434681AC.2060503@gmail.com> Message-ID: Nick Coghlan wrote: > Eric Nieuwland wrote: >> What happens to >> >> with 40*13+2 as X: >> print X > > It would fail with a TypeError because the relevant slot in the type > object > was NULL - the TypeError checks aren't shown for simplicity's sake. > > This behaviour isn't really any different from the existing PEP 343 - > the only > difference is that the statement looks for a __with__ slot on the > original > EXPR, rather than looking directly for an __enter__ slot. Hmmm I hadn't noticed that. In my memory a partial implementation of the protocol was possible. Thus, __enter__/__exit__ would only be called if they exist. Oh well, I'll just add some empty methods. --eric From guido at python.org Fri Oct 7 18:14:12 2005 From: guido at python.org (Guido van Rossum) Date: Fri, 7 Oct 2005 09:14:12 -0700 Subject: [Python-Dev] Sourceforge CVS access In-Reply-To: <43468417.4000701@iinet.net.au> References: <43468417.4000701@iinet.net.au> Message-ID: I will, if you tell me your sourceforge username. On 10/7/05, Nick Coghlan wrote: > Could one of the Sourceforge powers-that-be grant me check in access so I can > update PEP 343 directly? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Fri Oct 7 18:12:31 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 7 Oct 2005 18:12:31 +0200 Subject: [Python-Dev] Proposed changes to PEP 343 References: <4346467D.5010005@iinet.net.au> <43466C3B.50704@gmail.com> Message-ID: Nick Coghlan wrote: > That's not what the decorator is for - it's there to turn the generator used > to implement the __with__ method into a context manager, rather than saying > anything about decimal.Context as a whole. possibly, but using a decorated __with__ method doesn't make much sense if the purpose isn't to turn the class into something that can be used with the "with" statement. > However, requiring a decorator to get a slot to work right looks pretty ugly > to me, too. the whole concept might be perfectly fine on the "this construct corre- sponds to this code" level, but if you immediately end up with things that are not what they seem, and names that don't mean what the say, either the design or the description of it needs work. ("yes, I know you can use this class to manage the context, but it's not really a context manager, because it's that method that's a manager, not the class itself. yes, all the information that belongs to the context are managed by the class, but that doesn't make... oh, shut up and read the PEP") From BruceEckel-Python3234 at mailblocks.com Fri Oct 7 18:47:51 2005 From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel) Date: Fri, 7 Oct 2005 10:47:51 -0600 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <20051006221436.2892.JCARLSON@uci.edu> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> Message-ID: <415220344.20051007104751@MailBlocks.com> Early in this thread there was a comment to the effect that "if you don't know how to use threads, don't use them," which I pointedly avoided responding to because it seemed to me to simply be inflammatory. But Ian Bicking just posted a weblog entry: http://blog.ianbicking.org/concurrency-and-processes.html where he says "threads aren't as hard as they imply" and "An especially poor argument is one that tells me that I'm currently being beaten with a stick, but apparently don't know it." I always have a problem with this. After many years of studying concurrency on-and-off, I continue to believe that threading is very difficult (indeed, the more I study it, the more difficult I understand it to be). And I admit this. The comments I sometimes get back are to the effect that "threading really isn't that hard." Thus, I am just too dense to get it. It's hard to know how to answer. I've met enough brilliant people to know that it's just possible that the person posting really does easily grok concurrency issues and thus I must seem irreconcilably thick. This may actually be one of those people for whom threading is obvious (and Ian has always seemed like a smart guy, for example). But. I do happen to have contact with a lot of people who are at the forefront of the threading world, and *none* of them (many of whom have written the concurrency libraries for Java 5, for example) ever imply that threading is easy. In fact, they generally go out of their way to say that it's insanely difficult. And Java has taken until version 5 to (apparently) get it right, partly by defining a new memory model in order to accurately describe what goes on with threading issues. This same model is being adapted for the next version of C++. This is not stuff that was already out there, that everyone knew about -- this is new stuff. Also, look at the work that Scott Meyers, Andrei Alexandrescu, et al did on the "Double Checked Locking" idiom, showing that it was broken under threading. That was by no means "trivial and obvious" during all the years that people thought that it worked. My own experience in discussions with folks who think that threading is transparent usually uncovers, after a few appropriate questions, that said person doesn't actually understand the depth of the issues involved. A common story is someone who has written a few programs and convinced themselves that these programs work (the "it works for me" proof of correctness). Thus, concurrency must be easy. I know about this because I have learned the hard way throughout many years, over and over again. Every time I've thought that I understood concurrency, something new has popped up and shown me a whole new aspect of things that I have heretofore missed. Then I start thinking "OK, now I finally understand concurrency." One example: when I was rewriting the threading chapter for the 3rd (previous) edition of Thinking in Java, I decided to get a dual-processor machine so I could really test things. This way, I discovered that the behavior of a program on a single-processor machine could be dramatically different than the same program on a multiprocessor machine. That seems obvious, now, but at the time I thought I was writing pretty reasonable code. In addition, it turns out that some things in Java concurrency were broken (even the people who were creating thread support in the language weren't getting it right) so that threw in extra monkey wrenches. And when you start studying the new memory model, which takes into account instruction reordering and cache coherency issues, you realize that it's mind-numbingly far from trivial. Or maybe not, for those who think it's easy. But my experience is that the people who really do understand concurrency never suggest that it's easy. Bruce Eckel http://www.BruceEckel.com mailto:BruceEckel-Python3234 at mailblocks.com Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e" Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel Subscribe to my newsletter: http://www.mindview.net/Newsletter My schedule can be found at: http://www.mindview.net/Calendar From gustavo at niemeyer.net Fri Oct 7 19:22:37 2005 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Fri, 7 Oct 2005 14:22:37 -0300 Subject: [Python-Dev] Extending tuple unpacking Message-ID: <20051007172237.GA13288@localhost.localdomain> Not sure if this has been proposed before, but one thing I occasionally miss regarding tuple unpack is being able to do: first, second, *rest = something Also in for loops: for first, second, *rest in iterator: pass This seems to match the current meaning for starred variables in other contexts. What do you think? -- Gustavo Niemeyer http://niemeyer.net From jeremy at alum.mit.edu Fri Oct 7 19:32:29 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 7 Oct 2005 13:32:29 -0400 Subject: [Python-Dev] Python 2.5 and ast-branch In-Reply-To: <5.1.1.6.0.20051006224020.01f76148@mail.telecommunity.com> References: <87irwadwma.fsf@hydra.bayview.thirdcreek.com> <000001c5cad5$b6b83ee0$a105a044@oemcomputer> <5.1.1.6.0.20051006224020.01f76148@mail.telecommunity.com> Message-ID: On 10/6/05, Phillip J. Eby wrote: > At 07:34 PM 10/6/2005 -0700, Guido van Rossum wrote: > >How does this sound to the non-AST-branch developers who have to > >suffer the inevitable post-merge instability? I think it's now or > >never -- waiting longer isn't going to make this thing easier (not > >with several more language changes approved: with-statement, extended > >import, what else...) > > Do the AST branch changes affect the interface of the "parser" module? Or > do they just add new functionality? It doesn't affect the parser module. For now, the same parser is used, so the parser module can still work the way it does. If we changed the parser in the future, well, the parser module would change, too. I'd also like to add an analogous ast module that exposed the abstract syntax tree for manipulation, along the lines of the parser module. Not sure if we'll actually get to it for this release. Jeremy From guido at python.org Fri Oct 7 19:34:05 2005 From: guido at python.org (Guido van Rossum) Date: Fri, 7 Oct 2005 10:34:05 -0700 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <20051007172237.GA13288@localhost.localdomain> References: <20051007172237.GA13288@localhost.localdomain> Message-ID: On 10/7/05, Gustavo Niemeyer wrote: > Not sure if this has been proposed before, but one thing > I occasionally miss regarding tuple unpack is being able > to do: > > first, second, *rest = something > > Also in for loops: > > for first, second, *rest in iterator: > pass > > This seems to match the current meaning for starred > variables in other contexts. Someone should really write up a PEP -- this was just discussed a week or two ago. I personally think this is adequately handled by writing: (first, second), rest = something[:2], something[2:] I believe that this wish is an example of "hypergeneralization" -- an incorrect generalization based on a misunderstanding of the underlying principle. Argument lists are not tuples [*] and features of argument lists should not be confused with features of tuple unpackings. [*] Proof: f(1) is equivalent to f(1,) even though (1) is an int but (1,) is a tuple. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Fri Oct 7 19:45:45 2005 From: aahz at pythoncraft.com (Aahz) Date: Fri, 7 Oct 2005 10:45:45 -0700 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <415220344.20051007104751@MailBlocks.com> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <415220344.20051007104751@MailBlocks.com> Message-ID: <20051007174545.GA8369@panix.com> On Fri, Oct 07, 2005, Bruce Eckel wrote: > > I always have a problem with this. After many years of studying > concurrency on-and-off, I continue to believe that threading is very > difficult (indeed, the more I study it, the more difficult I > understand it to be). And I admit this. The comments I sometimes get > back are to the effect that "threading really isn't that hard." Thus, > I am just too dense to get it. What I generally say is that threading isn't too hard if you stick with some fairly simple idioms and tools -- and make absolutely certain to follow some rules about sharing data. But it's certainly true that threading (and concurrency) in general is mind-numbingly complex. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you think it's expensive to hire a professional to do the job, wait until you hire an amateur." --Red Adair From pje at telecommunity.com Fri Oct 7 19:51:15 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 07 Oct 2005 13:51:15 -0400 Subject: [Python-Dev] PEP 342 suggestion: start(), __call__() and unwind_call() methods In-Reply-To: <43466112.5050608@iinet.net.au> Message-ID: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> At 09:50 PM 10/7/2005 +1000, Nick Coghlan wrote: >Notice how a non-coroutine callable can be yielded, and it will still work >happily with the scheduler, because the desire to continue execution is >indicated by the ContinueIteration exception, rather than by the type of the >returned value. Whaaaa? You raise an exception to indicate the *normal* case? That seems, um... well, a Very Bad Idea. I also don't see any point to start(), or understand what finish() does or why you'd want it. Last, but far from least, as far as I can tell you can implement all of these semantics using PEP 342 as it sits. That is, it's very simple to make decorators or classes that add those semantics. I don't see anything that requires them to be part of Python. From gustavo at niemeyer.net Fri Oct 7 19:56:10 2005 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Fri, 7 Oct 2005 14:56:10 -0300 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: References: <20051007172237.GA13288@localhost.localdomain> Message-ID: <20051007175610.GA13795@localhost.localdomain> > Someone should really write up a PEP -- this was just discussed a week > or two ago. Heh.. I should follow the list more closely. > I personally think this is adequately handled by writing: > > (first, second), rest = something[:2], something[2:] That's an alternative indeed. But the the proposed way does look better: for item in iterator: (first, second), rest = item[2:], item[:2] ... vs. for first, second, *rest in iterator: ... > I believe that this wish is an example of "hypergeneralization" -- an > incorrect generalization based on a misunderstanding of the underlying > principle. Thanks for trying so hard to say in a nice way that this is not a good idea. :-) > Argument lists are not tuples [*] and features of argument lists > should not be confused with features of tuple unpackings. Do you agree that the concepts are related? For instance: >>> def f(first, second, *rest): ... print first, second, rest ... >>> f(1,2,3,4) 1 2 (3, 4) >>> first, second, *rest = (1,2,3,4) >>> print first, second, rest 1 2 (3, 4) > [*] Proof: f(1) is equivalent to f(1,) even though (1) is an int but > (1,) is a tuple. "Extended *tuple* unpacking" was a wrong subject indeed. This is general unpacking, since it's supposed to work with any sequence. -- Gustavo Niemeyer http://niemeyer.net From pje at telecommunity.com Fri Oct 7 20:07:23 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 07 Oct 2005 14:07:23 -0400 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <415220344.20051007104751@MailBlocks.com> References: <20051006221436.2892.JCARLSON@uci.edu> <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> Message-ID: <5.1.1.6.0.20051007140112.03b2fc80@mail.telecommunity.com> At 10:47 AM 10/7/2005 -0600, Bruce Eckel wrote: >Also, look at the work that Scott Meyers, Andrei Alexandrescu, et al >did on the "Double Checked Locking" idiom, showing that it was broken >under threading. That was by no means "trivial and obvious" during all >the years that people thought that it worked. One of the nice things about the GIL is that it means double-checked locking *does* work in Python. :) >My own experience in discussions with folks who think that threading >is transparent usually uncovers, after a few appropriate questions, >that said person doesn't actually understand the depth of the issues >involved. A common story is someone who has written a few programs and >convinced themselves that these programs work (the "it works for me" >proof of correctness). Thus, concurrency must be easy. > >I know about this because I have learned the hard way throughout many >years, over and over again. Every time I've thought that I understood >concurrency, something new has popped up and shown me a whole new >aspect of things that I have heretofore missed. Then I start thinking >"OK, now I finally understand concurrency." The day when I knew, beyond all shadow of a doubt, that the people who say threading is easy are full of it, is when I wrote an event-driven co-operative multitasking system in Python and managed to create a race condition in *single-threaded code*. Of course, due to its nature, a race condition in an event-driven system is at least reproducible given the same sequence of events, and it's fixable using "turns" (as described in a paper posted here yesterday). With threads, it's not anything like reproducible, because pre-emptive threading is non-deterministic. What the GIL-ranters don't get is that the GIL actually gives you just enough determinism to be able to write threaded programs that don't crash, and that maybe will even work if you treat every point of interaction between threads as a minefield and program with appropriate care. So, if threads are "easy" in Python compared to other langauges, it's *because of* the GIL, not in spite of it. From shane at hathawaymix.org Fri Oct 7 20:42:02 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Fri, 07 Oct 2005 12:42:02 -0600 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <415220344.20051007104751@MailBlocks.com> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <415220344.20051007104751@MailBlocks.com> Message-ID: <4346C17A.2090204@hathawaymix.org> Bruce Eckel wrote: > But. I do happen to have contact with a lot of people who are at the > forefront of the threading world, and *none* of them (many of whom > have written the concurrency libraries for Java 5, for example) ever > imply that threading is easy. In fact, they generally go out of their > way to say that it's insanely difficult. What's insanely difficult is really locking, and locking is driven by concurrency in general, not just threads. It's hard to reason about locks. There are only general rules about how to apply locking correctly, efficiently, and without deadlocks. Personally, to be absolutely certain I've applied locks correctly, I have to think for hours. Even then, it's hard to express my conclusions, so it's hard to be sure future maintainers will keep the locking correct. Java uses locks very liberally, which is to be expected of a language that provides locking using a keyword. This forces Java programmers to deal with the burden of locking everywhere. It also forces the developers of the language and its core libraries to make locking extremely fast yet safe. Java threads would be easy if there wasn't so much locking going on. Zope, OTOH, is far more conservative with locks. There is some code that dispatches HTTP requests to a worker thread, and other code that reads and writes an object database, but most Zope code isn't aware of concurrency. Thus locking is hardly an issue in Zope, and as a result, threading is quite easy in Zope. Recently, I've been simulating high concurrency on a PostgreSQL database, and I've discovered that the way you reason about row and table locks is very similar to the way you reason about locking among threads. The big difference is the consequence of incorrect locking: in PostgreSQL, using the serializable mode, incorrect locking generally only leads to aborted transactions; while in Python and most programming languages, incorrect locking instantly causes corruption and chaos. That's what hurts developers. I want a concurrency model in Python that acknowledges the need for locking while punishing incorrect locking with an exception rather than corruption. *That* would be cool, IMHO. Shane From barry at python.org Fri Oct 7 20:58:25 2005 From: barry at python.org (Barry Warsaw) Date: Fri, 07 Oct 2005 14:58:25 -0400 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <4346C17A.2090204@hathawaymix.org> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <415220344.20051007104751@MailBlocks.com> <4346C17A.2090204@hathawaymix.org> Message-ID: <1128711505.9875.19.camel@geddy.wooz.org> On Fri, 2005-10-07 at 14:42, Shane Hathaway wrote: > What's insanely difficult is really locking, and locking is driven by > concurrency in general, not just threads. It's hard to reason about > locks. I think that's a very interesting observation! I have not built a tremendous number of concurrent apps, but even the dumb locking that Mailman does (which is not a great model of granularity ;) has burned many bch's (brain cell hours) to get right. Where I have used more concurrency, I generally try to structure my apps into the one-producer-many-independent-consumers architecture that was outlined in a previous message. In that case, if you can narrow your touch points to the Queue module for example, then yeah, threading is easy. A gaggle of independent workers isn't that hard to get right in Python. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20051007/09266eeb/attachment.pgp From solipsis at pitrou.net Fri Oct 7 21:13:08 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 07 Oct 2005 21:13:08 +0200 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <4346C17A.2090204@hathawaymix.org> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <415220344.20051007104751@MailBlocks.com> <4346C17A.2090204@hathawaymix.org> Message-ID: <1128712388.6251.21.camel@fsol> Hi, (my 2 cents, probably not very constructive) > Recently, I've been simulating high concurrency on a PostgreSQL > database, and I've discovered that the way you reason about row and > table locks is very similar to the way you reason about locking among > threads. The big difference is the consequence of incorrect locking: in > PostgreSQL, using the serializable mode, incorrect locking generally > only leads to aborted transactions; while in Python and most programming > languages, incorrect locking instantly causes corruption and chaos. > That's what hurts developers. I want a concurrency model in Python that > acknowledges the need for locking while punishing incorrect locking with > an exception rather than corruption. *That* would be cool, IMHO. A relational database has a very strict and regular data model. Also, it has transactions. This makes it easy to precisely define concurrency at the engine level. To apply the same thing to Python you would at least need : 1. a way to define a subset of the current bag of reachable objects which has to stay consistent w.r.t. transactions that are applied to it (of course, you would have several such subsets in any non-trivial application) 2. a way to start and end a transaction on a bag of objects (begin / commit / rollback) 3. a precise definition of the semantics of "consistency" here : for example, only one thread could modify a bag of objects at any given time, and other threads would continue to see the frozen, stable version of that bag until the next version is committed by the writing thread For 1), a helpful paradigm would be to define an object as being the "root" of a bag, and all its properties would automatically and recursively (or not ?) belong to this bag. One has to be careful that no property "leaks" and makes the bag become the set of all reachable Python objects (one could provide a means to say that a specific property must not be transitively put in the bag). Then, use my_object.begin_transaction() and my_object.commit_transaction(). The implementation of 3) does not look very obvious ;-S Regards Antoine. From Martin.Maly at microsoft.com Fri Oct 7 21:15:04 2005 From: Martin.Maly at microsoft.com (Martin Maly) Date: Fri, 7 Oct 2005 12:15:04 -0700 Subject: [Python-Dev] __doc__ behavior in class definitions Message-ID: <1DFB396200705E46B5338CA4B2E25BDE9E6FF5@DF-BANDIT-MSG.exchange.corp.microsoft.com> Hello Python-Dev, My name is Martin Maly and I am a developer at Microsoft, working on the IronPython project with Jim Hugunin. I am spending lot of time making IronPython compatible with Python to the extent possible. I came across a case which I am not sure if by design or a bug in Python (Python 2.4.1 (#65, Mar 30 2005, 09:13:57)). Consider following Python module: # module begin "module doc" class c: print __doc__ __doc__ = "class doc" (1) print __doc__ print c.__doc__ # module end When ran, it prints: module doc class doc class doc Based on the binding rules described in the Python documentation, I would expect the code to throw because binding created on the line (1) is local to the class block and all the other __doc__ uses should reference that binding. Apparently, it is not the case. Is this bug in Python or are __doc__ strings in classes subject to some additional rules? Thanks Martin From fredrik at pythonware.com Fri Oct 7 22:18:14 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 7 Oct 2005 22:18:14 +0200 Subject: [Python-Dev] __doc__ behavior in class definitions References: <1DFB396200705E46B5338CA4B2E25BDE9E6FF5@DF-BANDIT-MSG.exchange.corp.microsoft.com> Message-ID: Martin Maly wrote: > I came across a case which I am not sure if by design or a bug in Python > (Python 2.4.1 (#65, Mar 30 2005, 09:13:57)). Consider following Python > module: > > # module begin > "module doc" > > class c: > print __doc__ > __doc__ = "class doc" (1) > print __doc__ > > print c.__doc__ > # module end > > When ran, it prints: > > module doc > class doc > class doc > > Based on the binding rules described in the Python documentation, I > would expect the code to throw because binding created on the line (1) > is local to the class block and all the other __doc__ uses should > reference that binding. Apparently, it is not the case. > > Is this bug in Python or are __doc__ strings in classes subject to some > additional rules? it's not limited to __doc__ strings, or, for that matter, to attributes: spam = "spam" class c: print spam spam = "bacon" print spam print len(spam) def len(self): return 10 print c.spam the language reference uses the term "local scope" for both class and def-statements, but it's not really the same thing. the former is more like a temporary extra global scope with a (class, global) search path, names are resolved when they are found (just as in the global scope); there's no preprocessing step. for additional class issues, see the "Discussion" in the nested scopes PEP: http://www.python.org/peps/pep-0227.html hope this helps! From pje at telecommunity.com Fri Oct 7 22:28:33 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 07 Oct 2005 16:28:33 -0400 Subject: [Python-Dev] __doc__ behavior in class definitions Message-ID: <5.1.1.6.0.20051007162832.01f7c080@mail.telecommunity.com> At 12:15 PM 10/7/2005 -0700, Martin Maly wrote: >Based on the binding rules described in the Python documentation, I >would expect the code to throw because binding created on the line (1) >is local to the class block and all the other __doc__ uses should >reference that binding. Apparently, it is not the case. Correct - the scoping rules about local bindings causing a symbol to be local only apply to *function* scopes. Class scopes are able to refer to module-level names until the name is shadowed in the class scope. >Is this bug in Python or are __doc__ strings in classes subject to some >additional rules? Neither; the behavior you're seeing doesn't have anything to do with docstrings per se, it's just normal Python binding behavior, coupled with the fact that the class' docstring isn't set until the class suite is completed. It's currently acceptable (if questionable style) to do things like this in today's Python: X = 1 class X: X = X + 1 print X.X # this will print "2" More commonly, and less questionably, this would manifest as something like: def function_taking_foo(foo, bar): ... class Foo(blah): function_taking_foo = function_taking_foo This makes it possible to call 'function_taking_foo(aFooInstance, someBar)' or 'aFooInstance.function_taking_foo(someBar)'. I've used this pattern a couple times myself, and I believe there may actually be cases in the standard library that do something like this, although maybe not binding the method under the same name as the function. From steve at holdenweb.com Fri Oct 7 22:33:57 2005 From: steve at holdenweb.com (Steve Holden) Date: Fri, 07 Oct 2005 21:33:57 +0100 Subject: [Python-Dev] __doc__ behavior in class definitions In-Reply-To: <1DFB396200705E46B5338CA4B2E25BDE9E6FF5@DF-BANDIT-MSG.exchange.corp.microsoft.com> References: <1DFB396200705E46B5338CA4B2E25BDE9E6FF5@DF-BANDIT-MSG.exchange.corp.microsoft.com> Message-ID: Martin Maly wrote: > Hello Python-Dev, > > My name is Martin Maly and I am a developer at Microsoft, working on the > IronPython project with Jim Hugunin. I am spending lot of time making > IronPython compatible with Python to the extent possible. > > I came across a case which I am not sure if by design or a bug in Python > (Python 2.4.1 (#65, Mar 30 2005, 09:13:57)). Consider following Python > module: > > # module begin > "module doc" > > class c: > print __doc__ > __doc__ = "class doc" (1) > print __doc__ > > print c.__doc__ > # module end > > When ran, it prints: > > module doc > class doc > class doc > > Based on the binding rules described in the Python documentation, I > would expect the code to throw because binding created on the line (1) > is local to the class block and all the other __doc__ uses should > reference that binding. Apparently, it is not the case. > > Is this bug in Python or are __doc__ strings in classes subject to some > additional rules? > Well, it's nothing to do with __doc__, as the following example shows: crud = "module crud" class c: print crud crud = "class crud" print crud print c.crud As you might by now expect, this outputs module crud class crud class crud Clearly the rules for class scopes aren't quite the same as those for function scopes, as the module crud = "module crud" def f(): print crud crud = "function crud" print crud f() does indeed raise an UnboundLocalError exception. I'm not enough of a language lawyer to determine exactly why this is, but it's clear that class variables aren't scoped in the same way as function locals. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From jack at performancedrivers.com Fri Oct 7 22:52:37 2005 From: jack at performancedrivers.com (Jack Diederich) Date: Fri, 7 Oct 2005 16:52:37 -0400 Subject: [Python-Dev] __doc__ behavior in class definitions In-Reply-To: <1DFB396200705E46B5338CA4B2E25BDE9E6FF5@DF-BANDIT-MSG.exchange.corp.microsoft.com> References: <1DFB396200705E46B5338CA4B2E25BDE9E6FF5@DF-BANDIT-MSG.exchange.corp.microsoft.com> Message-ID: <20051007205237.GL6255@performancedrivers.com> On Fri, Oct 07, 2005 at 12:15:04PM -0700, Martin Maly wrote: > Hello Python-Dev, > > My name is Martin Maly and I am a developer at Microsoft, working on the > IronPython project with Jim Hugunin. I am spending lot of time making > IronPython compatible with Python to the extent possible. > > I came across a case which I am not sure if by design or a bug in Python > (Python 2.4.1 (#65, Mar 30 2005, 09:13:57)). Consider following Python > module: > > # module begin > "module doc" > > class c: > print __doc__ > __doc__ = "class doc" (1) > print __doc__ > [snip] > > Based on the binding rules described in the Python documentation, I > would expect the code to throw because binding created on the line (1) > is local to the class block and all the other __doc__ uses should > reference that binding. Apparently, it is not the case. > > Is this bug in Python or are __doc__ strings in classes subject to some > additional rules? Classes behave just like you would expect them to, for proper variations of what to expect *wink*. The class body is evaluated first with the same local/global name lookups as would happen inside another scope (e.g. a function). The results of that evaluation are then passed to the class constructor as a dict. The __new__ method of metaclasses and the less used 'new' module highlight the final step that turns a bucket of stuff in a namespace into a class. >>> import new >>> A = new.classobj('w00t', (object,), {'__doc__':"no help at all", 'myself':lambda x:x}) >>> a = A() >>> a.myself() <__main__.w00t object at 0xb7bc32cc> >>> a <__main__.w00t object at 0xb7bc32cc> >>> help(a) Help on w00t in module __main__ object: class w00t(__builtin__.object) | no help at all | | Methods defined here: | | lambdax | >>> Hope that helps, -jackdied From shane at hathawaymix.org Fri Oct 7 22:55:58 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Fri, 07 Oct 2005 14:55:58 -0600 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <1128712388.6251.21.camel@fsol> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <415220344.20051007104751@MailBlocks.com> <4346C17A.2090204@hathawaymix.org> <1128712388.6251.21.camel@fsol> Message-ID: <4346E0DE.70502@hathawaymix.org> Antoine Pitrou wrote: > A relational database has a very strict and regular data model. Also, it > has transactions. This makes it easy to precisely define concurrency at > the engine level. > > To apply the same thing to Python you would at least need : > 1. a way to define a subset of the current bag of reachable objects > which has to stay consistent w.r.t. transactions that are applied to it > (of course, you would have several such subsets in any non-trivial > application) > 2. a way to start and end a transaction on a bag of objects (begin / > commit / rollback) > 3. a precise definition of the semantics of "consistency" here : for > example, only one thread could modify a bag of objects at any given > time, and other threads would continue to see the frozen, stable version > of that bag until the next version is committed by the writing thread > > For 1), a helpful paradigm would be to define an object as being the > "root" of a bag, and all its properties would automatically and > recursively (or not ?) belong to this bag. One has to be careful that no > property "leaks" and makes the bag become the set of all reachable > Python objects (one could provide a means to say that a specific > property must not be transitively put in the bag). Then, use > my_object.begin_transaction() and my_object.commit_transaction(). > > The implementation of 3) does not look very obvious ;-S Well, I think you just described ZODB. ;-) I'd be happy to explain how ZODB solves those problems, if you're interested. However, ZODB doesn't provide locking, and that bothers me somewhat. If two threads try to modify an object at the same time, one of the threads will be forced to abort, unless a method has been defined for resolving the conflict. If there are too many writers, ZODB crawls. ZODB's strategy works fine when there aren't many conflicting, concurrent changes, but the complex locking done by relational databases seems to be required for handling a lot of concurrent writers. Shane From ms at cerenity.org Fri Oct 7 23:02:38 2005 From: ms at cerenity.org (Michael Sparks) Date: Fri, 7 Oct 2005 22:02:38 +0100 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <20051006221436.2892.JCARLSON@uci.edu> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> Message-ID: <200510072202.39129.ms@cerenity.org> [ Possibly overlengthy reply. However given a multiple sets of cans of worms... ] On Friday 07 October 2005 07:25, Josiah Carlson wrote: > One thing I notice is absent from the Kamaelia page is benchmarks. That's largely for one simple reason: we haven't done any yet. At least not anything I'd call a benchmark. "There's lies, damn lies, statistics and then there's benchmarks." //Theoretically// I suspect that the system /could/ perform as well as traditional approaches to dealing with concurrent problems single threaded (and multi-thread/process). This is based on the recognition of two things: * Event systems (often implementing state machine type behaviour, not always though), often have intermediate buffers between states & operations. Some systems divide a problem into multiple reactors and stages and have communication between them, though this can sometimes be hidden. All we've done is make this much more explicit. * Event systems (and state machine based approaches) can often be used to effectively say "I want to stop and wait here, come back to me later" or simply "I'm doing something processor intensive, but I'm being nice and letting something else have a go". The use of generators here simply makes that particular behaviour more explicit. This is a nice bonus of python. [neither is a negative really, just different. The first bullet has implicit buffers in the system, the latter has a more implicit state machine in the system. ICBVW here of course.] However, COULD is not is, and whilst I say "in theory", I am painfully aware that theory and practice often have a big gulf between them. Also, I'm certain that at present our performance is nowhere near optimal. We've focussed on trying to find what works from a few perspectives rather than performance (one possible definition of correctness here, but certainly not the only one). Along the way we've made compomises in favour of clarity as to what's going on, rather than performance. For example, one are we know we can optimise is the handling of message delivery. The mini-axon tutorial represents delivery between active components as being performed by an independent party - a postman. This is precisely what happens in the current system. That can be optimised for example by collapsing outboxes into inboxes (ie removing one of the lists when a linkage is made and changing the refernce), and at that point you have a single intermediate buffer (much like an event/state system communicating between subsystems). We haven't done this yet, Whilst it would partly simplify things, it makes other areas more complex, and seems like premature optimisation. However I have performed an //informal comparison// between the use of a Kamaelia type approach and a traditional approach not using any framework at all for implementing a trivial game. (Cats bouncing around the screen scaling, rotating, etc, controlled by a user) The reason I say Kamaelia-type approach is because it was a mini-axon based experiment using collapsed outboxes to inboxes (as above). The measure I used was simply framerate. This is a fair real value and has a real use - if it drops too low, the system is simply unusable. I measured the framerate before transforming the simplistic game to work well in the framework, and after transforming it. The differences were: * 5% drop in performance/framerate * The ability to reuse much of the code in other systems and environments. From that perspective it seems acceptable (for now). This *isn't* as you would probably say a rigorous or trustable benchmark, but was a useful "smoke test" if you like of the approach. From a *pragmatic* perspective, currently the system is fast enough for simple games (say a hundred, 2 hundred, maybe more, sprites actve at once), for interactive applications, video players, realtime audio mixing and a variety of other things, so currently we're leaving that aside. Also from an even more pragmatic perspective, I would say if you're after performance and throughput then I'd say use Twisted, since it's a proven technology. **If** our stuff turns out to be useful, we'd like to find way of making our stuff available inside twisted -- if they'd like it (*) -- since we're not the least bit interested in competing with anyone :-) So far *we're* finding it useful, which is all I'd personally claim, and hope that it's useful to others. (*) The all too brief conversation I had with Tommi Virtanen at Europython suggested that he at least thought the pipeline/graphline idea was worth taking - so I'd like to do that at some point, even if it sidelines our work to date. Once we've validated the model though (which I expect to take some time, you only learn if it's validated by builiding things IMO), then we'll look at optimisation. (if the model is validated :-) All that said, I'm open to suggestion as to what sort of benchmark you'd like to see. I'm more interested in benchmarks that actually mean something rather than say X is better than Y though. Summarising them, no benchmarks, yet. If you're after speed, I'm certain you can find that elsewhere. If you're after an easy way of dealing with a concurrent problem, that's where we're starting from, and then optimising. We're very open to suggestions for improvement on both usability/leanability and on keeping doors open/open doors to performance though. I'd hate to have to rewrite everything in a another language later simply due to poor design decisions. [ Network controlled Networked Audio Mixing Matrix ] > Very neat. How much data? What kind of throughput? What kinds of > latencies? For the test system we tested with 3 raw PCM audio data streams. That 's 3 x 44.1Khz, 16 bit stereo - which is around 4.2Mbit/s of data from the network being processed realtime and output back to the network at 1.4Mbit/s. So, not huge numbers, but not insignificant amounts of data either. I suppose one thing I can take more time with now is to look at the specific latency of the mixer. It didn't *appear* to be large however. (there appeared to be similar latency in the system with or without the mixer) [[The aim of the rapid prototyping session was to see what could be done rather than to measure the results. The total time taken for coding the mixing matrix was 2.5 days. About 1/2 day spent on finding an issue we had with network resends regarding non-blocking sockets. A day with me totally misunderstanding how mixing raw audio byte streams works. The backplane was written during that 3 day time period. The control protocol for switching on/off mixes and querying the system though was ~1.5 hours from start to finish, including testing. To experiment with what dataflow architecture might work, I knocked up a command line controlled dynamic graph viewer (add nodes, link nodes, delete nodes) in about 5 minutes and then experimented with what the system would look like if done naively. The backplane idea became clear as useful here because we wanted to allow multiple mixers. ]] A more interesting effect we found was dealing with mouse movement in pygame where we found that *huge* numbers of messages being sent one at a time and processed one at a time (with yields after each) became a huge bottleneck. It became more sense to batch the events and pass them to client surfaces. (If that makes no sense we allow pygame components to act as if they have control of the display by giving them a surface from a pygame display service. This acts essentially as a simplistic window manager. That means pygame events need to be passed through quickly and cleanly.) The reason I like using pygame for these things is because a) it's relatively raw and fast b) games are another often /naturally/ concurrent system. Also it normally allows other senses beyond reading numbers/graphs to kick in when evaluating changes "that looks better/worse", "Theres's something wrong there". > I have two recent posts about the performance and features of a (hacked > together) tuple space system Great :-) I'll have a dig around. > The only thing that it is missing is a prioritization mechanism (fifo, > numeric priority, etc.), which would get us a job scheduling kernel. Not > bad for a "message passing"/"tuple space"/"IPC" library. Sounds interesting. I'll try and find some time to have a look and have a play. FWIW, we're also missing a prioritisation mechanism right now. Though currently I have SImon Wittber's latest release of Nanothreads on my stack of to look at. I do have a soft spot for Linda type approaches though :-) Best Regards, Michael. -- Michael Sparks, Senior R&D Engineer, Digital Media Group Michael.Sparks at rd.bbc.co.uk, http://kamaelia.sourceforge.net/ British Broadcasting Corporation, Research and Development Kingswood Warren, Surrey KT20 6NP This e-mail may contain personal views which are not the views of the BBC. From jim at zope.com Fri Oct 7 23:07:58 2005 From: jim at zope.com (Jim Fulton) Date: Fri, 07 Oct 2005 17:07:58 -0400 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <4346E0DE.70502@hathawaymix.org> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <415220344.20051007104751@MailBlocks.com> <4346C17A.2090204@hathawaymix.org> <1128712388.6251.21.camel@fsol> <4346E0DE.70502@hathawaymix.org> Message-ID: <4346E3AE.6020506@zope.com> Shane Hathaway wrote: > Antoine Pitrou wrote: > >>A relational database has a very strict and regular data model. Also, it >>has transactions. This makes it easy to precisely define concurrency at >>the engine level. >> >>To apply the same thing to Python you would at least need : >> 1. a way to define a subset of the current bag of reachable objects >>which has to stay consistent w.r.t. transactions that are applied to it >>(of course, you would have several such subsets in any non-trivial >>application) >> 2. a way to start and end a transaction on a bag of objects (begin / >>commit / rollback) >> 3. a precise definition of the semantics of "consistency" here : for >>example, only one thread could modify a bag of objects at any given >>time, and other threads would continue to see the frozen, stable version >>of that bag until the next version is committed by the writing thread >> >>For 1), a helpful paradigm would be to define an object as being the >>"root" of a bag, and all its properties would automatically and >>recursively (or not ?) belong to this bag. One has to be careful that no >>property "leaks" and makes the bag become the set of all reachable >>Python objects (one could provide a means to say that a specific >>property must not be transitively put in the bag). Then, use >>my_object.begin_transaction() and my_object.commit_transaction(). >> >>The implementation of 3) does not look very obvious ;-S > > > Well, I think you just described ZODB. ;-) I'd be happy to explain how > ZODB solves those problems, if you're interested. > > However, ZODB doesn't provide locking, and that bothers me somewhat. If > two threads try to modify an object at the same time, one of the threads > will be forced to abort, unless a method has been defined for resolving > the conflict. If there are too many writers, ZODB crawls. ZODB's > strategy works fine when there aren't many conflicting, concurrent > changes, but the complex locking done by relational databases seems to > be required for handling a lot of concurrent writers. I don't think it would be all that hard to use a locking (rather than a time-stamp) strategy for ZODB, although ZEO would make this extra challenging. In any case, the important thing to agree on here is that transactions provide a useful approach to concurrency control in the case where - separate control flows are independent, and - we need to mediate access to shared resources. Someone else pointed out essentially the same thing at the beginning of this thread. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From solipsis at pitrou.net Fri Oct 7 23:19:25 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 07 Oct 2005 23:19:25 +0200 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <4346E0DE.70502@hathawaymix.org> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <415220344.20051007104751@MailBlocks.com> <4346C17A.2090204@hathawaymix.org> <1128712388.6251.21.camel@fsol> <4346E0DE.70502@hathawaymix.org> Message-ID: <1128719965.6251.46.camel@fsol> > Well, I think you just described ZODB. ;-) *gasp* > I'd be happy to explain how > ZODB solves those problems, if you're interested. Well, yes, I'm interested :) (I don't anything about Zope internals though, and I've never even used it) From shane at hathawaymix.org Sat Oct 8 00:12:13 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Fri, 07 Oct 2005 16:12:13 -0600 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <1128719965.6251.46.camel@fsol> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <415220344.20051007104751@MailBlocks.com> <4346C17A.2090204@hathawaymix.org> <1128712388.6251.21.camel@fsol> <4346E0DE.70502@hathawaymix.org> <1128719965.6251.46.camel@fsol> Message-ID: <4346F2BD.8060301@hathawaymix.org> Antoine Pitrou wrote: >>I'd be happy to explain how >>ZODB solves those problems, if you're interested. > > > Well, yes, I'm interested :) > (I don't anything about Zope internals though, and I've never even used > it) Ok. Quoting your list: > To apply the same thing to Python you would at least need : > 1. a way to define a subset of the current bag of reachable objects > which has to stay consistent w.r.t. transactions that are applied > to it (of course, you would have several such subsets in any > non-trivial application) ZODB holds a tree of objects. When you add an attribute to an object managed by ZODB, you're expanding the tree. Consistency comes from several features: - Each thread has its own lazy copy of the object tree. - The application doesn't see changes to the object tree except at transaction boundaries. - The ZODB store keeps old revisions, and the new MVCC feature lets the application see the object system as it was at the beginning of the transaction. - If you make a change to the object tree that conflicts with a concurrent change, all changes to that copy of the object tree are aborted. > 2. a way to start and end a transaction on a bag of objects (begin / > commit / rollback) ZODB includes a transaction module that does just that. In fact, the module is so useful that I think it belongs in the standard library. > 3. a precise definition of the semantics of "consistency" here : for > example, only one thread could modify a bag of objects at any given > time, and other threads would continue to see the frozen, > stable version of that bag until the next version is committed by the > writing thread As mentioned above, the key is that ZODB maintains a copy of the objects per thread. A fair amount of RAM is lost that way, but the benefit in simplicity is tremendous. You also talked about the risk that applications would accidentally pull a lot of objects into the tree just by setting an attribute. That can and does happen, but the most common case is already solved by the pickle machinery: if you pickle something global like a class, the pickle stores the name and location of the class instead of the class itself. Shane From BruceEckel-Python3234 at mailblocks.com Sat Oct 8 00:26:47 2005 From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel) Date: Fri, 7 Oct 2005 16:26:47 -0600 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <200510072202.39129.ms@cerenity.org> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <200510072202.39129.ms@cerenity.org> Message-ID: <1245045308.20051007162647@MailBlocks.com> > //Theoretically// I suspect that the system /could/ perform as well as > traditional approaches to dealing with concurrent problems single threaded > (and multi-thread/process). I also think it's important to factor in the possibility of multiprocessors. If Kamaelia (for example) has a very safe and straightforward programming model so that more people are easily able to use it, but it has some performance impact over more complex systems, I think the ease of use issue opens up far greater possibilities if you include multiprocessing -- because if you can easily write concurrent programs in Python, then Python could gain a significant advantage over less agile languages when multiprocessors become common. That is, with multiprocessors, it could be way easier to write a program in Python that also runs way faster than the competition. Yes, of course given enough time they might theoretically be able to write a program that is as fast or faster using their threading mechanism, but it would be so hard by comparison that they'll either never get it done or never be sure if it's reliable. That's what I'm looking for. Bruce Eckel http://www.BruceEckel.com mailto:BruceEckel-Python3234 at mailblocks.com Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e" Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel Subscribe to my newsletter: http://www.mindview.net/Newsletter My schedule can be found at: http://www.mindview.net/Calendar From ms at cerenity.org Sat Oct 8 00:47:13 2005 From: ms at cerenity.org (Michael Sparks) Date: Fri, 7 Oct 2005 23:47:13 +0100 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <1245045308.20051007162647@MailBlocks.com> References: <20051006143740.287E.JCARLSON@uci.edu> <200510072202.39129.ms@cerenity.org> <1245045308.20051007162647@MailBlocks.com> Message-ID: <200510072347.14006.ms@cerenity.org> On Friday 07 October 2005 23:26, Bruce Eckel wrote: > I think the ease of use issue opens up far greater possibilities if you > include multiprocessing ... > That's what I'm looking for. In which case that's an area we need to push our work into sooner rather than later. After all, the PS3 and CELL arrive next year. Sun already has some interesting stuff shipping. I'd like to use that kit effectively, and more importantly make using that kit effectively available to collegues sooner rather than later. That really means multiprocess "now" not later. BTW, I hope it's clear that I'm not saying concurrency is easy per se (noting your previous post ;-) but rather than it /should/ be made as simple as is humanly possible. Thanks! Michael. -- Michael Sparks, Senior R&D Engineer, Digital Media Group Michael.Sparks at rd.bbc.co.uk, http://kamaelia.sourceforge.net/ British Broadcasting Corporation, Research and Development Kingswood Warren, Surrey KT20 6NP This e-mail may contain personal views which are not the views of the BBC. From ms at cerenity.org Sat Oct 8 00:49:42 2005 From: ms at cerenity.org (Michael Sparks) Date: Fri, 7 Oct 2005 23:49:42 +0100 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <1093762964.20051006140637@MailBlocks.com> References: <200510062054.56985.ms@cerenity.org> <1093762964.20051006140637@MailBlocks.com> Message-ID: <200510072349.42943.ms@cerenity.org> On Thursday 06 October 2005 21:06, Bruce Eckel wrote: > So yes indeed, this is quite high on my list to research. Looks like > people there have been doing some interesting work. > > Right now I'm just trying to cast a net, so that people can put in > ideas, for when the Java book is done and I can spend more time on it. Thanks for your kind words. Hopefully it's of use! :-) Michael. From jason.orendorff at gmail.com Sat Oct 8 00:51:21 2005 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Fri, 7 Oct 2005 18:51:21 -0400 Subject: [Python-Dev] __doc__ behavior in class definitions In-Reply-To: <1DFB396200705E46B5338CA4B2E25BDE9E6FF5@DF-BANDIT-MSG.exchange.corp.microsoft.com> References: <1DFB396200705E46B5338CA4B2E25BDE9E6FF5@DF-BANDIT-MSG.exchange.corp.microsoft.com> Message-ID: Martin, These two cases generate different bytecode. def foo(): # foo.func_code.co_flags == 0x43 print x # LOAD_FAST 0 x = 3 class Foo: # .co_flags == 0x40 print x # LOAD_NAME 'x' x = 3 In functions, local variables are just numbered slots. (co_flags bits 1 and 2 indicate this.) The LOAD_FAST opcode is used. If the slot is empty, LOAD_FAST throws. In other code, the local variables are actually stored in a dictionary. LOAD_NAME is used. This does a locals dictionary lookup; failing that, it falls back on the globals dictionary; and failing that, it falls back on builtins. Why the discrepancy? Beats me. I would definitely implement what CPython does up to this point, if that's your question. Btw, functions that use 'exec' are in their own category way out there: def foo2(): # foo2.func_code.co_flags == 0x42 print x # LOAD_NAME 'x' exec "x=3" # don't ever do this, it screws everything up print x Pretty weird. Jython seems to implement this. -j From ncoghlan at gmail.com Sat Oct 8 00:54:16 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 08 Oct 2005 08:54:16 +1000 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <415220344.20051007104751@MailBlocks.com> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <415220344.20051007104751@MailBlocks.com> Message-ID: <4346FC98.5050504@gmail.com> Bruce Eckel wrote: > I always have a problem with this. After many years of studying > concurrency on-and-off, I continue to believe that threading is very > difficult (indeed, the more I study it, the more difficult I > understand it to be). And I admit this. The comments I sometimes get > back are to the effect that "threading really isn't that hard." Thus, > I am just too dense to get it. The few times I have encountered anyone saying anything resembling "threading is easy", it was because the full sentence went something like "threading is easy if you use message passing and copy-on-send or release-reference-on-send to communicate between threads, and limit the shared data structures to those required to support the messaging infrastructure". And most of the time there was an implied "compared to using semaphores and locks directly, " at the start. Which is obiously a far cry from simply saying "threading is easy". If I encountered anyone who thought it was easy *in general*, then I would fear any threaded code they wrote, because they clearly weren't thinking about the problem hard enough ;) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Sat Oct 8 01:31:48 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 08 Oct 2005 09:31:48 +1000 Subject: [Python-Dev] Proposed changes to PEP 343 In-Reply-To: References: <4346467D.5010005@iinet.net.au> <43466C3B.50704@gmail.com> Message-ID: <43470564.6040903@gmail.com> Fredrik Lundh wrote: > Nick Coghlan wrote: >>However, requiring a decorator to get a slot to work right looks pretty ugly >>to me, too. > > > the whole concept might be perfectly fine on the "this construct corre- > sponds to this code" level, but if you immediately end up with things that > are not what they seem, and names that don't mean what the say, either > the design or the description of it needs work. > > ("yes, I know you can use this class to manage the context, but it's not > really a context manager, because it's that method that's a manager, not > the class itself. yes, all the information that belongs to the context are > managed by the class, but that doesn't make... oh, shut up and read the > PEP") Heh. OK, my current inclinitation is to make the new paragraph at the end of the "Generator Decorator" section read like this: 4. Add a paragraph to the end of the "Generator Decorator" section: If a generator is used to write a context's __with__ method, then Python's type machinery will automatically take care of applying this decorator. This means that it is just as easy to write a generator-based context manager for a context as it is to write a generator-based iterator for an iterable (see the decimal.Context example below). And then update the decimal.Context example to remove the @contextmanager decorator. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From rhamph at gmail.com Sat Oct 8 02:12:31 2005 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 7 Oct 2005 18:12:31 -0600 Subject: [Python-Dev] Sandboxed Threads in Python Message-ID: Okay, basic principal first. You start with a sandboxed thread that has access to nothing. No modules, no builtins, *nothing*. This means it can run without the GIL but it can't do any work. To make it do something useful we need to give it two things: first, immutable types that can be safely accessed without locks, and second a thread-safe queue to coordinate. With those you can bring modules and builtins back into the picture, either by making them immutable or using a proxy that handles all the methods in a single thread. Unfortunately python has a problem with immutable types. For the most part it uses an honor system, trusting programmers not to make a class that claims to be immutable yet changes state anyway. We need more than that, and "freezing" a dict would work well enough, so it's not the problem. The problem is the reference counting, and even if we do it "safely" all the memory writes just kill performance so we need to avoid it completely. Turns out it's quite easy and it doesn't harm performance of existing code or require modification (but a recompile is necessary). The idea is to only use a cyclic garbage collector for cleaning them up, which means we need to disable the reference counting. That requires we modify Py_INCREF and Py_DECREF to be a no-op if ob_refcnt is set to a magic constant (probably a negative value). That's all it takes. Modify Py_INCREF and Py_DECREFs to check for a magic constant. Ahh, but the performance? See for yourself. Normal Py_INCREF/Py_DECREF rhamph at factor:~/src/Python-2.4.1$ ./python Lib/test/pystone.py 500000 Pystone(1.1) time for 500000 passes = 13.34 This machine benchmarks at 37481.3 pystones/second Modified Py_INCREF/Py_DECREF with magic constant rhamph at factor:~/src/Python-2.4.1-sandbox$ ./python Lib/test/pystone.py 500000 Pystone(1.1) time for 500000 passes = 13.38 This machine benchmarks at 37369.2 pystones/second The numbers aren't significantly different. In fact the second one is often slightly faster, which shows the difference is smaller than the statistical noise. So to sum up, by prohibiting mutable objects from being transferred between sandboxes we can achieve scalability on multiple CPUs, making threaded programming easier and more reliable, as a bonus get secure sandboxes[1], and do that all while maintaining single-threaded performance and requiring minimal changes to existing C modules (recompiling). A "proof of concept" patch to Py_INCREF/Py_DECREF (only demonstrates performance effects, does not create or utilize any new functionality) can be found here: https://sourceforge.net/tracker/index.php?func=detail&aid=1316653&group_id=5470&atid=305470 [1] We need to remove any backdoor methods of getting to mutable objects outside of your sandbox, which gets us most of the way towards a restricted execution environment. -- Adam Olsen, aka Rhamphoryncus From pje at telecommunity.com Sat Oct 8 02:51:41 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 07 Oct 2005 20:51:41 -0400 Subject: [Python-Dev] Sandboxed Threads in Python In-Reply-To: Message-ID: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com> At 06:12 PM 10/7/2005 -0600, Adam Olsen wrote: >Okay, basic principal first. You start with a sandboxed thread that >has access to nothing. No modules, no builtins, *nothing*. This >means it can run without the GIL but it can't do any work. It sure can't. You need at least the threadstate and a builtins dictionary to do any work. > To make it >do something useful we need to give it two things: first, immutable >types that can be safely accessed without locks, This is harder than it sounds. Integers, for example, have a custom allocator and a free list, not to mention a small-integer cache. You would somehow need to duplicate all that for each sandbox, or else you have to make those integers immortal using your "magic constant". >Turns out it's quite easy and it doesn't harm performance of existing >code or require modification (but a recompile is necessary). The idea >is to only use a cyclic garbage collector for cleaning them up, Um, no, actually. You need a mark-and-sweep GC or something of that ilk. Python's GC only works with objects that *have refcounts*, and it works by clearing objects that are in cycles. The clearing causes DECREF-ing, which then causes objects to be freed. If you have objects without refcounts, they would be immortal and utterly unrecoverable. >which >means we need to disable the reference counting. That requires we >modify Py_INCREF and Py_DECREF to be a no-op if ob_refcnt is set to a >magic constant (probably a negative value). And any object with the magic refcount will live *forever*, unless you manually deallocate it. >That's all it takes. Modify Py_INCREF and Py_DECREFs to check for a >magic constant. Ahh, but the performance? See for yourself. First, you need to implement a garbage collection scheme that can deal with not having refcounts. Otherwise you're not comparing apples to apples here, and your programs will leak like crazy. Note that implementing a root-based GC for Python is non-trivial, since extension modules can store pointers to PyObjects anywhere they like. Further, many Python objects don't even support being tracked by the current cycle collector. So, changing this would probably require a lot of C extensions to be rewritten to support the needed API changes for the new garbage collection strategy. >So to sum up, by prohibiting mutable objects from being transferred >between sandboxes we can achieve scalability on multiple CPUs, making >threaded programming easier and more reliable, as a bonus get secure >sandboxes[1], and do that all while maintaining single-threaded >performance and requiring minimal changes to existing C modules >(recompiling). Unfortunately, you have only succeeded in restating the problem, not reducing its complexity. :) In fact, you may have increased the complexity, since now you need a threadsafe garbage collector, too. Oh, and don't forget - newstyle classes keep weak references to all their subclasses, which means for example that every time you subclass 'dict', you're modifying the "immutable" 'dict' class. So, unless you recreate all the classes in each sandbox, you're back to needing locking. And if you recreate everything in each sandbox, well, I think you've just reinvented "processes". :) From ncoghlan at gmail.com Sat Oct 8 03:06:28 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 08 Oct 2005 11:06:28 +1000 Subject: [Python-Dev] Sandboxed Threads in Python In-Reply-To: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com> References: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com> Message-ID: <43471B94.60104@gmail.com> Phillip J. Eby wrote: > Oh, and don't forget - newstyle classes keep weak references to all their > subclasses, which means for example that every time you subclass 'dict', > you're modifying the "immutable" 'dict' class. So, unless you recreate all > the classes in each sandbox, you're back to needing locking. And if you > recreate everything in each sandbox, well, I think you've just reinvented > "processes". :) After all, there's a reason Bruce Eckel's recent post about multi-processing attracted a fair amount of interest. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From rhamph at gmail.com Sat Oct 8 03:17:01 2005 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 7 Oct 2005 19:17:01 -0600 Subject: [Python-Dev] Sandboxed Threads in Python In-Reply-To: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com> References: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com> Message-ID: On 10/7/05, Phillip J. Eby wrote: > At 06:12 PM 10/7/2005 -0600, Adam Olsen wrote: > >Okay, basic principal first. You start with a sandboxed thread that > >has access to nothing. No modules, no builtins, *nothing*. This > >means it can run without the GIL but it can't do any work. > > It sure can't. You need at least the threadstate and a builtins dictionary > to do any work. > > > > To make it > >do something useful we need to give it two things: first, immutable > >types that can be safely accessed without locks, > > This is harder than it sounds. Integers, for example, have a custom > allocator and a free list, not to mention a small-integer cache. You would > somehow need to duplicate all that for each sandbox, or else you have to > make those integers immortal using your "magic constant". Yes, we'd probably want some per-sandbox allocators. I'm no expert on that but I know it can be done. > >Turns out it's quite easy and it doesn't harm performance of existing > >code or require modification (but a recompile is necessary). The idea > >is to only use a cyclic garbage collector for cleaning them up, > > Um, no, actually. You need a mark-and-sweep GC or something of that > ilk. Python's GC only works with objects that *have refcounts*, and it > works by clearing objects that are in cycles. The clearing causes > DECREF-ing, which then causes objects to be freed. If you have objects > without refcounts, they would be immortal and utterly unrecoverable. Perhaps I wasn't clear enough, I was assuming appropriate changes to the GC would be done. The important thing is it can be done without changing the interface that the existing modules use. > >which > >means we need to disable the reference counting. That requires we > >modify Py_INCREF and Py_DECREF to be a no-op if ob_refcnt is set to a > >magic constant (probably a negative value). > > And any object with the magic refcount will live *forever*, unless you > manually deallocate it. See above. > >That's all it takes. Modify Py_INCREF and Py_DECREFs to check for a > >magic constant. Ahh, but the performance? See for yourself. > > First, you need to implement a garbage collection scheme that can deal with > not having refcounts. Otherwise you're not comparing apples to apples > here, and your programs will leak like crazy. > > Note that implementing a root-based GC for Python is non-trivial, since > extension modules can store pointers to PyObjects anywhere they > like. Further, many Python objects don't even support being tracked by the > current cycle collector. > > So, changing this would probably require a lot of C extensions to be > rewritten to support the needed API changes for the new garbage collection > strategy. They only need to be rewritten if you want them to provide an immutable type that can be transferred between sandboxes. Short of that you can make the module object itself immutable, and from it create mutable instances that are private to each sandbox and not sharable. If you make no changes at all the module still works, but is only usable from the main thread. That allows us to transition incrementally. > >So to sum up, by prohibiting mutable objects from being transferred > >between sandboxes we can achieve scalability on multiple CPUs, making > >threaded programming easier and more reliable, as a bonus get secure > >sandboxes[1], and do that all while maintaining single-threaded > >performance and requiring minimal changes to existing C modules > >(recompiling). > > Unfortunately, you have only succeeded in restating the problem, not > reducing its complexity. :) In fact, you may have increased the > complexity, since now you need a threadsafe garbage collector, too. > > Oh, and don't forget - newstyle classes keep weak references to all their > subclasses, which means for example that every time you subclass 'dict', > you're modifying the "immutable" 'dict' class. So, unless you recreate all > the classes in each sandbox, you're back to needing locking. And if you > recreate everything in each sandbox, well, I think you've just reinvented > "processes". :) I was aware that weakrefs needed some special handling (I just forgot to mention it), but I didn't know it was used by subclassing. Unfortunately I don't know what purpose it serves so I can't contemplate how to deal with it. I need to stress that *only* the new, immutable and "thread-safe mark-and-sweep" types would be affected by these changes. Everything else would continue to exist as it did before, and the benchmark exists to show they can coexist without killing performance. -- Adam Olsen, aka Rhamphoryncus From ncoghlan at gmail.com Sat Oct 8 03:19:58 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 08 Oct 2005 11:19:58 +1000 Subject: [Python-Dev] PEP 342 suggestion: start(), __call__() and unwind_call() methods In-Reply-To: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> Message-ID: <43471EBE.40401@gmail.com> Phillip J. Eby wrote: > At 09:50 PM 10/7/2005 +1000, Nick Coghlan wrote: > >> Notice how a non-coroutine callable can be yielded, and it will still >> work >> happily with the scheduler, because the desire to continue execution is >> indicated by the ContinueIteration exception, rather than by the type >> of the >> returned value. > > > Whaaaa? You raise an exception to indicate the *normal* case? That > seems, um... well, a Very Bad Idea. The sheer backwardness of my idea occurred to me after I'd got some sleep :) > Last, but far from least, as far as I can tell you can implement all of > these semantics using PEP 342 as it sits. That is, it's very simple to > make decorators or classes that add those semantics. I don't see > anything that requires them to be part of Python. Yeah, I've now realised that you can do all of this more simply by doing it directly in the scheduler using StopIteration to indicate when the coroutine is done, and using yield to indicate "I'm not done yet". So with a bit of thought, I came up with a scheduler that has all the benefits I described, and only uses the existing PEP 342 methods. When writing a coroutine for this scheduler, you can do 6 things via the scheduler: 1. Raise StopIteration to indicate "I'm done" and return None to your caller 2. Raise StopIteration with a single argument to return a value other than None to your caller 3. Raise a different exception and have that exception propagate up to your caller 5. Yield None to allow other coroutines to be executed 5. Yield a coroutine to request a call to that coroutine 6. Yield a callable to request an asynchronous call using that object Yielding anything else, or trying to raise StopIteration with more than one argument results in a TypeError being raised *at the point of the offending yield or raise statement*, rather than taking out the scheduler itself. The more I explore the possibilities of PEP 342, the more impressed I am by the work that went into it! Cheers, Nick. P.S. Here's the Trampoline scheduler described above: import collections class Trampoline: """Manage communications between coroutines""" running = False def __init__(self): self.queue = collections.deque() def add(self, coroutine): """Request that a coroutine be executed""" self.schedule(coroutine) def run(self): result = None self.running = True try: while self.running and self.queue: func = self.queue.popleft() result = func() return result finally: self.running = False def stop(self): self.running = False def schedule(self, coroutine, stack=(), call_result=None, *exc): # Define the new pseudothread def pseudothread(): try: if exc: callee = coroutine.throw(call_result, *exc) else: callee = coroutine(call_result) except (StopIteration), ex: # Coroutine finished cleanly if stack: # Send the result to the caller caller = stack[0] prev_stack = stack[1] if len(ex.args) > 1: # Raise a TypeError in the current coroutine self.schedule(coroutine, stack, TypeError, "Too many arguments to StopIteration" ) elif ex.args: self.schedule(caller, prev_stack, ex.args[0]) else: self.schedule(caller, prev_stack) except: # Coroutine finished with an exception if stack: # send the error back to the caller caller = stack[0] prev_stack = stack[1] self.schedule( caller, prev_stack, *sys.exc_info() ) else: # Nothing left in this pseudothread to # handle it, let it propagate to the # run loop raise else: # Coroutine isn't finished yet if callee is None: # Reschedule the current coroutine self.schedule(coroutine, stack) elif isinstance(callee, types.GeneratorType): # Requested a call to another coroutine self.schedule(callee, (coroutine,stack)) elif callable(callee): # Requested an asynchronous call self._make_async_call(callee, coroutine, stack) else: # Raise a TypeError in the current coroutine self.schedule(coroutine, stack, TypeError, "Illegal argument to yield" ) # Add the new pseudothread to the execution queue self.queue.append(pseudothread) def _make_async_call(self, blocking_call, caller, stack): # Assume @threaded decorator takes care of # - returning a function with a call method which # kick starts the function execution and returns # a Future object to give access to the result. # - farming call out to a physical thread pool # - keeping the Thread object executing the async # call alive after this function exits @threaded def async_call(): try: result = blocking_call() except: # send the error back to the caller self.schedule( caller, stack, *sys.exc_info() ) else: # Send the result back to the caller self.schedule(caller, stack, result) # Start the asynchronous call async_call() -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From jcarlson at uci.edu Sat Oct 8 04:05:57 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 07 Oct 2005 19:05:57 -0700 Subject: [Python-Dev] Sandboxed Threads in Python In-Reply-To: References: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com> Message-ID: <20051007183215.28A4.JCARLSON@uci.edu> Adam Olsen wrote: > I need to stress that *only* the new, immutable and "thread-safe > mark-and-sweep" types would be affected by these changes. Everything > else would continue to exist as it did before, and the benchmark > exists to show they can coexist without killing performance. All the benchmark showed was that checking for a constant in the refcount during in/decrefing, and not garbage collecting those objects, didn't adversely affect performance. As an aside, there's also the ugly bit about being able to guarantee that an object is immutable. I personally mutate Python strings in my C code all the time (long story, not to be discussed here), and if I can do it now, then any malicious or "inventive" person can do the same in this "sandboxed thread" Python "of the future". At least in the case of integers, one could work the tagged integer idea to bypass the freelist issue the Phillip offered, but in general, I don't believe there exists a truely immutable type as long as there is C extensions and/or cTypes. Further, the work to actually implement a new garbage collector for Python in order to handle these 'immutable' types seems to me to be more trouble than it is worth. - Josiah From foom at fuhm.net Sat Oct 8 04:50:11 2005 From: foom at fuhm.net (James Y Knight) Date: Fri, 7 Oct 2005 22:50:11 -0400 Subject: [Python-Dev] Proposal for 2.5: Returning values from PEP 342 enhanced generators In-Reply-To: <4340C76E.8020502@satori.za.net> References: <4340C76E.8020502@satori.za.net> Message-ID: <4047F300-5065-4573-9D39-3117FC3D6E2B@fuhm.net> On Oct 3, 2005, at 1:53 AM, Piet Delport wrote: > For generators written in this style, "yield" means "suspend > execution of the > current call until the requested result/resource can be provided", and > "return" regains its full conventional meaning of "terminate the > current call > with a given result". > > The simplest / most straightforward implementation would be for > "return Foo" > to translate to "raise StopIteration, Foo". This is consistent with > "return" > translating to "raise StopIteration", and does not break any existing > generator code. > > (Another way to think about this change is that if a plain > StopIteration means > "the iterator terminated", then a valued StopIteration, by > extension, means > "the iterator terminated with the given value".) > It sounds like a nice idea to me. Of course, it is only useful to functions calling ".next()" explicitly; in something like a for loop, the return value would just be ignored. James From jcarlson at uci.edu Sat Oct 8 05:05:20 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 07 Oct 2005 20:05:20 -0700 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <200510072202.39129.ms@cerenity.org> References: <20051006221436.2892.JCARLSON@uci.edu> <200510072202.39129.ms@cerenity.org> Message-ID: <20051007190739.28AA.JCARLSON@uci.edu> Michael Sparks wrote: > [ Possibly overlengthy reply. However given a multiple sets of cans of > worms... ] > On Friday 07 October 2005 07:25, Josiah Carlson wrote: > > One thing I notice is absent from the Kamaelia page is benchmarks. > > That's largely for one simple reason: we haven't done any yet. Perfectly reasonable. If you ever do, I'd be happy to know! > At least not anything I'd call a benchmark. "There's lies, damn lies, > statistics and then there's benchmarks." Indeed. But it does allow people to get an idea whether a system could handle their workload. > The measure I used was simply framerate. This is a fair real value and has a > real use - if it drops too low, the system is simply unusable. I measured the > framerate before transforming the simplistic game to work well in the > framework, and after transforming it. The differences were: > * 5% drop in performance/framerate > * The ability to reuse much of the code in other systems and environments. Single process? Multi-process single machine? Multiprocess multiple machine? > Also from an even more pragmatic perspective, I would say if you're after > performance and throughput then I'd say use Twisted, since it's a proven > technology. I'm just curious. I keep my fingers away from Twisted as a matter of personal taste (I'm sure its great, but it's not for me). > All that said, I'm open to suggestion as to what sort of benchmark you'd like > to see. I'm more interested in benchmarks that actually mean something rather > than say X is better than Y though. I wouldn't dream of saying that X was better or worse than Y, unless one was obvious crap (since it works for you, and you've gotten new users to use it successfully, that is obviously not the case). There are five benchmarks that I think would be interesting to see: 1. Send ~500 bytes of data round-trip from process A to process B and back on the same machine as fast as you can (simulates a synchronous message passing and discovers transfer latencies) a few (tens of) thousands of times (A doesn't send message i until it has recieved message i-1 back from B). 2. Increase the number of processes that round trip with B. A quick chart of #senders vs. messages/second would be far more than adequate. 3. Have process B send ~500 byte messages to many listening processes via whatever is the fastest method (direct connections, multiple subscriptions to a 'channel', etc.). Knowing #listeners vs. messages/second would be cool. 4. Send blocks of data from process A to process B (any size you want). B immediately discards the data, but you pay attention to how much data/second B recieves (a dual processor machine with proper processor affinities would be fine here). 5. Start increasing the number of processes that send data to B. A quick chart of #senders vs. total bytes/second would be far more than adequate. I'm just offering the above as example benchmarks (you certainly don't need to do them to satisfy me, but I'll be doing those when my tuple space implementation is closer to being done). They are certainly not exhaustive, but they do offer a method by which one can measure latencies, message volume throughput, data volume throughput, and ability to handle many senders and/or recipients. > [ Network controlled Networked Audio Mixing Matrix ] > > Very neat. How much data? What kind of throughput? What kinds of > > latencies? > > For the test system we tested with 3 raw PCM audio data streams. That 's > 3 x 44.1Khz, 16 bit stereo - which is around 4.2Mbit/s of data from the > network being processed realtime and output back to the network at > 1.4Mbit/s. So, not huge numbers, but not insignificant amounts of data > either. I suppose one thing I can take more time with now is to look at > the specific latency of the mixer. It didn't *appear* to be large however. > (there appeared to be similar latency in the system with or without the > mixer) 530Kbytes/second in, 176kbytes/second out. Not bad (I imagine you are using a C library/extension of some sort to do the mixing...perhaps numarray, Numeric, ...). How large are the blocks of data that you are shuffling around at one time? 1,5,10,50,150kbytes? > A more interesting effect we found was dealing with mouse movement in pygame > where we found that *huge* numbers of messages being sent one at a time and > processed one at a time (with yields after each) became a huge bottleneck. I can imagine. > The reason I like using pygame for these things is because a) it's relatively > raw and fast b) games are another often /naturally/ concurrent system. Also > it normally allows other senses beyond reading numbers/graphs to kick in when > evaluating changes "that looks better/worse", "Theres's something wrong > there". Indeed. I'm should get my fingers into PyGame, but haven't yet due to other responsibilities. > > I have two recent posts about the performance and features of a (hacked > > together) tuple space system > > Great :-) I'll have a dig around. Make that 3. > > The only thing that it is missing is a prioritization mechanism (fifo, > > numeric priority, etc.), which would get us a job scheduling kernel. Not > > bad for a "message passing"/"tuple space"/"IPC" library. > > Sounds interesting. I'll try and find some time to have a look and have a > play. FWIW, we're also missing a prioritisation mechanism right now. Though > currently I have SImon Wittber's latest release of Nanothreads on my stack of > to look at. I do have a soft spot for Linda type approaches though :-) I've not yet released anything. The version I'm working on essentially indexes tuples in a set of specialized structures to make certain kinds of matching fast (both insertions and removals are also fast), which has a particular kind of queue at the 'leaf' (if one were to look at it as a tree). Those queues also support listeners which want to be notified about one or many tuples which happen to match up with the pattern, resulting in the tuple being consumed by one listener, broadcast to all listeners, etc. In the case of no listeners, but someone who just wants one tuple, one can prioritize tuple fetches based on fifo, numeric priority, lifo, or whatever other useful semantic that I get around to putting in there for whatever set of tuples matches it. - Josiah From pje at telecommunity.com Sat Oct 8 05:17:08 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 07 Oct 2005 23:17:08 -0400 Subject: [Python-Dev] Sandboxed Threads in Python In-Reply-To: References: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com> <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051007231002.02aae820@mail.telecommunity.com> At 07:17 PM 10/7/2005 -0600, Adam Olsen wrote: >On 10/7/05, Phillip J. Eby wrote: > > At 06:12 PM 10/7/2005 -0600, Adam Olsen wrote: > > >Turns out it's quite easy and it doesn't harm performance of existing > > >code or require modification (but a recompile is necessary). The idea > > >is to only use a cyclic garbage collector for cleaning them up, > > > > Um, no, actually. You need a mark-and-sweep GC or something of that > > ilk. Python's GC only works with objects that *have refcounts*, and it > > works by clearing objects that are in cycles. The clearing causes > > DECREF-ing, which then causes objects to be freed. If you have objects > > without refcounts, they would be immortal and utterly unrecoverable. > >Perhaps I wasn't clear enough, I was assuming appropriate changes to >the GC would be done. The important thing is it can be done without >changing the interface that the existing modules use. No, it can't. See more below. > > >That's all it takes. Modify Py_INCREF and Py_DECREFs to check for a > > >magic constant. Ahh, but the performance? See for yourself. > > > > First, you need to implement a garbage collection scheme that can deal with > > not having refcounts. Otherwise you're not comparing apples to apples > > here, and your programs will leak like crazy. > > > > Note that implementing a root-based GC for Python is non-trivial, since > > extension modules can store pointers to PyObjects anywhere they > > like. Further, many Python objects don't even support being tracked by the > > current cycle collector. > > > > So, changing this would probably require a lot of C extensions to be > > rewritten to support the needed API changes for the new garbage collection > > strategy. > >They only need to be rewritten if you want them to provide an >immutable type that can be transferred between sandboxes. No. You're missing my point. If they are able to *reference* these objects, then the garbage collector has to know about it, or else it can't know when to reclaim them. Ergo, these objects will leak, or else extensions will crash when they refer to the deallocated memory. In other words, you can't handwave the whole problem away by assuming "a garbage collector". The garbage collector has to actually be able to work, and you haven't specified *how* it can work without changing the C API. >I was aware that weakrefs needed some special handling (I just forgot >to mention it), but I didn't know it was used by subclassing. >Unfortunately I don't know what purpose it serves so I can't >contemplate how to deal with it. It allows changes to a supertype's C-level slots to propagate to subclasses. From ncoghlan at gmail.com Sat Oct 8 05:34:41 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 08 Oct 2005 13:34:41 +1000 Subject: [Python-Dev] Sourceforge CVS access In-Reply-To: References: <43468417.4000701@iinet.net.au> Message-ID: <43473E51.4010103@gmail.com> Guido van Rossum wrote: > I will, if you tell me your sourceforge username. Sorry, forgot about that little detail ;) Anyway, its ncoghlan, same as the gmail account. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From kbk at shore.net Sat Oct 8 06:34:33 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Sat, 8 Oct 2005 00:34:33 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200510080434.j984YXHG031113@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 341 open ( +4) / 2953 closed ( +6) / 3294 total (+10) Bugs : 884 open (-28) / 5321 closed (+43) / 6205 total (+15) RFE : 196 open ( +1) / 187 closed ( +0) / 383 total ( +1) New / Reopened Patches ______________________ Make fcntl work properly on AMD64 (2005-09-30) http://python.org/sf/1309352 opened by Brad Hards A wait4() implementation (2005-09-30) http://python.org/sf/1309579 opened by chads httplib error handling and HTTP/0.9 fix (2005-10-04) http://python.org/sf/1312980 opened by Yusuke Shinyama Speedup PyUnicode_DecodeCharmap (2005-10-05) http://python.org/sf/1313939 opened by Walter D?rwald os.makedirs - robust against partial path (2005-10-05) http://python.org/sf/1314067 opened by Jim Jewett ensure lock is released if exception is raised (2005-10-06) http://python.org/sf/1314396 opened by Eric Blossom ToolTip.py: fix main() function (2005-10-06) http://python.org/sf/1315161 opened by sebastien blanchet Py_INCREF/Py_DECREF with magic constant demo (2005-10-07) http://python.org/sf/1316653 opened by Adam Olsen Patches Closed ______________ pyexpat.c: Two line fix for decoding crash (2005-09-29) http://python.org/sf/1309009 closed by nnorwitz patch IDLE to allow running anonymous code in editor window (2005-05-13) http://python.org/sf/1201522 closed by kbk BSD-style wait4 implementation (2004-07-29) http://python.org/sf/1000267 closed by nnorwitz Patch for (Doc) bug #1219273 (2005-06-25) http://python.org/sf/1227568 closed by nnorwitz Greatly enhanced webbrowser.py (2003-06-13) http://python.org/sf/754022 closed by birkenfeld Patch for [ 1163563 ] Sub threads execute in restricted mode (05/17/05) http://python.org/sf/1203393 closed by sf-robot New / Reopened Bugs ___________________ linechache module returns wrong results (2005-09-30) http://python.org/sf/1309567 opened by Thomas Heller 2.4.2 make problems (2005-10-03) http://python.org/sf/1311579 opened by Paul Mothersdill broken link in sets page (2005-10-03) CLOSED http://python.org/sf/1311674 opened by Fernando Canizo python.exe 2.4.2 compiled with VS2005 crashes (2005-10-03) http://python.org/sf/1311784 opened by Peter Klotz Python breakdown in windows (uses apsw) (2005-10-03) CLOSED http://python.org/sf/1311993 opened by Benjamin Hinrichs mac_roman codec missing "apple" codepoint (2005-10-04) http://python.org/sf/1313051 opened by Tony Nelson urlparse "caches" parses regardless of encoding (2005-10-04) http://python.org/sf/1313119 opened by Ken Kinder bisect C replacement doesn't accept named args (2005-10-04) CLOSED http://python.org/sf/1313496 opened by Drew Perttula Compile fails for configure "--without-threads" (2005-10-05) CLOSED http://python.org/sf/1313974 opened by seidl Issue in unicode args in logging (2005-10-05) CLOSED http://python.org/sf/1314107 reopened by tungwaiyip Issue in unicode args in logging (2005-10-05) CLOSED http://python.org/sf/1314107 opened by Wai Yip Tung crash in longobject (invalid memory access) (2005-10-05) CLOSED http://python.org/sf/1314182 opened by Jon Nelson logging run into deadlock in some error handling situation (2005-10-05) CLOSED http://python.org/sf/1314519 opened by Wai Yip Tung Trailing slash redirection for SimpleHTTPServer (2005-10-05) http://python.org/sf/1314572 opened by Josiah Carlson inspect.getsourcelines() broken (2005-10-07) http://python.org/sf/1315961 opened by Walter D?rwald gzip.GzipFile.seek missing second argument (2005-10-07) http://python.org/sf/1316069 opened by Neil Schemenauer Segmentation fault with invalid "coding" (2005-10-07) http://python.org/sf/1316162 opened by Humberto Di?genes Bugs Closed ___________ Datagram Socket Timeouts (2005-09-29) http://python.org/sf/1308042 closed by nnorwitz Unsatisfied symbols: _PyGILState_NoteThreadState (code) (2005-09-29) http://python.org/sf/1307978 closed by mwh subprocess.Popen locks on Cygwin (2005-09-29) http://python.org/sf/1307798 closed by jlt63 test_posix fails on cygwin (2005-04-10) http://python.org/sf/1180147 closed by jlt63 can't import thru cygwin symlink (2005-04-08) http://python.org/sf/1179412 closed by jlt63 popen4/cygwin ssh hangs (2005-01-13) http://python.org/sf/1101756 closed by jlt63 2.3.3 tests on cygwin (2003-12-22) http://python.org/sf/864374 closed by jlt63 Datagram Socket Timeouts (2005-09-28) http://python.org/sf/1307357 closed by nnorwitz log() on a big number fails on powerpc64 (2005-07-26) http://python.org/sf/1245381 closed by nnorwitz __getnewargs__ is in pickle docs but not in code (2005-09-30) http://python.org/sf/1309724 closed by nnorwitz Win registry problem (2005-07-15) http://python.org/sf/1239120 closed by birkenfeld unknown encoding -> MemoryError (2003-07-17) http://python.org/sf/772896 closed by nnorwitz 'Expression' AST Node not documented (2005-06-12) http://python.org/sf/1219273 closed by nnorwitz segfault if redirecting directory (2004-01-30) http://python.org/sf/887946 closed by nnorwitz Acroread aborts printing PDF documentation (2004-05-30) http://python.org/sf/963321 closed by hgolden Python hangs up on I/O operations on the latest FreeBSD 4.10 (2004-06-09) http://python.org/sf/969492 closed by birkenfeld empty raise after handled exception (2004-06-15) http://python.org/sf/973103 closed by nnorwitz Unhelpful error message when mtime of a module is -1 (2004-06-21) http://python.org/sf/976608 closed by nnorwitz compile of code with incorrect encoding yields MemoryError (2004-06-25) http://python.org/sf/979739 closed by nnorwitz os.access reports true for read-only directories (2004-07-15) http://python.org/sf/991735 closed by nnorwitz seg fault when calling bsddb.hashopen() (2004-07-19) http://python.org/sf/994100 closed by montanaro os.major() os.minor() example and description change (2004-08-12) http://python.org/sf/1008310 closed by nnorwitz test_pep277 fails (2004-09-16) http://python.org/sf/1029561 closed by nnorwitz broken link in sets page (2005-10-03) http://python.org/sf/1311674 closed by fdrake Python breakdown in windows (uses apsw) (2005-10-03) http://python.org/sf/1311993 closed by birkenfeld Missing sk_SK in windows_locale (2005-07-13) http://python.org/sf/1237015 deleted by luks bisect C replacement doesn't accept named args (2005-10-05) http://python.org/sf/1313496 closed by rhettinger Compile fails for configure "--without-threads" (2005-10-06) http://python.org/sf/1313974 closed by perky Issue in unicode args in logging (2005-10-05) http://python.org/sf/1314107 closed by vsajip Issue in unicode args in logging (2005-10-05) http://python.org/sf/1314107 closed by vsajip crash in longobject (invalid memory access) (2005-10-05) http://python.org/sf/1314182 closed by tim_one logging run into deadlock in some error handling situation (2005-10-06) http://python.org/sf/1314519 closed by vsajip A command history for the idle interactive shell (2005-09-23) http://python.org/sf/1302267 closed by kbk New / Reopened RFE __________________ Add os.path.relpath (2005-09-30) http://python.org/sf/1309676 opened by Reinhold Birkenfeld From pjd at satori.za.net Sat Oct 8 08:20:30 2005 From: pjd at satori.za.net (Piet Delport) Date: Sat, 08 Oct 2005 08:20:30 +0200 Subject: [Python-Dev] PEP 342 suggestion: start(), __call__() and unwind_call() methods In-Reply-To: <43471EBE.40401@gmail.com> References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> <43471EBE.40401@gmail.com> Message-ID: <4347652E.1090705@satori.za.net> Nick Coghlan wrote: > Phillip J. Eby wrote: >> Nick Coghlan wrote: > [...] > >> Last, but far from least, as far as I can tell you can implement all of >> these semantics using PEP 342 as it sits. That is, it's very simple to >> make decorators or classes that add those semantics. I don't see >> anything that requires them to be part of Python. > > > Yeah, I've now realised that you can do all of this more simply by doing it > directly in the scheduler using StopIteration to indicate when the coroutine > is done, and using yield to indicate "I'm not done yet". Earlier this week, i proposed legalizing "return Result" inside a generator, and making it act like "raise StopIteration( Result )", for exactly this reason. IMHO, this is an elegant and straightforward extension of the current semantics of returns inside generators, and is the final step toward making generator-based concurrent tasks[1] look just like the equivalent synchronous code (with the only difference, more-or-less, being the need for appropriate "yield" keywords, and a task runner/scheduler loop). This change would make a huge difference to the practical usability of these generator-based tasks. I think they're much less likely to catch on if you have to write "raise StopIteration( Result )" (or "_return( Result )") all the time. [1] a.k.a. coroutines, which i don't think is an accurate name, anymore. From ncoghlan at iinet.net.au Sat Oct 8 10:23:57 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Sat, 08 Oct 2005 18:23:57 +1000 Subject: [Python-Dev] test_cmd_line failure on Kubuntu 5.10 with GCC 4.0 Message-ID: <4347821D.1070105@iinet.net.au> Anyone else seeing any problems with test_cmd_line? I've got a few failures in test_cmd_line on Kubuntu 5.10 with GCC 4.0 relating to a missing "\n" line ending. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From martin at v.loewis.de Sat Oct 8 12:32:00 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 08 Oct 2005 12:32:00 +0200 Subject: [Python-Dev] PythonCore\CurrentVersion Message-ID: <4347A020.2050008@v.loewis.de> What happened to the CurrentVersion registry entry documented at http://www.python.org/windows/python/registry.html AFAICT, even the python15.wse file did not fill a value in this entry (perhaps I'm misinterpreting the wse file, though). So was this ever used? Why is it documented, and who documented it (unfortunately, registry.html is not in cvs/subversion, either)? Regards, Martin From ms at cerenity.org Sat Oct 8 12:44:00 2005 From: ms at cerenity.org (Michael Sparks) Date: Sat, 8 Oct 2005 11:44:00 +0100 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <20051007190739.28AA.JCARLSON@uci.edu> References: <20051006221436.2892.JCARLSON@uci.edu> <200510072202.39129.ms@cerenity.org> <20051007190739.28AA.JCARLSON@uci.edu> Message-ID: <200510081144.02122.ms@cerenity.org> On Saturday 08 October 2005 04:05, Josiah Carlson wrote: [ simplistic, informal benchmark of a test optimised versioned of the system, based on bouncing scaing rotating sprites around the screen. ] > Single process? ?Multi-process single machine? ?Multiprocess multiple > machine? SIngle process, single CPU, not very recent machine. (600MHz crusoe based machine so) That machine wasn't hardware accelerated though, so was only able to handle several dozen sprites before slowing down. The slowdown was due to the hardware not being able to keep up with pygame's drawing requests though rather than the framework. > I'm just offering the above as example benchmarks (you certainly don't > need to do them to satisfy me, but I'll be doing those when my tuple > space implementation is closer to being done). I'll note them as things worth doing - they look reasonable and interesting benchmarks. (I can think of a few modifications I might make though. For example in 3 you say "fastest". I might have that as a 3b. 3a could be "simplest to use/read" or "most likely to pick". Obviously there's a good chance that's not the fastest. (Could be optimised to be under the hood I suppose, but that wouldn't be the point of the test) > > [ Network controlled Networked Audio Mixing Matrix ] > I imagine you are using a C library/extension of some sort to do the > mixing...perhaps numarray, Numeric, ... Nope, just plain old python (I'm now using a 1.6Ghz centrino machine though). My mixing function is particularly naive as well. To me that says more about python than my code. I did consider using pyrex to wrap (or write) an optimised version, but there didn't seem to be any need for last week (Though for a non-prototype something faster would be nice :). I'll save responding the linda things until I have a chance to read in detail what you've written. It sounds very promising though - having multiple approaches to different styles of concurrency that work nicely with each other safely is always a positive thing IMO. Thanks for the suggestions and best regards, Michael. -- "Though we are not now that which in days of old moved heaven and earth, that which we are, we are: one equal temper of heroic hearts made weak by time and fate but strong in will to strive, to seek, to find and not to yield" -- "Ulysses", Tennyson From rhamph at gmail.com Sat Oct 8 14:29:25 2005 From: rhamph at gmail.com (Adam Olsen) Date: Sat, 8 Oct 2005 06:29:25 -0600 Subject: [Python-Dev] Sandboxed Threads in Python In-Reply-To: <5.1.1.6.0.20051007231002.02aae820@mail.telecommunity.com> References: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com> <5.1.1.6.0.20051007231002.02aae820@mail.telecommunity.com> Message-ID: On 10/7/05, Phillip J. Eby wrote: > At 07:17 PM 10/7/2005 -0600, Adam Olsen wrote: > >On 10/7/05, Phillip J. Eby wrote: > > > Note that implementing a root-based GC for Python is non-trivial, since > > > extension modules can store pointers to PyObjects anywhere they > > > like. Further, many Python objects don't even support being tracked by the > > > current cycle collector. > > > > > > So, changing this would probably require a lot of C extensions to be > > > rewritten to support the needed API changes for the new garbage collection > > > strategy. > > > >They only need to be rewritten if you want them to provide an > >immutable type that can be transferred between sandboxes. > > No. You're missing my point. If they are able to *reference* these > objects, then the garbage collector has to know about it, or else it can't > know when to reclaim them. Ergo, these objects will leak, or else > extensions will crash when they refer to the deallocated memory. > > In other words, you can't handwave the whole problem away by assuming "a > garbage collector". The garbage collector has to actually be able to work, > and you haven't specified *how* it can work without changing the C API. Unfortunately the rammifications of your original statement didn't set in until well after I sent my reply. You are right, it does make it impossible without changing the C API, so that much of the idea is dead. I wonder if it would be possible to use a wrapper around the immutable type instead.. something to ponder anyway. > >I was aware that weakrefs needed some special handling (I just forgot > >to mention it), but I didn't know it was used by subclassing. > >Unfortunately I don't know what purpose it serves so I can't > >contemplate how to deal with it. > > It allows changes to a supertype's C-level slots to propagate to subclasses. I see. Well, I would have required the supertype to be immutable, so there couldn't be any changes to the C-level slots. -- Adam Olsen, aka Rhamphoryncus From rhamph at gmail.com Sat Oct 8 14:34:19 2005 From: rhamph at gmail.com (Adam Olsen) Date: Sat, 8 Oct 2005 06:34:19 -0600 Subject: [Python-Dev] Sandboxed Threads in Python In-Reply-To: <20051007183215.28A4.JCARLSON@uci.edu> References: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com> <20051007183215.28A4.JCARLSON@uci.edu> Message-ID: On 10/7/05, Josiah Carlson wrote: > > Adam Olsen wrote: > > I need to stress that *only* the new, immutable and "thread-safe > > mark-and-sweep" types would be affected by these changes. Everything > > else would continue to exist as it did before, and the benchmark > > exists to show they can coexist without killing performance. > > All the benchmark showed was that checking for a constant in the > refcount during in/decrefing, and not garbage collecting those objects, > didn't adversely affect performance. > > As an aside, there's also the ugly bit about being able to guarantee > that an object is immutable. I personally mutate Python strings in my C > code all the time (long story, not to be discussed here), and if I can > do it now, then any malicious or "inventive" person can do the same in > this "sandboxed thread" Python "of the future". Malicious use is hardly a serious concern. Someone using C code could just as well crash the interpreter. Modifying a python string you just created before you expose it to python code should be fine. If that's not what you're doing.. I'm not sure I want to know *wink* > At least in the case of integers, one could work the tagged integer idea > to bypass the freelist issue the Phillip offered, but in general, I > don't believe there exists a truely immutable type as long as there is C > extensions and/or cTypes. Further, the work to actually implement a new > garbage collector for Python in order to handle these 'immutable' types > seems to me to be more trouble than it is worth. Maybe.. I'm not convinced. There's a lot of payback IMO. -- Adam Olsen, aka Rhamphoryncus From hyeshik at gmail.com Sat Oct 8 16:23:06 2005 From: hyeshik at gmail.com (Hye-Shik Chang) Date: Sat, 8 Oct 2005 23:23:06 +0900 Subject: [Python-Dev] test_cmd_line failure on Kubuntu 5.10 with GCC 4.0 In-Reply-To: <4347821D.1070105@iinet.net.au> References: <4347821D.1070105@iinet.net.au> Message-ID: <4f0b69dc0510080723s2585ae2cw23cbfbc71941cf92@mail.gmail.com> On 10/8/05, Nick Coghlan wrote: > Anyone else seeing any problems with test_cmd_line? I've got a few failures in > test_cmd_line on Kubuntu 5.10 with GCC 4.0 relating to a missing "\n" line ending. > Same problem here. (FreeBSD 6.0 with GCC 3.4.4) In my short inspection, popen2.popen4.read() returned just an empty string. I'll investigate more in this weekend. Hye-Shik From ncoghlan at gmail.com Sat Oct 8 18:18:27 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 09 Oct 2005 02:18:27 +1000 Subject: [Python-Dev] test_cmd_line failure on Kubuntu 5.10 with GCC 4.0 In-Reply-To: <4f0b69dc0510080723s2585ae2cw23cbfbc71941cf92@mail.gmail.com> References: <4347821D.1070105@iinet.net.au> <4f0b69dc0510080723s2585ae2cw23cbfbc71941cf92@mail.gmail.com> Message-ID: <4347F153.8050904@gmail.com> Hye-Shik Chang wrote: > On 10/8/05, Nick Coghlan wrote: > >>Anyone else seeing any problems with test_cmd_line? I've got a few failures in >>test_cmd_line on Kubuntu 5.10 with GCC 4.0 relating to a missing "\n" line ending. >> > > > Same problem here. (FreeBSD 6.0 with GCC 3.4.4) > In my short inspection, popen2.popen4.read() returned just an empty string. Good to know it isn't just a system quirk, as that's the same behaviour I'm getting. I noticed that the ones which appear to be failing (-E, -O, -S, -Q) are the ones which expect an interactive session to open. The tests which pass (-V, -h, directory as argument or stdin) are the ones which don't actually start the interpreter. If I explicitly write Ctrl-D to the subprocess's stdin for the tests which open the interpreter, then the tests pass. So it looks like some sort of buffering problem with standard out not getting flushed before the test tries to read the data. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From jcarlson at uci.edu Sat Oct 8 20:03:31 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 08 Oct 2005 11:03:31 -0700 Subject: [Python-Dev] Sandboxed Threads in Python In-Reply-To: References: <20051007183215.28A4.JCARLSON@uci.edu> Message-ID: <20051008104605.28B3.JCARLSON@uci.edu> Adam Olsen wrote: > On 10/7/05, Josiah Carlson wrote: > > Adam Olsen wrote: > > > I need to stress that *only* the new, immutable and "thread-safe > > > mark-and-sweep" types would be affected by these changes. Everything > > > else would continue to exist as it did before, and the benchmark > > > exists to show they can coexist without killing performance. > > > > All the benchmark showed was that checking for a constant in the > > refcount during in/decrefing, and not garbage collecting those objects, > > didn't adversely affect performance. > > > > As an aside, there's also the ugly bit about being able to guarantee > > that an object is immutable. I personally mutate Python strings in my C > > code all the time (long story, not to be discussed here), and if I can > > do it now, then any malicious or "inventive" person can do the same in > > this "sandboxed thread" Python "of the future". > > Malicious use is hardly a serious concern. Someone using C code could > just as well crash the interpreter. Your malicious user is my inventive colleague. Here's one: performing zero-copy inter-thread IPC by modifying shared immutables. Attempting to enforce a policy of "don't do that, it's not supported" is not going to be effective, especially when doing unsupported things increase speed. People have known for decades that having anything run in kernel space beyond the kernel is dangerous, but they still do because it is faster. I can (but won't) point out examples for days of bad decisions made for the sake of speed, or policy that has been ignored for the sake of speed (some of these overlap and some don't). > Modifying a python string you just created before you expose it to > python code should be fine. If that's not what you're doing.. I'm not > sure I want to know *wink* You really don't want to know. > Maybe.. I'm not convinced. There's a lot of payback IMO. You've not convinced me either. Good luck in getting a group together to make it happen. - Josiah From jcarlson at uci.edu Sat Oct 8 20:42:32 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 08 Oct 2005 11:42:32 -0700 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <200510081144.02122.ms@cerenity.org> References: <20051007190739.28AA.JCARLSON@uci.edu> <200510081144.02122.ms@cerenity.org> Message-ID: <20051008100848.28B0.JCARLSON@uci.edu> Michael Sparks wrote: > On Saturday 08 October 2005 04:05, Josiah Carlson wrote: > > I'm just offering the above as example benchmarks (you certainly don't > > need to do them to satisfy me, but I'll be doing those when my tuple > > space implementation is closer to being done). > > I'll note them as things worth doing - they look reasonable and interesting > benchmarks. (I can think of a few modifications I might make though. For > example in 3 you say "fastest". I might have that as a 3b. 3a could be > "simplest to use/read" or "most likely to pick". Obviously there's a good > chance that's not the fastest. (Could be optimised to be under the hood > I suppose, but that wouldn't be the point of the test) Good point. 3a. Use 1024 byte blocks... 3b. Use whatever makes your system perform best (if you have the time to tune it)... > > > [ Network controlled Networked Audio Mixing Matrix ] > > I imagine you are using a C library/extension of some sort to do the > > mixing...perhaps numarray, Numeric, ... > > Nope, just plain old python (I'm now using a 1.6Ghz centrino machine > though). My mixing function is particularly naive as well. To me that says > more about python than my code. I did consider using pyrex to wrap (or > write) an optimised version, but there didn't seem to be any need for > last week (Though for a non-prototype something faster would be > nice :). Indeed. A quick array.array('h',...) implementation is able to run 7-8x real time on 3->1 stream mixing on my 1.3 ghz laptop. Maybe numeric or numarray isn't necessary. > I'll save responding the linda things until I have a chance to read in detail > what you've written. It sounds very promising though - having multiple > approaches to different styles of concurrency that work nicely with each > other safely is always a positive thing IMO. > > Thanks for the suggestions and best regards, Thank you for the interesting and informative discussion. - Josiah From nnorwitz at gmail.com Sat Oct 8 20:52:52 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sat, 8 Oct 2005 11:52:52 -0700 Subject: [Python-Dev] test_cmd_line failure on Kubuntu 5.10 with GCC 4.0 In-Reply-To: <4347F153.8050904@gmail.com> References: <4347821D.1070105@iinet.net.au> <4f0b69dc0510080723s2585ae2cw23cbfbc71941cf92@mail.gmail.com> <4347F153.8050904@gmail.com> Message-ID: On 10/8/05, Nick Coghlan wrote: > Hye-Shik Chang wrote: > > On 10/8/05, Nick Coghlan wrote: > > > >>Anyone else seeing any problems with test_cmd_line? I've got a few failures in > >>test_cmd_line on Kubuntu 5.10 with GCC 4.0 relating to a missing "\n" line ending. > > If I explicitly write Ctrl-D to the subprocess's stdin for the tests which > open the interpreter, then the tests pass. So it looks like some sort of > buffering problem with standard out not getting flushed before the test tries > to read the data. Sorry, that's a new test I added recently. It works for me on gentoo. The test is very simple and shouldn't be hard to fix. Can you fix it? I assume Guido (or someone) added you as a developer. If not, if you can give me enough info, I can try to fix it. n From rhamph at gmail.com Sat Oct 8 21:00:38 2005 From: rhamph at gmail.com (Adam Olsen) Date: Sat, 8 Oct 2005 13:00:38 -0600 Subject: [Python-Dev] Sandboxed Threads in Python In-Reply-To: <20051008104605.28B3.JCARLSON@uci.edu> References: <20051007183215.28A4.JCARLSON@uci.edu> <20051008104605.28B3.JCARLSON@uci.edu> Message-ID: On 10/8/05, Josiah Carlson wrote: > Your malicious user is my inventive colleague. Here's one: performing > zero-copy inter-thread IPC by modifying shared immutables. Attempting to > enforce a policy of "don't do that, it's not supported" is not going to > be effective, especially when doing unsupported things increase speed. I actually have no problem with that, so long as you use a custom type. It may not technically be immutable but it does provide its own clearly defined semantics for simultaneous modification, and that's enough. Anyway, the idea as I presented it is dead at this point, so I'll leave it at that. -- Adam Olsen, aka Rhamphoryncus From BruceEckel-Python3234 at mailblocks.com Sat Oct 8 21:14:25 2005 From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel) Date: Sat, 8 Oct 2005 13:14:25 -0600 Subject: [Python-Dev] Sandboxed Threads in Python In-Reply-To: <20051008104605.28B3.JCARLSON@uci.edu> References: <20051007183215.28A4.JCARLSON@uci.edu> <20051008104605.28B3.JCARLSON@uci.edu> Message-ID: <1377773721.20051008131425@MailBlocks.com> > I can (but won't) point out examples for days of bad decisions made for > the sake of speed, or policy that has been ignored for the sake of speed > (some of these overlap and some don't). As long as you've entered premature-optimization land, how about decisions made because it's *assumed* that (A) We must have speed here and (B) This will make it happen. My hope would be that we could find a solution that would by default keep you out of trouble when writing concurrent programs, but provide a back door if you wanted to do something special. If you choose to go in the back door, you have to do it consciously and take responsibility for the outcome. With Java, in contrast, as soon as you step into the world of concurrency (even if you step in by accident, which is not uncommon), lots of rules change. What was an ordinary method call before is now something risky that can cause great damage. Should I make this variable volatile? Is an operation atomic? You have to learn a lot of things all over again. I don't want that for Python. I'd like the move into concurrency to be a gentle slope, not a sudden reality-shift. If a novice decides they want to try game programming with concurrency, I want there to be training wheels on by default, so that their first experience will be a successful one, and they can then start learning more features and ideas incrementally, without trying a feature and suddenly having the whole thing get weird and crash down on their heads and cause them to run screaming away ... I know there have been some technologies that have already been mentioned on this list and I hope that we can continue to experiment with and discuss those and also new ideas until we shake out the fundamental issues and maybe even come up with a list of possible solutions. Bruce Eckel http://www.BruceEckel.com mailto:BruceEckel-Python3234 at mailblocks.com Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e" Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel Subscribe to my newsletter: http://www.mindview.net/Newsletter My schedule can be found at: http://www.mindview.net/Calendar From guido at python.org Sat Oct 8 22:28:26 2005 From: guido at python.org (Guido van Rossum) Date: Sat, 8 Oct 2005 13:28:26 -0700 Subject: [Python-Dev] test_cmd_line failure on Kubuntu 5.10 with GCC 4.0 In-Reply-To: References: <4347821D.1070105@iinet.net.au> <4f0b69dc0510080723s2585ae2cw23cbfbc71941cf92@mail.gmail.com> <4347F153.8050904@gmail.com> Message-ID: On 10/8/05, Neal Norwitz wrote: > On 10/8/05, Nick Coghlan wrote: > > Hye-Shik Chang wrote: > > > On 10/8/05, Nick Coghlan wrote: > > > > > >>Anyone else seeing any problems with test_cmd_line? I've got a few failures in > > >>test_cmd_line on Kubuntu 5.10 with GCC 4.0 relating to a missing "\n" line ending. > > > > If I explicitly write Ctrl-D to the subprocess's stdin for the tests which > > open the interpreter, then the tests pass. So it looks like some sort of > > buffering problem with standard out not getting flushed before the test tries > > to read the data. > > Sorry, that's a new test I added recently. It works for me on gentoo. > The test is very simple and shouldn't be hard to fix. Can you fix > it? I assume Guido (or someone) added you as a developer. If not, if > you can give me enough info, I can try to fix it. I guess Neil's test was expecting at least one line of output from python at all times, but on most systems it is completely silent when the input is empty. I fixed the test (also in 2.4) to allow empty input as well as input ending in \n. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jcarlson at uci.edu Sat Oct 8 22:38:45 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 08 Oct 2005 13:38:45 -0700 Subject: [Python-Dev] Sandboxed Threads in Python In-Reply-To: <1377773721.20051008131425@MailBlocks.com> References: <20051008104605.28B3.JCARLSON@uci.edu> <1377773721.20051008131425@MailBlocks.com> Message-ID: <20051008125655.28B8.JCARLSON@uci.edu> Bruce Eckel wrote: > > > I can (but won't) point out examples for days of bad decisions made for > > the sake of speed, or policy that has been ignored for the sake of speed > > (some of these overlap and some don't). > > As long as you've entered premature-optimization land, how about > decisions made because it's *assumed* that (A) We must have speed here > and (B) This will make it happen. A. From what I understand about sandboxing threads, the point was to remove the necessity for the GIL, so that every thread can go out on its own and run on its own processor. B. Shared memory vs. queues vs. pipes vs. ... Concurrency without communication is almost totally worthless. Historically, shared memory has tended to be one of the fastest (if not the fastest) communication methods available. Whether or not mutable shared memory would be faster or slower than queues is unknown, but I'm going to stick with my experience until I am proved wrong by this mythical free threaded system with immutables. > My hope would be that we could find a solution that would by default > keep you out of trouble when writing concurrent programs, but provide > a back door if you wanted to do something special. If you choose to go > in the back door, you have to do it consciously and take > responsibility for the outcome. > > With Java, in contrast, as soon as you step into the world of > concurrency (even if you step in by accident, which is not uncommon), > lots of rules change. What was an ordinary method call before is now > something risky that can cause great damage. Should I make this > variable volatile? Is an operation atomic? You have to learn a lot of > things all over again. > > I don't want that for Python. I'd like the move into concurrency to be > a gentle slope, not a sudden reality-shift. If a novice decides they > want to try game programming with concurrency, I want there to be > training wheels on by default, so that their first experience will be > a successful one, and they can then start learning more features and > ideas incrementally, without trying a feature and suddenly having the > whole thing get weird and crash down on their heads and cause them to > run screaming away ... I don't want to get into an argument here. While I agree that concurrent programming should be easier, my experience with MPI (and other similar systems) and writing parallel algorithms leads me to believe that even if you have a simple method for communication, even if you can guarantee that thread/process A won't clobber thread/process B, actually writing software which executes in some way which made the effort of making the software concurrent worthwhile, is less than easy. I'd love to be proved wrong (I'm hoping to do it myself with tuple spaces). I do, however, doubt that free threading approaches will be the future for concurrent programming in CPython. - Josiah From guido at python.org Sat Oct 8 23:29:21 2005 From: guido at python.org (Guido van Rossum) Date: Sat, 8 Oct 2005 14:29:21 -0700 Subject: [Python-Dev] PEP 342 suggestion: start(), __call__() and unwind_call() methods In-Reply-To: <4347652E.1090705@satori.za.net> References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> <43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net> Message-ID: On 10/7/05, Piet Delport wrote: > Earlier this week, i proposed legalizing "return Result" inside a generator, > and making it act like "raise StopIteration( Result )", for exactly this reason. > > IMHO, this is an elegant and straightforward extension of the current > semantics of returns inside generators, and is the final step toward making > generator-based concurrent tasks[1] look just like the equivalent synchronous > code (with the only difference, more-or-less, being the need for appropriate > "yield" keywords, and a task runner/scheduler loop). > > This change would make a huge difference to the practical usability of these > generator-based tasks. I think they're much less likely to catch on if you > have to write "raise StopIteration( Result )" (or "_return( Result )") all the > time. > > [1] a.k.a. coroutines, which i don't think is an accurate name, anymore. Before we do this I'd like to see you show some programming examples that show how this would be used. I'm having a hard time understanding where you would need this but I realize I haven't used this paradigm enough to have a good feel for it, so I'm open for examples. At least this makes more sense than mapping "return X" into "yield X; return" as someone previously proposed. :) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Sun Oct 9 03:10:56 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 09 Oct 2005 11:10:56 +1000 Subject: [Python-Dev] PEP 342 suggestion: start(), __call__() and unwind_call() methods In-Reply-To: References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> <43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net> Message-ID: <43486E20.3010908@gmail.com> Guido van Rossum wrote: >>This change would make a huge difference to the practical usability of these >>generator-based tasks. I think they're much less likely to catch on if you >>have to write "raise StopIteration( Result )" (or "_return( Result )") all the >>time. >> >>[1] a.k.a. coroutines, which i don't think is an accurate name, anymore. > > > Before we do this I'd like to see you show some programming examples > that show how this would be used. I'm having a hard time understanding > where you would need this but I realize I haven't used this paradigm > enough to have a good feel for it, so I'm open for examples. > > At least this makes more sense than mapping "return X" into "yield X; > return" as someone previously proposed. :) It would be handy when the generators are being used as true pseudothreads with a scheduler like the one I posted earlier in this discussion. It allows these pseudothreads to "call" each other by yielding the call as a lambda or partial function application that produces a zero-argument callable. The called pseudothread can then yield as many times as it wants (either making its own calls, or just being a well-behaved member of a cooperatively MT environment), and then finally returning the value that the original caller requested. Using 'return' for this is actually a nice idea, and if we ever do make it legal to use 'return' in generators, these are the semantics it should have. However, I'm not sure its something we should be adding *right now* as part of PEP 342 - writing "raise StopIteration" and "raise StopIteration(result)", and saying that a generator includes an implied "raise StopIteration" after its last line of code really isn't that difficult to understand, and is completely explicit about what is going on. My basic concern is that I think replacing "raise StopIteration" with "return" and "raise StopIteration(EXPR)" with "return EXPR" would actually make such code easier to write at the expense of making it harder to *read*, because the fact that an exception is being raised is obscured. Consider the following two code snippets: def function(): try: return except StopIteration: print "We never get here." def generator(): yield try: return except StopIteration: print "But we would get here!" So, instead of having "return" automatically map to "raise StopIteration" inside generators, I'd like to suggest we keep it illegal to use "return" inside a generator, and instead add a new attribute "result" to StopIteration instances such that the following three conditions hold: # Result is None if there is no argument to StopIteration try: raise StopIteration except StopIteration, ex: assert ex.result is None # Result is the argument if there is exactly one argument try: raise StopIteration(expr) except StopIteration, ex: assert ex.result == ex.args[0] # Result is the argument tuple if there are multiple arguments try: raise StopIteration(expr1, expr2) except StopIteration, ex: assert ex.result == ex.args This precisely parallels the behaviour of return statements: return # Call returns None return expr # Call returns expr return expr1, expr2 # Call returns (expr1, expr2) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Sun Oct 9 03:25:31 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 09 Oct 2005 11:25:31 +1000 Subject: [Python-Dev] Sandboxed Threads in Python In-Reply-To: <20051008125655.28B8.JCARLSON@uci.edu> References: <20051008104605.28B3.JCARLSON@uci.edu> <1377773721.20051008131425@MailBlocks.com> <20051008125655.28B8.JCARLSON@uci.edu> Message-ID: <4348718B.9000502@gmail.com> Josiah Carlson wrote: > I do, however, doubt that free threading approaches will be the future > for concurrent programming in CPython. Hear, hear! IMO, it's the combination of the GIL with a compiler which never decides to change the code execution order under the covers that makes threading *not* a pain in Python (so long as one remembers to release the GIL around blocking calls to external libraries, and to use threading.Queue to get info between threads wherever possible). The desire to change that seems to be a classic case of wanting to write C/C++/Java/whatever in Python, rather than writing Python in Python. And thanks to Bruce for starting the recent multi-processing discussion - hopefully one day we will have mechanisms in the standard library that scale relatively smoothly from PEP 342 based logical threads, through threading.Thread based physical threads, to based subprocesses. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From guido at python.org Sun Oct 9 03:25:39 2005 From: guido at python.org (Guido van Rossum) Date: Sat, 8 Oct 2005 18:25:39 -0700 Subject: [Python-Dev] PEP 342 suggestion: start(), __call__() and unwind_call() methods In-Reply-To: <43486E20.3010908@gmail.com> References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> <43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net> <43486E20.3010908@gmail.com> Message-ID: > Guido van Rossum wrote: > > Before we do this I'd like to see you show some programming examples > > that show how this would be used. I'm having a hard time understanding > > where you would need this but I realize I haven't used this paradigm > > enough to have a good feel for it, so I'm open for examples. On 10/8/05, Nick Coghlan wrote: > It would be handy when the generators are being used as true pseudothreads > with a scheduler like the one I posted earlier in this discussion. It allows > these pseudothreads to "call" each other by yielding the call as a lambda or > partial function application that produces a zero-argument callable. The > called pseudothread can then yield as many times as it wants (either making > its own calls, or just being a well-behaved member of a cooperatively MT > environment), and then finally returning the value that the original caller > requested. > > Using 'return' for this is actually a nice idea, and if we ever do make it > legal to use 'return' in generators, these are the semantics it should have. > > However, I'm not sure its something we should be adding *right now* as part of > PEP 342 - writing "raise StopIteration" and "raise StopIteration(result)", and > saying that a generator includes an implied "raise StopIteration" after its > last line of code really isn't that difficult to understand, and is completely > explicit about what is going on. > > My basic concern is that I think replacing "raise StopIteration" with "return" > and "raise StopIteration(EXPR)" with "return EXPR" would actually make such > code easier to write at the expense of making it harder to *read*, because the > fact that an exception is being raised is obscured. Consider the following two > code snippets: > > def function(): > try: > return > except StopIteration: > print "We never get here." > > def generator(): > yield > try: > return > except StopIteration: > print "But we would get here!" Right. Plus, Piet also remarked that the value is silently ignored when the generator is used in a for-loop. Since that's likely to be the majority of generators, I'd worry that accepting "return X" would increase the occurrence of bugs caused by someone habitually writing "return X" where they meant "yield X". (Assuming there's another yield in the generator, otherwise it wouldn't be a generator and the error would reveal itself very differently.) > So, instead of having "return" automatically map to "raise StopIteration" > inside generators, I'd like to suggest we keep it illegal to use "return" > inside a generator, and instead add a new attribute "result" to StopIteration > instances such that the following three conditions hold: > > # Result is None if there is no argument to StopIteration > try: > raise StopIteration > except StopIteration, ex: > assert ex.result is None > > # Result is the argument if there is exactly one argument > try: > raise StopIteration(expr) > except StopIteration, ex: > assert ex.result == ex.args[0] > > # Result is the argument tuple if there are multiple arguments > try: > raise StopIteration(expr1, expr2) > except StopIteration, ex: > assert ex.result == ex.args > > This precisely parallels the behaviour of return statements: > return # Call returns None > return expr # Call returns expr > return expr1, expr2 # Call returns (expr1, expr2) This seems a bit overdesigned; I'd expect that the trampoline scheduler could easily enough pick the args tuple apart to get the same effect without adding another attribute unique to StopIteration. I'd like to keep StopIteration really lightweight so it doesn't slow down its use in other places. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Sun Oct 9 04:43:34 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 09 Oct 2005 12:43:34 +1000 Subject: [Python-Dev] PEP 342 suggestion: start(), __call__() and unwind_call() methods In-Reply-To: References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> <43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net> <43486E20.3010908@gmail.com> Message-ID: <434883D6.80009@gmail.com> Guido van Rossum wrote: > On 10/8/05, Nick Coghlan wrote: >>So, instead of having "return" automatically map to "raise StopIteration" >>inside generators, I'd like to suggest we keep it illegal to use "return" >>inside a generator, and instead add a new attribute "result" to StopIteration >>instances such that the following three conditions hold: >> >> # Result is None if there is no argument to StopIteration >> try: >> raise StopIteration >> except StopIteration, ex: >> assert ex.result is None >> >> # Result is the argument if there is exactly one argument >> try: >> raise StopIteration(expr) >> except StopIteration, ex: >> assert ex.result == ex.args[0] >> >> # Result is the argument tuple if there are multiple arguments >> try: >> raise StopIteration(expr1, expr2) >> except StopIteration, ex: >> assert ex.result == ex.args >> >>This precisely parallels the behaviour of return statements: >> return # Call returns None >> return expr # Call returns expr >> return expr1, expr2 # Call returns (expr1, expr2) > > > This seems a bit overdesigned; I'd expect that the trampoline > scheduler could easily enough pick the args tuple apart to get the > same effect without adding another attribute unique to StopIteration. > I'd like to keep StopIteration really lightweight so it doesn't slow > down its use in other places. True. And it would be easy enough for a framework to have a utility function that looked like: def getresult(ex): args = ex.args if not args: return None elif len(args) == 1: return args[0] else: return args Although, if StopIteration.result was a read-only property with the above definition, wouldn't that give us the benefit of "one obvious way" to return a value from a coroutine without imposing any runtime cost on normal use of StopIteration to finish an iterator? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From foom at fuhm.net Sun Oct 9 04:54:10 2005 From: foom at fuhm.net (James Y Knight) Date: Sat, 8 Oct 2005 22:54:10 -0400 Subject: [Python-Dev] PEP 342 suggestion: start(), __call__() and unwind_call() methods In-Reply-To: <43486E20.3010908@gmail.com> References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> <43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net> <43486E20.3010908@gmail.com> Message-ID: <953B6108-C621-46C3-B492-6A726595403B@fuhm.net> On Oct 8, 2005, at 9:10 PM, Nick Coghlan wrote: > So, instead of having "return" automatically map to "raise > StopIteration" > inside generators, I'd like to suggest we keep it illegal to use > "return" > inside a generator Only one issue with that: it's _not currently illegal_ to use return inside a generator. From the view of the outsider, it currently effectively does currently map to "raise StopIteration". But not on the inside, just like you'd expect to happen. The only proposed change to the semantics is to also allow a value to be provided with the return. > def generator(): > yield > try: > return > except StopIteration: > print "But we would get here!" >>> def generator(): ... yield 5 ... try: ... return ... except StopIteration: ... print "But we would get here!" ... >>> x=generator() >>> x.next() 5 >>> x.next() Traceback (most recent call last): File "", line 1, in ? StopIteration >>> James From ncoghlan at gmail.com Sun Oct 9 05:26:15 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 09 Oct 2005 13:26:15 +1000 Subject: [Python-Dev] PEP 342 suggestion: start(), __call__() and unwind_call() methods In-Reply-To: <953B6108-C621-46C3-B492-6A726595403B@fuhm.net> References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> <43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net> <43486E20.3010908@gmail.com> <953B6108-C621-46C3-B492-6A726595403B@fuhm.net> Message-ID: <43488DD7.9010700@gmail.com> James Y Knight wrote: > > On Oct 8, 2005, at 9:10 PM, Nick Coghlan wrote: > >> So, instead of having "return" automatically map to "raise >> StopIteration" >> inside generators, I'd like to suggest we keep it illegal to use >> "return" >> inside a generator > > > Only one issue with that: it's _not currently illegal_ to use return > inside a generator. From the view of the outsider, it currently > effectively does currently map to "raise StopIteration". But not on the > inside, just like you'd expect to happen. The only proposed change to > the semantics is to also allow a value to be provided with the return. Huh. I'd have sworn I'd tried that and it didn't work. Maybe I was using a value with the return, and had forgotten the details of the error message. In that case, I have far less of an objection to the idea - particularly since it *does* forcibly terminate the generator's block without triggering any exception handlers. I was forgetting that the StopIteration exception is actually raised external to the generator code block - it's created by the surrounding generator object once the code block terminates. That means the actual change being proposed is smaller than I thought: 1. Change the compiler to allow an argument to return inside a generator 2. Change generator objects to use the value returned by their internal code block as the argument to the StopIteration exception they create if the block terminates Note that this would change the behaviour of normal generators - they will raise "StopIteration(None)", rather than the current "StopIteration()". I actually kind of like that - it means that generators become even more like functions, with their return value being held in ex.args[0]. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From mhammond at skippinet.com.au Sun Oct 9 10:29:27 2005 From: mhammond at skippinet.com.au (Mark Hammond) Date: Sun, 9 Oct 2005 18:29:27 +1000 Subject: [Python-Dev] PythonCore\CurrentVersion In-Reply-To: <4347A020.2050008@v.loewis.de> Message-ID: > What happened to the CurrentVersion registry entry documented at > > http://www.python.org/windows/python/registry.html > > AFAICT, even the python15.wse file did not fill a value in this > entry (perhaps I'm misinterpreting the wse file, though). > > So was this ever used? Why is it documented, and who documented it > (unfortunately, registry.html is not in cvs/subversion, either)? I believe I documented it many moons ago. I don't think CurrentVersion was ever implemented (or possibly was for a very short time before being removed). The "registered modules" concept was misguided and AFAIK is not used by anyone - IMO it should be deprecated (if not just removed!). Further, I believe the documentation in the file for PYTHONPATH is, as said in those docs, out of date, but that the comments in getpathp.c are correct. Cheers, Mark From ncoghlan at gmail.com Sun Oct 9 15:08:32 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 09 Oct 2005 23:08:32 +1000 Subject: [Python-Dev] New PEP 342 suggestion: result() and allow "return with arguments" in generators (was Re: PEP 342 suggestion: start(), __call__() and unwind_call() methods) In-Reply-To: <434883D6.80009@gmail.com> References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> <43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net> <43486E20.3010908@gmail.com> <434883D6.80009@gmail.com> Message-ID: <43491650.1020704@gmail.com> Nick Coghlan wrote: > Although, if StopIteration.result was a read-only property with the above > definition, wouldn't that give us the benefit of "one obvious way" to return a > value from a coroutine without imposing any runtime cost on normal use of > StopIteration to finish an iterator? Sometimes I miss the obvious. There's a *much*, *much* better place to store the return value of a generator than on the StopIteration exception that it raises when it finishes. Just save the return value in the *generator*. And then provide a method on generators that is the functional equivalent of: def result(): # Finish the generator if it isn't finished already for step in self: pass return self._result # Return the result saved when the block finished It doesn't matter that a for loop swallows the StopIteration exception any more, because the return value is retrieved directly from the generator. I also like that this interface could still be used even if the work of getting the result is actually farmed off to a separate thread or process behind the scenes. Cheers, Nick. P.S. Here's what a basic trampoline scheduler without builtin asynchronous call support would look like if coroutines could return values directly. The bits that it cleans up are marked "NEW": import collections class Trampoline: """Manage communications between coroutines""" running = False def __init__(self): self.queue = collections.deque() def add(self, coroutine): """Request that a coroutine be executed""" self.schedule(coroutine) def run(self): result = None self.running = True try: while self.running and self.queue: func = self.queue.popleft() result = func() return result finally: self.running = False def stop(self): self.running = False def schedule(self, coroutine, stack=(), call_result=None, *exc): # Define the new pseudothread def pseudothread(): try: if exc: callee = coroutine.throw(call_result, *exc) else: callee = coroutine.send(call_result) except StopIteration: # NEW: no need to name exception # Coroutine finished cleanly if stack: # Send the result to the caller caller = stack[0] prev_stack = stack[1] # NEW: get result directly from callee self.schedule( caller, prev_stack, callee.result() ) except: # Coroutine finished with an exception if stack: # send the error back to the caller caller = stack[0] prev_stack = stack[1] self.schedule( caller, prev_stack, *sys.exc_info() ) else: # Nothing left in this pseudothread to # handle it, let it propagate to the # run loop raise else: # Coroutine isn't finished yet if callee is None: # Reschedule the current coroutine self.schedule(coroutine, stack) elif isinstance(callee, types.GeneratorType): # Make a call to another coroutine self.schedule(callee, (coroutine,stack)) elif iscallable(callee): # Make a blocking call in a separate thread self.schedule( threaded(callee), (coroutine,stack) ) else: # Raise a TypeError in the current coroutine self.schedule(coroutine, stack, TypeError, "Illegal argument to yield" ) # Add the new pseudothread to the execution queue self.queue.append(pseudothread) P.P.S. Here's the simple coroutine that threads out a call to support asynchronous calls with the above scheduler: def threaded(func): class run_func(threading.Thread): def __init__(self): super(run_func, self).__init__() self.finished = False def run(self): print "Making call" self.result = func() self.finished = True print "Made call" call = run_func() call.start() print "Started call" while not call.finished: yield # Not finished yet so reschedule print "Finished call" return call.result I tried this out by replacing 'yield' with 'yield None' and 'return call.result' with 'print call.result': Py> x = threaded(lambda: "Hi there!") Py> x.next() Started call Making call Made call Py> x.next() Finished call Hi there! Traceback (most recent call last): File "", line 1, in ? StopIteration -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From andersjm at inbound.dk Sun Oct 9 16:00:04 2005 From: andersjm at inbound.dk (Anders J. Munch) Date: Sun, 09 Oct 2005 16:00:04 +0200 Subject: [Python-Dev] Proposed changes to PEP 343 In-Reply-To: <43468257.9030008@gmail.com> References: <9B1795C95533CA46A83BA1EAD4B01030031F0B@flonidanmail.flonidan.net> <43468257.9030008@gmail.com> Message-ID: <43492264.5080403@inbound.dk> Nick Coghlan wrote: >Anders J. Munch wrote: > >>Note that __with__ and __enter__ could be combined into one with no >>loss of functionality: >> >> abc,VAR = (EXPR).__with__() >> > >They can't be combined, because they're invoked on different objects. > Sure they can. The combined method first does what __with__ would have done to create abc, and then does whatever abc.__enter__ would have done. Since the type of 'abc' is always known to the author of __with__, this is trivial. Strictly speaking there's no guarantee that the type of 'abc' is known to the author of __with__, but I can't imagine an example where that would not be the case. >It would >be like trying to combine __iter__() and next() into the same method for >iterators. . . The with-statement needs two pieces of information from the expression: Which object to bind to the users's variable (VAR) and which object takes care of block-exit cleanup (abc). A combined method would give these two equal standing rather than deriving one from the other. Nothing ugly about that. - Anders From guido at python.org Sun Oct 9 16:28:29 2005 From: guido at python.org (Guido van Rossum) Date: Sun, 9 Oct 2005 07:28:29 -0700 Subject: [Python-Dev] Proposed changes to PEP 343 In-Reply-To: <43492264.5080403@inbound.dk> References: <9B1795C95533CA46A83BA1EAD4B01030031F0B@flonidanmail.flonidan.net> <43468257.9030008@gmail.com> <43492264.5080403@inbound.dk> Message-ID: On 10/9/05, Anders J. Munch wrote: > Nick Coghlan wrote: > >Anders J. Munch wrote: > > > >>Note that __with__ and __enter__ could be combined into one with no > >>loss of functionality: > >> > >> abc,VAR = (EXPR).__with__() > >> > > > >They can't be combined, because they're invoked on different objects. > > > > Sure they can. The combined method first does what __with__ would > have done to create abc, and then does whatever abc.__enter__ would > have done. Since the type of 'abc' is always known to the author of > __with__, this is trivial. I'm sure it can be done, but I find this ugly API design. While I'm not keen on complicating the API, the decimal context example has convinced me that it's necessary. The separation into __with__ which asks EXPR for a context manager and __enter__ / __exit__ which handle try/finally feels right. An API returning a tuple is asking for bugs. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sun Oct 9 16:46:09 2005 From: guido at python.org (Guido van Rossum) Date: Sun, 9 Oct 2005 07:46:09 -0700 Subject: [Python-Dev] New PEP 342 suggestion: result() and allow "return with arguments" in generators (was Re: PEP 342 suggestion: start(), __call__() and unwind_call() methods) In-Reply-To: <43491650.1020704@gmail.com> References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> <43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net> <43486E20.3010908@gmail.com> <434883D6.80009@gmail.com> <43491650.1020704@gmail.com> Message-ID: On 10/9/05, Nick Coghlan wrote: > Sometimes I miss the obvious. There's a *much*, *much* better place to store > the return value of a generator than on the StopIteration exception that it > raises when it finishes. Just save the return value in the *generator*. > > And then provide a method on generators that is the functional equivalent of: > > def result(): > # Finish the generator if it isn't finished already > for step in self: > pass > return self._result # Return the result saved when the block finished > > It doesn't matter that a for loop swallows the StopIteration exception any > more, because the return value is retrieved directly from the generator. Actually, I don't like this at all. It harks back to earlier proposals where state was stored on the generator (e.g. PEP 288). > I also like that this interface could still be used even if the work of > getting the result is actually farmed off to a separate thread or process > behind the scenes. That seems an odd use case for generators, better addressed by creating an explicit helper object when the need exists. I bet that object will need to exist anyway to hold other information related to the exchange of information between threads (like a lock or a Queue). Looking at your example, I have to say that I find the trampoline example from PEP 342 really hard to understand. It took me several days to get it after Phillip first put it in the PEP, and that was after having reconstructed the same functionality independently. (I have plans to replace or augment it with a different set of examples, but haven't gotten the time. Old story...) I don't think that something like that ought to be motivating generator extensions. I also think that using a thread for async I/O is the wrong approach -- if you wanted to use threads shou should be using threads and you wouldn't be dealing with generators. There's a solution that uses select() which can handle as many sockets as you want without threads and without the clumsy polling ("is it ready yet? is it ready yet? is it ready yet?"). I urge you to leave well enough alone. There's room for extensions after people have built real systems with the raw material provided by PEP 342 and 343. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at zope.com Sun Oct 9 18:33:12 2005 From: jim at zope.com (Jim Fulton) Date: Sun, 09 Oct 2005 12:33:12 -0400 Subject: [Python-Dev] defaultproperty (was: Re: RFC: readproperty) In-Reply-To: <433BA3CF.1090205@zope.com> References: <433AA5AC.6040509@zope.com> <433BA3CF.1090205@zope.com> Message-ID: <43494648.6040904@zope.com> Based on the discussion, I think I'd go with defaultproperty. Questions: - Should this be in builtins, alongside property, or in a library module? (Oleg suggested propertytools.) - Do we need a short PEP? Jim Jim Fulton wrote: > Guido van Rossum wrote: > >>On 9/28/05, Jim Fulton wrote: >> > > ... > >>I think we need to be real careful with chosing a name -- in Jim's >>example, *anyone* could assign to Spam().eggs to override the value. >>The name "readproperty" is too close to "readonlyproperty", > > > In fact, property creates read-only properties for new-style classes. > (I hadn't realized, until reading this thread, that for classic > classes, you could still set the attribute.) > > > but > >>read-only it ain't! "Lazy" also doesn't really describe what's going >>on. > > > I agree. > > >>I believe some folks use a concept of "memo functions" which resemble >>this proposal except the notation is different: IIRC a memo function >>is always invoked as a function, but stores its result in a private >>instance variable, which it returns upon subsequent calls. This is a >>common pattern. Jim's proposal differs because the access looks like >>an attribute, not a method call. Still, perhaps memoproperty would be >>a possible name. >> >>Another way to look at the naming problem is to recognize that the >>provided function really computes a default value if the attribute >>isn't already set. So perhaps defaultproperty? > > > Works for me. > > Oleg Broytmann wrote: > > On Wed, Sep 28, 2005 at 10:16:12AM -0400, Jim Fulton wrote: > > > >> class readproperty(object): > > > > [skip] > > > >>I do this often enough > > > > > > I use it since about 2000 often enough under the name CachedAttribute: > > > > http://cvs.sourceforge.net/viewcvs.py/ppa/qps/qUtils.py > > Steven Bethard wrote: > > Jim Fulton wrote: > > > ... > > I've also needed behavior like this a few times, but I use a variant > > of Scott David Daniel's recipe[1]: > > > > class _LazyAttribute(object): > > > Yup, the Zope 3 sources have something very similar: > > http://svn.zope.org/Zope3/trunk/src/zope/cachedescriptors/property.py?view=markup > > I actually think this does too much. All it saves me, compared to what I proposed > is one assignment. I'd rather make that assignment explicit. > > Anyway, all I wanted with readproperty was a property that implemented only > __get__, as opposed to property, which implements __get__, __set__, and __delete__. > > I'd be happy to call it readproprty or getproperty or defaulproperty or whatever. :) > > I'd prefer that it's semantics stay fairly simple though. > > > Jim > -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From solipsis at pitrou.net Sun Oct 9 21:02:16 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 09 Oct 2005 21:02:16 +0200 Subject: [Python-Dev] async IO and helper threads In-Reply-To: References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> <43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net> <43486E20.3010908@gmail.com> <434883D6.80009@gmail.com> <43491650.1020704@gmail.com> Message-ID: <1128884536.6142.10.camel@fsol> Le dimanche 09 octobre 2005 ? 07:46 -0700, Guido van Rossum a ?crit : > I > also think that using a thread for async I/O is the wrong approach -- > if you wanted to use threads shou should be using threads and you > wouldn't be dealing with generators. There's a solution that uses > select() which can handle as many sockets as you want without threads > and without the clumsy polling select() works with sockets. But nothing else if you want to stay cross-platform, so async file IO and other things remain open questions. By the way, you don't need clumsy polling to wait for helper threads ;) You can just use a ConditionVariable from the threading package (or something else with the same semantics). BTW, I'm not arguing at all for the extension proposal. Integrating async stuff into generators does not need an API extension IMO. I'm already doing it in my scheduler. An example which just waits for an external command to finish and periodically spins a character in the meantime: http://svn.berlios.de/viewcvs/tasklets/trunk/examples/popen1.py?view=markup The scheduler code is here: http://svn.berlios.de/viewcvs/tasklets/trunk/softlets/core/switcher.py?view=markup Regards Antoine. From greg.ewing at canterbury.ac.nz Mon Oct 10 02:33:42 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 10 Oct 2005 13:33:42 +1300 Subject: [Python-Dev] Removing the block stack (was Re: PEP 343 and __with__) In-Reply-To: <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com> References: <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com> Message-ID: <4349B6E6.4020804@canterbury.ac.nz> Phillip J. Eby wrote: > Clearly, the cost of function calls in Python lies somewhere else, and I'd > probably look next at parameter tuple allocation, For simple calls where there aren't any *args or other such complications, it seems like it should be possible to just copy the args from the calling frame straight into the called one. Or is this already done these days? -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From guido at python.org Mon Oct 10 02:35:55 2005 From: guido at python.org (Guido van Rossum) Date: Sun, 9 Oct 2005 17:35:55 -0700 Subject: [Python-Dev] defaultproperty (was: Re: RFC: readproperty) In-Reply-To: <43494648.6040904@zope.com> References: <433AA5AC.6040509@zope.com> <433BA3CF.1090205@zope.com> <43494648.6040904@zope.com> Message-ID: On 10/9/05, Jim Fulton wrote: > Based on the discussion, I think I'd go with defaultproperty. Great. > Questions: > > - Should this be in builtins, alongside property, or in > a library module? (Oleg suggested propertytools.) > > - Do we need a short PEP? I think so. From the responses I'd say there's at most lukewarm interest (including from me). You might also want to drop it and just add it to your personal (or Zope's) library. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Mon Oct 10 03:18:30 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 09 Oct 2005 21:18:30 -0400 Subject: [Python-Dev] Removing the block stack (was Re: PEP 343 and __with__) In-Reply-To: <4349B6E6.4020804@canterbury.ac.nz> References: <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051009211304.01f30270@mail.telecommunity.com> At 01:33 PM 10/10/2005 +1300, Greg Ewing wrote: >Phillip J. Eby wrote: > > > Clearly, the cost of function calls in Python lies somewhere else, and I'd > > probably look next at parameter tuple allocation, > >For simple calls where there aren't any *args or other >such complications, it seems like it should be possible >to just copy the args from the calling frame straight >into the called one. > >Or is this already done these days? It's already done, if the number of arguments matches, the code flags are just so, etc. From greg.ewing at canterbury.ac.nz Mon Oct 10 04:43:20 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 10 Oct 2005 15:43:20 +1300 Subject: [Python-Dev] New PEP 342 suggestion: result() and allow "return with arguments" in generators (was Re: PEP 342 suggestion: start(), __call__() and unwind_call() methods) In-Reply-To: <43491650.1020704@gmail.com> References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> <43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net> <43486E20.3010908@gmail.com> <434883D6.80009@gmail.com> <43491650.1020704@gmail.com> Message-ID: <4349D548.3030000@canterbury.ac.nz> Nick Coghlan wrote: > Sometimes I miss the obvious. There's a *much*, *much* better place to store > the return value of a generator than on the StopIteration exception that it > raises when it finishes. Just save the return value in the *generator*. I'm not convinced that this is better, because it would make value-returning something specific to generators. On the other hand, raising StopIteration(value) is something that any iterator can easily do, whether it's implemented as a generator, a Python class, a C type, or whatever. Besides, it doesn't smell right to me -- sort of like returning a value from a function by storing it in a global rather than using a return statement. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Mon Oct 10 04:43:31 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 10 Oct 2005 15:43:31 +1300 Subject: [Python-Dev] PEP 342 suggestion: start(), __call__() and unwind_call() methods In-Reply-To: References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> <43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net> <43486E20.3010908@gmail.com> Message-ID: <4349D553.1060909@canterbury.ac.nz> Guido van Rossum wrote: > Plus, Piet also remarked that the value is silently ignored > when the generator is used in a for-loop. ... I'd worry that accepting > "return X" would increase the occurrence of bugs caused by someone > habitually writing "return X" where they meant "yield X". Then have for-loops raise an exception if they get a StopIteration with something other than None as an argument. > I'd like to keep StopIteration really lightweight so it doesn't slow > down its use in other places. You could leave StopIteration itself alone altogether and have a subclass StopIterationWithValue for returning things. This would make the for-loop situation even safer, since then you could distinguish between falling off the end of a generator and executing 'return None' inside it. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Mon Oct 10 04:44:12 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 10 Oct 2005 15:44:12 +1300 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: References: <20051007172237.GA13288@localhost.localdomain> Message-ID: <4349D57C.7010509@canterbury.ac.nz> Guido van Rossum wrote: > I personally think this is adequately handled by writing: > > (first, second), rest = something[:2], something[2:] That's less than satisfying because it violates DRY three times (once for mentioning 'something' twice, once for mentioning the index twice, and once for needing to make sure the index agrees with the number of items on the LHS). > Argument lists are not tuples [*] and features of argument lists > should not be confused with features of tuple unpackings. I'm aware of the differences, but I still see a strong similarity where this particular feature is concerned. The pattern of thinking is the same: "I want to deal with the first n of these things individually, and the rest collectively." -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From fdrake at acm.org Mon Oct 10 05:04:58 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Sun, 9 Oct 2005 23:04:58 -0400 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <4349D57C.7010509@canterbury.ac.nz> References: <20051007172237.GA13288@localhost.localdomain> <4349D57C.7010509@canterbury.ac.nz> Message-ID: <200510092304.58638.fdrake@acm.org> On Sunday 09 October 2005 22:44, Greg Ewing wrote: > I'm aware of the differences, but I still see a strong > similarity where this particular feature is concerned. > The pattern of thinking is the same: "I want to deal > with the first n of these things individually, and the > rest collectively." Well stated. I'm in complete agreement on this matter. -Fred -- Fred L. Drake, Jr. From ncoghlan at gmail.com Mon Oct 10 00:28:14 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 10 Oct 2005 08:28:14 +1000 Subject: [Python-Dev] defaultproperty In-Reply-To: <43494648.6040904@zope.com> References: <433AA5AC.6040509@zope.com> <433BA3CF.1090205@zope.com> <43494648.6040904@zope.com> Message-ID: <4349997E.9010208@gmail.com> Jim Fulton wrote: > Based on the discussion, I think I'd go with defaultproperty. > > Questions: > > - Should this be in builtins, alongside property, or in > a library module? (Oleg suggested propertytools.) > > - Do we need a short PEP? The much-discussed never-created decorators module, perhaps? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ironfroggy at gmail.com Mon Oct 10 07:35:55 2005 From: ironfroggy at gmail.com (Calvin Spealman) Date: Mon, 10 Oct 2005 01:35:55 -0400 Subject: [Python-Dev] Removing the block stack (was Re: PEP 343 and __with__) In-Reply-To: <5.1.1.6.0.20051006024517.01f6f0d0@mail.telecommunity.com> References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com> <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com> <2mu0fxekdz.fsf@starship.python.net> <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com> <5.1.1.6.0.20051006024517.01f6f0d0@mail.telecommunity.com> Message-ID: <76fd5acf0510092235u82c5a32vb436e3e10a4118cf@mail.gmail.com> On 10/6/05, Phillip J. Eby wrote: > At 10:09 PM 10/5/2005 -0700, Neal Norwitz wrote: > >The general idea is to allocate the stack in one big hunk and just > >walk up/down it as functions are called/returned. This only means > >incrementing or decrementing pointers. This should allow us to avoid > >a bunch of copying and tuple creation/destruction. Frames would > >hopefully be the same size which would help. Note that even though > >there is a free list for frames, there could still be > >PyObject_GC_Resize()s often (or unused memory). WIth my idea, > >hopefully there would be better memory locality, which could speed > >things up. > > Yeah, unfortunately for your idea, generators would have to copy off bits > of the stack and then copy them back in, making generators slower. If it > weren't for that part, the idea would probably be a good one, as arguments, > locals, cells, and the block and value stacks could all be handled that > way, with the compiler treating all operations as base-pointer offsets, > thereby eliminating lots of more-complex pointer management in ceval.c and > frameobject.c. If we had these seperate stacks for each thread, would it be possible to also create a stack for generator calls? The current call operations could possibly do a check to see if the function being called is a generator (if they don't have a generator bit, could they, to speed this up?). This generator-specific stack would be used for the generator's frame and any calls it makes on each iteration. This may pose threat of a bottleneck, allocating a new stack in the heap for every generator call, but generators are generally iterated more than created and the stacks could be pooled, of course. I don't know as much as I'd like about the CPython internals, so I'm just throwing this out there for commenting by those in the know. From ironfroggy at gmail.com Mon Oct 10 07:47:32 2005 From: ironfroggy at gmail.com (Calvin Spealman) Date: Mon, 10 Oct 2005 01:47:32 -0400 Subject: [Python-Dev] Fwd: defaultproperty In-Reply-To: <76fd5acf0510092246q1861403flda2dd49ddbe02d6c@mail.gmail.com> References: <433AA5AC.6040509@zope.com> <433BA3CF.1090205@zope.com> <43494648.6040904@zope.com> <4349997E.9010208@gmail.com> <76fd5acf0510092246q1861403flda2dd49ddbe02d6c@mail.gmail.com> Message-ID: <76fd5acf0510092247m4d8383e8o40260de7950906@mail.gmail.com> Sorry, Nick. GMail, for some reason, doesn't follow the reply-to properly for python-dev. Forwarding to list now... On 10/9/05, Nick Coghlan wrote: > Jim Fulton wrote: > > Based on the discussion, I think I'd go with defaultproperty. > > > > Questions: > > > > - Should this be in builtins, alongside property, or in > > a library module? (Oleg suggested propertytools.) > > > > - Do we need a short PEP? > > The much-discussed never-created decorators module, perhaps? > > Cheers, > Nick. Never created for a reason? lumping things together for having the similar usage semantics, but unrelated purposes, might be something to avoid and maybe that's why it hasn't happened yet for decorators. If ever there was a makethreadsafe decorator, it should go in the thread module, etc. I mean, come on, its like making a module just to store a bunch of unrelated types just to lump them together because they're types. Who wants that? From fredrik at pythonware.com Mon Oct 10 10:24:47 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 10 Oct 2005 10:24:47 +0200 Subject: [Python-Dev] defaultproperty References: <433AA5AC.6040509@zope.com><433BA3CF.1090205@zope.com> <43494648.6040904@zope.com><4349997E.9010208@gmail.com><76fd5acf0510092246q1861403flda2dd49ddbe02d6c@mail.gmail.com> <76fd5acf0510092247m4d8383e8o40260de7950906@mail.gmail.com> Message-ID: Calvin Spealman wrote: > I mean, come on, its like making a module just to store a bunch of > unrelated types just to lump them together because they're types. import types From ncoghlan at gmail.com Mon Oct 10 11:02:57 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 10 Oct 2005 19:02:57 +1000 Subject: [Python-Dev] New PEP 342 suggestion: result() and allow "return with arguments" in generators (was Re: PEP 342 suggestion: start(), __call__() and unwind_call() methods) In-Reply-To: <4349D548.3030000@canterbury.ac.nz> References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> <43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net> <43486E20.3010908@gmail.com> <434883D6.80009@gmail.com> <43491650.1020704@gmail.com> <4349D548.3030000@canterbury.ac.nz> Message-ID: <434A2E41.7040903@gmail.com> Greg Ewing wrote: > Nick Coghlan wrote: > > >>Sometimes I miss the obvious. There's a *much*, *much* better place to store >>the return value of a generator than on the StopIteration exception that it >>raises when it finishes. Just save the return value in the *generator*. > > > I'm not convinced that this is better, because it would > make value-returning something specific to generators. > > On the other hand, raising StopIteration(value) is something > that any iterator can easily do, whether it's implemented > as a generator, a Python class, a C type, or whatever. > > Besides, it doesn't smell right to me -- sort of like returning > a value from a function by storing it in a global rather than > using a return statement. Yeah, the various responses have persuaded me that having generators resemble threads in that they don't have a defined "return value" isn't a bad thing at all. Although that means I've gone all the way back to preferring the status quo - if you want to pass data back from a generator when it terminates, just use StopIteration(result). I'm starting to think we want to let PEP 342 bake for at least one release cycle before deciding what (if any) additional behaviour should be added to generators. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Mon Oct 10 11:21:56 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 10 Oct 2005 19:21:56 +1000 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <200510092304.58638.fdrake@acm.org> References: <20051007172237.GA13288@localhost.localdomain> <4349D57C.7010509@canterbury.ac.nz> <200510092304.58638.fdrake@acm.org> Message-ID: <434A32B4.1050709@gmail.com> Fred L. Drake, Jr. wrote: > On Sunday 09 October 2005 22:44, Greg Ewing wrote: > > I'm aware of the differences, but I still see a strong > > similarity where this particular feature is concerned. > > The pattern of thinking is the same: "I want to deal > > with the first n of these things individually, and the > > rest collectively." > > Well stated. I'm in complete agreement on this matter. It also works for situations where "the first n items are mandatory, the rest are optional". This usage was brought up in the context of a basic line interpreter: cmd, *args = input.split() Another usage is to have a Python function which doesn't support keywords for its positional arguments (to avoid namespace clashes in the keyword dict), but can still unpack the mandatory arguments easily: def func(*args, **kwds): arg1, arg2, *rest = args # Unpack the positional arguments Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From tzot at mediconsa.com Mon Oct 10 13:25:40 2005 From: tzot at mediconsa.com (Christos Georgiou) Date: Mon, 10 Oct 2005 14:25:40 +0300 Subject: [Python-Dev] PEP 3000 and exec Message-ID: This might be minor-- but I didn't see anyone mentioning it so far. If `exec` functionality is to be provided, then I think it still should be a keyword for the parser to know; currently bytecode generation is affected if `exec` is present. Even if that changes for Python 3k (we don't know yet), the paragraph for exec should be annotated with a note about this issue. From jim at zope.com Mon Oct 10 14:25:59 2005 From: jim at zope.com (Jim Fulton) Date: Mon, 10 Oct 2005 08:25:59 -0400 Subject: [Python-Dev] defaultproperty In-Reply-To: References: <433AA5AC.6040509@zope.com> <433BA3CF.1090205@zope.com> <43494648.6040904@zope.com> Message-ID: <434A5DD7.70707@zope.com> Guido van Rossum wrote: > On 10/9/05, Jim Fulton wrote: > >>Based on the discussion, I think I'd go with defaultproperty. > > > Great. > > >>Questions: >> >>- Should this be in builtins, alongside property, or in >> a library module? (Oleg suggested propertytools.) >> >>- Do we need a short PEP? > > > I think so. From the responses I'd say there's at most lukewarm > interest (including from me). Hm, I saw several responses from people who'd built something quite similar. This suggests to me that this is a common need. > You might also want to drop it and just > add it to your personal (or Zope's) library. I have something like this in Zope's library. I end up with a very small package that isn't logically part of other packages, but that is a dependency of lots of packages. I don't like that, but I guess I should get over it. I must say that I am of 2 minds about things like this. On the one hand, I'd like Python's standard library to be small with packaging systems to provide "extra batteries". OTOH, I often find small tools like this that would be nice to have readily available. Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From circlecycle at gmail.com Sun Oct 9 02:04:13 2005 From: circlecycle at gmail.com (jamesr) Date: Sat, 8 Oct 2005 20:04:13 -0400 Subject: [Python-Dev] C.E.R. Thoughts Message-ID: <78d129adb4581d24b1d07844019a2afe@gmail.com> Congragulations heartily given. I missed the ternary op in c... Way to go! clean and easy and now i can do: if ((sys.argv[1] =='debug') if len(sys.argv) > 1 else False): pass and check variables IF AND ONLY if they exist, in a single line! but y'all knew that.. From phd at mail2.phd.pp.ru Mon Oct 10 15:02:48 2005 From: phd at mail2.phd.pp.ru (Oleg Broytmann) Date: Mon, 10 Oct 2005 17:02:48 +0400 Subject: [Python-Dev] C.E.R. Thoughts In-Reply-To: <78d129adb4581d24b1d07844019a2afe@gmail.com> References: <78d129adb4581d24b1d07844019a2afe@gmail.com> Message-ID: <20051010130248.GB19369@phd.pp.ru> On Sat, Oct 08, 2005 at 08:04:13PM -0400, jamesr wrote: > if ((sys.argv[1] =='debug') if len(sys.argv) > 1 else False): > pass Very good example! Very good example why ternary operators must be forbidden! Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From exarkun at divmod.com Mon Oct 10 15:26:15 2005 From: exarkun at divmod.com (Jp Calderone) Date: Mon, 10 Oct 2005 09:26:15 -0400 Subject: [Python-Dev] C.E.R. Thoughts In-Reply-To: <78d129adb4581d24b1d07844019a2afe@gmail.com> Message-ID: <20051010132615.3914.1043000115.divmod.quotient.26309@ohm> On Sat, 8 Oct 2005 20:04:13 -0400, jamesr wrote: >Congragulations heartily given. I missed the ternary op in c... Way to >go! clean and easy and now i can do: > >if ((sys.argv[1] =='debug') if len(sys.argv) > 1 else False): > pass > >and check variables IF AND ONLY if they exist, in a single line! if len(sys.argv) > 1 and sys.argv[1] == 'debug': ... usually-wouldn't-but-can't-pass-it-up-ly y'rs, Jp From ark at acm.org Mon Oct 10 15:26:04 2005 From: ark at acm.org (Andrew Koenig) Date: Mon, 10 Oct 2005 09:26:04 -0400 Subject: [Python-Dev] C.E.R. Thoughts In-Reply-To: <78d129adb4581d24b1d07844019a2afe@gmail.com> Message-ID: <000901c5cd9e$2d7275d0$6402a8c0@arkdesktop> > Congragulations heartily given. I missed the ternary op in c... Way to > go! clean and easy and now i can do: > if ((sys.argv[1] =='debug') if len(sys.argv) > 1 else False): > pass > and check variables IF AND ONLY if they exist, in a single line! Umm... Is this a joke? From skip at pobox.com Mon Oct 10 15:48:30 2005 From: skip at pobox.com (skip@pobox.com) Date: Mon, 10 Oct 2005 08:48:30 -0500 Subject: [Python-Dev] C.E.R. Thoughts In-Reply-To: <000901c5cd9e$2d7275d0$6402a8c0@arkdesktop> References: <78d129adb4581d24b1d07844019a2afe@gmail.com> <000901c5cd9e$2d7275d0$6402a8c0@arkdesktop> Message-ID: <17226.28974.868184.492129@montanaro.dyndns.org> Andrew> Umm... Is this a joke? I hope so. I must admit the OP's intent didn't make itself known to me with the cursory glance I gave it. Jp's formulation is how I would have written it. Assuming of course, that was the OP's intent. Skip From barry at python.org Mon Oct 10 16:41:32 2005 From: barry at python.org (Barry Warsaw) Date: Mon, 10 Oct 2005 10:41:32 -0400 Subject: [Python-Dev] Fwd: defaultproperty In-Reply-To: <76fd5acf0510092247m4d8383e8o40260de7950906@mail.gmail.com> References: <433AA5AC.6040509@zope.com> <433BA3CF.1090205@zope.com> <43494648.6040904@zope.com> <4349997E.9010208@gmail.com> <76fd5acf0510092246q1861403flda2dd49ddbe02d6c@mail.gmail.com> <76fd5acf0510092247m4d8383e8o40260de7950906@mail.gmail.com> Message-ID: <1128955292.27841.2.camel@geddy.wooz.org> On Mon, 2005-10-10 at 01:47, Calvin Spealman wrote: > Never created for a reason? lumping things together for having the > similar usage semantics, but unrelated purposes, might be something to > avoid and maybe that's why it hasn't happened yet for decorators. If > ever there was a makethreadsafe decorator, it should go in the thread > module, etc. I mean, come on, its like making a module just to store a > bunch of unrelated types just to lump them together because they're > types. Who wants that? Like itertools? +1 for a decorators module. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20051010/5bb19553/attachment.pgp From abo at minkirri.apana.org.au Mon Oct 10 16:45:33 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Mon, 10 Oct 2005 15:45:33 +0100 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <4346FC98.5050504@gmail.com> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <415220344.20051007104751@MailBlocks.com> <4346FC98.5050504@gmail.com> Message-ID: <1128955532.32340.141.camel@parabolic.corp.google.com> On Fri, 2005-10-07 at 23:54, Nick Coghlan wrote: [...] > The few times I have encountered anyone saying anything resembling "threading > is easy", it was because the full sentence went something like "threading is > easy if you use message passing and copy-on-send or release-reference-on-send > to communicate between threads, and limit the shared data structures to those > required to support the messaging infrastructure". And most of the time there > was an implied "compared to using semaphores and locks directly, " at the start. LOL! So threading is easy if you restrict inter-thread communication to message passing... and what makes multi-processing hard is your only inter-process communication mechanism is message passing :-) Sounds like yet another reason to avoid threading and use processes instead... effort spent on threading based message passing implementations could instead be spent on inter-process messaging. -- Donovan Baarda From guido at python.org Mon Oct 10 16:50:02 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 10 Oct 2005 07:50:02 -0700 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <434A32B4.1050709@gmail.com> References: <20051007172237.GA13288@localhost.localdomain> <4349D57C.7010509@canterbury.ac.nz> <200510092304.58638.fdrake@acm.org> <434A32B4.1050709@gmail.com> Message-ID: On 10/10/05, Nick Coghlan wrote: > It also works for situations where "the first n items are mandatory, the rest > are optional". This usage was brought up in the context of a basic line > interpreter: > > cmd, *args = input.split() That's a really poor example though. You really don't want a line interpreter to bomb if the line is empty! > Another usage is to have a Python function which doesn't support keywords for > its positional arguments (to avoid namespace clashes in the keyword dict), but > can still unpack the mandatory arguments easily: > > def func(*args, **kwds): > arg1, arg2, *rest = args # Unpack the positional arguments Again, I'd be more comfortable if this was preceded by a check for len(args) >= 2. I should add that I'm just -0 on this. I think proponents ought to find better motivating examples that aren't made-up. Perhaps Raymond's requirement would help -- find places in the standard library where this would make code more readable/maintainable. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Oct 10 16:51:28 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 10 Oct 2005 07:51:28 -0700 Subject: [Python-Dev] New PEP 342 suggestion: result() and allow "return with arguments" in generators (was Re: PEP 342 suggestion: start(), __call__() and unwind_call() methods) In-Reply-To: <434A2E41.7040903@gmail.com> References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com> <43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net> <43486E20.3010908@gmail.com> <434883D6.80009@gmail.com> <43491650.1020704@gmail.com> <4349D548.3030000@canterbury.ac.nz> <434A2E41.7040903@gmail.com> Message-ID: On 10/10/05, Nick Coghlan wrote: > I'm starting to think we want to let PEP 342 bake for at least one release > cycle before deciding what (if any) additional behaviour should be added to > generators. Yes please! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From abo at minkirri.apana.org.au Mon Oct 10 17:01:05 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Mon, 10 Oct 2005 16:01:05 +0100 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <415220344.20051007104751@MailBlocks.com> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <415220344.20051007104751@MailBlocks.com> Message-ID: <1128956465.32337.152.camel@parabolic.corp.google.com> On Fri, 2005-10-07 at 17:47, Bruce Eckel wrote: > Early in this thread there was a comment to the effect that "if you > don't know how to use threads, don't use them," which I pointedly > avoided responding to because it seemed to me to simply be > inflammatory. But Ian Bicking just posted a weblog entry: > http://blog.ianbicking.org/concurrency-and-processes.html where he > says "threads aren't as hard as they imply" and "An especially poor > argument is one that tells me that I'm currently being beaten with a > stick, but apparently don't know it." The problem with threads is at first glance they appear easy, which seduces many beginning programmers into using them. The hard part is knowing when and how to lock shared resources... at first glance you don't even realise you need to do this. So many threaded applications are broken and don't know it, because this kind of broken-ness is nearly always intermittant and very hard to reproduce and debug. One common alternative is async polling frameworks like Twisted. These scare beginners away because a first glance, they appear hideously complicated. However, if you take the time to get your head around them, you get a better feel for all the nasty implications of concurrency, and end up designing better applications. This is the reason why, given a choice between an async and a threaded implementation of an application, I will always choose the async solution. Not because async is inherently better than threading, but because the programmer who bothered to grock async is more likely to get it right. -- Donovan Baarda From Michaels at rd.bbc.co.uk Mon Oct 10 15:58:25 2005 From: Michaels at rd.bbc.co.uk (Michael Sparks) Date: Mon, 10 Oct 2005 14:58:25 +0100 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <1128955532.32340.141.camel@parabolic.corp.google.com> References: <20051006143740.287E.JCARLSON@uci.edu> <4346FC98.5050504@gmail.com> <1128955532.32340.141.camel@parabolic.corp.google.com> Message-ID: <200510101458.25644.Michaels@rd.bbc.co.uk> On Monday 10 Oct 2005 15:45, Donovan Baarda wrote: > Sounds like yet another reason to avoid threading and use processes > instead... effort spent on threading based message passing > implementations could instead be spent on inter-process messaging. I can't let that pass (even if our threaded component has a couple of warts at the moment). # Blocking thread example (uses raw_input) to single threaded pygame # display ticker. (The display is rate limited to 8 words per second at # most since it was designed for subtitles) # from Axon.ThreadedComponent import threadedcomponent from Kamaelia.Util.PipelineComponent import pipeline from Kamaelia.UI.Pygame.Ticker import Ticker class ConsoleReader(threadedcomponent): def __init__(self, prompt=">>> "): super(ConsoleReader, self).__init__() self.prompt = prompt def run(self): # implementation wart, should be "main" while 1: line = raw_input(self.prompt) line = line + "\n" self.outqueues["outbox"].put(line) # implementation wart, should be self.send(line, "outbox") pipeline( ConsoleReader(), Ticker() # Single threaded pygame based text ticker ).run() There's other ways with other systems to achieve the same goal. Inter-process based messaging can be done in various ways. The API though can look pretty much the same. (There's obviously some implications of crossing process boundaries though, but that's for the system composer to deal with, not the components). Regards, Michael. -- Michael Sparks, Senior R&D Engineer, Digital Media Group Michael.Sparks at rd.bbc.co.uk, http://kamaelia.sourceforge.net/ British Broadcasting Corporation, Research and Development Kingswood Warren, Surrey KT20 6NP This e-mail may contain personal views which are not the views of the BBC. From robey at lag.net Mon Oct 10 19:18:13 2005 From: robey at lag.net (Robey Pointer) Date: Mon, 10 Oct 2005 10:18:13 -0700 Subject: [Python-Dev] C API doc fix In-Reply-To: References: <4092C34F-5A07-47D0-A27F-1781EBFE887A@lag.net> Message-ID: On 29 Sep 2005, at 12:06, Steven Bethard wrote: > On 9/29/05, Robey Pointer wrote: > >> Yesterday I ran into a bug in the C API docs. The top of this page: >> >> http://docs.python.org/api/unicodeObjects.html >> >> says: >> >> Py_UNICODE >> This type represents a 16-bit unsigned storage type which is >> used by Python internally as basis for holding Unicode ordinals. On >> platforms where wchar_t is available and also has 16-bits, Py_UNICODE >> is a typedef alias for wchar_t to enhance native platform >> compatibility. On all other platforms, Py_UNICODE is a typedef alias >> for unsigned short. >> > > I believe this is the same issue that was brought up in May[1]. My > impression was that people could not agree on a documentation patch. Would it help if I tried my hand at it? My impression so far is that extension coders should probably try not to worry about the size or content of Py_UNICODE. (The thread seems to have wandered off into nowhere again...) Py_UNICODE This type represents an unsigned storage type at least 16-bits long (but sometimes more) which is used by Python internally as basis for holding Unicode ordinals. On platforms where wchar_t is available and also has 16-bits, Py_UNICODE is a typedef alias for wchar_t to enhance native platform compatibility. In general, you should use PyUnicode_FromEncodedObject and PyUnicode_AsEncodedString to convert strings to/from unicode objects, and consider Py_UNICODE to be an implementation detail. robey From janssen at parc.com Mon Oct 10 19:59:54 2005 From: janssen at parc.com (Bill Janssen) Date: Mon, 10 Oct 2005 10:59:54 PDT Subject: [Python-Dev] Pythonic concurrency In-Reply-To: Your message of "Mon, 10 Oct 2005 08:01:05 PDT." <1128956465.32337.152.camel@parabolic.corp.google.com> Message-ID: <05Oct10.105958pdt."58617"@synergy1.parc.xerox.com> > The problem with threads is at first glance they appear easy... Anyone who thinks that a "glance" is enough to understand something is too far gone to worry about. On the other hand, you might be referring to a putative brokenness of the Python documentation on Python threads. I'm not sure they're broken, though. They just point out the threading that Python provides, for folks who want to use threads. Are they required to provide a full course in threads? > ...which seduces many beginning programmers into using them. Don't worry about this. That's how "beginning programmers" learn. > The hard part is knowing when and how to lock shared resources... Well, I might say the "careful part". > ...at first glance you don't even realise you need to do this. Again, I'm not sure why you care what "glancers" do and don't realize. You could say the same about most algorithms and data structures. Bill From skip at pobox.com Mon Oct 10 20:20:31 2005 From: skip at pobox.com (skip@pobox.com) Date: Mon, 10 Oct 2005 13:20:31 -0500 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <05Oct10.105958pdt."58617"@synergy1.parc.xerox.com> References: <1128956465.32337.152.camel@parabolic.corp.google.com> <05Oct10.105958pdt."58617"@synergy1.parc.xerox.com> Message-ID: <17226.45295.661911.542400@montanaro.dyndns.org> >> The hard part is knowing when and how to lock shared resources... Bill> Well, I might say the "careful part". With the Mojam middleware stuff I suffered quite awhile with a single-threaded implementation that would hang the entire webserver if a backend query took too long. I realized I needed to do something (threads, asyncore, whatever), but didn't think I understood the issues well enough to do it right. Once I finally bit the bullet and switched to a multithreaded implementation, I didn't have too much trouble. Of course, the application was pretty mature at that point and I understood what objects were shared and needed to be locked. Oh, and I took Aahz's admonition to heart and pretty much stuck to using Queues for all synchronization. It ain't rocket science, but it can be subtle. Skip From abo at minkirri.apana.org.au Mon Oct 10 20:39:58 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Mon, 10 Oct 2005 19:39:58 +0100 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <05Oct10.105958pdt."58617"@synergy1.parc.xerox.com> References: <05Oct10.105958pdt."58617"@synergy1.parc.xerox.com> Message-ID: <1128969598.32345.224.camel@parabolic.corp.google.com> On Mon, 2005-10-10 at 18:59, Bill Janssen wrote: > > The problem with threads is at first glance they appear easy... > > Anyone who thinks that a "glance" is enough to understand something is > too far gone to worry about. On the other hand, you might be > referring to a putative brokenness of the Python documentation on > Python threads. I'm not sure they're broken, though. They just point > out the threading that Python provides, for folks who want to use > threads. Are they required to provide a full course in threads? I was speaking in general, not about Python in particular. If anything, Python is one of the simplest and safest platforms for threading (thanks mostly to the GIL). And I find the documentation excellent :-) > > ...which seduces many beginning programmers into using them. > > Don't worry about this. That's how "beginning programmers" learn. Many other things "beginning programmers" learn very quickly break if you do it wrong, until you learn to do it right. Threads are tricky in that they can "mostly work", and it can be a long while before you realise it is actually broken. I don't know how many bits of other people's code I've had to fix that worked for years until it was run on hardware fast enough to trigger that nasty race condition :-) -- Donovan Baarda From mal at egenix.com Mon Oct 10 21:09:58 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 10 Oct 2005 21:09:58 +0200 Subject: [Python-Dev] C API doc fix In-Reply-To: References: <4092C34F-5A07-47D0-A27F-1781EBFE887A@lag.net> Message-ID: <434ABC86.80202@egenix.com> Robey Pointer wrote: > On 29 Sep 2005, at 12:06, Steven Bethard wrote: > > >>On 9/29/05, Robey Pointer wrote: >> >> >>>Yesterday I ran into a bug in the C API docs. The top of this page: >>> >>> http://docs.python.org/api/unicodeObjects.html >>> >>>says: >>> >>>Py_UNICODE >>> This type represents a 16-bit unsigned storage type which is >>>used by Python internally as basis for holding Unicode ordinals. On >>>platforms where wchar_t is available and also has 16-bits, Py_UNICODE >>>is a typedef alias for wchar_t to enhance native platform >>>compatibility. On all other platforms, Py_UNICODE is a typedef alias >>>for unsigned short. >>> >> >>I believe this is the same issue that was brought up in May[1]. My >>impression was that people could not agree on a documentation patch. FYI, I've fixed the Py_UNICODE description now. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 10 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From janssen at parc.com Mon Oct 10 21:26:45 2005 From: janssen at parc.com (Bill Janssen) Date: Mon, 10 Oct 2005 12:26:45 PDT Subject: [Python-Dev] Pythonic concurrency In-Reply-To: Your message of "Mon, 10 Oct 2005 11:20:31 PDT." <17226.45295.661911.542400@montanaro.dyndns.org> Message-ID: <05Oct10.122654pdt."58617"@synergy1.parc.xerox.com> Skip, > With the Mojam middleware stuff I suffered quite awhile with a > single-threaded implementation that would hang the entire webserver if a > backend query took too long. I realized I needed to do something (threads, > asyncore, whatever), but didn't think I understood the issues well enough to > do it right. Yes, there's a troublesome meme in the world: "threads are hard". They aren't, really. You just have to know what you're doing. But that meme seems to keep quite capable people from doing things they are well qualified to do. > Once I finally bit the bullet and switched to a multithreaded > implementation, I didn't have too much trouble. Yep. > Of course, the application > was pretty mature at that point and I understood what objects were shared > and needed to be locked. Kind of like managing people, isn't it :-?. I've done a lot of middleware myself, of course. ILU was based on a thread-safe C library and worked with Python threads quite well. Lately I've been building UpLib (a threaded Python service) on top of Medusa, which has worked quite well. UpLib handles calls sequentially, but uses threads internally to manage underlying data transformations. Medusa almost but not quite supports per-request threads; I'm wondering if I should just fix that and post a patch. Or would that just be re-creating ZServer, which I admit I haven't figured out how to look at? Bill From paul.dubois at gmail.com Mon Oct 10 22:14:30 2005 From: paul.dubois at gmail.com (Paul Du Bois) Date: Mon, 10 Oct 2005 13:14:30 -0700 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: References: <20051007172237.GA13288@localhost.localdomain> <4349D57C.7010509@canterbury.ac.nz> <200510092304.58638.fdrake@acm.org> <434A32B4.1050709@gmail.com> Message-ID: <85f6a31f0510101314x4a1ccfdeu43d3d9436031fe3c@mail.gmail.com> On 10/10/05, Nick Coghlan wrote: > cmd, *args = input.split() These examples also have a reasonable implementation using list.pop(), albeit one that requires more typing. On the plus side, it does not violate DRY and is explicit about the error cases. args = input.split() try: cmd = input.pop(0) except IndexError: cmd = '' > def func(*args, **kwds): > arg1, arg2, *rest = args # Unpack the positional arguments rest = args # or args[:] if you really did want a copy try: arg1 = rest.pop(0) arg2 = rest.pop(0) except IndexError: raise TypeError("foo() takes at least 2 arguments") paul From BruceEckel-Python3234 at mailblocks.com Mon Oct 10 22:15:18 2005 From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel) Date: Mon, 10 Oct 2005 14:15:18 -0600 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <05Oct10.122654pdt."58617"@synergy1.parc.xerox.com> References: Your message of "Mon, 10 Oct 2005 11:20:31 PDT." <17226.45295.661911.542400@montanaro.dyndns.org> <05Oct10.122654pdt."58617"@synergy1.parc.xerox.com> Message-ID: <746109444.20051010141518@MailBlocks.com> > Yes, there's a troublesome meme in the world: "threads are hard". > They aren't, really. You just have to know what you're doing. I would say that the troublesome meme is that "threads are easy." I posted an earlier, rather longish message about this. The gist of which was: "when someone says that threads are easy, I have no idea what they mean by it." Perhaps this means "threads in Python are easier than threads in other languages." But I just finished a 150-page chapter on Concurrency in Java which took many months to write, based on a large chapter on Concurrency in C++ which probably took longer to write. I keep in reasonably good touch with some of the threading experts. I can't get any of them to say that it's easy, even though they really do understand the issues and think about it all the time. *Because* of that, they say that it's hard. So alright, I'll take the bait that you've laid down more than once, now. Perhaps you can go beyond saying that "threads really aren't hard" and explain the aspects of them that seem so easy to you. Perhaps you can give a nice clear explanation of cache coherency and memory barriers in multiprocessor machines? Or explain atomicity, volatility and visibility? Or, even better, maybe you can come up with a better concurrency model, which is what I think most of us are looking for in this discussion. Bruce Eckel http://www.BruceEckel.com mailto:BruceEckel-Python3234 at mailblocks.com Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e" Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel Subscribe to my newsletter: http://www.mindview.net/Newsletter My schedule can be found at: http://www.mindview.net/Calendar From bcannon at gmail.com Mon Oct 10 22:29:26 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 10 Oct 2005 13:29:26 -0700 Subject: [Python-Dev] PEP 3000 and exec In-Reply-To: References: Message-ID: On 10/10/05, Christos Georgiou wrote: > This might be minor-- but I didn't see anyone mentioning it so far. If > `exec` functionality is to be provided, then I think it still should be a > keyword for the parser to know; currently bytecode generation is affected if > `exec` is present. Even if that changes for Python 3k (we don't know yet), > the paragraph for exec should be annotated with a note about this issue. > But the PEP says that 'exec' will become a function and thus no longer become a built-in, so changing the grammar is not needed. -Brett From bcannon at gmail.com Mon Oct 10 22:33:15 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 10 Oct 2005 13:33:15 -0700 Subject: [Python-Dev] Fwd: defaultproperty In-Reply-To: <1128955292.27841.2.camel@geddy.wooz.org> References: <433AA5AC.6040509@zope.com> <433BA3CF.1090205@zope.com> <43494648.6040904@zope.com> <4349997E.9010208@gmail.com> <76fd5acf0510092246q1861403flda2dd49ddbe02d6c@mail.gmail.com> <76fd5acf0510092247m4d8383e8o40260de7950906@mail.gmail.com> <1128955292.27841.2.camel@geddy.wooz.org> Message-ID: On 10/10/05, Barry Warsaw wrote: > On Mon, 2005-10-10 at 01:47, Calvin Spealman wrote: > > > Never created for a reason? lumping things together for having the > > similar usage semantics, but unrelated purposes, might be something to > > avoid and maybe that's why it hasn't happened yet for decorators. If > > ever there was a makethreadsafe decorator, it should go in the thread > > module, etc. I mean, come on, its like making a module just to store a > > bunch of unrelated types just to lump them together because they're > > types. Who wants that? > > Like itertools? > > +1 for a decorators module. +1 from me as well. And placing defaultproperty in there makes sense if it is meant to be used as a decorator and not viewed as some spiffy descriptor. Should probably work in Michael's update_meta() function as well (albeit maybe with a different name since I think I remember Guido saying he didn't like the name). -Brett From ianb at colorstudy.com Mon Oct 10 23:57:24 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 10 Oct 2005 16:57:24 -0500 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <5.1.1.6.0.20051007140112.03b2fc80@mail.telecommunity.com> References: <20051006221436.2892.JCARLSON@uci.edu> <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <5.1.1.6.0.20051007140112.03b2fc80@mail.telecommunity.com> Message-ID: <434AE3C4.6030401@colorstudy.com> Phillip J. Eby wrote: > What the GIL-ranters don't get is that the GIL actually gives you just > enough determinism to be able to write threaded programs that don't crash, > and that maybe will even work if you treat every point of interaction > between threads as a minefield and program with appropriate care. So, if > threads are "easy" in Python compared to other langauges, it's *because of* > the GIL, not in spite of it. Three cheers for the GIL! For the record, since I was quoted at the beginning of this subthread, *I* don't think threads are easy. But among all ways to handle concurrency, I just don't think they are so bad. And unlike many alternatives, they are relatively easy to get started with, and you can do a lot of work in a threaded system without knowing anything about threads. Of course, threads aren't the only way to accomplish that, just one of the easiest. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From skip at pobox.com Tue Oct 11 00:00:48 2005 From: skip at pobox.com (skip@pobox.com) Date: Mon, 10 Oct 2005 17:00:48 -0500 Subject: [Python-Dev] PEP 3000 and exec In-Reply-To: References: Message-ID: <17226.58512.451743.300957@montanaro.dyndns.org> >> This might be minor-- but I didn't see anyone mentioning it so far. >> If `exec` functionality is to be provided, then I think it still >> should be a keyword for the parser to know; currently bytecode >> generation is affected if `exec` is present. Even if that changes >> for Python 3k (we don't know yet), the paragraph for exec should be >> annotated with a note about this issue. Brett> But the PEP says that 'exec' will become a function and thus no Brett> longer become a built-in, so changing the grammar is not needed. I don't think that was the OP's point though it might not have been terribly clear. Today, the presence of the exec statement in a function changes how non-local load instructions are generated. Consider f and g with their dis.dis output: >>> def f(a): ... exec "import %s" % a ... print q ... >>> def g(a): ... __import__(a) ... print q ... >>> dis.dis(f) 2 0 LOAD_CONST 1 ('import %s') 3 LOAD_FAST 0 (a) 6 BINARY_MODULO 7 LOAD_CONST 0 (None) 10 DUP_TOP 11 EXEC_STMT 3 12 LOAD_NAME 1 (q) 15 PRINT_ITEM 16 PRINT_NEWLINE 17 LOAD_CONST 0 (None) 20 RETURN_VALUE >>> dis.dis(g) 2 0 LOAD_GLOBAL 0 (__import__) 3 LOAD_FAST 0 (a) 6 CALL_FUNCTION 1 9 POP_TOP 3 10 LOAD_GLOBAL 2 (q) 13 PRINT_ITEM 14 PRINT_NEWLINE 15 LOAD_CONST 0 (None) 18 RETURN_VALUE If the exec statement is replaced by a function, how will the bytecode generator know that q should be looked up using LOAD_NAME instead of LOAD_GLOBAL? Maybe it's a non-issue, but even if so, a note to that affect on the wiki page might be worthwhile. Skip From guido at python.org Tue Oct 11 00:05:56 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 10 Oct 2005 15:05:56 -0700 Subject: [Python-Dev] PEP 3000 and exec In-Reply-To: <17226.58512.451743.300957@montanaro.dyndns.org> References: <17226.58512.451743.300957@montanaro.dyndns.org> Message-ID: My idea was to make the compiler smarter so that it would recognize exec() even if it was just a function. Another idea might be to change the exec() spec so that you are required to pass in a namespace (and you can't use locals() either!). Then the whole point becomes moot. On 10/10/05, skip at pobox.com wrote: > >> This might be minor-- but I didn't see anyone mentioning it so far. > >> If `exec` functionality is to be provided, then I think it still > >> should be a keyword for the parser to know; currently bytecode > >> generation is affected if `exec` is present. Even if that changes > >> for Python 3k (we don't know yet), the paragraph for exec should be > >> annotated with a note about this issue. > > Brett> But the PEP says that 'exec' will become a function and thus no > Brett> longer become a built-in, so changing the grammar is not needed. > > I don't think that was the OP's point though it might not have been terribly > clear. Today, the presence of the exec statement in a function changes how > non-local load instructions are generated. Consider f and g with their > dis.dis output: > > >>> def f(a): > ... exec "import %s" % a > ... print q > ... > >>> def g(a): > ... __import__(a) > ... print q > ... > >>> dis.dis(f) > 2 0 LOAD_CONST 1 ('import %s') > 3 LOAD_FAST 0 (a) > 6 BINARY_MODULO > 7 LOAD_CONST 0 (None) > 10 DUP_TOP > 11 EXEC_STMT > > 3 12 LOAD_NAME 1 (q) > 15 PRINT_ITEM > 16 PRINT_NEWLINE > 17 LOAD_CONST 0 (None) > 20 RETURN_VALUE > >>> dis.dis(g) > 2 0 LOAD_GLOBAL 0 (__import__) > 3 LOAD_FAST 0 (a) > 6 CALL_FUNCTION 1 > 9 POP_TOP > > 3 10 LOAD_GLOBAL 2 (q) > 13 PRINT_ITEM > 14 PRINT_NEWLINE > 15 LOAD_CONST 0 (None) > 18 RETURN_VALUE > > If the exec statement is replaced by a function, how will the bytecode > generator know that q should be looked up using LOAD_NAME instead of > LOAD_GLOBAL? Maybe it's a non-issue, but even if so, a note to that affect > on the wiki page might be worthwhile. > > Skip > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bcannon at gmail.com Tue Oct 11 00:15:39 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 10 Oct 2005 15:15:39 -0700 Subject: [Python-Dev] PEP 3000 and exec In-Reply-To: <17226.58512.451743.300957@montanaro.dyndns.org> References: <17226.58512.451743.300957@montanaro.dyndns.org> Message-ID: On 10/10/05, skip at pobox.com wrote: > >> This might be minor-- but I didn't see anyone mentioning it so far. > >> If `exec` functionality is to be provided, then I think it still > >> should be a keyword for the parser to know; currently bytecode > >> generation is affected if `exec` is present. Even if that changes > >> for Python 3k (we don't know yet), the paragraph for exec should be > >> annotated with a note about this issue. > > Brett> But the PEP says that 'exec' will become a function and thus no > Brett> longer become a built-in, so changing the grammar is not needed. > > I don't think that was the OP's point though it might not have been terribly > clear. Today, the presence of the exec statement in a function changes how > non-local load instructions are generated. Consider f and g with their > dis.dis output: > > >>> def f(a): > ... exec "import %s" % a > ... print q > ... > >>> def g(a): > ... __import__(a) > ... print q > ... > >>> dis.dis(f) > 2 0 LOAD_CONST 1 ('import %s') > 3 LOAD_FAST 0 (a) > 6 BINARY_MODULO > 7 LOAD_CONST 0 (None) > 10 DUP_TOP > 11 EXEC_STMT > > 3 12 LOAD_NAME 1 (q) > 15 PRINT_ITEM > 16 PRINT_NEWLINE > 17 LOAD_CONST 0 (None) > 20 RETURN_VALUE > >>> dis.dis(g) > 2 0 LOAD_GLOBAL 0 (__import__) > 3 LOAD_FAST 0 (a) > 6 CALL_FUNCTION 1 > 9 POP_TOP > > 3 10 LOAD_GLOBAL 2 (q) > 13 PRINT_ITEM > 14 PRINT_NEWLINE > 15 LOAD_CONST 0 (None) > 18 RETURN_VALUE > > If the exec statement is replaced by a function, how will the bytecode > generator know that q should be looked up using LOAD_NAME instead of > LOAD_GLOBAL? Maybe it's a non-issue, but even if so, a note to that affect > on the wiki page might be worthwhile. Ah, OK. That makes more sense. For a quick, on-the-spot answer, one possibility is for the 'exec' function to examine the execution stack, go back to the caller, and patch the bytecode so that it uses LOAD_NAME instead of LOAD_GLOBAL. Total hack, but it would work and since 'exec' is not exactly performance-critical to begin with something this expensive wouldn't necessarily out of the question. But the better answer is we will just find a way. =) -Brett From tim.peters at gmail.com Tue Oct 11 00:42:26 2005 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 10 Oct 2005 18:42:26 -0400 Subject: [Python-Dev] PythonCore\CurrentVersion In-Reply-To: References: <4347A020.2050008@v.loewis.de> Message-ID: <1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com> [Martin v. L?wis] >> What happened to the CurrentVersion registry entry documented at >> >> http://www.python.org/windows/python/registry.html >> >> AFAICT, even the python15.wse file did not fill a value in this >> entry (perhaps I'm misinterpreting the wse file, though). >> >> So was this ever used? Why is it documented, and who documented it >> (unfortunately, registry.html is not in cvs/subversion, either)? [Mark Hammond] > I believe I documented it many moons ago. I don't think CurrentVersion was > ever implemented (or possibly was for a very short time before being > removed). The "registered modules" concept was misguided and AFAIK is not > used by anyone - IMO it should be deprecated (if not just removed!). > Further, I believe the documentation in the file for PYTHONPATH is, as said > in those docs, out of date, but that the comments in getpathp.c are correct. It would be good to update that web page ;-) The construction of PYTHONPATH differs across platforms, which isn't good. Here's a key difference: playground/ someother/ script.py This is script.py: """ import sys from pprint import pprint pprint(sys.path) """ Suppose we run script.py while playground/ is the current directory. I'm using 2.4.2 here, but doubt it matters much. No Python-related envars are set. Windows (and the PIL and pywin32 extensions are installed here): C:\playground>\python24\python.exe someother\script.py ['C:\\playground\\someother', 'C:\\python24\\python24.zip', 'C:\\playground', 'C:\\python24\\DLLs', 'C:\\python24\\lib', 'C:\\python24\\lib\\plat-win', 'C:\\python24\\lib\\lib-tk', 'C:\\python24', 'C:\\python24\\lib\\site-packages', 'C:\\python24\\lib\\site-packages\\PIL', 'C:\\python24\\lib\\site-packages\\win32', 'C:\\python24\\lib\\site-packages\\win32\\lib', 'C:\\python24\\lib\\site-packages\\Pythonwin'] When PC/getpathp.c says: * Python always adds an empty entry at the start, which corresponds to the current directory. I'm not sure what it means. The directory containing the script we're _running_ shows up first in sys.path there, while the _current_ directory shows up third. Linux: the current directory doesn't show up at all: [playground]$ ~/Python-2.4.2/python someother/script.py ['/home/tim/playground/someother', '/usr/local/lib/python24.zip', '/home/tim/Python-2.4.2/Lib', '/home/tim/Python-2.4.2/Lib/plat-linux2', '/home/tim/Python-2.4.2/Lib/lib-tk', '/home/tim/Python-2.4.2/Modules', '/home/tim/Python-2.4.2/build/lib.linux-i686-2.4'] I have no concrete suggestion, as any change to sys.path will break something for someone. It's nevertheless not good that "current directory on sys.path?" doesn't have the same answer across platforms (unsure why, but I've been burned by that several times this year, but never before this year -- maybe sys.path _used_ to contain the current directory on Linux?). From tdelaney at avaya.com Tue Oct 11 00:50:39 2005 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Tue, 11 Oct 2005 08:50:39 +1000 Subject: [Python-Dev] Extending tuple unpacking Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> Paul Du Bois wrote: > On 10/10/05, Nick Coghlan wrote: >> cmd, *args = input.split() > > These examples also have a reasonable implementation using list.pop(), > albeit one that requires more typing. On the plus side, it does not > violate > DRY and is explicit about the error cases. > > args = input.split() > try: > cmd = input.pop(0) > except IndexError: > cmd = '' I'd say you violated it right there ... (should have been):: args = input.split() try: cmd = arg.pop() except IndexError: cmd = '' FWIW, I've been +1 on * unpacking since I first saw the proposal, and have yet to see a convincing argument against it other than people wanting to stick the * anywhere but at the end. Perhaps I'll take the stdlib challenge (unfortunately, I have to travel this weekend, but I'll see if I can make time). Tim Delaney From tdelaney at avaya.com Tue Oct 11 00:54:22 2005 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Tue, 11 Oct 2005 08:54:22 +1000 Subject: [Python-Dev] Extending tuple unpacking Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB6B7@au3010avexu1.global.avaya.com> Delaney, Timothy (Tim) wrote: > args = input.split() > > try: > cmd = arg.pop() ^^^ args ... > except IndexError: > cmd = '' Can't even get it right myself - does that prove a point? Tim Delaney From mhammond at skippinet.com.au Tue Oct 11 01:20:43 2005 From: mhammond at skippinet.com.au (Mark Hammond) Date: Tue, 11 Oct 2005 09:20:43 +1000 Subject: [Python-Dev] PythonCore\CurrentVersion In-Reply-To: <1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com> Message-ID: > Suppose we run script.py while playground/ is the current directory. > I'm using 2.4.2 here, but doubt it matters much. No Python-related > envars are set. > > Windows (and the PIL and pywin32 extensions are installed here): > > C:\playground>\python24\python.exe someother\script.py > ['C:\\playground\\someother', > 'C:\\python24\\python24.zip', > 'C:\\playground', ... > When PC/getpathp.c says: > > * Python always adds an empty entry at the start, which corresponds > to the current directory. I believe it used to mean that literally '' was at the start of sys.path, but all the way back to 1.5.2 it seems that it really is the dirname of the script. Up to 2.2 it was as specifed in sys.argv, in 2.3 and later it was made absolute. > I'm not sure what it means. The directory containing the script we're > _running_ shows up first in sys.path there, while the _current_ > directory shows up third. That's strange - I don't see the current directory at all in any version. I get something very close to you when I have PYTHONPATH=. - although it then turns up as the second entry, consistent with the docs. Mark From nyamatongwe at gmail.com Tue Oct 11 01:52:36 2005 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Tue, 11 Oct 2005 09:52:36 +1000 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <746109444.20051010141518@MailBlocks.com> References: <17226.45295.661911.542400@montanaro.dyndns.org> <746109444.20051010141518@MailBlocks.com> Message-ID: <50862ebd0510101652y70250d23vdf04d0e19872ee1b@mail.gmail.com> Bruce Eckel: > I would say that the troublesome meme is that "threads are easy." I > posted an earlier, rather longish message about this. The gist of > which was: "when someone says that threads are easy, I have no idea > what they mean by it." I think you are overcomplicating the issue by looking at too many levels at once. The memory model is something that implementers of threading support need to understand. Users of that threading support just need to know that concurrent access to variables is dangerous and that they should use locks to access shared variables or use other forms of packaged inter-thread communication. Double Checked Locking is an optimization (removal of a lock) of an attempt to better modularize code (by automating the helper object creation). I'd either just leave the lock in or if benchmarking revealed an unacceptable performance problem, allocate the helper object before the resource is accessible to more than one thread. For statics, expose an Init method that gets called when the application is in the initial one user thread state. > But I just finished a 150-page chapter on Concurrency in Java which > took many months to write, based on a large chapter on Concurrency in > C++ which probably took longer to write. I keep in reasonably good > touch with some of the threading experts. I can't get any of them to > say that it's easy, even though they really do understand the issues > and think about it all the time. *Because* of that, they say that it's > hard. Implementing threading is hard. Using threading is not that hard. Its a source of complexity but so are many aspects of development. I get scared by reentrance in UI code. Neil From greg.ewing at canterbury.ac.nz Tue Oct 11 02:09:03 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 11 Oct 2005 13:09:03 +1300 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <434AE3C4.6030401@colorstudy.com> References: <20051006221436.2892.JCARLSON@uci.edu> <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <5.1.1.6.0.20051007140112.03b2fc80@mail.telecommunity.com> <434AE3C4.6030401@colorstudy.com> Message-ID: <434B029F.3050000@canterbury.ac.nz> Ian Bicking wrote: > What the GIL-ranters don't get is that the GIL actually gives you just > enough determinism to be able to write threaded programs that don't crash, The GIL no doubt helps, but your threads can still get preempted between bytecodes, so I can't see it making much difference at the Python thought-level. I'm wondering whether Python threads should be non-preemptive by default. Preemptive threading is massive overkill for many applications. You don't need it, for example, if you just want to use threads to structure your program, overlap processing with I/O, etc. Preemptive threading would still be there as an option to turn on when you really need it. Or perhaps there could be a priority system, with a thread only able to be preempted by a thread of higher priority. If you ignore priorities, all your threads default to the same priority, so there's no preemption. If you want a thread that can preempt others, you give it a higher priority. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From guido at python.org Tue Oct 11 02:18:15 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 10 Oct 2005 17:18:15 -0700 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <434B029F.3050000@canterbury.ac.nz> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <5.1.1.6.0.20051007140112.03b2fc80@mail.telecommunity.com> <434AE3C4.6030401@colorstudy.com> <434B029F.3050000@canterbury.ac.nz> Message-ID: On 10/10/05, Greg Ewing wrote: > I'm wondering whether Python threads should be > non-preemptive by default. Preemptive threading is > massive overkill for many applications. You don't > need it, for example, if you just want to use threads > to structure your program, overlap processing with I/O, > etc. I recall using a non-preemptive system in the past; in Amoeba, to be precise. Initially it worked great. But as we added more powerful APIs to the library, we started to run into bugs that were just as if you had preemptive scheduling: it wouldn't always be predictable whether a call into the library would need to do I/O or not (it might use some sort of cache) so it would sometimes allow other threads to run and sometimes not. Or a change to the library would change this behavior (making a call that didn't use to block into sometimes-blocking). Given the tendency of Python developers to build layers of abstractions I don't think it will help much. > Preemptive threading would still be there as an option > to turn on when you really need it. > > Or perhaps there could be a priority system, with a > thread only able to be preempted by a thread of higher > priority. If you ignore priorities, all your threads > default to the same priority, so there's no preemption. > If you want a thread that can preempt others, you give > it a higher priority. If you ask me, priorities are worse than the problem they are trying to solve. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From radeex at gmail.com Tue Oct 11 02:31:09 2005 From: radeex at gmail.com (Christopher Armstrong) Date: Tue, 11 Oct 2005 11:31:09 +1100 Subject: [Python-Dev] PEP 3000 and exec In-Reply-To: References: <17226.58512.451743.300957@montanaro.dyndns.org> Message-ID: <60ed19d40510101731x61359b1evb3256aefc82369b0@mail.gmail.com> On 10/11/05, Guido van Rossum wrote: > My idea was to make the compiler smarter so that it would recognize > exec() even if it was just a function. > > Another idea might be to change the exec() spec so that you are > required to pass in a namespace (and you can't use locals() either!). > Then the whole point becomes moot. I think that's a great idea. It goes a step towards a more analyzable Python, and really, I've never found a *good* use case for allowing this invisible munging of locals. I would guess that it would simplify the implementation, given that there are currently so many special cases around exec, including when used with nested scopes. -- Twisted | Christopher Armstrong: International Man of Twistery Radix | -- http://radix.twistedmatrix.com | Release Manager, Twisted Project \\\V/// | -- http://twistedmatrix.com |o O| | w----v----w-+ From greg.ewing at canterbury.ac.nz Tue Oct 11 02:41:05 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 11 Oct 2005 13:41:05 +1300 Subject: [Python-Dev] PEP 3000 and exec In-Reply-To: References: <17226.58512.451743.300957@montanaro.dyndns.org> Message-ID: <434B0A21.6070103@canterbury.ac.nz> Brett Cannon wrote: > But the better answer is we will just find a way. =) I think the best answer would be just to dump the idea of exec-in-local-namespace altogether. I don't think I've ever seen a use case for it that wasn't better done some other way. Most often it seems to be used to answer newbie "variable variable" questions, to which the *correct* answer is invariably "start thinking in Python, not bash/perl/tcl/PHP/ whatever." -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From bcannon at gmail.com Tue Oct 11 02:53:18 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 10 Oct 2005 17:53:18 -0700 Subject: [Python-Dev] PEP 3000 and exec In-Reply-To: <434B0A21.6070103@canterbury.ac.nz> References: <17226.58512.451743.300957@montanaro.dyndns.org> <434B0A21.6070103@canterbury.ac.nz> Message-ID: On 10/10/05, Greg Ewing wrote: > Brett Cannon wrote: > > > But the better answer is we will just find a way. =) > > I think the best answer would be just to dump the idea of > exec-in-local-namespace altogether. I don't think I've > ever seen a use case for it that wasn't better done some > other way. > I agree that 'exec' could really stand to be tweaked. As it stands now it is nasty to deal with when it comes to program analysis. Anything that will make that easier gets my vote. -Brett From radeex at gmail.com Tue Oct 11 03:01:03 2005 From: radeex at gmail.com (Christopher Armstrong) Date: Tue, 11 Oct 2005 12:01:03 +1100 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <5.1.1.6.0.20051007140112.03b2fc80@mail.telecommunity.com> <434AE3C4.6030401@colorstudy.com> <434B029F.3050000@canterbury.ac.nz> Message-ID: <60ed19d40510101801i57e37379n4ed85a9703ba82e9@mail.gmail.com> On 10/11/05, Guido van Rossum wrote: > I recall using a non-preemptive system in the past; in Amoeba, to be precise. > > Initially it worked great. > > But as we added more powerful APIs to the library, we started to run > into bugs that were just as if you had preemptive scheduling: it > wouldn't always be predictable whether a call into the library would > need to do I/O or not (it might use some sort of cache) so it would > sometimes allow other threads to run and sometimes not. Or a change to > the library would change this behavior (making a call that didn't use > to block into sometimes-blocking). I'm going to be giving a talk at OSDC (in Melbourne) this year about concurrency systems, and I'm going to talk a lot about the subtleties between these various non-preemptive (let's call them cooperative :) systems. I advocate a system that gives you really straightforward-looking code, but still requires you to annotate the fact that context switches can occur on every frame where they might occur (i.e., with a yield). I've given examples before of my new 2.5-yield + twisted Deferred code here, but to recap it just means that you have to do: def foo(): x = yield getPage() return "Yay" when you want to download a web page, and the caller of 'foo' would *also* need to do something like "yay = yield foo()". I think this is a very worthwhile tradeoff for those obsessed with "natural" code. -- Twisted | Christopher Armstrong: International Man of Twistery Radix | -- http://radix.twistedmatrix.com | Release Manager, Twisted Project \\\V/// | -- http://twistedmatrix.com |o O| | w----v----w-+ From janssen at parc.com Tue Oct 11 03:05:59 2005 From: janssen at parc.com (Bill Janssen) Date: Mon, 10 Oct 2005 18:05:59 PDT Subject: [Python-Dev] Pythonic concurrency In-Reply-To: Your message of "Mon, 10 Oct 2005 17:18:15 PDT." Message-ID: <05Oct10.180605pdt."58617"@synergy1.parc.xerox.com> Guido writes: > Given the tendency of Python developers to build layers of > abstractions I don't think [non-preemptive threads] will help much. I think that's right, although I think adding priorities to Python's existing preemptive threads might be useful for real-time programmers (yes, as machines continue to get faster people are writing real-time software on top of VMs). IMO, if one understands the issues of simultaneous memory access by multiple threads, and understands condition variables (and their underlying concept of mutexes), threads are pretty easy to use. Getting into the habit of always writing thread-safe code is a good idea, too. It would be nice if some of these programming environments (IDLE, Emacs, Eclipse, Visual Studio) provided better support for analysis of threading issues in programs. I'd love to have the Interlisp thread inspector for Python. I sympathize with Bruce's Java experience, though. Java's original threading design is one of the many misfeatures of that somewhat horrible language (along with lack of multiple-inheritance, hybrid types, omission of unsigned integers, static typing, etc.). Synchronized methods is a weird way of presenting mutexes, IMO. Java's condition variables don't (didn't? has this been fixed?) quite work. The emphasis on portability and the resulting notions of red/green threading packages at the beginning didn't help either. Read Allen Holub's book. And Doug Lea's book. I understand much of this has been addressed with a new package in Java 1.5. Bill From fdrake at acm.org Tue Oct 11 03:09:37 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon, 10 Oct 2005 21:09:37 -0400 Subject: [Python-Dev] PythonCore\CurrentVersion In-Reply-To: <1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com> References: <4347A020.2050008@v.loewis.de> <1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com> Message-ID: <200510102109.37690.fdrake@acm.org> On Monday 10 October 2005 18:42, Tim Peters wrote: > never before this year -- maybe sys.path _used_ to contain the current > directory on Linux?). It's been a long time since this was the case on Unix of any variety; I *think* this changed to the current state back before 2.0. -Fred -- Fred L. Drake, Jr. From nnorwitz at gmail.com Tue Oct 11 06:15:22 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 10 Oct 2005 21:15:22 -0700 Subject: [Python-Dev] problem with genexp Message-ID: There's a problem with genexp's that I think really needs to get fixed. See http://python.org/sf/1167751 the details are below. This code: >>> foo(a = i for i in range(10)) generates "NameError: name 'i' is not defined" when run because: 2 0 LOAD_GLOBAL 0 (foo) 3 LOAD_CONST 1 ('a') 6 LOAD_GLOBAL 1 (i) 9 CALL_FUNCTION 256 12 POP_TOP 13 LOAD_CONST 0 (None) 16 RETURN_VALUE If you add parens around the code: foo(a = i for i in range(10)) You get something quite different: 2 0 LOAD_GLOBAL 0 (foo) 3 LOAD_CONST 1 ('a') 6 LOAD_CONST 2 ( at 0x2a960baae8, file "", line 2>) 9 MAKE_FUNCTION 0 12 LOAD_GLOBAL 1 (range) 15 LOAD_CONST 3 (10) 18 CALL_FUNCTION 1 21 GET_ITER 22 CALL_FUNCTION 1 25 CALL_FUNCTION 256 28 POP_TOP 29 LOAD_CONST 0 (None) 32 RETURN_VALUE I agree with the bug report that the code should either raise a SyntaxError or do the right thing. n From rrr at ronadam.com Tue Oct 11 06:13:53 2005 From: rrr at ronadam.com (Ron Adam) Date: Tue, 11 Oct 2005 00:13:53 -0400 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> Message-ID: <434B3C01.5030001@ronadam.com> Delaney, Timothy (Tim) wrote: > Paul Du Bois wrote: > > >>On 10/10/05, Nick Coghlan wrote: >> >>> cmd, *args = input.split() >> >>These examples also have a reasonable implementation using list.pop(), >>albeit one that requires more typing. On the plus side, it does not >>violate >>DRY and is explicit about the error cases. >> >> args = input.split() >> try: >> cmd = input.pop(0) >> except IndexError: >> cmd = '' > > > I'd say you violated it right there ... (should have been):: > > args = input.split() > > try: > cmd = arg.pop() > except IndexError: > cmd = '' > > FWIW, I've been +1 on * unpacking since I first saw the proposal, and > have yet to see a convincing argument against it other than people > wanting to stick the * anywhere but at the end. Perhaps I'll take the > stdlib challenge (unfortunately, I have to travel this weekend, but I'll > see if I can make time). > > Tim Delaney I'm +1 for some way to do partial tuple unpacking, yet -1 on using the * symbol for that purpose outside of functions calls. The problem is the '*' means different things depending on where it's located. In a function def, it means to group or to pack, but from the calling end it's used to unpack. I don't expect it to change as it's been a part of Python for a long time and as long as it's only used with argument passing it's not too difficult to keep straight. My concern is if it's used outside of functions, then on the left hand side of assignments, it will be used to pack, but if used on the right hand side it will be to unpack. And if it becomes as common place as I think it will, it will present confusing uses and or situations where you may have to think, "oh yeah, it's umm... unpacking here and umm... packing there, but multiplying there". The point is it could be a stumbling block, especially for new Python users. So I think a certain amount of caution should be in order on this item. At least check that it's doesn't cause confusing situations. I really would like some form of easy and efficient tuple unpacking if possibly. I've played around with using '/' and '-' to split and to partially unpack lists, but it's probably better to use a named method. That has the benefit of always reading the same. Also packing tuples (other than in function defs) isn't needed if you have a way to do partial unpacking. a,b,c = alist[:2]+[alist[2:]] # a,b,rest Not the most efficient way I think, but maybe as a sequence method written in C it could be better? Cheers, Ron From guido at python.org Tue Oct 11 06:55:58 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 10 Oct 2005 21:55:58 -0700 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <434B3C01.5030001@ronadam.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> Message-ID: On 10/10/05, Ron Adam wrote: > The problem is the '*' means different things depending on where it's > located. In a function def, it means to group or to pack, but from the > calling end it's used to unpack. I don't expect it to change as it's > been a part of Python for a long time and as long as it's only used with > argument passing it's not too difficult to keep straight. > > My concern is if it's used outside of functions, then on the left hand > side of assignments, it will be used to pack, but if used on the right > hand side it will be to unpack. And if it becomes as common place as I > think it will, it will present confusing uses and or situations where > you may have to think, "oh yeah, it's umm... unpacking here and umm... > packing there, but multiplying there". The point is it could be a > stumbling block, especially for new Python users. So I think a certain > amount of caution should be in order on this item. At least check that > it's doesn't cause confusing situations. This particular concern, I believe, is a fallacy. If you squint the right way, using *rest for both packing and unpacking is totally logical. If a, b, *rest = (1, 2, 3, 4, 5) puts 1 into a, 2 into b, and (3, 4, 5) into rest, then it's totally logical and symmetrical if after that x = a, b, *rest puts (1, 2, 3, 4, 5) into x. BTW, what should [a, b, *rest] = (1, 2, 3, 4, 5) do? Should it set rest to (3, 4, 5) or to [3, 4, 5]? Suppose the latter. Then should we allow [*rest] = x as alternative syntax for rest = list(x) ? And then perhaps *rest = x should mean rest = tuple(x) Or should that be disallowed and would we have to write *rest, = x analogous to singleton tuples? There certainly is a need for doing the same from the end: *rest, a, b = (1, 2, 3, 4, 5) could set rest to (1, 2, 3), a to 4, and b to 5. From there it's a simple step towards a, b, *rest, d, e = (1, 2, 3, 4, 5) meaning a, b, rest, d, e = (1, 2, (3,), 4, 5) and so on. Where does it stop? BTW, and quite unrelated, I've always felt uncomfortable that you have to write f(a, b, foo=1, bar=2, *args, **kwds) I've always wanted to write that as f(a, b, *args, foo=1, bar=2, **kwds) but the current grammar doesn't allow it. Still -0 on the whole thing, -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bcannon at gmail.com Tue Oct 11 07:10:18 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 10 Oct 2005 22:10:18 -0700 Subject: [Python-Dev] problem with genexp In-Reply-To: References: Message-ID: On 10/10/05, Neal Norwitz wrote: > There's a problem with genexp's that I think really needs to get > fixed. See http://python.org/sf/1167751 the details are below. This > code: > > >>> foo(a = i for i in range(10)) > > generates "NameError: name 'i' is not defined" when run because: [SNIP] > If you add parens around the code: foo(a = i for i in range(10)) > You get something quite different: Do you mean having ``(foo(a = i for i in range(10))``? Otherwise I see no difference when compared to the first value. -Brett From nnorwitz at gmail.com Tue Oct 11 07:14:44 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 10 Oct 2005 22:14:44 -0700 Subject: [Python-Dev] problem with genexp In-Reply-To: References: Message-ID: On 10/10/05, Brett Cannon wrote: > On 10/10/05, Neal Norwitz wrote: > > There's a problem with genexp's that I think really needs to get > > fixed. See http://python.org/sf/1167751 the details are below. This > > code: > > > > >>> foo(a = i for i in range(10)) > > > > generates "NameError: name 'i' is not defined" when run because: > [SNIP] > > If you add parens around the code: foo(a = i for i in range(10)) > > You get something quite different: > > Do you mean having ``(foo(a = i for i in range(10))``? Otherwise I > see no difference when compared to the first value. Sorry, I think I put it in the bug report, but forgot to add it here: >>> foo(a = (i for i in range(10))) n From martin at v.loewis.de Tue Oct 11 08:16:53 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 11 Oct 2005 08:16:53 +0200 Subject: [Python-Dev] PythonCore\CurrentVersion In-Reply-To: <200510102109.37690.fdrake@acm.org> References: <4347A020.2050008@v.loewis.de> <1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com> <200510102109.37690.fdrake@acm.org> Message-ID: <434B58D5.7020206@v.loewis.de> Fred L. Drake, Jr. wrote: > On Monday 10 October 2005 18:42, Tim Peters wrote: > > never before this year -- maybe sys.path _used_ to contain the current > > directory on Linux?). > > It's been a long time since this was the case on Unix of any variety; I > *think* this changed to the current state back before 2.0. Please check again: [GCC 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> sys.path ['', '/usr/lib/python23.zip', '/usr/lib/python2.3', '/usr/lib/python2.3/plat-linux2', '/usr/lib/python2.3/lib-tk', '/usr/lib/python2.3/lib-dynload', '/usr/local/lib/python2.3/site-packages', '/usr/lib/python2.3/site-packages', '/usr/lib/python2.3/site-packages/Numeric', '/usr/lib/python2.3/site-packages/gtk-2.0', '/usr/lib/site-python'] We still have the empty string in sys.path, and it still denotes the current directory. Regards, Martin From greg.ewing at canterbury.ac.nz Tue Oct 11 09:21:06 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 11 Oct 2005 20:21:06 +1300 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <434B3C01.5030001@ronadam.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> Message-ID: <434B67E2.8060909@canterbury.ac.nz> Ron Adam wrote: > My concern is if it's used outside of functions, then on the left hand > side of assignments, it will be used to pack, but if used on the right > hand side it will be to unpack. I don't see why that should be any more confusing than the fact that commas denote tuple packing on the right and unpacking on the left. Greg From greg.ewing at canterbury.ac.nz Tue Oct 11 09:39:38 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 11 Oct 2005 20:39:38 +1300 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> Message-ID: <434B6C3A.7020001@canterbury.ac.nz> Guido van Rossum wrote: > BTW, what should > > [a, b, *rest] = (1, 2, 3, 4, 5) > > do? Should it set rest to (3, 4, 5) or to [3, 4, 5]? Whatever type is chosen, it should be the same type, always. The rhs could be any iterable, not just a tuple or a list. Making a special case of preserving one or two types doesn't seem worth it to me. > Suppose the latter. Then should we allow > > [*rest] = x > > as alternative syntax for > > rest = list(x) That would be a consequence of that choice, yes, but so what? There are already infinitely many ways of writing any expression. > ? And then perhaps > > *rest = x > > should mean > > rest = tuple(x) > > Or should that be disallowed Why bother? What harm would result from the ability to write that? > There certainly is a need for doing the same from the end: > > *rest, a, b = (1, 2, 3, 4, 5) I wouldn't mind at all if *rest were only allowed at the end. There's a pragmatic reason for that if nothing else: the rhs can be any iterable, and there's no easy way of getting "all but the last n" items from a general iterable. > Where does it stop? For me, it stops with *rest only allowed at the end, and always yielding a predictable type (which could be either tuple or list, I don't care). > BTW, and quite unrelated, I've always felt uncomfortable that you have to write > > f(a, b, foo=1, bar=2, *args, **kwds) > > I've always wanted to write that as > > f(a, b, *args, foo=1, bar=2, **kwds) Yes, I'd like that too, with the additional meaning that foo and bar can only be specified by keyword, not by position. Greg From ncoghlan at gmail.com Tue Oct 11 11:05:36 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 11 Oct 2005 19:05:36 +1000 Subject: [Python-Dev] Fwd: defaultproperty In-Reply-To: References: <433AA5AC.6040509@zope.com> <433BA3CF.1090205@zope.com> <43494648.6040904@zope.com> <4349997E.9010208@gmail.com> <76fd5acf0510092246q1861403flda2dd49ddbe02d6c@mail.gmail.com> <76fd5acf0510092247m4d8383e8o40260de7950906@mail.gmail.com> <1128955292.27841.2.camel@geddy.wooz.org> Message-ID: <434B8060.6070903@gmail.com> Brett Cannon wrote: > On 10/10/05, Barry Warsaw wrote: > >>On Mon, 2005-10-10 at 01:47, Calvin Spealman wrote: >> >> >>>Never created for a reason? lumping things together for having the >>>similar usage semantics, but unrelated purposes, might be something to >>>avoid and maybe that's why it hasn't happened yet for decorators. If >>>ever there was a makethreadsafe decorator, it should go in the thread >>>module, etc. I mean, come on, its like making a module just to store a >>>bunch of unrelated types just to lump them together because they're >>>types. Who wants that? >> >>Like itertools? >> >>+1 for a decorators module. > > > +1 from me as well. And placing defaultproperty in there makes sense > if it is meant to be used as a decorator and not viewed as some spiffy > descriptor. > > Should probably work in Michael's update_meta() function as well > (albeit maybe with a different name since I think I remember Guido > saying he didn't like the name). I thought mimic was a nice name: @mimic(func) def wrapper(*args, **kwds): return func(*args, **kwds) As a location for this, I would actually suggest a module called something like "metatools", rather than "decorators". The things these have in common is that they're about manipulating the way functions and the like interact with the Python language infrastructure - they're tools to make metaprogramming a bit easier. If "contextmanager" isn't made a builtin, this module would also be the place for it. Ditto for any standard context managers (such as closing()) which aren't made builtins. At the moment, the only location for such things is the builtin namespace (e.g. classmethod, staticmethod). Regardless, a short PEP is needed to: a. pick a name for the module b. decide precisely what will be in it for Python 2.5 Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at iinet.net.au Tue Oct 11 12:02:39 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Tue, 11 Oct 2005 20:02:39 +1000 Subject: [Python-Dev] Making Queue.Queue easier to use Message-ID: <434B8DBF.9080509@iinet.net.au> The multi-processing discussion reminded me that I have a few problems I run into every time I try to use Queue objects. My first problem is finding it: Py> from threading import Queue # Nope Traceback (most recent call last): File "", line 1, in ? ImportError: cannot import name Queue Py> from Queue import Queue # Ah, there it is What do people think of the idea of adding an alias to Queue into the threading module so that: a) the first line above works; and b) Queue can be documented with all of the other threading primitives, rather than being off somewhere else in its own top-level section. My second problem is with the current signatures of the put() and get() methods. Specifically, the following code blocks forever instead of raising an Empty exception after 500 milliseconds as one might expect: from Queue import Queue x = Queue() x.get(0.5) I assume the current signature is there for backward compatibility with the original version that didn't support timeouts (considering the difficulty of telling the difference between "x.get(1)" and "True = 1; x.get(True)" from inside the get() method) However, the need to write "x.get(True, 0.5)" seems seriously redundant, given that a single paramater can actually handle all the options (as is currently the case with Condition.wait()). The "put_nowait" and "get_nowait" functions are fine, because they serve a useful documentation purpose at the calling point (particularly given the current clumsy timeout signature). What do people think of the idea of adding "put_wait" and "get_wait" methods with the signatures: put_wait(item,[timeout=None) get_wait([timeout=None]) Optionally, the existing "put" and "get" methods could be deprecated, with the goal of eventually changing their signature to match the put_wait and get_wait methods above. If people are amenable to these ideas, I should be able to work up a patch for them this week. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Tue Oct 11 12:04:07 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 11 Oct 2005 20:04:07 +1000 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <1128955532.32340.141.camel@parabolic.corp.google.com> References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <415220344.20051007104751@MailBlocks.com> <4346FC98.5050504@gmail.com> <1128955532.32340.141.camel@parabolic.corp.google.com> Message-ID: <434B8E17.9040105@gmail.com> Donovan Baarda wrote: > On Fri, 2005-10-07 at 23:54, Nick Coghlan wrote: > [...] > >>The few times I have encountered anyone saying anything resembling "threading >>is easy", it was because the full sentence went something like "threading is >>easy if you use message passing and copy-on-send or release-reference-on-send >>to communicate between threads, and limit the shared data structures to those >>required to support the messaging infrastructure". And most of the time there >>was an implied "compared to using semaphores and locks directly, " at the start. > > > LOL! So threading is easy if you restrict inter-thread communication to > message passing... and what makes multi-processing hard is your only > inter-process communication mechanism is message passing :-) > > Sounds like yet another reason to avoid threading and use processes > instead... effort spent on threading based message passing > implementations could instead be spent on inter-process messaging. > Actually, I think it makes it worth building a decent message-passing paradigm (like, oh, PEP 342) that can then be scaled using backends with four different levels of complexity: - logical threading (generators) - physical threading (threading.Thread and Queue.Queue) - multiple processing - distributed processing Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Tue Oct 11 12:06:31 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 11 Oct 2005 20:06:31 +1000 Subject: [Python-Dev] problem with genexp In-Reply-To: References: Message-ID: <434B8EA7.5040205@gmail.com> Neal Norwitz wrote: > There's a problem with genexp's that I think really needs to get > fixed. See http://python.org/sf/1167751 the details are below. This > code: > I agree with the bug report that the code should either raise a > SyntaxError or do the right thing. I agree it should be a SyntaxError - I believe the AST compiler actually raises one in this situation. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Tue Oct 11 12:14:46 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 11 Oct 2005 20:14:46 +1000 Subject: [Python-Dev] C.E.R. Thoughts In-Reply-To: <78d129adb4581d24b1d07844019a2afe@gmail.com> References: <78d129adb4581d24b1d07844019a2afe@gmail.com> Message-ID: <434B9096.5040100@gmail.com> jamesr wrote: > Congragulations heartily given. I missed the ternary op in c... Way to > go! clean and easy and now i can do: > > if ((sys.argv[1] =='debug') if len(sys.argv) > 1 else False): > pass > > and check variables IF AND ONLY if they exist, in a single line! > > but y'all knew that.. Yep, it was a conscious decision to add a construct with the *potential* to be abused for use in places where the existing "and" and "or" expressions *are* being abused and resulting in buggy code. The code in your example is lousy because it's unreadable (and there are far more readable alternatives like a simple short-circuiting usage of "and"), but at least it's semantically correct (whereas the same can't be said for the current abuse of "and" and "or"). If code using a conditional expression is unclear, blame the programmer for choosing to write the code, don't blame the existence of the conditional expression :) We're-all-adults-here-ly yours, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Tue Oct 11 12:25:44 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 11 Oct 2005 20:25:44 +1000 Subject: [Python-Dev] PEP 3000 and exec In-Reply-To: References: <17226.58512.451743.300957@montanaro.dyndns.org> Message-ID: <434B9328.4030105@gmail.com> Guido van Rossum wrote: > My idea was to make the compiler smarter so that it would recognize > exec() even if it was just a function. > > Another idea might be to change the exec() spec so that you are > required to pass in a namespace (and you can't use locals() either!). > Then the whole point becomes moot. I vote for the latter option. Particularly if something like Namespace objects make their way into the standard lib before Py3k (a Namespace object is essentially designed to provide attribute style lookup into a string-keyed dictionary- you can fake it pretty well with an empty class, but there are a few quirks with doing it that way). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Tue Oct 11 12:36:59 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 11 Oct 2005 20:36:59 +1000 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <746109444.20051010141518@MailBlocks.com> References: Your message of "Mon, 10 Oct 2005 11:20:31 PDT." <17226.45295.661911.542400@montanaro.dyndns.org> <05Oct10.122654pdt."58617"@synergy1.parc.xerox.com> <746109444.20051010141518@MailBlocks.com> Message-ID: <434B95CB.5000107@gmail.com> Bruce Eckel wrote: >>Yes, there's a troublesome meme in the world: "threads are hard". >>They aren't, really. You just have to know what you're doing. > > > I would say that the troublesome meme is that "threads are easy." I > posted an earlier, rather longish message about this. The gist of > which was: "when someone says that threads are easy, I have no idea > what they mean by it." > > Perhaps this means "threads in Python are easier than threads in other > languages." One key thing is that the Python is so dynamic that the compiler can't get too fancy with the order in which it does things. However, Python threading has its own traps for the unwary (mainly related to badly-behaved C extensions, but they're still traps). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From steve at holdenweb.com Tue Oct 11 13:06:39 2005 From: steve at holdenweb.com (Steve Holden) Date: Tue, 11 Oct 2005 12:06:39 +0100 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <746109444.20051010141518@MailBlocks.com> References: Your message of "Mon, 10 Oct 2005 11:20:31 PDT." <17226.45295.661911.542400@montanaro.dyndns.org> <05Oct10.122654pdt."58617"@synergy1.parc.xerox.com> <746109444.20051010141518@MailBlocks.com> Message-ID: Bruce Eckel wrote: [Bill Janssen] >>Yes, there's a troublesome meme in the world: "threads are hard". >>They aren't, really. You just have to know what you're doing. > But that begs the question, because there is a significant amount of evidence that when it comes to threads "knowing what you are doing" is hard to the point that people can *think* they do when they demonstrably don't! > > I would say that the troublesome meme is that "threads are easy." I > posted an earlier, rather longish message about this. The gist of > which was: "when someone says that threads are easy, I have no idea > what they mean by it." > I would suggest that the truth lies in the middle ground, and would say that "you can get yourself into a lot of trouble using threads without considering the subtleties". It's an area where anything but the most simplistic solutions are almost always wrong at some point. > Perhaps this means "threads in Python are easier than threads in other > languages." > > But I just finished a 150-page chapter on Concurrency in Java which > took many months to write, based on a large chapter on Concurrency in > C++ which probably took longer to write. I keep in reasonably good > touch with some of the threading experts. I can't get any of them to > say that it's easy, even though they really do understand the issues > and think about it all the time. *Because* of that, they say that it's > hard. > > So alright, I'll take the bait that you've laid down more than once, > now. Perhaps you can go beyond saying that "threads really aren't > hard" and explain the aspects of them that seem so easy to you. > Perhaps you can give a nice clear explanation of cache coherency and > memory barriers in multiprocessor machines? Or explain atomicity, > volatility and visibility? Or, even better, maybe you can come up with > a better concurrency model, which is what I think most of us are > looking for in this discussion. > The nice thing about Python threads (or rather threading.threads) is that since each thread is an instance it's *relatively* easy to ensure that a thread restricts itself to manipulating thread-local resources (i.e. instance members). This makes it possible to write algorithms parameterized for the number of "worker threads" where the workers are taking their tasks off a Queue with entries generated by a single producer thread. With care, multiple producers can be used. More complex inter-thread communications are problematic, and arbitrary access to foreign-thread state is a nightmare (although the position has been somewhat alleviated by the introduction of threading.local). Beyond the single-producer many-consumers model there is still plenty of room to shoot yourself in the foot. In the case of threads true sophistication is staying away from the difficult cases, an option which unfortunately isn't always available in the real world. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From jeremy at alum.mit.edu Tue Oct 11 15:40:56 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Tue, 11 Oct 2005 09:40:56 -0400 Subject: [Python-Dev] problem with genexp In-Reply-To: <434B8EA7.5040205@gmail.com> References: <434B8EA7.5040205@gmail.com> Message-ID: On 10/11/05, Nick Coghlan wrote: > Neal Norwitz wrote: > > There's a problem with genexp's that I think really needs to get > > fixed. See http://python.org/sf/1167751 the details are below. This > > code: > > I agree with the bug report that the code should either raise a > > SyntaxError or do the right thing. > > I agree it should be a SyntaxError - I believe the AST compiler actually > raises one in this situation. Could someone add a test for this on the AST branch? BTW, it looks like doctest is the way to go for SyntaxError tests. There are older tests, like test_scope.py, that use separate files with bad syntax (and lots of extra kludges in the infrastructure to ignore the fact that those .py files can't be compiled). Jeremy From ncoghlan at gmail.com Tue Oct 11 15:44:59 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 11 Oct 2005 23:44:59 +1000 Subject: [Python-Dev] problem with genexp In-Reply-To: <434B8EA7.5040205@gmail.com> References: <434B8EA7.5040205@gmail.com> Message-ID: <434BC1DB.4040806@gmail.com> Nick Coghlan wrote: > Neal Norwitz wrote: > >>There's a problem with genexp's that I think really needs to get >>fixed. See http://python.org/sf/1167751 the details are below. This >>code: >>I agree with the bug report that the code should either raise a >>SyntaxError or do the right thing. > > > I agree it should be a SyntaxError - I believe the AST compiler actually > raises one in this situation. I was half right. Both the normal compiler and the AST compiler give a SyntaxError if you write: foo((a=i for i in range(10))) The problem is definitely on the parser end though: Py> compiler.parse("foo(x=i for i in range(10))") Module(None, Stmt([Discard(CallFunc(Name('foo'), [Keyword('x', Name('i'))], None, None))])) It's getting to what looks like a valid keyword argument in "x=i" and throwing the rest of it away, when it should be flagging a syntax error (the parser's limited lookahead should still be enough to spot the erroneous 'for' keyword and bail out). The error will be even more obscure if there is an "i" visible from the location of the function call. Whereas when it's parenthesised correctly, the parse tree looks more like this: Py> compiler.parse("foo(x=(i for i in range(10)))") Module(None, Stmt([Discard(CallFunc(Name('foo'), [Keyword('x', GenExpr(GenExprInner(Name('i'), [GenExprFor(AssName('i', 'OP_ASSIGN'), CallFunc(Name('range'), [Const(10)], None, None), [])])))], None, None))])) Cheers, Nick. P.S. I added a comment showing the parser output to the SF bug report. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From tim.peters at gmail.com Tue Oct 11 15:51:06 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 11 Oct 2005 09:51:06 -0400 Subject: [Python-Dev] PythonCore\CurrentVersion In-Reply-To: <434B58D5.7020206@v.loewis.de> References: <4347A020.2050008@v.loewis.de> <1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com> <200510102109.37690.fdrake@acm.org> <434B58D5.7020206@v.loewis.de> Message-ID: <1f7befae0510110651o504958det5d2409b3f724070e@mail.gmail.com> [Tim Peters] >>> never before this year -- maybe sys.path _used_ to contain the current >>> directory on Linux?). [Fred L. Drake, Jr.] >> It's been a long time since this was the case on Unix of any variety; I >> *think* this changed to the current state back before 2.0. [Martin v. L?wis] > Please check again: > > [GCC 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import sys > >>> sys.path > ['', '/usr/lib/python23.zip', '/usr/lib/python2.3', > '/usr/lib/python2.3/plat-linux2', '/usr/lib/python2.3/lib-tk', > '/usr/lib/python2.3/lib-dynload', > '/usr/local/lib/python2.3/site-packages', > '/usr/lib/python2.3/site-packages', > '/usr/lib/python2.3/site-packages/Numeric', > '/usr/lib/python2.3/site-packages/gtk-2.0', '/usr/lib/site-python'] > > We still have the empty string in sys.path, and it still > denotes the current directory. Well, that's in interactive mode, and I see sys.path[0] == "" on both Windows and Linux then. I don't see "" in sys.path on either box in batch mode, although I do see the absolutized path to the current directory in sys.path in batch mode on Windows but not on Linux -- but Mark Hammond says he doesn't see (any form of) the current directory in sys.path in batch mode on Windows. It's a bit confusing ;-) From fumanchu at amor.org Tue Oct 11 16:46:40 2005 From: fumanchu at amor.org (Robert Brewer) Date: Tue, 11 Oct 2005 07:46:40 -0700 Subject: [Python-Dev] Pythonic concurrency References: Your message of "Mon, 10 Oct 2005 11:20:31 PDT." <17226.45295.661911.542400@montanaro.dyndns.org> <05Oct10.122654pdt."58617"@synergy1.parc.xerox.com> <746109444.20051010141518@MailBlocks.com> Message-ID: Steve Holden wrote: > The nice thing about Python threads (or rather threading.threads) is > that since each thread is an instance it's *relatively* easy to ensure > that a thread restricts itself to manipulating thread-local resources > (i.e. instance members). > > This makes it possible to write algorithms parameterized for the number > of "worker threads" where the workers are taking their tasks off a Queue > with entries generated by a single producer thread. With care, multiple > producers can be used. More complex inter-thread communications are > problematic, and arbitrary access to foreign-thread state is a nightmare > (although the position has been somewhat alleviated by the introduction > of threading.local). "Somewhat alleviated" and somewhat worsened. I've had half a dozen conversations in the last year about sharing data between threads; in every case, I've had to work quite hard to convince the other person that threading.local is *not* magic pixie thread dust. Each time, they had come to the conclusion that if they had a global variable, they could just stick a reference to it into a threading.local object and instantly have safe, concurrent access to it. Robert Brewer System Architect Amor Ministries fumanchu at amor.org -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20051011/39c9073f/attachment.html From ncoghlan at gmail.com Tue Oct 11 16:51:30 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Oct 2005 00:51:30 +1000 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <434B6C3A.7020001@canterbury.ac.nz> References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> <434B6C3A.7020001@canterbury.ac.nz> Message-ID: <434BD172.3030004@gmail.com> Greg Ewing wrote: > Guido van Rossum wrote: > > >>BTW, what should >> >> [a, b, *rest] = (1, 2, 3, 4, 5) >> >>do? Should it set rest to (3, 4, 5) or to [3, 4, 5]? > > > Whatever type is chosen, it should be the same type, always. > The rhs could be any iterable, not just a tuple or a list. > Making a special case of preserving one or two types doesn't > seem worth it to me. And, for consistency with functions, the type chosen should be a tuple. I'm also trying to figure out why you would ever write: [a, b, c, d] = seq instead of: a, b, c, d = seq or: (a, b, c, d) = seq It's not like the square brackets generate different code: Py> def foo(): ... x, y = 1, 2 ... (x, y) = 1, 2 ... [x, y] = 1, 2 ... Py> dis.dis(foo) 2 0 LOAD_CONST 3 ((1, 2)) 3 UNPACK_SEQUENCE 2 6 STORE_FAST 1 (x) 9 STORE_FAST 0 (y) 3 12 LOAD_CONST 4 ((1, 2)) 15 UNPACK_SEQUENCE 2 18 STORE_FAST 1 (x) 21 STORE_FAST 0 (y) 4 24 LOAD_CONST 5 ((1, 2)) 27 UNPACK_SEQUENCE 2 30 STORE_FAST 1 (x) 33 STORE_FAST 0 (y) 36 LOAD_CONST 0 (None) 39 RETURN_VALUE So my vote would actually go for deprecating the use of square brackets to surround an assignment target list - it makes it look like an actual list object should be involved somewhere, but there isn't one. >>? And then perhaps >> >> *rest = x >> >>should mean >> >> rest = tuple(x) >> >>Or should that be disallowed > > Why bother? What harm would result from the ability to write that? Given that: def foo(*args): print args is legal, I would have no problem with "*rest = x" being legal. >>There certainly is a need for doing the same from the end: >> >> *rest, a, b = (1, 2, 3, 4, 5) > > > I wouldn't mind at all if *rest were only allowed at the end. > There's a pragmatic reason for that if nothing else: the rhs > can be any iterable, and there's no easy way of getting "all > but the last n" items from a general iterable. Agreed. The goal here is to make the name binding rules consistent between for loops, tuple assigment and function entry, not to create different rules. >>Where does it stop? > For me, it stops with *rest only allowed at the end, and > always yielding a predictable type (which could be either tuple > or list, I don't care). For me, it stops when the rules for positional name binding are more consistent across operations that bind names (although complete consistency isn't possible, given that function calls don't unpack sequences automatically). Firstly, let's list the operations that permit name binding to a list of identifiers: - binding of function parameters to function arguments - binding of assignment target list to assigned sequence - binding of iteration variables to iteration values However, that function argument case needs to be recognised as a two step operation, whereby the arguments are *always* packed into a tuple before being bound to the parameters. That is something very vaguely like: if numargs > 0: if numargs == 1: argtuple = args, # One argument gives singleton tuple else: argtuple = args # More arguments gives appropriate tuple argtuple += tuple(starargs) # Extended arguments are added to the tuple param1, param2, *rest = argtuple # Tuple is unpacked to parameters This means that the current behaviour of function parameters is actually the same as assignment target lists and iteration variables, in that the argument tuple is *always* unpacked into the parameter list - the only difference is that a single argument is always considered a singleton tuple. You can get the same behaviour with target lists and iteration variables by only using tuples of identifiers as targets (i.e., use "x," rather than just "x"). So the proposal at this stage is simply to mimic the unpacking of the argument tuple into the formal parameter list in the other two name list binding cases, such that the pseudocode above would actually do the same thing as building an argument list and binding it to its formal parameters does. Now, when it came to tuple *packing* syntax (i.e., extended call syntax) The appropriate behaviour would be for: 1, 2, 3, *range(10) to translate (roughly) to: (1, 2, 3) + tuple(range(10)) However, given that the equivalent code works just fine anywhere it really matters (assignment value, return value, yield value), and is clearer about what is going on, this option is probably worth avoiding. >>BTW, and quite unrelated, I've always felt uncomfortable that you have to write >> >> f(a, b, foo=1, bar=2, *args, **kwds) >> >>I've always wanted to write that as >> >> f(a, b, *args, foo=1, bar=2, **kwds) > > > Yes, I'd like that too, with the additional meaning that > foo and bar can only be specified by keyword, not by > position. Indeed. It's a (minor) pain that optional flag variables and variable length argument lists are currently mutually exclusive. Although, if you had that rule, I'd want to be able to write: def f(a, b, *, foo=1, bar=2): pass to get a function which required exactly two positional arguments, but had a couple of optional keyword arguments, rather than having to do: def f(a, b, *args, foo=1, bar=2): if args: raise TypeError("f() takes exactly 2 positional arguments (%d given)", 2 + len(args)) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Tue Oct 11 16:54:19 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Oct 2005 00:54:19 +1000 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <434BD172.3030004@gmail.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> <434B6C3A.7020001@canterbury.ac.nz> <434BD172.3030004@gmail.com> Message-ID: <434BD21B.8020208@gmail.com> Nick Coghlan wrote: > For me, it stops when the rules for positional name binding are more > consistent across operations that bind names (although complete consistency > isn't possible, given that function calls don't unpack sequences automatically). Oops - forgot to delete this bit once I realised that functions actually *do* unpack the arugment tuple automatically. It's just that an argument which is a single sequence gets put into a singleton tuple before being unpacked. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Tue Oct 11 16:55:31 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Oct 2005 00:55:31 +1000 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: References: Your message of "Mon, 10 Oct 2005 11:20:31 PDT." <17226.45295.661911.542400@montanaro.dyndns.org> <05Oct10.122654pdt."58617"@synergy1.parc.xerox.com> <746109444.20051010141518@MailBlocks.com> Message-ID: <434BD263.40907@gmail.com> Robert Brewer wrote: > "Somewhat alleviated" and somewhat worsened. I've had half a dozen > conversations in the last year about sharing data between threads; in > every case, I've had to work quite hard to convince the other person > that threading.local is *not* magic pixie thread dust. Each time, they > had come to the conclusion that if they had a global variable, they > could just stick a reference to it into a threading.local object and > instantly have safe, concurrent access to it. Ouch. Copy, yes, reference, no. . . Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From guido at python.org Tue Oct 11 17:12:03 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 11 Oct 2005 08:12:03 -0700 Subject: [Python-Dev] Making Queue.Queue easier to use In-Reply-To: <434B8DBF.9080509@iinet.net.au> References: <434B8DBF.9080509@iinet.net.au> Message-ID: On 10/11/05, Nick Coghlan wrote: > The multi-processing discussion reminded me that I have a few problems I run > into every time I try to use Queue objects. > > My first problem is finding it: > > Py> from threading import Queue # Nope > Traceback (most recent call last): > File "", line 1, in ? > ImportError: cannot import name Queue > Py> from Queue import Queue # Ah, there it is I don't think that's a reason to move it. >>> from sys import Queue ImportError: cannon import name Queue >>> from os import Queue ImportError: cannot import name Queue >>> # Well where the heck is it?! > What do people think of the idea of adding an alias to Queue into the > threading module so that: > a) the first line above works; and I see no need. Code that *doesn't* need Queue but does use threading shouldn't have to pay for loading Queue.py. > b) Queue can be documented with all of the other threading primitives, > rather than being off somewhere else in its own top-level section. Do top-level sections have to limit themselves to a single module? Even if they do, I think it's fine to plant a prominent link to the Queue module. You can't really expect people to learn how to use threads wisely from reading the library reference anyway. > My second problem is with the current signatures of the put() and get() > methods. Specifically, the following code blocks forever instead of raising an > Empty exception after 500 milliseconds as one might expect: > from Queue import Queue > x = Queue() > x.get(0.5) I'm not sure if I have much sympathy with a bug due to refusing to read the docs... :) > I assume the current signature is there for backward compatibility with the > original version that didn't support timeouts (considering the difficulty of > telling the difference between "x.get(1)" and "True = 1; x.get(True)" from > inside the get() method) Huh? What a bizarre idea. Why would you do that? I gues I don't understand where you're coming from. > However, the need to write "x.get(True, 0.5)" seems seriously redundant, given > that a single paramater can actually handle all the options (as is currently > the case with Condition.wait()). So write x.get(timeout=0.5). That's clear and unambiguous. > The "put_nowait" and "get_nowait" functions are fine, because they serve a > useful documentation purpose at the calling point (particularly given the > current clumsy timeout signature). > > What do people think of the idea of adding "put_wait" and "get_wait" methods > with the signatures: > put_wait(item,[timeout=None) > get_wait([timeout=None]) -1. I'd rather not tweak the current Queue module at all until Python 3000. Then we could force people to use keyword args. > Optionally, the existing "put" and "get" methods could be deprecated, with the > goal of eventually changing their signature to match the put_wait and get_wait > methods above. Apart from trying to guess the API without reading the docs (:-), what are the use cases for using put/get with a timeout? I have a feeling it's not that common. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 11 17:19:00 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 11 Oct 2005 08:19:00 -0700 Subject: [Python-Dev] PythonCore\CurrentVersion In-Reply-To: <1f7befae0510110651o504958det5d2409b3f724070e@mail.gmail.com> References: <4347A020.2050008@v.loewis.de> <1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com> <200510102109.37690.fdrake@acm.org> <434B58D5.7020206@v.loewis.de> <1f7befae0510110651o504958det5d2409b3f724070e@mail.gmail.com> Message-ID: On 10/11/05, Tim Peters wrote: > Well, that's in interactive mode, and I see sys.path[0] == "" on both > Windows and Linux then. I don't see "" in sys.path on either box in > batch mode, although I do see the absolutized path to the current > directory in sys.path in batch mode on Windows but not on Linux -- but > Mark Hammond says he doesn't see (any form of) the current directory > in sys.path in batch mode on Windows. > > It's a bit confusing ;-) How did you test batch mode? All: sys.path[0] is *not* defined to be the current directory. It is defined to be the directory of the script that was used to invoke python (sys.argv[0], typically). If there is no script, or it is being read from stdin, the default is ''. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.peters at gmail.com Tue Oct 11 18:08:42 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 11 Oct 2005 12:08:42 -0400 Subject: [Python-Dev] PythonCore\CurrentVersion In-Reply-To: References: <4347A020.2050008@v.loewis.de> <1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com> <200510102109.37690.fdrake@acm.org> <434B58D5.7020206@v.loewis.de> <1f7befae0510110651o504958det5d2409b3f724070e@mail.gmail.com> Message-ID: <1f7befae0510110908g2bada0a7vfd98df080c9af745@mail.gmail.com> [Tim] >> Well, that's in interactive mode, and I see sys.path[0] == "" on both >> Windows and Linux then. I don't see "" in sys.path on either box in >> batch mode, although I do see the absolutized path to the current >> directory in sys.path in batch mode on Windows but not on Linux -- but >> Mark Hammond says he doesn't see (any form of) the current directory >> in sys.path in batch mode on Windows. >> >> It's a bit confusing ;-) [Guido] > How did you test batch mode? I gave full code (it's brief) and screen-scrapes from Windows and Linux yesterday: http://mail.python.org/pipermail/python-dev/2005-October/057162.html By batch mode, I meant invoking path_to_python path_to_python_script.py from a shell prompt. > All: > > sys.path[0] is *not* defined to be the current directory. > > It is defined to be the directory of the script that was used to > invoke python (sys.argv[0], typically). In my runs, sys.argv[0] was the path to the Python executable, not to the script being run. The directory of the script being run was nevertheless in sys.path[0] on both Windows and Linux. On Windows, but not on Linux, the _current_ directory (the directory I happened to be in at the time I invoked Python) was also on sys.path; Mark Hammond said it was not when he tried, but he didn't show exactly what he did so I'm not sure what he saw. > If there is no script, or it is being read from stdin, the default is ''. I believe everyone sees that. From guido at python.org Tue Oct 11 18:22:43 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 11 Oct 2005 09:22:43 -0700 Subject: [Python-Dev] PythonCore\CurrentVersion In-Reply-To: <1f7befae0510110908g2bada0a7vfd98df080c9af745@mail.gmail.com> References: <4347A020.2050008@v.loewis.de> <1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com> <200510102109.37690.fdrake@acm.org> <434B58D5.7020206@v.loewis.de> <1f7befae0510110651o504958det5d2409b3f724070e@mail.gmail.com> <1f7befae0510110908g2bada0a7vfd98df080c9af745@mail.gmail.com> Message-ID: On 10/11/05, Tim Peters wrote: > [Tim] > >> Well, that's in interactive mode, and I see sys.path[0] == "" on both > >> Windows and Linux then. I don't see "" in sys.path on either box in > >> batch mode, although I do see the absolutized path to the current > >> directory in sys.path in batch mode on Windows but not on Linux -- but > >> Mark Hammond says he doesn't see (any form of) the current directory > >> in sys.path in batch mode on Windows. > >> > >> It's a bit confusing ;-) > > [Guido] > > How did you test batch mode? > > I gave full code (it's brief) and screen-scrapes from Windows and > Linux yesterday: > > http://mail.python.org/pipermail/python-dev/2005-October/057162.html > > By batch mode, I meant invoking > > path_to_python path_to_python_script.py > > from a shell prompt. > > > All: > > > > sys.path[0] is *not* defined to be the current directory. > > > > It is defined to be the directory of the script that was used to > > invoke python (sys.argv[0], typically). > > In my runs, sys.argv[0] was the path to the Python executable, not to > the script being run. I tried your experiment but added 'print sys.argv[0]' and didn't see that. sys.argv[0] is the path to the script. > The directory of the script being run was > nevertheless in sys.path[0] on both Windows and Linux. On Windows, > but not on Linux, the _current_ directory (the directory I happened to > be in at the time I invoked Python) was also on sys.path; Mark Hammond > said it was not when he tried, but he didn't show exactly what he did > so I'm not sure what he saw. I see what you see. The first entry is the script's directory, the 2nd is a nonexistent zip file, the 3rd is the current directory, then the rest is standard library stuff. I suppose PC/getpathp.c puts it there, per your post quoted above? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steve at holdenweb.com Tue Oct 11 18:22:06 2005 From: steve at holdenweb.com (Steve Holden) Date: Tue, 11 Oct 2005 17:22:06 +0100 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <434BD172.3030004@gmail.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> <434B6C3A.7020001@canterbury.ac.nz> <434BD172.3030004@gmail.com> Message-ID: Nick Coghlan wrote: > Greg Ewing wrote: > >>Guido van Rossum wrote: >> >> >> >>>BTW, what should >>> >>> [a, b, *rest] = (1, 2, 3, 4, 5) >>> >>>do? Should it set rest to (3, 4, 5) or to [3, 4, 5]? >> >> >>Whatever type is chosen, it should be the same type, always. >>The rhs could be any iterable, not just a tuple or a list. >>Making a special case of preserving one or two types doesn't >>seem worth it to me. > > > And, for consistency with functions, the type chosen should be a tuple. > > I'm also trying to figure out why you would ever write: > [a, b, c, d] = seq > > instead of: > a, b, c, d = seq > > or: > (a, b, c, d) = seq > [...] > So my vote would actually go for deprecating the use of square brackets to > surround an assignment target list - it makes it look like an actual list > object should be involved somewhere, but there isn't one. > But don't forget that at present unpacking can be used at several levels: >>> ((a, b), c) = ((1, 2), 3) >>> a, b, c (1, 2, 3) >>> So presumably you'd need to be able to say ((a, *b), c, *d) = ((1, 2, 3), 4, 5, 6) and see a, b, c, d == 1, (2, 3), 4, (5, 6) if we are to retain today's multi-level consistency. And are you also proposing to allow a, *b = [1] to put the empty list into b, or is that an unpacking error? > >>>? And then perhaps >>> >>> *rest = x >>> >>>should mean >>> >>> rest = tuple(x) >>> >>>Or should that be disallowed >> >>Why bother? What harm would result from the ability to write that? > > > Given that: > def foo(*args): > print args > > is legal, I would have no problem with "*rest = x" being legal. > Though presumably we'd still be raising TypeError is x weren't a sequence. > >>>There certainly is a need for doing the same from the end: >>> >>> *rest, a, b = (1, 2, 3, 4, 5) >> >> >>I wouldn't mind at all if *rest were only allowed at the end. >>There's a pragmatic reason for that if nothing else: the rhs >>can be any iterable, and there's no easy way of getting "all >>but the last n" items from a general iterable. > > > Agreed. The goal here is to make the name binding rules consistent between for > loops, tuple assigment and function entry, not to create different rules. > > >>>Where does it stop? >> >>For me, it stops with *rest only allowed at the end, and >>always yielding a predictable type (which could be either tuple >>or list, I don't care). > > > For me, it stops when the rules for positional name binding are more > consistent across operations that bind names (although complete consistency > isn't possible, given that function calls don't unpack sequences automatically). > Hmm. Given that today we can write >>> def foo((a, b), c): ... print a, b, c ... >>> foo((1, 2, 3)) Traceback (most recent call last): File "", line 1, in ? TypeError: foo() takes exactly 2 arguments (1 given) >>> foo((1, 2), 3) 1 2 3 >>> does this mean that you'd also like to be able to write def foo((a, *b), *c): print a, b, c and then call it like foo((1, 2, 3, 4), 5, 6) to see 1, (2, 3, 4), (5, 6) [...] > >>>BTW, and quite unrelated, I've always felt uncomfortable that you have to write >>> >>> f(a, b, foo=1, bar=2, *args, **kwds) >>> >>>I've always wanted to write that as >>> >>> f(a, b, *args, foo=1, bar=2, **kwds) >> >> >>Yes, I'd like that too, with the additional meaning that >>foo and bar can only be specified by keyword, not by >>position. > > > Indeed. It's a (minor) pain that optional flag variables and variable length > argument lists are currently mutually exclusive. Although, if you had that > rule, I'd want to be able to write: > > def f(a, b, *, foo=1, bar=2): pass > > to get a function which required exactly two positional arguments, but had a > couple of optional keyword arguments, rather than having to do: > > def f(a, b, *args, foo=1, bar=2): > if args: > raise TypeError("f() takes exactly 2 positional arguments (%d given)", > 2 + len(args)) > I do feel that for Python 3 it might be better to make a clean separation between keywords and positionals: in other words, of the function definition specifies a keyword argument then a keyword must be used to present it. This would allow users to provide an arbitrary number of positionals rather than having them become keyword arguments. At present it's difficult to specify that. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From solipsis at pitrou.net Tue Oct 11 18:46:18 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 11 Oct 2005 18:46:18 +0200 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> <434B6C3A.7020001@canterbury.ac.nz> <434BD172.3030004@gmail.com> Message-ID: <1129049178.6162.5.camel@fsol> (my own 2 eurocents) > I do feel that for Python 3 it might be better to make a clean > separation between keywords and positionals: in other words, of the > function definition specifies a keyword argument then a keyword must be > used to present it. Do you mean it would also be forbidden to invoke a "positional" argument using its keyword? It would be a huge step back in usability IMO. Some people like invoking by position (because it's shorter) and some others prefer invoking by keyword (because it's more explicit). Why should the implementer of the API have to make a choice for the user of the API ? From BruceEckel-Python3234 at mailblocks.com Tue Oct 11 18:53:02 2005 From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel) Date: Tue, 11 Oct 2005 10:53:02 -0600 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <05Oct10.180605pdt."58617"@synergy1.parc.xerox.com> References: Your message of "Mon, 10 Oct 2005 17:18:15 PDT." <05Oct10.180605pdt."58617"@synergy1.parc.xerox.com> Message-ID: <638942434.20051011105302@MailBlocks.com> > Java's condition variables don't (didn't? has this been fixed?) quite > work. The emphasis on portability and the resulting notions of > red/green threading packages at the beginning didn't help either. > Read Allen Holub's book. And Doug Lea's book. I understand much of > this has been addressed with a new package in Java 1.5. Not only are there significant new library components in java.util.concurrent in J2SE5, but perhaps more important is the new memory model that deals with issues that are (especially) revealed in multiprocessor environments. The new memory model represents new work in the computer science field; apparently the original paper is written by Ph.D.s and is a bit too theoretical for the normal person to follow. But the smart threading guys studied this and came up with the new Java memory model -- so that volatile, for example, which didn't work quite right before, does now. This is part of J2SE5, and this work is being incorporated into the upcoming C++0x. Java concurrency is certainly one of the bad examples of language design. Apparently, they grabbed stuff from C++ (mostly the volatile keyword) and combined it with what they new about pthreads, and decided that being able to declare a method as synchronized made the whole thing object-oriented. But you can see how ill-thought-out the design was because in later versions of Java some fundamental methods: stop(), suspend(), resume() and destroy(), were deprecated because ... oops, we didn't really think those out very well. And then finally, with J2SE5, it *appears* that all the kinks have been fixed, but only with some really smart folks like Doug Lea, Brian Goetz, and that gang, working long and hard on all these issues and (we hope) figuring them all out. I think threading *can* be much simpler, and I *want* it to be that way in Python. But that can only happen if the right model is chosen, and that model is not pthreads. People migrate to pthreads if they already understand it and so it might seem "simple" to them because of that. But I think we need something that supports an object-oriented approach to concurrency that doesn't prevent beginners from using it safely. Bruce Eckel From jcarlson at uci.edu Tue Oct 11 20:07:24 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 11 Oct 2005 11:07:24 -0700 Subject: [Python-Dev] Making Queue.Queue easier to use In-Reply-To: References: <434B8DBF.9080509@iinet.net.au> Message-ID: <20051011105128.28E0.JCARLSON@uci.edu> Guido van Rossum wrote: > > Optionally, the existing "put" and "get" methods could be deprecated, with the > > goal of eventually changing their signature to match the put_wait and get_wait > > methods above. > > Apart from trying to guess the API without reading the docs (:-), what > are the use cases for using put/get with a timeout? I have a feeling > it's not that common. With timeout=0, a shared connection/resource pool (perhaps DB, etc., I use one in the tuple space implementation I have for connections to the tuple space). Note that technically speaking, Queue.Queue from Pythons prior to 2.4 is broken: get_nowait() may not get an object even if the Queue is full, this is caused by "elif not self.esema.acquire(0):" being called for non-blocking requests. Tim did more than simplify the structure by rewriting it, he fixed this bug. With block=True, timeout=None, worker threads pulling from a work-to-do queue, and even a thread which handles the output of those threads via a result queue. - Josiah From steven.bethard at gmail.com Tue Oct 11 20:09:01 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Tue, 11 Oct 2005 12:09:01 -0600 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <434BD172.3030004@gmail.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> <434B6C3A.7020001@canterbury.ac.nz> <434BD172.3030004@gmail.com> Message-ID: Nick Coghlan wrote: > So my vote would actually go for deprecating the use of square brackets to > surround an assignment target list - it makes it look like an actual list > object should be involved somewhere, but there isn't one. I've found myself using square brackets a few times for more complicated unpacking, e.g.: try: x, y = args except ValueError: [x], y = args, None where I thought that (x,), y = args, None would have been more confusing. OTOH, I usually end up rewriting this to x, = args y = None because even the bracketed form is a bit confusing. So I wouldn't really be upset if the brackets went away. STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From john.m.camara at comcast.net Tue Oct 11 20:09:25 2005 From: john.m.camara at comcast.net (john.m.camara@comcast.net) Date: Tue, 11 Oct 2005 18:09:25 +0000 Subject: [Python-Dev] Python-Dev Digest, Vol 27, Issue 44 Message-ID: <101120051809.17864.434BFFD10007FB43000045C822007358340E9D0E030E0CD203D202080106@comcast.net> > Date: Tue, 11 Oct 2005 09:51:06 -0400 > From: Tim Peters > Subject: Re: [Python-Dev] PythonCore\CurrentVersion > To: Martin v. L?wis > Cc: python-dev at python.org > Message-ID: > <1f7befae0510110651o504958det5d2409b3f724070e at mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1 > > [Tim Peters] > >>> never before this year -- maybe sys.path _used_ to contain the current > >>> directory on Linux?). > > [Fred L. Drake, Jr.] > >> It's been a long time since this was the case on Unix of any variety; I > >> *think* this changed to the current state back before 2.0. > > [Martin v. L?wis] > > Please check again: > > > > [GCC 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)] on linux2 > > Type "help", "copyright", "credits" or "license" for more information. > > >>> import sys > > >>> sys.path > > ['', '/usr/lib/python23.zip', '/usr/lib/python2.3', > > '/usr/lib/python2.3/plat-linux2', '/usr/lib/python2.3/lib-tk', > > '/usr/lib/python2.3/lib-dynload', > > '/usr/local/lib/python2.3/site-packages', > > '/usr/lib/python2.3/site-packages', > > '/usr/lib/python2.3/site-packages/Numeric', > > '/usr/lib/python2.3/site-packages/gtk-2.0', '/usr/lib/site-python'] > > > > We still have the empty string in sys.path, and it still > > denotes the current directory. > > Well, that's in interactive mode, and I see sys.path[0] == "" on both > Windows and Linux then. I don't see "" in sys.path on either box in > batch mode, although I do see the absolutized path to the current > directory in sys.path in batch mode on Windows but not on Linux -- but > Mark Hammond says he doesn't see (any form of) the current directory > in sys.path in batch mode on Windows. > > It's a bit confusing ;-) > Been bit by this in the past. On windows, it's a relative path in interactive mode and absolute path in non-interactive mode. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20051011/735447ed/attachment-0001.html From jcarlson at uci.edu Tue Oct 11 20:26:42 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 11 Oct 2005 11:26:42 -0700 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: References: Message-ID: <20051011104737.28DD.JCARLSON@uci.edu> "Robert Brewer" wrote: > "Somewhat alleviated" and somewhat worsened. I've had half a dozen > conversations in the last year about sharing data between threads; in > every case, I've had to work quite hard to convince the other person > that threading.local is *not* magic pixie thread dust. Each time, they > had come to the conclusion that if they had a global variable, they > could just stick a reference to it into a threading.local object and > instantly have safe, concurrent access to it. *boggles* Perhaps there should be an entry in the documentation about this. Here is a proposed modification. Despite desires and assumptions to the contrary, threading.local is not magic. Placing references to global shared objects into threading.local will not make them magically threadsafe. Only by using threadsafe shared objects (by design with Queue.Queue, or by desire with lock.acquire()/release() placed around object accesses) will you have the potential for doing safe things. - Josiah From tim.peters at gmail.com Tue Oct 11 20:35:52 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 11 Oct 2005 14:35:52 -0400 Subject: [Python-Dev] Making Queue.Queue easier to use In-Reply-To: <20051011105128.28E0.JCARLSON@uci.edu> References: <434B8DBF.9080509@iinet.net.au> <20051011105128.28E0.JCARLSON@uci.edu> Message-ID: <1f7befae0510111135n48591bcawb39d7cae0698d9ff@mail.gmail.com> [Guido] >> Apart from trying to guess the API without reading the docs (:-), what >> are the use cases for using put/get with a timeout? I have a feeling >> it's not that common. [Josiah Carlson] > With timeout=0, a shared connection/resource pool (perhaps DB, etc., I > use one in the tuple space implementation I have for connections to the > tuple space). Passing timeout=0 is goofy: use {get,put}_nowait() instead. There's no difference in semantics. > Note that technically speaking, Queue.Queue from Pythons > prior to 2.4 is broken: get_nowait() may not get an object even if the > Queue is full, this is caused by "elif not self.esema.acquire(0):" being > called for non-blocking requests. Tim did more than simplify the > structure by rewriting it, he fixed this bug. I don't agree it was a bug, but I did get fatally weary of arguing with people who insisted it was ;-) It's certainly easier to explain (and the code is easier to read) now. > With block=True, timeout=None, worker threads pulling from a work-to-do > queue, and even a thread which handles the output of those threads via > a result queue. Guido understands use cases for blocking and non-blocking put/get, and Queue always supported those possibilities. The timeout argument got added later, and it's not really clear _why_ it was added. timeout=0 isn't a sane use case (because the same effect can be gotten with non-blocking put/get). From guido at python.org Tue Oct 11 20:45:28 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 11 Oct 2005 11:45:28 -0700 Subject: [Python-Dev] Making Queue.Queue easier to use In-Reply-To: <1f7befae0510111135n48591bcawb39d7cae0698d9ff@mail.gmail.com> References: <434B8DBF.9080509@iinet.net.au> <20051011105128.28E0.JCARLSON@uci.edu> <1f7befae0510111135n48591bcawb39d7cae0698d9ff@mail.gmail.com> Message-ID: On 10/11/05, Tim Peters wrote: > Guido understands use cases for blocking and non-blocking put/get, and > Queue always supported those possibilities. The timeout argument got > added later, and it's not really clear _why_ it was added. timeout=0 > isn't a sane use case (because the same effect can be gotten with > non-blocking put/get). In the socket world, a similar bifurcation of the API has happened (also under my supervision, even though the idea and prototype code were contributed by others). The API there is very different because the blocking or timeout is an attribute of the socket, not passed in to every call. But one lesson we can learn from sockets (or perhaps the reason why people kept asking for timeout=0 to be "fixed" :) is that timeout=0 is just a different way to spell blocking=False. The socket module makes sure that the socket ends up in exactly the same state no matter which API is used; and in fact the setblocking() API is redundant. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From reinhold-birkenfeld-nospam at wolke7.net Tue Oct 11 20:43:59 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Tue, 11 Oct 2005 20:43:59 +0200 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <434B6C3A.7020001@canterbury.ac.nz> References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> <434B6C3A.7020001@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Guido van Rossum wrote: > >> BTW, what should >> >> [a, b, *rest] = (1, 2, 3, 4, 5) >> >> do? Should it set rest to (3, 4, 5) or to [3, 4, 5]? > > Whatever type is chosen, it should be the same type, always. > The rhs could be any iterable, not just a tuple or a list. > Making a special case of preserving one or two types doesn't > seem worth it to me. I don't think that [a, b, c] = iterable is good style right now, so I'd say that [a, b, *rest] = iterable should be disallowed or be the same as with parentheses. It's not intuitive that rest could be a list here. >> ? And then perhaps >> >> *rest = x >> >> should mean >> >> rest = tuple(x) >> >> Or should that be disallowed > > Why bother? What harm would result from the ability to write that? > >> There certainly is a need for doing the same from the end: >> >> *rest, a, b = (1, 2, 3, 4, 5) > > I wouldn't mind at all if *rest were only allowed at the end. > There's a pragmatic reason for that if nothing else: the rhs > can be any iterable, and there's no easy way of getting "all > but the last n" items from a general iterable. > >> Where does it stop? > > For me, it stops with *rest only allowed at the end, and > always yielding a predictable type (which could be either tuple > or list, I don't care). +1. Tuple is more consistent. >> BTW, and quite unrelated, I've always felt uncomfortable that you have to write >> >> f(a, b, foo=1, bar=2, *args, **kwds) >> >> I've always wanted to write that as >> >> f(a, b, *args, foo=1, bar=2, **kwds) > > Yes, I'd like that too, with the additional meaning that > foo and bar can only be specified by keyword, not by > position. That would be a logical consequence. But one should also be able to give default values for positional parameters. So: foo(a, b, c=1, *args, d=2, e=5, **kwargs) ^^^^^^^^^ ^^^^^^^^ positional only with kw or with kw Reinhold -- Mail address is perfectly valid! From tim.peters at gmail.com Tue Oct 11 21:09:02 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 11 Oct 2005 15:09:02 -0400 Subject: [Python-Dev] PythonCore\CurrentVersion In-Reply-To: References: <4347A020.2050008@v.loewis.de> <1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com> <200510102109.37690.fdrake@acm.org> <434B58D5.7020206@v.loewis.de> <1f7befae0510110651o504958det5d2409b3f724070e@mail.gmail.com> <1f7befae0510110908g2bada0a7vfd98df080c9af745@mail.gmail.com> Message-ID: <1f7befae0510111209j40975a17w98bec7350d6c4874@mail.gmail.com> [Guido] > I tried your experiment but added 'print sys.argv[0]' and didn't see > that. sys.argv[0] is the path to the script. My mistake! You're right, sys.argv[0] is the path to the script for me too. [Tim] >> The directory of the script being run was >> nevertheless in sys.path[0] on both Windows and Linux. On Windows, >> but not on Linux, the _current_ directory (the directory I happened to >> be in at the time I invoked Python) was also on sys.path; Mark Hammond >> said it was not when he tried, but he didn't show exactly what he did >> so I'm not sure what he saw. [Guido] > I see what you see. The first entry is the script's directory, the > 2nd is a nonexistent zip file, the 3rd is the current directory, then > the rest is standard library stuff. So why doesn't Mark see that? I'll ask him ;-) > I suppose PC/getpathp.c puts it there, per your post quoted above? I don't think it does (although I understand why it's sane to believe that it must). Curiously, I do _not_ see the current directory on sys.path on Windows if I run from current CVS HEAD. I do see it running Pythons 2.2.3, 2.3.5 and 2.4.2. PC/getpathp.c doesn't appear to have changed in a relevant way. blor.py: """ import sys from pprint import pprint print sys.version_info pprint(sys.path) """ C:\>\code\python\PCbuild\python.exe code\blor.py # C:\ not in sys.path (2, 5, 0, 'alpha', 0) ['C:\\code', 'C:\\code\\python\\PCbuild\\python25.zip', 'C:\\code\\python\\DLLs', 'C:\\code\\python\\lib', 'C:\\code\\python\\lib\\plat-win', 'C:\\code\\python\\lib\\lib-tk', 'C:\\code\\python\\PCbuild', 'C:\\code\\python', 'C:\\code\\python\\lib\\site-packages'] C:\>\python24\python.exe code\blor.py # C:\ in sys.path (2, 4, 2, 'final', 0) ['C:\\code', 'C:\\python24\\python24.zip', 'C:\\', 'C:\\python24\\DLLs', 'C:\\python24\\lib', 'C:\\python24\\lib\\plat-win', 'C:\\python24\\lib\\lib-tk', 'C:\\python24', 'C:\\python24\\lib\\site-packages', 'C:\\python24\\lib\\site-packages\\PIL', 'C:\\python24\\lib\\site-packages\\win32', 'C:\\python24\\lib\\site-packages\\win32\\lib', 'C:\\python24\\lib\\site-packages\\Pythonwin'] From jason.orendorff at gmail.com Tue Oct 11 22:15:03 2005 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Tue, 11 Oct 2005 16:15:03 -0400 Subject: [Python-Dev] Proposed changes to PEP 343 In-Reply-To: References: <4346467D.5010005@iinet.net.au> <43466C3B.50704@gmail.com> Message-ID: On 10/7/05, Fredrik Lundh wrote: > the whole concept might be perfectly fine on the "this construct corre- > sponds to this code" level, but if you immediately end up with things that > are not what they seem, and names that don't mean what the say, either > the design or the description of it needs work. > > ("yes, I know you can use this class to manage the context, but it's not > really a context manager, because it's that method that's a manager, not > the class itself. yes, all the information that belongs to the context are > managed by the class, but that doesn't make... oh, shut up and read the > PEP") Good points... Maybe it is the description that needs work. Here is a description of iterators, to illustrate the parallels: An object that has an __iter__ method is iterable. It can plug into the Python 'for' statement. obj.__iter__() returns an iterator. An iterator is a single-use, forward-only view of a sequence. 'for' calls __iter__() and uses the resulting iterator's next() method. (This is just as complicated as PEP343+changes, but not as mindboggling, because the terminology is better. Also because we're used to iterators.) Now contexts, per PEP 343 with Nick's proposed changes: An object that has a __with__ method is a context. It can plug into the Python 'with' statement. obj.__with__() returns a context manager. A context manager is a single-use object that manages a single visit into a context. 'with' calls __with__() and uses the resulting context manager's __enter__() and __exit__() methods. A contextmanager is a function that returns a new context manager. Okay, that last bit is weird. But note that PEP 343 has this oddness even without the proposed changes. Perhaps either "context manager" or contextmanager should be renamed, regardless of whether Nick's changes are accepted. With the changes, context managers will be (conceptually) single-use. So maybe a different term might be appropriate. Perhaps "ticket". "A ticket is a single-use object that manages a single visit into a context." -j From rrr at ronadam.com Tue Oct 11 22:41:01 2005 From: rrr at ronadam.com (Ron Adam) Date: Tue, 11 Oct 2005 16:41:01 -0400 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> <434B6C3A.7020001@canterbury.ac.nz> Message-ID: <434C235D.1060404@ronadam.com> Reinhold Birkenfeld wrote: > Greg Ewing wrote: > >>Guido van Rossum wrote: >> >> >>>BTW, what should >>> >>> [a, b, *rest] = (1, 2, 3, 4, 5) >>> >>>do? Should it set rest to (3, 4, 5) or to [3, 4, 5]? >> >>Whatever type is chosen, it should be the same type, always. >>The rhs could be any iterable, not just a tuple or a list. >>Making a special case of preserving one or two types doesn't >>seem worth it to me. > > > I don't think that > > [a, b, c] = iterable > > is good style right now, so I'd say that > > [a, b, *rest] = iterable > > should be disallowed or be the same as with parentheses. It's not > intuitive that rest could be a list here. I wonder if something like the following would fulfill the need? This divides a sequence at given index's by using an divider iterator on it. class xlist(list): def div_at(self, *args): """ return a divided sequence """ return [x for x in self.div_iter(*args)] def div_iter(self, *args): """ return a sequence divider-iter """ s = None for n in args: yield self[s:n] s = n yield self[n:] seq = xlist(range(10)) (a,b),rest = seq.div_at(2) print a,b,rest # 0 1 [2, 3, 4, 5, 6, 7, 8, 9] (a,b),c,(d,e),rest = seq.div_at(2,4,6) print seq.div_at(2,4,6) # [[0, 1], [2, 3], [4, 5], [6, 7, 8, 9]] print a,b,c,d,e,rest # 0 1 [2, 3] 4 5 [6, 7, 8, 9] This addresses the issue of repeating the name of the iterable. Cheers, Ron From jcarlson at uci.edu Wed Oct 12 02:41:06 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 11 Oct 2005 17:41:06 -0700 Subject: [Python-Dev] Making Queue.Queue easier to use In-Reply-To: <1f7befae0510111135n48591bcawb39d7cae0698d9ff@mail.gmail.com> References: <20051011105128.28E0.JCARLSON@uci.edu> <1f7befae0510111135n48591bcawb39d7cae0698d9ff@mail.gmail.com> Message-ID: <20051011165924.28FC.JCARLSON@uci.edu> [Guido] > >> Apart from trying to guess the API without reading the docs (:-), what > >> are the use cases for using put/get with a timeout? I have a feeling > >> it's not that common. [Josiah Carlson] > > With timeout=0, a shared connection/resource pool (perhaps DB, etc., I > > use one in the tuple space implementation I have for connections to the > > tuple space). [Tim Peters] > Passing timeout=0 is goofy: use {get,put}_nowait() instead. There's > no difference in semantics. I understand this, as do many others who use it. However, having both manually and automatically tuned timeouts myself in certain applications, the timeout=0 case is useful. Uncommon? Likely, I've not yet seen any examples of anyone using this particular timeout method at koders.com . > > Note that technically speaking, Queue.Queue from Pythons > > prior to 2.4 is broken: get_nowait() may not get an object even if the > > Queue is full, this is caused by "elif not self.esema.acquire(0):" being > > called for non-blocking requests. Tim did more than simplify the > > structure by rewriting it, he fixed this bug. > > I don't agree it was a bug, but I did get fatally weary of arguing > with people who insisted it was ;-) It's certainly easier to explain > (and the code is easier to read) now. When getting an object from a non-empty queue fails because some other thread already had the lock, and it is a fair assumption that the other thread will release the lock within the next context switch... Because I still develop on Python 2.3 (I need to support a commercial codebase made with 2.3), I was working around it by using the timeout parameter: try: connection = connection_queue.get(timeout=.000001) except Queue.Empty: connection = make_new_connection() With only get_nowait() calls, by the time I hit 3-4 threads, it was failing to pick up connections even when there were hundreds in the queue, and I quickly ran into the file handle limit for my platform, not to mention that the server I was connecting to used asynchronous sockets and select, which died at the 513th incoming socket. I have since copied the implementation of 2.4's queue into certain portions of code which make use of get_nowait() and its variants (handline the deque reference as necessary). Any time one needs to work around a "not buggy feature" with some claimed "unnecessary feature", it tends to smell less than pristine to my nose. > > With block=True, timeout=None, worker threads pulling from a work-to-do > > queue, and even a thread which handles the output of those threads via > > a result queue. > > Guido understands use cases for blocking and non-blocking put/get, and > Queue always supported those possibilities. The timeout argument got > added later, and it's not really clear _why_ it was added. timeout=0 > isn't a sane use case (because the same effect can be gotten with > non-blocking put/get). def t(): try: #thread state setup... while not QUIT: try: work = q.get(timeout=5) except Queue.Empty: continue #handle work finally: #thread state cleanup... Could the above be daemonized? Certainly, but then the thread state wouldn't be cleaned up. If you can provide me with a way of doing the above with equivalent behavior, using only get_nowait() and get(), then put it in the documentation. If not, then I'd say that the timeout argument is a necessarily useful feature. [Guido] > But one lesson we can learn from sockets (or perhaps the reason why > people kept asking for timeout=0 to be "fixed" :) is that timeout=0 is > just a different way to spell blocking=False. The socket module makes > sure that the socket ends up in exactly the same state no matter which > API is used; and in fact the setblocking() API is redundant. This would suggest to me that at least for sockets, setblocking() could be deprecated, as could the block parameter in Queue. I wouldn't vote for either deprecation, but it would seem to make more sense than to remove the timeout arguments from both. - Josiah From greg.ewing at canterbury.ac.nz Wed Oct 12 03:26:58 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 12 Oct 2005 14:26:58 +1300 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: References: <434B8DBF.9080509@iinet.net.au> Message-ID: <434C6662.4040503@canterbury.ac.nz> Guido van Rossum wrote: > I see no need. Code that *doesn't* need Queue but does use threading > shouldn't have to pay for loading Queue.py. However, it does seem awkward to have a whole module providing just one small class that logically is so closely related to other threading facilities. What we want in this kind of situation is some sort of autoloading mechanism, so you can import something from a module and have it trigger the loading of another module behind the scenes to provide it. Another place I'd like this is in my PyGUI library, where I want all the commonly-used class names to appear in the top-level package, but ideally not import the code to implement them until they're actually used. There are various ways of hacking up such functionality today, but it would be nice if there were some kind of language or library support for it. Maybe something like a descriptor mechanism for lookups in module namespaces. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Wed Oct 12 04:10:27 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 12 Oct 2005 15:10:27 +1300 Subject: [Python-Dev] Proposed changes to PEP 343 In-Reply-To: References: <4346467D.5010005@iinet.net.au> <43466C3B.50704@gmail.com> Message-ID: <434C7093.9070802@canterbury.ac.nz> Jason Orendorff wrote: > A contextmanager is a function that returns a new context manager. > > Okay, that last bit is weird. If the name of the decorator is to be 'contextmanager', it really needs to say something like The contextmanager decorator turns a generator into a function that returns a context manager. So maybe the decorator should be called 'contextmanagergenerator'. Or perhaps not, since that's getting rather too much of an eyeful to parse... -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Wed Oct 12 04:15:51 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 12 Oct 2005 15:15:51 +1300 Subject: [Python-Dev] Fwd: defaultproperty In-Reply-To: <434B8060.6070903@gmail.com> References: <433AA5AC.6040509@zope.com> <433BA3CF.1090205@zope.com> <43494648.6040904@zope.com> <4349997E.9010208@gmail.com> <76fd5acf0510092246q1861403flda2dd49ddbe02d6c@mail.gmail.com> <76fd5acf0510092247m4d8383e8o40260de7950906@mail.gmail.com> <1128955292.27841.2.camel@geddy.wooz.org> <434B8060.6070903@gmail.com> Message-ID: <434C71D7.5090703@canterbury.ac.nz> Nick Coghlan wrote: > As a location for this, I would actually suggest a module called something > like "metatools", -1, too vague and meaningless a name. If "decorator" is the official term for this kind of function, then calling the module "decorators" is precise and helpful. Other kinds of meta-level tools should go in their own suitably-named modules. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Wed Oct 12 04:16:16 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 12 Oct 2005 15:16:16 +1300 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <434BD172.3030004@gmail.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> <434B6C3A.7020001@canterbury.ac.nz> <434BD172.3030004@gmail.com> Message-ID: <434C71F0.9000609@canterbury.ac.nz> Nick Coghlan wrote: > I'm also trying to figure out why you would ever write: > [a, b, c, d] = seq I think the ability to use square brackets is a holdover from some ancient Python version where you had to match the type of the thing being unpacked with the appropriate syntax on the lhs. It was a silly requirement from the beginning, and it became unworkable as soon as things other than lists and tuples could be unpacked. In Py3k I expect that [...] for unpacking will no longer be allowed. > Indeed. It's a (minor) pain that optional flag variables and variable length > argument lists are currently mutually exclusive. Although, if you had that > rule, I'd want to be able to write: > > def f(a, b, *, foo=1, bar=2): pass Yes, I agree. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Wed Oct 12 04:17:02 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 12 Oct 2005 15:17:02 +1300 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> <434B6C3A.7020001@canterbury.ac.nz> <434BD172.3030004@gmail.com> Message-ID: <434C721E.6090802@canterbury.ac.nz> Steve Holden wrote: > So presumably you'd need to be able to say > > ((a, *b), c, *d) = ((1, 2, 3), 4, 5, 6) Yes. > a, *b = [1] > > to put the empty list into b, or is that an unpacking error? Empty sequence in b (of whatever type is chosen). > does this mean that you'd also like to be able to write > > def foo((a, *b), *c): That would follow, yes. > I do feel that for Python 3 it might be better to make a clean > separation between keywords and positionals: in other words, of the > function definition specifies a keyword argument then a keyword must be > used to present it. But then how would you give a positional arg a default value without turning it into a keyword arg? It seems to me that the suggested extension covers all the possibilities quite nicely. You can have named positional args with or without default values, optional extra positional args with *, named keyword-only args with or without default values, and unnamed extra keyword-only args with **. The only thing it doesn't give you directly is mandatory positional-only args, and you can get that by catching them with * and unpacking them afterwards. This would actually synergise nicely with * in tuple unpacking: def f(*args): a, b, *rest = args And with one further small extension, you could even get that into the argument list as well: def f(*(a, b, *rest)): ... -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From jspoerri+dated+1129516162.13b96a at earthlink.net Wed Oct 12 04:35:17 2005 From: jspoerri+dated+1129516162.13b96a at earthlink.net (Joshua Spoerri) Date: Wed, 12 Oct 2005 02:35:17 +0000 (UTC) Subject: [Python-Dev] Pythonic concurrency References: <2mll1ghsjc.fsf@starship.python.net> <397621172.20050927111836@MailBlocks.com> <433AE8A8.3010500@v.loewis.de> <329633301.20050929074337@MailBlocks.com> <2mll1ghsjc.fsf@starship.python.net> <5.1.1.6.0.20050929121236.0399ed88@mail.telecommunity.com> <160502469.20050929104837@MailBlocks.com> Message-ID: that stm paper isn't the end. there's a java implementation which seems to be exactly what we want: http://research.microsoft.com/~tharris/papers/2003-oopsla.pdf From ncoghlan at gmail.com Wed Oct 12 11:23:58 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Oct 2005 19:23:58 +1000 Subject: [Python-Dev] Proposed changes to PEP 343 In-Reply-To: <434C7093.9070802@canterbury.ac.nz> References: <4346467D.5010005@iinet.net.au> <43466C3B.50704@gmail.com> <434C7093.9070802@canterbury.ac.nz> Message-ID: <434CD62E.4020901@gmail.com> Greg Ewing wrote: > Jason Orendorff wrote: > > >> A contextmanager is a function that returns a new context manager. >> >>Okay, that last bit is weird. > > > If the name of the decorator is to be 'contextmanager', it > really needs to say something like > > The contextmanager decorator turns a generator into a > function that returns a context manager. > > So maybe the decorator should be called 'contextmanagergenerator'. > Or perhaps not, since that's getting rather too much of an > eyeful to parse... Strictly speaking this fits in with the existing confusion of "generator factory" and "generator": Py> def g(): ... yield None ... Py> type(g) Py> type(g()) Most people would call "g" a generator, even though its really just a factory function that returns generator objects. So technically, the "contextmanager" decorator turns a generator factory function into a context manager factory function. But its easier to simply say that the decorator turns a generator into a context manager, even if that's technically incorrect. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Wed Oct 12 11:41:56 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Oct 2005 19:41:56 +1000 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> <434B6C3A.7020001@canterbury.ac.nz> <434BD172.3030004@gmail.com> Message-ID: <434CDA64.9070302@gmail.com> Steve Holden wrote: > But don't forget that at present unpacking can be used at several levels: > > >>> ((a, b), c) = ((1, 2), 3) > >>> a, b, c > (1, 2, 3) > >>> > > So presumably you'd need to be able to say > > ((a, *b), c, *d) = ((1, 2, 3), 4, 5, 6) > > and see > > a, b, c, d == 1, (2, 3), 4, (5, 6) > > if we are to retain today's multi-level consistency. That seems reasonable enough. I'd considered such code bad style though. And are you also > proposing to allow > > a, *b = [1] > > to put the empty list into b, or is that an unpacking error? It does the same as function parameter unpacking does, by making b the empty tuple. > This would allow users to provide an arbitrary number of positionals > rather than having them become keyword arguments. At present it's > difficult to specify that. That's the reasoning behind the "* without a name" idea: def f(a, b, c=default, *, foo=1, bar=2): pass Here, c is a positional argument with a default value, while foo and bar are forced to be keyword arguments. Completely nuts idea #576 wold involve extending this concept past the keyword dict as well to get function default values which aren't arguments: def f(pos1, pos2, pos3=default, *, kw1=1, kw2=2, **, const="Nutty idea"): pass Py> f(const=1) Traceback (most recent call last): File "", line 1, in ? TypeError: f() got an unexpected keyword argument 'const' Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Wed Oct 12 11:51:36 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Oct 2005 19:51:36 +1000 Subject: [Python-Dev] Making Queue.Queue easier to use In-Reply-To: References: <434B8DBF.9080509@iinet.net.au> Message-ID: <434CDCA8.7050705@gmail.com> Guido van Rossum wrote: > Apart from trying to guess the API without reading the docs (:-), what > are the use cases for using put/get with a timeout? I have a feeling > it's not that common. Actually, I think wanting to use a timeout is an artifact of a history of dealing with too many C libraries which don't provide a proper event-based or select-style interface (which means the calls have to time out periodically in order to respond gracefully to program shutdown requests). However, because Queues are multi-producer, that isn't a problem - I just have to remember to push the shutdown request in through the Queue. Basically, I'd fallen into the "trying-to-write-C-in-Python" trap and I simply didn't notice until I read the responses in this thread :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Wed Oct 12 12:09:24 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Oct 2005 20:09:24 +1000 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <434C235D.1060404@ronadam.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> <434B6C3A.7020001@canterbury.ac.nz> <434C235D.1060404@ronadam.com> Message-ID: <434CE0D4.3070809@gmail.com> Ron Adam wrote: > I wonder if something like the following would fulfill the need? Funny you should say that. . . A pre-PEP propsing itertools.iunpack (amongst other things): http://mail.python.org/pipermail/python-dev/2004-November/050043.html And the reason that PEP was never actually created: http://mail.python.org/pipermail/python-dev/2004-November/050068.html Obviouly, I've changed my views over the last year or so ;) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Wed Oct 12 12:25:18 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Oct 2005 20:25:18 +1000 Subject: [Python-Dev] Proposed changes to PEP 343 In-Reply-To: References: <4346467D.5010005@iinet.net.au> <43466C3B.50704@gmail.com> <434C7093.9070802@canterbury.ac.nz> <434CD62E.4020901@gmail.com> Message-ID: <434CE48E.9090305@gmail.com> Jason Orendorff wrote: > On 10/12/05, Nick Coghlan wrote: > >>Strictly speaking this fits in with the existing confusion of "generator >>factory" and "generator": >> >>Py> def g(): >>... yield None >>... >>Py> type(g) >> >>Py> type(g()) >> >> >>Most people would call "g" a generator, even though its really just a factory >>function that returns generator objects. > > > Not the same. A precise term exists for "g": it's a generator function. > PEP 255 explicitly talks about this: > > "...Note that when > the intent is clear from context, the unqualified name "generator" may > be used to refer either to a generator-function or a generator- > iterator." > > What would the corresponding paragraph be for PEP 343? "...Note that when the intent is clear from context, the unqualified name 'context manager' may be used to refer either to a 'context manager function' or to an actual 'context manager object'. This distinction is primarily relevant for generator-based context managers, and is similar to that between a normal generator-function and a generator-iterator." Basically, a context manager object is an object with __enter__ and __exit__ methods, while the __with__ method itself is a context manager function. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From john.m.camara at comcast.net Wed Oct 12 13:03:58 2005 From: john.m.camara at comcast.net (john.m.camara@comcast.net) Date: Wed, 12 Oct 2005 11:03:58 +0000 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) Message-ID: <101220051103.1985.434CED9E0009C73C000007C122007503300E9D0E030E0CD203D202080106@comcast.net> Greg Ewing wrote: > > Guido van Rossum wrote: > > > I see no need. Code that *doesn't* need Queue but does use threading > > shouldn't have to pay for loading Queue.py. > > However, it does seem awkward to have a whole module > providing just one small class that logically is so > closely related to other threading facilities. > > What we want in this kind of situation is some sort > of autoloading mechanism, so you can import something > from a module and have it trigger the loading of another > module behind the scenes to provide it. > Bad idea unless it is tied to a namespace. So that users knows where this auto-loaded functionality is coming from. Otherwise it's just as bad as 'from xxx import *'. John M. Camara From mcherm at mcherm.com Wed Oct 12 13:35:18 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Wed, 12 Oct 2005 04:35:18 -0700 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) Message-ID: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> > Guido van Rossum writes: > Code that *doesn't* need Queue but does use threading > shouldn't have to pay for loading Queue.py. Greg Ewing responds: > What we want in this kind of situation is some sort > of autoloading mechanism, so you can import something > from a module and have it trigger the loading of another > module behind the scenes to provide it. John Camera comments: > Bad idea unless it is tied to a namespace. So that users knows > where this auto-loaded functionality is coming from. Otherwise > it's just as bad as 'from xxx import *'. John, I think what Greg is suggesting is that we include Queue in the threading module, but that we use a Clever Trick(TM) to address Guido's point by not actually loading the Queue code until the first time (if ever) that it is used. I'm not familiar with the clever trick Greg is proposing, but I do agree that _IF_ everything else were equal, then Queue seems to belong in the threading module. My biggest reason is that I think anyone who is new to threading probably shouldn't use any communication mechanism OTHER than Queue or something similar which has been carefully designed by someone knowlegable. -- Michael Chermside From mcherm at mcherm.com Wed Oct 12 13:54:38 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Wed, 12 Oct 2005 04:54:38 -0700 Subject: [Python-Dev] Extending tuple unpacking Message-ID: <20051012045438.gz4jb9pc1wwskwcw@login.werra.lunarpages.com> Steve Holden writes: > I do feel that for Python 3 it might be better to make a clean > separation between keywords and positionals: in other words, of the > function definition specifies a keyword argument then a keyword must be > used to present it. > > This would allow users to provide an arbitrary number of positionals > rather than having them become keyword arguments. At present it's > difficult to specify that. Antoine Pitrou already responded: > Do you mean it would also be forbidden to invoke a "positional" argument > using its keyword? It would be a huge step back in usability IMO. Some > people like invoking by position (because it's shorter) and some others > prefer invoking by keyword (because it's more explicit). Why should the > implementer of the API have to make a choice for the user of the API ? I strongly agree with Antoine here, but the combination of "keyword arguments after the star": > foo(a, b, c=1, *args, d=2, e=5, **kwargs) > ^^^^^^^^^ ^^^^^^^^ > positional only with kw > or with kw with "star without a name": > def f(a, b, c=default, *, foo=1, bar=2): pass > > Here, c is a positional argument with a default value, while foo and bar are > forced to be keyword arguments. is quite tempting. It satisfies Steve by allowing the implementer of the function to require keyword arguments. It satisfies Antoine and myself by also allowing the implementor of the function to permit positional OR keyword use, and making this the default behavior. It is logically consistant. There's just one big problem that I know of: Guido writes: > I've always wanted to write that as > > f(a, b, *args, foo=1, bar=2, **kwds) > > but the current grammar doesn't allow it. Hmm.... why doesn't the current grammar allow it, and can we fix that? I don't see that it's a limitation of the one-token-lookahead, could we permit this syntax by rearanging bits of the grammer? -- Michael Chermside From cludwig at cdc.informatik.tu-darmstadt.de Wed Oct 12 14:09:18 2005 From: cludwig at cdc.informatik.tu-darmstadt.de (Christoph Ludwig) Date: Wed, 12 Oct 2005 14:09:18 +0200 Subject: [Python-Dev] [C++-sig] GCC version compatibility In-Reply-To: <20050716101357.GC3607@lap200.cdc.informatik.tu-darmstadt.de> References: <42CDA654.2080106@v.loewis.de> <20050708072807.GC3581@lap200.cdc.informatik.tu-darmstadt.de> <42CEF948.3010908@v.loewis.de> <20050709102010.GA3836@lap200.cdc.informatik.tu-darmstadt.de> <42D0D215.9000708@v.loewis.de> <20050710125458.GA3587@lap200.cdc.informatik.tu-darmstadt.de> <42D15DB2.3020300@v.loewis.de> <20050716101357.GC3607@lap200.cdc.informatik.tu-darmstadt.de> Message-ID: <20051012120917.GA11058@lap200.cdc.informatik.tu-darmstadt.de> Hi, this is to continue a discussion started back in July by a posting by Dave Abrahams regarding the compiler (C vs. C++) used to compile python's main() and to link the executable. On Sat, Jul 16, 2005 at 12:13:58PM +0200, Christoph Ludwig wrote: > On Sun, Jul 10, 2005 at 07:41:06PM +0200, "Martin v. L?wis" wrote: > > Maybe. For Python 2.4, feel free to contribute a more complex test. For > > Python 2.5, I would prefer if the entire code around ccpython.cc was > > removed. > > I submitted patch #1239112 that implements the test involving two TUs for > Python 2.4. I plan to work on a more comprehensive patch for Python 2.5 but > that will take some time. I finally had the spare time to look into this problem again and submitted patch #1324762. The proposed patch implements the following: 1) The configure option --with-cxx is renamed --with-cxx-main. This was done to avoid surprising the user by the changed meaning. Furthermore, it is now possible that CXX has a different value than provided by --with-cxx-main, so the old name would have been confusing. 2) The compiler used to translate python's main() function is stored in the configure / Makefile variable MAINCC. By default, MAINCC=$(CC). If --with-cxx-main is given (without an appended compiler name), then MAINCC=$(CXX). If --with-cxx-main= is on the configure command line, then MAINCC=. Additionally, configure sets CXX= unless CXX was already set on the configure command line. 3) The command used to link the python executable is (as before) stored in LINKCC. By default, LINKCC='$(PURIFY) $(MAINCC)', i.e. the linker front-end is the compiler used to translate main(). If necessary, LINKCC can be set on the configure command line in which case it won't be altered. 4) If CXX is not set by the user (on the command line or via --with-cxx-main), then configure tries several likely C++ compiler names. CXX is assigned the first name that refers to a callable program in the system. (CXX is set even if python is built with a C compiler only, so distutils can build C++ extensions.) 5) Modules/ccpython.cc is no longer used and can be removed. I think that makes it possible to build python appropriately on every platform: - By default, python is built with the C compiler only; CXX is assigned the name of a "likely" C++ compiler. This works fine, e.g., on ELF systems like x86 / Linux where python should not have any dependency on the C++ runtime to avoid conflicts with C++ extensions. distutils can still build C++ extensions since CXX names a callable C++ compiler. - On platforms that require main() to be a C++ function if C++ extensions are to be imported, the user can configure python --with-cxx-main. On platforms where one must compile main() with a C++ compiler, but does not need to link the executable with the same compiler, the user can specify both --with-cxx-main and LINKCC on the configure command line. Best regards Christoph -- http://www.informatik.tu-darmstadt.de/TI/Mitarbeiter/cludwig.html LiDIA: http://www.informatik.tu-darmstadt.de/TI/LiDIA/Welcome.html From john.m.camara at comcast.net Wed Oct 12 15:12:05 2005 From: john.m.camara at comcast.net (john.m.camara@comcast.net) Date: Wed, 12 Oct 2005 13:12:05 +0000 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) Message-ID: <101220051312.4429.434D0BA5000808D40000114D22007503300E9D0E030E0CD203D202080106@comcast.net> > > Guido van Rossum writes: > > Code that *doesn't* need Queue but does use threading > > shouldn't have to pay for loading Queue.py. > > Greg Ewing responds: > > What we want in this kind of situation is some sort > > of autoloading mechanism, so you can import something > > from a module and have it trigger the loading of another > > module behind the scenes to provide it. > > John Camera comments: > > Bad idea unless it is tied to a namespace. So that users knows > > where this auto-loaded functionality is coming from. Otherwise > > it's just as bad as 'from xxx import *'. > > Michael Chermside comments: > John, I think what Greg is suggesting is that we include Queue > in the threading module, but that we use a Clever Trick(TM) to > address Guido's point by not actually loading the Queue code > until the first time (if ever) that it is used. > > I'm not familiar with the clever trick Greg is proposing, but I > do agree that _IF_ everything else were equal, then Queue seems > to belong in the threading module. My biggest reason is that I > think anyone who is new to threading probably shouldn't use any > communication mechanism OTHER than Queue or something similar > which has been carefully designed by someone knowlegable. > I guess from Greg’s comments I’m not sure if he wants to import threading and as a result ‘Queue’ becomes available in the local namespace and bound/loaded when it is first needed and thus saves himself from typing ‘import Queue’ immediately after ‘import threading’ or Queue becomes part of the threading namespace and bound/loaded when it is first needed. Queue then becomes accessible through ‘threading.Queue’ When Greg says > However, it does seem awkward to have a whole module > providing just one small class that logically is so > closely related to other threading facilities. It sounds like he feels Queue should just be part of threading but queues can be used in other contexts besides threading. So having separate modules is a good thing. The idea of delaying an import until it’s needed sounds like a great idea and having built in support for this would be great. Here are 2 possible suggestions for the import statements import Queue asneeded delayedimport Queue # can't think of a better name at this time But auto loading a module by a module on behalf of a client just doesn’t sit too well for me. How about the confusion it would cause. Is Queue in treading module a reference to a Queue in a Queue module or a new class all together? If we go down this slippery slope we will see modules like array, struct, etc getting referenced and getting auto loaded on behalf of the client. Where will it end. John M. Camara > > Guido van Rossum writes: > > Code that *doesn't* need Queue but does use threading > > shouldn't have to pay for loading Queue.py. > > Greg Ewing responds: > > What we want in this kind of situation is some sort > > of autoloading mechanism, so you can import something > > from a module and have it trigger the loading of another > > module behind the scenes to provide it. > > John Camera comments: > > Bad idea unless it is tied to a namespace. So that users knows > > where this auto-loaded functionality is coming from. Otherwise > > it's just as bad as 'from xxx import *'. > > John, I think what Greg is suggesting is that we include Queue > in the threading module, but that we use a Clever Trick(TM) to > address Guido's point by not actually loading the Queue code > until the first time (if ever) that it is used. > > I'm not familiar with the clever trick Greg is proposing, but I > do agree that _IF_ everything else were equal, then Queue seems > to belong in the threading module. My biggest reason is that I > think anyone who is new to threading probably shouldn't use any > communication mechanism OTHER than Queue or something similar > which has been carefully designed by someone knowlegable. > > -- Michael Chermside > From mcherm at mcherm.com Wed Oct 12 16:25:28 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Wed, 12 Oct 2005 07:25:28 -0700 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) Message-ID: <20051012072528.68vls5q6143k08c0@login.werra.lunarpages.com> John Camera writes: > It sounds like he feels Queue should just be part of threading but queues > can be used in other contexts besides threading. So having separate > modules is a good thing. Perhaps I am wrong here, but the Queue.Queue class is designed specifically for synchronization, and I have always been under the impression that it was probably NOT the best tool for normal queues that have nothing to do with threading. Why incur the overhead of synchronization locks when you don't intend to use them. I would advise against using Queue.Queue in any context besides threading. continued... > I guess from Greg?s comments I?m not sure if he wants to [...] I'm going to stop trying to channel Greg here, he can speak for himself. But I will be quite surprised if _anyone_ supports the idea of having an module modify the local namespace importing it when it is imported. and later... > Here are 2 possible suggestions for the import statements > > import Queue asneeded > delayedimport Queue # can't think of a better name at this time Woah! There is no need for new syntax here! If you want to import Queue only when needed use this (currently legal) syntax: if queueIsNeeded: import Queue If you want to add a module (call it "Queue") to the namespace, but delay executing some of the code for now, then just use "import Queue" and modify the module so that it doesn't do all its work at import time, but delays some of it until needed. That too is possible today: # start of module initialized = False def doSomething(): if not initialized: initialize() # ... Python today is incredibly dynamic and flexible... despite the usual tenor of conversations on python-dev, it is very rare to encounter a problem that cannot be solved (and readably so) using the existing tools and constructs. -- Michael Chermside From guido at python.org Wed Oct 12 16:32:17 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 12 Oct 2005 07:32:17 -0700 Subject: [Python-Dev] Europeans attention please! Message-ID: I have some 65%-off passes to EuroOSCON which starts next Monday in Amsterdam. Anybody interested? http://conferences.oreillynet.com/eurooscon/grid/ -- --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Wed Oct 12 16:46:52 2005 From: skip at pobox.com (skip@pobox.com) Date: Wed, 12 Oct 2005 09:46:52 -0500 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> Message-ID: <17229.8668.114534.151179@montanaro.dyndns.org> Michael> I'm not familiar with the clever trick Greg is proposing, but I Michael> do agree that _IF_ everything else were equal, then Queue seems Michael> to belong in the threading module. My biggest reason is that I Michael> think anyone who is new to threading probably shouldn't use any Michael> communication mechanism OTHER than Queue or something similar Michael> which has been carefully designed by someone knowlegable. Is the Queue class very useful outside a multithreaded context? The notion of a queue as a data structure has meaning outside of threaded applications. Its presence might seduce a new programmer into thinking it is subtly different than it really is. A cursory test suggests that it works, though q.get() on a empty queue seems a bit counterproductive. Also, Queue objects are probably quite a bit less efficient than lists. Taken as a whole, perhaps a stronger attachment with the threading module isn't such a bad idea. Skip From guido at python.org Wed Oct 12 16:58:10 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 12 Oct 2005 07:58:10 -0700 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> Message-ID: On 10/12/05, Michael Chermside wrote: > I'm not familiar with the clever trick Greg is proposing, but I > do agree that _IF_ everything else were equal, then Queue seems > to belong in the threading module. My biggest reason is that I > think anyone who is new to threading probably shouldn't use any > communication mechanism OTHER than Queue or something similar > which has been carefully designed by someone knowlegable. I *still* disagree. At some level, Queue is just an application of threading, while the threading module provides the basic API (never mind that there's an even more basic API, the thread module -- it's too low-level to consider and we actively recommend against it, at least I hope we do). While at this point there may be no other "applications" of threading in the standard library, that may not remain the case; it's quite possble that some of the discussions of threading APIs will eventually lead to a PEP proposing a different threading paradigm build on top of the threading module. I'm using the word "application" loosely here because I realize one person's application is another's primitive operation. But I object to the idea that just because A and B are often used together or A is recommended for programs using B that A and B should live in the same module. We don't put urllib and httplib in the socket module either! Now, if we had a package structure, I would sure like to see threading and Queue end up as neighbors in the same package. But I don't think it's right to package them all up in the same module. (Not to say that autoloading is a bad idea; I'm -0 on it for myself, but I can see use cases; but it doesn't change my mind on whether Queue should become threading.Queue. I guess I didn't articulate my reasoning for being against that well previously and tried to hide behind the load time argument.) BTW, Queue.Queue violates a recent module naming standard; it is now considered bad style to name the class and the module the same. Modules and packages should have short all-lowercase names, classes should be CapWords. Even the same but different case is bad style. (I'd suggest queueing.Queue except nobody can type that right. :) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Oct 12 17:00:30 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 12 Oct 2005 08:00:30 -0700 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <17229.8668.114534.151179@montanaro.dyndns.org> References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <17229.8668.114534.151179@montanaro.dyndns.org> Message-ID: On 10/12/05, skip at pobox.com wrote: > Is the Queue class very useful outside a multithreaded context? No. It was designed specifically for inter-thread communication. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Wed Oct 12 17:19:00 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 12 Oct 2005 11:19:00 -0400 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: References: <2mll1ghsjc.fsf@starship.python.net> <397621172.20050927111836@MailBlocks.com> <433AE8A8.3010500@v.loewis.de> <329633301.20050929074337@MailBlocks.com> <2mll1ghsjc.fsf@starship.python.net> <5.1.1.6.0.20050929121236.0399ed88@mail.telecommunity.com> <160502469.20050929104837@MailBlocks.com> Message-ID: <5.1.1.6.0.20051012111425.01f3eec8@mail.telecommunity.com> At 02:35 AM 10/12/2005 +0000, Joshua Spoerri wrote: >that stm paper isn't the end. > >there's a java implementation which seems to be exactly what we want: >http://research.microsoft.com/~tharris/papers/2003-oopsla.pdf There's already a Python implementation of what's described in the paper. It's called ZODB. :) Just use the memory backend if you don't want the objects to persist. Granted, if you want automatic retry you'll need to create a decorator that catches conflict errors. But basically, ZODB implements a similar optimistic conflict management transaction algorithm to that described in the paper. Certainly, it's the closest thing you can get in CPython without a complete redesign of the VM. From skromta at gmail.com Sat Oct 8 09:55:43 2005 From: skromta at gmail.com (Kalle Anke) Date: Sat, 8 Oct 2005 09:55:43 +0200 Subject: [Python-Dev] Pythonic concurrency References: <20051006143740.287E.JCARLSON@uci.edu> <200510070145.17284.ms@cerenity.org> <20051006221436.2892.JCARLSON@uci.edu> <415220344.20051007104751@MailBlocks.com> Message-ID: <0001HW.BF6D481F0162024EF0407550@news.gmane.org> On Fri, 7 Oct 2005 18:47:51 +0200, Bruce Eckel wrote (in article <415220344.20051007104751 at MailBlocks.com>): > It's hard to know how to answer. I've met enough brilliant people to > know that it's just possible that the person posting really does > easily grok concurrency issues and thus I must seem irreconcilably > thick. This may actually be one of those people for whom threading is > obvious (and Ian has always seemed like a smart guy, for example). I think it depends on which "level" you're talking about, concurrency IS very easy and "natural" at a conceptual level. It's also quite easy for doing basic stuff ... but it can become very complicated if you introduce different requirements and/or the system becomes complex and/or you're going to implement the actual mechanism. That's my limited experience (personally, I really like concurrency ... and to be honest, some people can't really understand the concept at all while others have no problem so it's a "personal thing" also) From john.m.camara at comcast.net Wed Oct 12 17:37:24 2005 From: john.m.camara at comcast.net (john.m.camara@comcast.net) Date: Wed, 12 Oct 2005 15:37:24 +0000 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) Message-ID: <101220051537.5148.434D2DB3000F41170000141C22007503300E9D0E030E0CD203D202080106@comcast.net> > John Camera writes: > > It sounds like he feels Queue should just be part of threading but queues > > can be used in other contexts besides threading. So having separate > > modules is a good thing. > > Michael Chermside > Perhaps I am wrong here, but the Queue.Queue class is designed specifically > for synchronization, and I have always been under the impression that > it was probably NOT the best tool for normal queues that have nothing > to do with threading. Why incur the overhead of synchronization locks > when you don't intend to use them. I would advise against using Queue.Queue > in any context besides threading. I haven't used the Queue class before as I normally use a list for a queue. I just assumed a Queue was just a queue that was perhaps optimized for performance. I guess I would have expected the Queue class as defined in the standard library to have a different name if it wasn't just a queue. Well I should have known better than to make assumption on this list. :) I now see where Greg is coming from but I'm still not comfortable having it in the threading module. To me threads and queues are two different beasts. John M. Camara From skip at pobox.com Wed Oct 12 18:02:38 2005 From: skip at pobox.com (skip@pobox.com) Date: Wed, 12 Oct 2005 11:02:38 -0500 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> Message-ID: <17229.13214.981827.304999@montanaro.dyndns.org> Guido> At some level, Queue is just an application of threading, while Guido> the threading module provides the basic API ... While Queue is built on top of threading Lock and Condition objects, it is a highly useful synchronization mechanism in its own right, and is almost certainly easier to use correctly (at least for novices) than the lower-level synchronization objects the threading module provides. If threading is the "friendly" version of thread, perhaps Queue should be considered the "friendly" synchronization object. (I'm playing the devil's advocate here. I'm fine with Queue being where it is.) Skip From john.m.camara at comcast.net Wed Oct 12 18:04:13 2005 From: john.m.camara at comcast.net (john.m.camara@comcast.net) Date: Wed, 12 Oct 2005 16:04:13 +0000 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) Message-ID: <101220051604.20801.434D33FD000335700000514122007614380E9D0E030E0CD203D202080106@comcast.net> > Skip write: > Is the Queue class very useful outside a multithreaded context? The notion > of a queue as a data structure has meaning outside of threaded applications. > Its presence might seduce a new programmer into thinking it is subtly > different than it really is. A cursory test suggests that it works, though > q.get() on a empty queue seems a bit counterproductive. Also, Queue objects > are probably quite a bit less efficient than lists. Taken as a whole, > perhaps a stronger attachment with the threading module isn't such a bad > idea. > Maybe Queue belongs in a module called synchronize to avoid any confusions. John M. Camara From solipsis at pitrou.net Wed Oct 12 18:11:40 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 12 Oct 2005 18:11:40 +0200 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <101220051604.20801.434D33FD000335700000514122007614380E9D0E030E0CD203D202080106@comcast.net> References: <101220051604.20801.434D33FD000335700000514122007614380E9D0E030E0CD203D202080106@comcast.net> Message-ID: <1129133500.6178.16.camel@fsol> > Maybe Queue belongs in a module called synchronize to avoid any confusions. Why not /just/ make the doc a little bit more explicit ? Instead of saying: It is especially useful in threads programming when information must be exchanged safely between multiple threads. Replace it with: It is dedicated to threads programming for safe exchange of information between multiple threads. On the other hand, if you are only looking for a single-thread queue structure, use the built-in list type, or the deque class from the collections module. If necessary, put it in bold ;) From aahz at pythoncraft.com Wed Oct 12 19:47:25 2005 From: aahz at pythoncraft.com (Aahz) Date: Wed, 12 Oct 2005 10:47:25 -0700 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <434C6662.4040503@canterbury.ac.nz> References: <434B8DBF.9080509@iinet.net.au> <434C6662.4040503@canterbury.ac.nz> Message-ID: <20051012174725.GA26101@panix.com> On Wed, Oct 12, 2005, Greg Ewing wrote: > Guido van Rossum wrote: >> >> I see no need. Code that *doesn't* need Queue but does use threading >> shouldn't have to pay for loading Queue.py. I'd argue that such code is rare enough (given the current emphasis on Queue) that the performance issue doesn't matter. > However, it does seem awkward to have a whole module providing > just one small class that logically is so closely related to other > threading facilities. The problem is that historically Queue did not use ``threading``; it was built directly on top of ``thread``, and people were told to use Queue regardless of whether they were using ``thread`` or ``threading``. Obviously, there is no use case for putting Queue into ``thread``, so off it went into its own module. At this point, my opinion is that we should leave reorganizing the thread stuff until Python 3.0. (Python 3.0 should "deprecate" ``thread`` by renaming it to ``_thread``). -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you think it's expensive to hire a professional to do the job, wait until you hire an amateur." --Red Adair From guido at python.org Wed Oct 12 19:55:06 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 12 Oct 2005 10:55:06 -0700 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <20051012174725.GA26101@panix.com> References: <434B8DBF.9080509@iinet.net.au> <434C6662.4040503@canterbury.ac.nz> <20051012174725.GA26101@panix.com> Message-ID: On 10/12/05, Aahz wrote: > (Python 3.0 > should "deprecate" ``thread`` by renaming it to ``_thread``). +1. (We could even start doing this before 3.0.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mcherm at mcherm.com Wed Oct 12 21:33:18 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Wed, 12 Oct 2005 12:33:18 -0700 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) Message-ID: <20051012123318.vb9uvd4meu40sc0c@login.werra.lunarpages.com> Aahz writes: > (Python 3.0 should "deprecate" ``thread`` by renaming it to ``_thread``). Guido says: > +1. (We could even start doing this before 3.0.) Before 3.0, let's deprecate it by listing it in the Deprecated modules section within the documentation... no need to gratuitously break code by renaming it until 3.0 arrives. -- Michael Chermside From rrr at ronadam.com Wed Oct 12 22:52:20 2005 From: rrr at ronadam.com (Ron Adam) Date: Wed, 12 Oct 2005 16:52:20 -0400 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <434CE0D4.3070809@gmail.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> <434B6C3A.7020001@canterbury.ac.nz> <434C235D.1060404@ronadam.com> <434CE0D4.3070809@gmail.com> Message-ID: <434D7784.7020209@ronadam.com> Nick Coghlan wrote: > Ron Adam wrote: > >>I wonder if something like the following would fulfill the need? > > > Funny you should say that. . . > > A pre-PEP propsing itertools.iunpack (amongst other things): > http://mail.python.org/pipermail/python-dev/2004-November/050043.html > > And the reason that PEP was never actually created: > http://mail.python.org/pipermail/python-dev/2004-November/050068.html > > Obviouly, I've changed my views over the last year or so ;) > > Cheers, > Nick. It looked like the PEP didn't get created because there wasn't enough interest at the time, not because there was anything wrong with the idea. And the motivation was, suprisingly, that this would be discussed again, and here it is. ;-) I reversed my view in the other direction in the past 6 months or so. Mostly because when chaining methods or functions with * and **, my mind (which often doesn't have enough sleep), want's to think they mean the same thing in both ends of the method. For example... (with small correction from the previous example) def div_at(self, *args): return self.__class__(self.div_iter(*args)) This would read better to me if it was. # (just an example, not a sugestion.) def div_at(self, *args): return self.__class__(self.div_iter(/args)) But I may be one of a few that this is a minor annoyance. I wonder if you make '*' work outside of functions arguments lists, if requests to do the same for '**' would follow? Cheers, Ron From aahz at pythoncraft.com Wed Oct 12 23:02:41 2005 From: aahz at pythoncraft.com (Aahz) Date: Wed, 12 Oct 2005 14:02:41 -0700 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <20051012123318.vb9uvd4meu40sc0c@login.werra.lunarpages.com> References: <20051012123318.vb9uvd4meu40sc0c@login.werra.lunarpages.com> Message-ID: <20051012210241.GA887@panix.com> On Wed, Oct 12, 2005, Michael Chermside wrote: > Guido says: >> Aahz writes: >>> >>> (Python 3.0 should "deprecate" ``thread`` by renaming it to ``_thread``). >> >> +1. (We could even start doing this before 3.0.) > > Before 3.0, let's deprecate it by listing it in the Deprecated modules > section within the documentation... no need to gratuitously break code > by renaming it until 3.0 arrives. Note carefully the deprecation in quotes. It's not going to be literally deprecated, only renamed, similar to the way _socket and socket work together. We could also rename to _threading, but I prefer the simpler change of only a prepended underscore. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you think it's expensive to hire a professional to do the job, wait until you hire an amateur." --Red Adair From guido at python.org Wed Oct 12 23:24:49 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 12 Oct 2005 14:24:49 -0700 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <20051012210241.GA887@panix.com> References: <20051012123318.vb9uvd4meu40sc0c@login.werra.lunarpages.com> <20051012210241.GA887@panix.com> Message-ID: On 10/12/05, Aahz wrote: > Note carefully the deprecation in quotes. It's not going to be > literally deprecated, only renamed, similar to the way _socket and > socket work together. We could also rename to _threading, but I prefer > the simpler change of only a prepended underscore. Could you specify exactly what you have in mind? How would backwards compatibility be maintained in 2.x? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Wed Oct 12 23:48:30 2005 From: aahz at pythoncraft.com (Aahz) Date: Wed, 12 Oct 2005 14:48:30 -0700 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: References: <20051012123318.vb9uvd4meu40sc0c@login.werra.lunarpages.com> <20051012210241.GA887@panix.com> Message-ID: <20051012214830.GA24007@panix.com> On Wed, Oct 12, 2005, Guido van Rossum wrote: > On 10/12/05, Aahz wrote: >> >> Note carefully the deprecation in quotes. It's not going to be >> literally deprecated, only renamed, similar to the way _socket and >> socket work together. We could also rename to _threading, but I prefer >> the simpler change of only a prepended underscore. > > Could you specify exactly what you have in mind? How would backwards > compatibility be maintained in 2.x? I'm suggesting that we add a doc note that using the thread module is discouraged and that it will be renamed in 3.0. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you think it's expensive to hire a professional to do the job, wait until you hire an amateur." --Red Adair From mcherm at mcherm.com Thu Oct 13 00:00:17 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Wed, 12 Oct 2005 15:00:17 -0700 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) Message-ID: <20051012150017.njcyu2ftthc0wosk@login.werra.lunarpages.com> Aahz writes: > I'm suggesting that we add a doc note that using the thread module is > discouraged and that it will be renamed in 3.0. Then we're apparently all in agreement. -- Michael Chermside From eyal.lotem at gmail.com Tue Oct 11 23:31:42 2005 From: eyal.lotem at gmail.com (Eyal Lotem) Date: Tue, 11 Oct 2005 23:31:42 +0200 Subject: [Python-Dev] Early PEP draft (For Python 3000?) Message-ID: I would like to re-suggest a suggestion I have made in the past, but with a mild difference, and a narrower scope. Name: Attribute access for all namespaces Rationale: globals() access is conceptually the same as setting the module's attributes but uses a different idiom (access of the dict directly). Also, locals() returns a dict, which implies it can affect the local scope, but quietly ignores changes. Using attribute access to access the local/global namespaces just as that is used in instance namespaces and other modules' namespaces, could reduce the mental footprint of Python. Method: All namespace accesses are attribute accesses, and not direct __dict__ accesses. Thus globals() is replaced by a "module" keyword (or "magic variable"?) that evaluates to the module object. Thus, reading/writing globals in module X, uses getattr/setattr on the module object, just like doing so in module Y would be. locals() would return be replaced by a function that returns the frame object (or a weaker equivalent of a frame object) of the currently running function. This object will represent the local namespace and will allow attribute getting/setting to read/write attributes. Or it can disallow attribute setting. Examples: global x ; x = 1 Replaced by: module.x = 1 or: globals()[x] = 1 Replaced by: setattr(module, x, 1) locals()['x'] = 1 # Quietly fails! Replaced by: frame.x = 1 # Raises error x = locals()[varname] Replaced by: x = getattr(frame, varname) Advantages: - Python becomes more consistent w.r.t namespacing and scopes. Disadvantages: - "module" is already possible by importing one's own module, but that is: * Confusing and unnecessarily requires naming one's self redundantly (Making renaming of the module a bit more difficult). * Not easily possible in a __main__/importable module. * No equivalent for locals() - Automatic script conversion may be difficult in some use cases of globals()/locals() From greg.ewing at canterbury.ac.nz Thu Oct 13 02:40:44 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 13 Oct 2005 13:40:44 +1300 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> Message-ID: <434DAD0C.3060507@canterbury.ac.nz> Michael Chermside wrote: > John, I think what Greg is suggesting is that we include Queue > in the threading module, but that we use a Clever Trick(TM) to > address Guido's point by not actually loading the Queue code > until the first time (if ever) that it is used. I wasn't actually going so far as to suggest doing this, rather pointing out that, if we had an autoloading mechanism, this would be an obvious use case for it. > I'm not familiar with the clever trick Greg is proposing, I'll see if I can cook up an example of it to show. Be warned, it is very hackish... -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Oct 13 02:47:29 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 13 Oct 2005 13:47:29 +1300 Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading? (Making Queue.Queue easier to use)) In-Reply-To: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> Message-ID: <434DAEA1.9000904@canterbury.ac.nz> I just tried to implement an autoloader using a technique I'm sure I used in an earlier Python version, but it no longer seems to be allowed. I'm trying to change the __class__ of a newly-imported module to a subclass of types.ModuleType, but I'm getting TypeError: __class__ assignment: only for heap types Have the rules concerning assignent to __class__ been made more restrictive recently? -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From pje at telecommunity.com Thu Oct 13 04:16:57 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 12 Oct 2005 22:16:57 -0400 Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading? (Making Queue.Queue easier to use)) In-Reply-To: <434DAEA1.9000904@canterbury.ac.nz> References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> Message-ID: <5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com> At 01:47 PM 10/13/2005 +1300, Greg Ewing wrote: >I just tried to implement an autoloader using a technique >I'm sure I used in an earlier Python version, but it no >longer seems to be allowed. > >I'm trying to change the __class__ of a newly-imported >module to a subclass of types.ModuleType, but I'm getting > > TypeError: __class__ assignment: only for heap types > >Have the rules concerning assignent to __class__ been >made more restrictive recently? It happened in Python 2.3, actually. The best way to work around this is to add an instance of your subclass to sys.modules *first*, then call reload() on it to make the normal import process work. PEAK uses this to implement lazy loading. Actually, for your purposes, you might be able to just replace the module object and copy its contents to the new module's dictionary. From greg.ewing at canterbury.ac.nz Thu Oct 13 05:44:12 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 13 Oct 2005 16:44:12 +1300 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <101220051537.5148.434D2DB3000F41170000141C22007503300E9D0E030E0CD203D202080106@comcast.net> References: <101220051537.5148.434D2DB3000F41170000141C22007503300E9D0E030E0CD203D202080106@comcast.net> Message-ID: <434DD80C.80207@canterbury.ac.nz> john.m.camara at comcast.net wrote: > I now see where Greg is coming from but I'm still not comfortable having > it in the threading module. To me threads and queues are two different > beasts. All right then, how about putting it in a module called threadutils or something like that, which is clearly related to threading, but is open for the addition of future thread-related features that might arise. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Oct 13 05:44:21 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 13 Oct 2005 16:44:21 +1300 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <20051012072528.68vls5q6143k08c0@login.werra.lunarpages.com> References: <20051012072528.68vls5q6143k08c0@login.werra.lunarpages.com> Message-ID: <434DD815.8070909@canterbury.ac.nz> Michael Chermside wrote: > # start of module > initialized = False > > def doSomething(): > if not initialized: > initialize() But how do you do this if the thing in question is a class rather than a function? The module could export a function getSomeClass() that clients were required to use instead of just referencing the class, but that would be klunky in the extreme. BTW, I agree that special *syntax* isn't necessarily needed. But it does seem to me that some sort of hook is needed somewhere to make this doable smoothly, that doesn't exist today. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Oct 13 05:44:25 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 13 Oct 2005 16:44:25 +1300 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <101220051312.4429.434D0BA5000808D40000114D22007503300E9D0E030E0CD203D202080106@comcast.net> References: <101220051312.4429.434D0BA5000808D40000114D22007503300E9D0E030E0CD203D202080106@comcast.net> Message-ID: <434DD819.7090705@canterbury.ac.nz> john.m.camara at comcast.net wrote: > I guess from Greg?s comments I?m not sure if he wants to > > import threading > > and as a result > > ?Queue? becomes available in the local namespace No!!! > Queue becomes part of the threading namespace and bound/loaded > when it is first needed. Queue then becomes accessible through > ?threading.Queue? Yes. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Oct 13 07:20:58 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 13 Oct 2005 18:20:58 +1300 Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading? (Making Queue.Queue easier to use)) In-Reply-To: <5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com> References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com> Message-ID: <434DEEBA.7050905@canterbury.ac.nz> Phillip J. Eby wrote: > At 01:47 PM 10/13/2005 +1300, Greg Ewing wrote: > >> I'm trying to change the __class__ of a newly-imported >> module to a subclass of types.ModuleType > > It happened in Python 2.3, actually. Is there a discussion anywhere about the reason this was done? It would be useful if this capability could be regained somehow without breaking things. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Oct 13 07:25:58 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 13 Oct 2005 18:25:58 +1300 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <434DAD0C.3060507@canterbury.ac.nz> References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <434DAD0C.3060507@canterbury.ac.nz> Message-ID: <434DEFE6.7030909@canterbury.ac.nz> I wrote: > I'll see if I can cook up an example of it to show. Be > warned, it is very hackish... Well, here it is. It's even slightly uglier than I thought it would be due to the inability to change the class of a module these days. When you run it, you should get Imported my_module Loading the spam module Glorious processed meat product! Glorious processed meat product! #-------------------------------------------------------------- # # test.py # import my_module print "Imported my_module" my_module.spam() my_module.spam() # # my_module.py # import autoloading autoloading.register(__name__, {'spam': 'spam_module'}) # # spam_module.py # print "Loading the spam module" def spam(): print "Glorious processed meat product!" # # autoloading.py # import sys class AutoloadingModule(object): def __getattr__(self, name): modname = self.__dict__['_autoload'][name] module = __import__(modname, self.__dict__, {}, [name]) value = getattr(module, name) setattr(self, name, value) return value def register(module_name, mapping): module = sys.modules[module_name] m2 = AutoloadingModule() m2.__name__ = module.__name__ m2.__dict__ = module.__dict__ # Drop all references to the original module before assigning # the _autoload attribute. Otherwise, when the original module # gets cleared, _autoload is set to None. sys.modules[module_name] = m2 del module m2._autoload = mapping #-------------------------------------------------------------- -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From ncoghlan at gmail.com Thu Oct 13 11:47:31 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2005 19:47:31 +1000 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <434D7784.7020209@ronadam.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> <434B6C3A.7020001@canterbury.ac.nz> <434C235D.1060404@ronadam.com> <434CE0D4.3070809@gmail.com> <434D7784.7020209@ronadam.com> Message-ID: <434E2D33.50204@gmail.com> Ron Adam wrote: > I wonder if you make '*' work outside of functions arguments lists, if > requests to do the same for '**' would follow? Only if keyword unpacking were to be permitted elsewhere first. That is: Py> data = dict(a=1, b=2, c=3) Py> (a, b, c) = **data Py> print a, b, c (1, 2, 3) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Thu Oct 13 11:54:15 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2005 19:54:15 +1000 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <20051012045438.gz4jb9pc1wwskwcw@login.werra.lunarpages.com> References: <20051012045438.gz4jb9pc1wwskwcw@login.werra.lunarpages.com> Message-ID: <434E2EC7.5060101@gmail.com> Michael Chermside wrote: > Guido writes: > >>I've always wanted to write that as >> >> f(a, b, *args, foo=1, bar=2, **kwds) >> >>but the current grammar doesn't allow it. > > > Hmm.... why doesn't the current grammar allow it, and can we fix that? > I don't see that it's a limitation of the one-token-lookahead, could > we permit this syntax by rearanging bits of the grammer? I griped about this a while back, and got the impression from Guido that fixing it was possible, but it had simply never bugged anyone enough for them to actaully get around to fixing it. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Thu Oct 13 12:46:42 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2005 20:46:42 +1000 Subject: [Python-Dev] threadtools (was Re: Autoloading? (Making Queue.Queue easier to use)) In-Reply-To: <17229.13214.981827.304999@montanaro.dyndns.org> References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <17229.13214.981827.304999@montanaro.dyndns.org> Message-ID: <434E3B12.1000108@gmail.com> skip at pobox.com wrote: > Guido> At some level, Queue is just an application of threading, while > Guido> the threading module provides the basic API ... > > While Queue is built on top of threading Lock and Condition objects, it is a > highly useful synchronization mechanism in its own right, and is almost > certainly easier to use correctly (at least for novices) than the > lower-level synchronization objects the threading module provides. If > threading is the "friendly" version of thread, perhaps Queue should be > considered the "friendly" synchronization object. > > (I'm playing the devil's advocate here. I'm fine with Queue being where it > is.) If we *don't* make Queue a part of the basic threading API (and I think Guido is right that it doesn't need to be), then I suggest we create a threadtools module. So the thread-related API would actually have three layers: - _thread (currently "_thread") for the low-level guts - threading for the basic thread API that any threaded app needs - threadtools for the more complex "application-specific" items Initially threadtools would just contain Queue, but other candidates for inclusion in the future might be standard implementations of: - PeriodicTimer (see below) - FutureCall (threading out a call, only blocking when you need the result - QueueThread (a thread with "inbox" and "outbox" Queues) - ThreadPool (up to the application to make sure the Threads are reusable) - threading related decorators Cheers, Nick. P.S. PeriodicTimer would be a variant of threading Timer which simply replaces the run method with: def run(): while 1: self.finished.wait(self.interval) if self.finished.isSet(): break self.function(*self.args, **self.kwds) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Thu Oct 13 12:51:05 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2005 20:51:05 +1000 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <101220051312.4429.434D0BA5000808D40000114D22007503300E9D0E030E0CD203D202080106@comcast.net> References: <101220051312.4429.434D0BA5000808D40000114D22007503300E9D0E030E0CD203D202080106@comcast.net> Message-ID: <434E3C19.10101@gmail.com> john.m.camara at comcast.net wrote: > It sounds like he feels Queue should just be part of threading but queues > can be used in other contexts besides threading. So having separate > modules is a good thing. If threads aren't involved, you should use "collections.deque" directly, rather than going through "Queue.Queue". The latter jumps through a lot of hoops in order to be thread-safe. This confusion is one of the reasons I have a problem with the current name of the Queue module. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From steve at holdenweb.com Thu Oct 13 12:55:52 2005 From: steve at holdenweb.com (Steve Holden) Date: Thu, 13 Oct 2005 11:55:52 +0100 Subject: [Python-Dev] Extending tuple unpacking In-Reply-To: <434E2D33.50204@gmail.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com> <434B3C01.5030001@ronadam.com> <434B6C3A.7020001@canterbury.ac.nz> <434C235D.1060404@ronadam.com> <434CE0D4.3070809@gmail.com> <434D7784.7020209@ronadam.com> <434E2D33.50204@gmail.com> Message-ID: Nick Coghlan wrote: > Ron Adam wrote: > >>I wonder if you make '*' work outside of functions arguments lists, if >>requests to do the same for '**' would follow? > > > Only if keyword unpacking were to be permitted elsewhere first. That is: > > Py> data = dict(a=1, b=2, c=3) > Py> (a, b, c) = **data > Py> print a, b, c > (1, 2, 3) > > Cheers, > Nick. > This gets too weird, though. What about: (a, **d) = **data Should this be equivalent to a = 1 d = dict(b=2, c=3) ? Basically I suspect we are heading towards the outer limits here. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From ncoghlan at gmail.com Thu Oct 13 13:41:31 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2005 21:41:31 +1000 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <434DD815.8070909@canterbury.ac.nz> References: <20051012072528.68vls5q6143k08c0@login.werra.lunarpages.com> <434DD815.8070909@canterbury.ac.nz> Message-ID: <434E47EB.8090909@gmail.com> Greg Ewing wrote: > BTW, I agree that special *syntax* isn't necessarily > needed. But it does seem to me that some sort of > hook is needed somewhere to make this doable > smoothly, that doesn't exist today. Having module attribute access obey the descriptor protocol (__get__, __set__, __delete__) sounds like a pretty good option to me. It would even be pretty backwards compatible, as I'd be hardpressed to think why anyone would have a descriptor *instance* as a top-level object in a module (descriptor definition, yes, but not an instance). Consider lazy instance attributes: Py> def lazyattr(func): ... class wrapper(object): ... def __get__(self, instance, cls): ... val = func() ... setattr(instance, func.__name__, val) ... return val ... return wrapper() ... Py> class test(object): ... @lazyattr ... def foo(): ... print "Evaluating foo!" ... return "Instance attribute" ... Py> t = test() Py> t.foo Evaluating foo! 'Instance attribute' Py> t.foo 'Instance attribute' And lazy class attributes: Py> def lazyclassattr(func): ... class wrapper(object): ... def __get__(self, instance, cls): ... val = func() ... setattr(cls, func.__name__, val) ... return val ... return wrapper() ... Py> class test(object): ... @lazyclassattr ... def bar(): ... print "Evaluating bar!" ... return "Class attribute" ... Py> test.bar Evaluating bar! 'Class attribute' Py> test.bar 'Class attribute' Unfortunately, that trick doesn't work at the module level: Py> def lazymoduleattr(func): ... class wrapper(object): ... def __get__(self, instance, cls): ... val = func() ... globals()[func.__name__] = val ... return val ... return wrapper() ... Py> @lazymoduleattr ... def baz(): ... print "Evaluating baz!" ... return "Module attribute" ... Py> baz # Descriptor not invoked <__main__.wrapper object at 0x00B9E3B0> Py> import sys Py> main = sys.modules["__main__"] Py> main.baz # Descriptor STILL not invoked :( <__main__.wrapper object at 0x00B9E3B0> But putting the exact same descriptor in a class lets it work its magic: Py> class lazy(object): ... baz = baz ... Py> lazy.baz Evaluating baz! 'Module attribute' Py> baz 'Module attribute' Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From gjc at inescporto.pt Thu Oct 13 15:07:06 2005 From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro) Date: Thu, 13 Oct 2005 14:07:06 +0100 Subject: [Python-Dev] Making Queue.Queue easier to use In-Reply-To: <434B8DBF.9080509@iinet.net.au> References: <434B8DBF.9080509@iinet.net.au> Message-ID: <1129208826.31838.13.camel@localhost> I'd just like to point out that Queue is not quite as useful as people seem to think in this thread. The main problem is that I can't integrate Queue into a select/poll based main loop. The other day I wanted extended a python main loop, which uses poll(), to be thread safe, so I could queue idle functions from separate threads. Obviously Queue doesn't work (no file descriptor to poll), so I just ended up creating a pipe, to which I send a single byte when I want to "wake up" the main loop to make it realize changes in its configuration, such as a new callback added. I guess this is partly an unix problem. There's no system call to say like "wake me up when one of these descriptors has data OR when this condition variable is set". Windows has WaitForMultipleObjects, which I suspect is quite a bit more powerful. Regards. -- Gustavo J. A. M. Carneiro The universe is always one step beyond logic. From mwh at python.net Thu Oct 13 16:36:21 2005 From: mwh at python.net (Michael Hudson) Date: Thu, 13 Oct 2005 15:36:21 +0100 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <638942434.20051011105302@MailBlocks.com> (Bruce Eckel's message of "Tue, 11 Oct 2005 10:53:02 -0600") References: <05Oct10.180605pdt."58617"@synergy1.parc.xerox.com> <638942434.20051011105302@MailBlocks.com> Message-ID: <2machdbi1m.fsf@starship.python.net> Bruce Eckel writes: > Not only are there significant new library components in > java.util.concurrent in J2SE5, but perhaps more important is the new > memory model that deals with issues that are (especially) revealed in > multiprocessor environments. The new memory model represents new work > in the computer science field; apparently the original paper is > written by Ph.D.s and is a bit too theoretical for the normal person > to follow. But the smart threading guys studied this and came up with > the new Java memory model -- so that volatile, for example, which > didn't work quite right before, does now. This is part of J2SE5, and > this work is being incorporated into the upcoming C++0x. Do you have a link that explains this sort of thing for the layman? Cheers, mwh -- When physicists speak of a TOE, they don't really mean a theory of *everything*. Taken literally, "Everything" covers a lot of ground, including biology, art, decoherence and the best way to barbecue ribs. -- John Baez, sci.physics.research From mwh at python.net Thu Oct 13 17:02:17 2005 From: mwh at python.net (Michael Hudson) Date: Thu, 13 Oct 2005 16:02:17 +0100 Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading? (Making Queue.Queue easier to use)) In-Reply-To: <434DEEBA.7050905@canterbury.ac.nz> (Greg Ewing's message of "Thu, 13 Oct 2005 18:20:58 +1300") References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com> <434DEEBA.7050905@canterbury.ac.nz> Message-ID: <2m1x2pbgue.fsf@starship.python.net> Greg Ewing writes: > Phillip J. Eby wrote: >> At 01:47 PM 10/13/2005 +1300, Greg Ewing wrote: >> >>> I'm trying to change the __class__ of a newly-imported >>> module to a subclass of types.ModuleType >> >> It happened in Python 2.3, actually. > > Is there a discussion anywhere about the reason this was > done? It would be useful if this capability could be > regained somehow without breaking things. Well, I think it's undesirable that you be able to do this to, e.g., strings. Modules are something of a greyer area, I guess. Cheers, mwh -- You sound surprised. We're talking about a government department here - they have procedures, not intelligence. -- Ben Hutchings, cam.misc From skip at pobox.com Thu Oct 13 18:07:07 2005 From: skip at pobox.com (skip@pobox.com) Date: Thu, 13 Oct 2005 11:07:07 -0500 Subject: [Python-Dev] Threading and synchronization primitives In-Reply-To: <434DD80C.80207@canterbury.ac.nz> References: <101220051537.5148.434D2DB3000F41170000141C22007503300E9D0E030E0CD203D202080106@comcast.net> <434DD80C.80207@canterbury.ac.nz> Message-ID: <17230.34347.295074.10528@montanaro.dyndns.org> Greg> All right then, how about putting it in a module called Greg> threadutils or something like that, which is clearly related to Greg> threading, but is open for the addition of future thread-related Greg> features that might arise. Then Lock, RLock, Semaphore, etc belong there instead of in threading don't they? We have two things here, the basic thread object and the stuff it does (run, start, etc) and the synchronization primitives. Thread objects come in two levels of abstraction: thread.thread and threading.Thread. The synchronization primitives come in three levels of abstraction: thread.lock, threading.{Lock,Semaphore,...} and Queue.Queue. Each level of abstraction builds on the level below. In the typical case I think we want to encourage programmers to use the highest levels of abstraction available and leave the lower level stuff to the real pros. That means most programmers using threads should use threading.Thread and Queue.Queue. Partitioning the various classes to different modules might look like this: Module Thread Classes Sync Primitives ------ -------------- --------------- _thread thread lock threadutils Lock, RLock, Semaphore thread Thread Queue Programmers would clearly be discouraged from using the _thread module (currently thread). The typical case would be to import the thread module (currently threading) and use its Thread and Queue objects. For specialized use the threadutils programmer can import the threadutils module to get at the synchronization primitives it contains. Skip From skip at pobox.com Thu Oct 13 18:15:15 2005 From: skip at pobox.com (skip@pobox.com) Date: Thu, 13 Oct 2005 11:15:15 -0500 Subject: [Python-Dev] threadtools (was Re: Autoloading? (Making Queue.Queue easier to use)) In-Reply-To: <434E3B12.1000108@gmail.com> References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <17229.13214.981827.304999@montanaro.dyndns.org> <434E3B12.1000108@gmail.com> Message-ID: <17230.34835.844742.854871@montanaro.dyndns.org> Nick> So the thread-related API would actually have three layers: Nick> - _thread (currently "_thread") for the low-level guts Nick> - threading for the basic thread API that any threaded app needs Nick> - threadtools for the more complex "application-specific" items Nick> Initially threadtools would just contain Queue, but other candidates for Nick> inclusion in the future might be standard implementations of: Nick> - PeriodicTimer (see below) Nick> - FutureCall (threading out a call, only blocking when you need the result Nick> - QueueThread (a thread with "inbox" and "outbox" Queues) Nick> - ThreadPool (up to the application to make sure the Threads are reusable) Nick> - threading related decorators Given your list of stuff to go in a threadtools module, I still think you need something to hold Lock, RLock, Condition and Semaphore. See my previous post (subject: Threading and synchronization primitives) about a threadutils module to hold these somewhat lower-level sync primitives. In most cases I don't think programmers need them. OTOH, providing some higher level abstractions seems to make sense. (I have to admit I have no idea what a QueueThread's outbox queue would be used for. Queues are generally multi-producer, single-consumer objects. It makes sense for a thread to have an inbox. I'm not so sure about an outbox.) Skip From aahz at pythoncraft.com Thu Oct 13 19:08:17 2005 From: aahz at pythoncraft.com (Aahz) Date: Thu, 13 Oct 2005 10:08:17 -0700 Subject: [Python-Dev] threadtools (was Re: Autoloading? (Making Queue.Queue easier to use)) In-Reply-To: <17230.34835.844742.854871@montanaro.dyndns.org> References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <17229.13214.981827.304999@montanaro.dyndns.org> <434E3B12.1000108@gmail.com> <17230.34835.844742.854871@montanaro.dyndns.org> Message-ID: <20051013170817.GA20568@panix.com> On Thu, Oct 13, 2005, skip at pobox.com wrote: > > Given your list of stuff to go in a threadtools module, I still think > you need something to hold Lock, RLock, Condition and Semaphore. See > my previous post (subject: Threading and synchronization primitives) > about a threadutils module to hold these somewhat lower-level sync > primitives. In most cases I don't think programmers need them. OTOH, > providing some higher level abstractions seems to make sense. (I > have to admit I have no idea what a QueueThread's outbox queue would > be used for. Queues are generally multi-producer, single-consumer > objects. It makes sense for a thread to have an inbox. I'm not so > sure about an outbox.) If you look at my thread tutorial, the spider thread pool uses a single-producer, multiple-consumer queue to feed URLs to the retrieving threads. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you think it's expensive to hire a professional to do the job, wait until you hire an amateur." --Red Adair From BruceEckel-Python3234 at mailblocks.com Thu Oct 13 19:10:03 2005 From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel) Date: Thu, 13 Oct 2005 11:10:03 -0600 Subject: [Python-Dev] Pythonic concurrency In-Reply-To: <2machdbi1m.fsf@starship.python.net> References: <05Oct10.180605pdt."58617"@synergy1.parc.xerox.com> <638942434.20051011105302@MailBlocks.com> <2machdbi1m.fsf@starship.python.net> Message-ID: <1569974120.20051013111003@MailBlocks.com> I don't know of anything that exists. There is an upcoming book that may help: Java Concurrency in Practice, by Brian Goetz, Tim Peierls, Joshua Bloch, Joseph Bowbeer, David Holmes, and Doug Lea (Addison-Wesley 2006). I have had assistance from some of the authors, but don't know if it introduces the concepts from the research paper. Estimated publication is February. However, you might get something from Scott Meyer's analysis of the concurrency issues surrounding the double-checked locking algorithm: http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf Thursday, October 13, 2005, 8:36:21 AM, Michael Hudson wrote: > Bruce Eckel writes: >> Not only are there significant new library components in >> java.util.concurrent in J2SE5, but perhaps more important is the new >> memory model that deals with issues that are (especially) revealed in >> multiprocessor environments. The new memory model represents new work >> in the computer science field; apparently the original paper is >> written by Ph.D.s and is a bit too theoretical for the normal person >> to follow. But the smart threading guys studied this and came up with >> the new Java memory model -- so that volatile, for example, which >> didn't work quite right before, does now. This is part of J2SE5, and >> this work is being incorporated into the upcoming C++0x. > Do you have a link that explains this sort of thing for the layman? > Cheers, > mwh Bruce Eckel http://www.BruceEckel.com mailto:BruceEckel-Python3234 at mailblocks.com Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e" Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel Subscribe to my newsletter: http://www.mindview.net/Newsletter My schedule can be found at: http://www.mindview.net/Calendar From pje at telecommunity.com Thu Oct 13 19:46:28 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 13 Oct 2005 13:46:28 -0400 Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading? (Making Queue.Queue easier to use)) In-Reply-To: <2m1x2pbgue.fsf@starship.python.net> References: <434DEEBA.7050905@canterbury.ac.nz> <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com> <434DEEBA.7050905@canterbury.ac.nz> Message-ID: <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com> At 04:02 PM 10/13/2005 +0100, Michael Hudson wrote: >Greg Ewing writes: > > > Phillip J. Eby wrote: > >> At 01:47 PM 10/13/2005 +1300, Greg Ewing wrote: > >> > >>> I'm trying to change the __class__ of a newly-imported > >>> module to a subclass of types.ModuleType > >> > >> It happened in Python 2.3, actually. > > > > Is there a discussion anywhere about the reason this was > > done? It would be useful if this capability could be > > regained somehow without breaking things. > >Well, I think it's undesirable that you be able to do this to, e.g., >strings. Modules are something of a greyer area, I guess. Actually, it's desirable to be *able* to do it for anything. But certainly for otherwise-immutable objects it can lead to aliasing issues. For mutable objects, it's *very* desirable, and I think the rules added in 2.3 might have been overly strict, as they disallow you changing any built-in type to a non built-in type, even if the allocator is the same. It seems to me the safety check could perhaps be reduced to just checking whether the old and new classes have the same tp_free. (Apart from the layout and other inheritance-related checks, I mean.) (By the way, for an example use case other than modules, note that somebody wrote an "observables" package that could detect mutation of lists and dictionaries in Python 2.2 using __class__ changes, which then became useless as of Python 2.3.) From eyal.lotem at gmail.com Thu Oct 13 19:52:32 2005 From: eyal.lotem at gmail.com (Eyal Lotem) Date: Thu, 13 Oct 2005 19:52:32 +0200 Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading? (Making Queue.Queue easier to use)) In-Reply-To: <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com> References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com> <434DEEBA.7050905@canterbury.ac.nz> <2m1x2pbgue.fsf@starship.python.net> <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com> Message-ID: Why not lazily import modules by importing them when they are needed (i.e inside functions), and not in the top-level module scope? On 10/13/05, Phillip J. Eby wrote: > At 04:02 PM 10/13/2005 +0100, Michael Hudson wrote: > >Greg Ewing writes: > > > > > Phillip J. Eby wrote: > > >> At 01:47 PM 10/13/2005 +1300, Greg Ewing wrote: > > >> > > >>> I'm trying to change the __class__ of a newly-imported > > >>> module to a subclass of types.ModuleType > > >> > > >> It happened in Python 2.3, actually. > > > > > > Is there a discussion anywhere about the reason this was > > > done? It would be useful if this capability could be > > > regained somehow without breaking things. > > > >Well, I think it's undesirable that you be able to do this to, e.g., > >strings. Modules are something of a greyer area, I guess. > > Actually, it's desirable to be *able* to do it for anything. But certainly > > for otherwise-immutable objects it can lead to aliasing issues. > > For mutable objects, it's *very* desirable, and I think the rules added in > 2.3 might have been overly strict, as they disallow you changing any > built-in type to a non built-in type, even if the allocator is the > same. It seems to me the safety check could perhaps be reduced to just > checking whether the old and new classes have the same tp_free. (Apart > from the layout and other inheritance-related checks, I mean.) > > (By the way, for an example use case other than modules, note that somebody > > wrote an "observables" package that could detect mutation of lists and > dictionaries in Python 2.2 using __class__ changes, which then became > useless as of Python 2.3.) > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/eyal.lotem%40gmail.com > From jcarlson at uci.edu Thu Oct 13 20:13:23 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 13 Oct 2005 11:13:23 -0700 Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading? (Making Queue.Queue easier to use)) In-Reply-To: References: <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com> Message-ID: <20051013110837.918C.JCARLSON@uci.edu> Eyal Lotem wrote: > Why not lazily import modules by importing them when they are needed > (i.e inside functions), and not in the top-level module scope? Because then it wouldn't be automatic. The earlier portion of this discussion came from... import module #module.foo does not reference a module module.foo #now module.foo references a module The discussion is about how we can get that kind of behavior. - Josiah From fredrik at pythonware.com Thu Oct 13 20:07:08 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 13 Oct 2005 20:07:08 +0200 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> Message-ID: Guido van Rossum wrote: > BTW, Queue.Queue violates a recent module naming standard; it is now > considered bad style to name the class and the module the same. > Modules and packages should have short all-lowercase names, classes > should be CapWords. Even the same but different case is bad style. unfortunately, this standard seem to result in generic "spamtools" modules into which people throw everything that's even remotely related to "spam", followed by complaints about bloat and performance from users, followed by various more or less stupid attempts to implement lazy loading of hidden in- ternal modules, followed by more complaints from users who no longer has a clear view of what's really going on in there... I think I'll stick to the old standard for a few more years... From guido at python.org Thu Oct 13 20:37:34 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 13 Oct 2005 11:37:34 -0700 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> Message-ID: On 10/13/05, Fredrik Lundh wrote: > Guido van Rossum wrote: > > > BTW, Queue.Queue violates a recent module naming standard; it is now > > considered bad style to name the class and the module the same. > > Modules and packages should have short all-lowercase names, classes > > should be CapWords. Even the same but different case is bad style. > > unfortunately, this standard seem to result in generic "spamtools" modules > into which people throw everything that's even remotely related to "spam", > followed by complaints about bloat and performance from users, followed by > various more or less stupid attempts to implement lazy loading of hidden in- > ternal modules, followed by more complaints from users who no longer has > a clear view of what's really going on in there... > > I think I'll stick to the old standard for a few more years... Yeah, until you've learned to use packages. :( -- --Guido van Rossum (home page: http://www.python.org/~guido/) From solipsis at pitrou.net Thu Oct 13 20:40:28 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 13 Oct 2005 20:40:28 +0200 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> Message-ID: <1129228828.7198.24.camel@fsol> > unfortunately, this standard seem to result in generic "spamtools" modules > into which people throw everything that's even remotely related to "spam", > followed by complaints about bloat and performance from users, followed by > various more or less stupid attempts to implement lazy loading of hidden in- > ternal modules, followed by more complaints from users who no longer has > a clear view of what's really going on in there... BTW, what's the performance problem in importing unnecessary stuff (assuming pyc files are already generated) ? Has it been evaluated somewhere ? From guido at python.org Thu Oct 13 20:42:01 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 13 Oct 2005 11:42:01 -0700 Subject: [Python-Dev] Threading and synchronization primitives In-Reply-To: <17230.34347.295074.10528@montanaro.dyndns.org> References: <101220051537.5148.434D2DB3000F41170000141C22007503300E9D0E030E0CD203D202080106@comcast.net> <434DD80C.80207@canterbury.ac.nz> <17230.34347.295074.10528@montanaro.dyndns.org> Message-ID: On 10/13/05, skip at pobox.com wrote: > > Greg> All right then, how about putting it in a module called > Greg> threadutils or something like that, which is clearly related to > Greg> threading, but is open for the addition of future thread-related > Greg> features that might arise. > > Then Lock, RLock, Semaphore, etc belong there instead of in threading don't > they? No. Locks and semaphores are the lowest-level threading primitives. They go in the basic module. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Oct 13 20:44:05 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 13 Oct 2005 11:44:05 -0700 Subject: [Python-Dev] Making Queue.Queue easier to use In-Reply-To: <1129208826.31838.13.camel@localhost> References: <434B8DBF.9080509@iinet.net.au> <1129208826.31838.13.camel@localhost> Message-ID: On 10/13/05, Gustavo J. A. M. Carneiro wrote: > I'd just like to point out that Queue is not quite as useful as people > seem to think in this thread. The main problem is that I can't > integrate Queue into a select/poll based main loop. Well, you're mixing two incompatible paradigms there, so that's to be expected, right? Either you're using async I/O or you're using threads. Mixing the two causes confusion and bugs no matter what you try. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy at alum.mit.edu Thu Oct 13 22:52:14 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Thu, 13 Oct 2005 16:52:14 -0400 Subject: [Python-Dev] AST branch update Message-ID: Neil and I have been working on the AST branch for the last week. We're nearly ready to merge the changes to the head. I imagine we'll do it this weekend, barring last minute glitches. There are a few open issues that remain. I'd like to merge the branch before resolving them. Please let me know if you disagree. The current status of the AST branch is that only two tests fail: test_trace and test_symtable. The causes for these failures are described below. We did not merge the current head to the branch again, but I diffed the test suite between head and branch and did not see any substantive changes since the last merge. Some of the finer points of generating the line number table (lnotab) are wrong. There is some very delicate code to support single stepping with the debugger. We'll get that fixed soon, but we'd like to temporarily disable the failing tests in test_trace. The symtable module exposed parts of the internal representation of the old symbol table. The representation changed, and the module is going to need to change. The old module was poorly documented and tested, so I'd like to start over. Again, I'd like to disable a couple of failing tests until after the merge occurs. I don't think the current test suite covers all of the possible syntax errors that can be raised. I'd like to add a new test suite that covers all of the remaining cases, perhaps moving some existing tests into this module as well. I'd like to do that after the merge, which means there may be some loose ends where syntax errors aren't handled gracefully. For those of you familiar with the ast work, I'll summarize the recent changes: We added line numbers to expressions in the AST. There are many cases where a statement spans multiple lines. We couldn't generate a correct lnotab without knowing the lines that expressions occur on. We merged the peephole optimizer into the new compiler and restored PyNode_Compile() so that the parser module works again. The parser module will still expose the old parse trees (just what it's users want). We should probably provide a similar AST module, but I'm not sure if we'll get to that. We fixed some flawed logic in the symbol table for handling nested scopes. Luckily, the test cases for nested scopes are pretty thorough. They all pass now. Jeremy From jepler at unpythonic.net Fri Oct 14 00:08:41 2005 From: jepler at unpythonic.net (jepler@unpythonic.net) Date: Thu, 13 Oct 2005 17:08:41 -0500 Subject: [Python-Dev] AST branch update In-Reply-To: References: Message-ID: <20051013220841.GB8826@unpythonic.net> I'm excited to see work continuing (resuming?) on the AST tree. I don't know how many machines you've been able to test the AST branch on. I have a linux/amd64 machine handy and I've tried to run the test suite with a fresh copy of the ast-branch. test_trace segfaults consistently, even when run alone. You didn't give me the impression that the failure was a segfault, so I'll include more information about it below. With '-x test_trace -x test_codecencodings_kr', I get through the testsuite run. Compared to a build of HEAD, also from today, I get additional failures in test_genexps test_grp test_pwd test_symtable and additional unexpected skips of: test_email test_email_codecs The 'pwd' and 'grp' failures look like they're due to a change not merged from HEAD. I'm not sure what to make of the 'genexps' failure. Is it just a harmless output difference? I didn't see you mention that in your message. Here is some of the relevant-looking output: $ ./python -E -tt ./Lib/test/regrtest.py [...] ********************************************************************** File "/usr/src/python-ast/Lib/test/test_genexps.py", line ?, in test.test_genexps.__test__.doctests Failed example: (y for y in (1,2)) = 10 Expected: Traceback (most recent call last): ... SyntaxError: assign to generator expression not possible Got: Traceback (most recent call last): File "/usr/src/python-ast/Lib/doctest.py", line 1243, in __run compileflags, 1) in test.globs File "", line 1 SyntaxError: assignment to generator expression not possible (, line 1) ********************************************************************** File "/usr/src/python-ast/Lib/test/test_genexps.py", line ?, in test.test_genexps.__test__.doctests Failed example: (y for y in (1,2)) += 10 Expected: Traceback (most recent call last): ... SyntaxError: augmented assign to tuple literal or generator expression not possible Got: Traceback (most recent call last): File "/usr/src/python-ast/Lib/doctest.py", line 1243, in __run compileflags, 1) in test.globs File "", line 1 SyntaxError: augmented assignment to generator expression not possible (, line 1) ********************************************************************** [...] test test_grp failed -- Traceback (most recent call last): File "/usr/src/python-ast/Lib/test/test_grp.py", line 29, in test_values e2 = grp.getgrgid(e.gr_gid) OverflowError: signed integer is greater than maximum [...] test test_pwd failed -- Traceback (most recent call last): File "/usr/src/python-ast/Lib/test/test_pwd.py", line 42, in test_values self.assert_(pwd.getpwuid(e.pw_uid) in entriesbyuid[e.pw_uid]) OverflowError: signed integer is greater than maximum The segfault in test_trace looks like this: $ gdb ./python (gdb) source Misc/gdbinit (gdb) run Lib/test/test_trace.py [...] test_10_no_jump_to_except_1 (__main__.JumpTestCase) ... FAIL Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 46912496260768 (LWP 11945)] PyEval_EvalFrame (f=0x652c30) at Python/ceval.c:1994 [1967 case COMPARE_OP:) 1994 Py_DECREF(v); (gdb) print oparg $1 = 10 [PyCmp_EXC_MATCH?] (gdb) pyo v NULL $2 = void #0 PyEval_EvalFrame (f=0x652c30) at Python/ceval.c:1994 #1 0x0000000000475800 in PyEval_EvalFrame (f=0x697390) at Python/ceval.c:3618 #2 0x0000000000475800 in PyEval_EvalFrame (f=0x694f10) at Python/ceval.c:3618 #3 0x0000000000475800 in PyEval_EvalFrame (f=0x649fa0) at Python/ceval.c:3618 [...] #50 0x00000000004113bb in Py_Main (argc=Variable "argc" is not available.) at Modules/main.c:484 (gdb) pystack Lib/test/test_trace.py (447): no_jump_to_except_2 Lib/test/test_trace.py (447): run_test Lib/test/test_trace.py (557): test_11_no_jump_to_except_2 /usr/src/python-ast/Lib/unittest.py (581): run /usr/src/python-ast/Lib/unittest.py (280): __call__ /usr/src/python-ast/Lib/unittest.py (420): run /usr/src/python-ast/Lib/unittest.py (427): __call__ /usr/src/python-ast/Lib/unittest.py (420): run /usr/src/python-ast/Lib/unittest.py (427): __call__ /usr/src/python-ast/Lib/unittest.py (692): run /usr/src/python-ast/Lib/test/test_support.py (692): run_suite /usr/src/python-ast/Lib/test/test_support.py (278): run_unittest Lib/test/test_trace.py (600): test_main Lib/test/test_trace.py (600): I'm not sure what other information from gdb to furnish. Jeff -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20051013/b025eac6/attachment.pgp From nas at arctrix.com Fri Oct 14 00:16:51 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Thu, 13 Oct 2005 16:16:51 -0600 Subject: [Python-Dev] AST branch update In-Reply-To: <20051013220841.GB8826@unpythonic.net> References: <20051013220841.GB8826@unpythonic.net> Message-ID: <20051013221650.GA8676@mems-exchange.org> On Thu, Oct 13, 2005 at 05:08:41PM -0500, jepler at unpythonic.net wrote: > test_trace segfaults consistently, even when run alone. That's a bug in frame_setlineno(), IMO. It's failing to detect an invalid jump because the lnotab generated by the new compiler is slightly different (DUP_TOP opcode corresponds to a different line). > I'm not sure what to make of the 'genexps' failure. Is it just a harmless > output difference? I didn't see you mention that in your message. It's a bug in the traceback.py module, IMO. See bug 1326077. Neil From greg.ewing at canterbury.ac.nz Fri Oct 14 02:32:34 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2005 13:32:34 +1300 Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading? (Making Queue.Queue easier to use)) In-Reply-To: <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com> References: <434DEEBA.7050905@canterbury.ac.nz> <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com> <434DEEBA.7050905@canterbury.ac.nz> <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com> Message-ID: <434EFCA2.4080100@canterbury.ac.nz> Phillip J. Eby wrote: > Actually, it's desirable to be *able* to do it for anything. But certainly > for otherwise-immutable objects it can lead to aliasing issues. Even for immutables, it could be useful to be able to add behaviour that doesn't mutate anything. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Fri Oct 14 02:43:28 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2005 13:43:28 +1300 Subject: [Python-Dev] Simplify lnotab? (AST branch update) In-Reply-To: References: Message-ID: <434EFF30.3040703@canterbury.ac.nz> Jeremy Hylton wrote: > Some of the finer points of generating the line number table (lnotab) > are wrong. There is some very delicate code to support single > stepping with the debugger. With disk and memory sizes being what they are nowadays, is it still worth making heroic efforts to compress the lnotab table? How about getting rid of all the delicate code and replacing it with something much simpler? -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Fri Oct 14 02:49:44 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2005 13:49:44 +1300 Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading? (Making Queue.Queue easier to use)) In-Reply-To: <20051013110837.918C.JCARLSON@uci.edu> References: <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com> <20051013110837.918C.JCARLSON@uci.edu> Message-ID: <434F00A8.1080306@canterbury.ac.nz> Josiah Carlson wrote: > The earlier portion of this discussion came from... > > import module > #module.foo does not reference a module > module.foo > #now module.foo references a module Or more generally, module.foo now references *something*, not necessarily a module. (In my use case it's a class.) -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From pje at telecommunity.com Fri Oct 14 02:59:36 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 13 Oct 2005 20:59:36 -0400 Subject: [Python-Dev] Simplify lnotab? (AST branch update) In-Reply-To: <434EFF30.3040703@canterbury.ac.nz> References: Message-ID: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> At 01:43 PM 10/14/2005 +1300, Greg Ewing wrote: >Jeremy Hylton wrote: > > > Some of the finer points of generating the line number table (lnotab) > > are wrong. There is some very delicate code to support single > > stepping with the debugger. > >With disk and memory sizes being what they are nowadays, >is it still worth making heroic efforts to compress the >lnotab table? How about getting rid of all the delicate >code and replacing it with something much simpler? +1. I'd be especially interested in lifting the current requirement that line ranges and byte ranges both increase monotonically. Even better if the lines for a particular piece of code don't have to all come from the same file. It'd be nice to be able to do the equivalent of '#line' directives for Python code that's generated by other tools, such as parser generators and the like. From greg.ewing at canterbury.ac.nz Fri Oct 14 03:25:26 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2005 14:25:26 +1300 Subject: [Python-Dev] Simplify lnotab? (AST branch update) In-Reply-To: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> References: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> Message-ID: <434F0906.6090508@canterbury.ac.nz> Phillip J. Eby wrote: > +1. I'd be especially interested in lifting the current requirement > that line ranges and byte ranges both increase monotonically. Even > better if the lines for a particular piece of code don't have to all > come from the same file. How about an array of: +----------------+----------------+----------------+ | bytecode index | file no. | line no. | +----------------+----------------+----------------+ Entries are sorted by bytecode index, with each entry applying from that bytecode position up to the position of the next entry. The file no. indexes a tuple of file names attached to the code object. All entries are 32-bit integers. Easy to generate, easy to look up with a binary search, should be big enough for everyone except those generating obscenely huge code objects on 64-bit platforms. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From pje at telecommunity.com Fri Oct 14 03:42:32 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 13 Oct 2005 21:42:32 -0400 Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading? (Making Queue.Queue easier to use)) In-Reply-To: <434EFCA2.4080100@canterbury.ac.nz> References: <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com> <434DEEBA.7050905@canterbury.ac.nz> <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com> <434DEEBA.7050905@canterbury.ac.nz> <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051013214118.0312e0e8@mail.telecommunity.com> At 01:32 PM 10/14/2005 +1300, Greg Ewing wrote: >Phillip J. Eby wrote: > > > Actually, it's desirable to be *able* to do it for anything. But > certainly > > for otherwise-immutable objects it can lead to aliasing issues. > >Even for immutables, it could be useful to be able to >add behaviour that doesn't mutate anything. I meant that just changing its class is a mutation, and since immutables can be shared or cached, that could lead to problems. So I do think it's a reasonable implementation limit to disallow changing the __class__ of an immutable. From pinard at iro.umontreal.ca Fri Oct 14 03:41:35 2005 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Thu, 13 Oct 2005 21:41:35 -0400 Subject: [Python-Dev] Simplify lnotab? (AST branch update) In-Reply-To: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> References: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> Message-ID: <20051014014135.GA22105@alcyon.progiciels-bpi.ca> [Phillip J. Eby] > It'd be nice to be able to do the equivalent of '#line' directives for > Python code that's generated by other tools, such as parser generators > and the like. I had such a need a few times in the past, and it was tedious having to do indirections through generated Python for finding the real source as referenced by comments. Yet, granted also that the need has not been frequent, for me. -- Fran?ois Pinard http://pinard.progiciels-bpi.ca From pje at telecommunity.com Fri Oct 14 03:55:20 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 13 Oct 2005 21:55:20 -0400 Subject: [Python-Dev] Simplify lnotab? (AST branch update) In-Reply-To: <434F0906.6090508@canterbury.ac.nz> References: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051013214331.02f320a0@mail.telecommunity.com> At 02:25 PM 10/14/2005 +1300, Greg Ewing wrote: >Phillip J. Eby wrote: > > > +1. I'd be especially interested in lifting the current requirement > > that line ranges and byte ranges both increase monotonically. Even > > better if the lines for a particular piece of code don't have to all > > come from the same file. > >How about an array of: > > +----------------+----------------+----------------+ > | bytecode index | file no. | line no. | > +----------------+----------------+----------------+ > >Entries are sorted by bytecode index, with each entry >applying from that bytecode position up to the position >of the next entry. The file no. indexes a tuple of file >names attached to the code object. All entries are 32-bit >integers. The file number could be 16-bit - I don't see a use case for referring to 65,000 different filenames. ;) But that doesn't save much space. Anyway, in the common case, this scheme will use 10 more bytes per line of Python code, which translates to a megabyte or so for the standard library. I definitely like the simplicity, but a meg's a meg. A more compact scheme is possible, by using two tables - a bytecode->line number table, and a line number-> file table. In the single-file case, you can omit the second table, and the first table then only uses 6 more bytes per line than we're currently using. Not fantastic, but probably more acceptable. If you have to encode multiple files, you just offset their line numbers by the size of the other files, and put entries in the line->file table to match. When computing the line number, you subtract the matching entry in the line->file table to get the actual line number within that file. From greg.ewing at canterbury.ac.nz Fri Oct 14 05:14:08 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2005 16:14:08 +1300 Subject: [Python-Dev] Early PEP draft (For Python 3000?) In-Reply-To: References: Message-ID: <434F2280.2020000@canterbury.ac.nz> Eyal Lotem wrote: > locals()['x'] = 1 # Quietly fails! > Replaced by: > frame.x = 1 # Raises error Or even better, replaced by frame.x = 1 # Does the right thing The frame object knows enough to be able to find the correct locals slot and update it, so there's no need for this to fail. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From falcon at intercable.ru Tue Oct 11 08:33:43 2005 From: falcon at intercable.ru (Sokolov Yura) Date: Tue, 11 Oct 2005 10:33:43 +0400 Subject: [Python-Dev] PEP 3000 and exec Message-ID: <434B5CC7.2030009@intercable.ru> Agree. >>>i=1 >>>def a(): i=2 def b(): print i return b >>>a()() 2 >>>def a(): i=2 def b(): exec "print i" return b >>>a()() 1 From falcon at intercable.ru Tue Oct 11 08:55:41 2005 From: falcon at intercable.ru (Sokolov Yura) Date: Tue, 11 Oct 2005 10:55:41 +0400 Subject: [Python-Dev] Pythonic concurrency - offtopic Message-ID: <434B61ED.4080503@intercable.ru> Offtopic: Microsoft Windows [Version 5.2.3790] (C) Copyright 1985-2003 Microsoft Corp. G:\Working\1>c:\Python24\python Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from os import fork Traceback (most recent call last): File "", line 1, in ? ImportError: cannot import name fork >>> From support at intercable.ru Tue Oct 11 08:57:03 2005 From: support at intercable.ru (Technical Support of Intercable Co) Date: Tue, 11 Oct 2005 10:57:03 +0400 Subject: [Python-Dev] C.E.R. Thoughts Message-ID: <434B623F.9030000@intercable.ru> And why not if len(sys.argv) > 1 take sys.argv[1] == 'debug': ... It was not so bad :-) A = len(sys.argv)==0 take None or sys.argv[1] Sorry for being noisy :-) From tonynelson at georgeanelson.com Wed Oct 12 04:10:29 2005 From: tonynelson at georgeanelson.com (Tony Nelson) Date: Tue, 11 Oct 2005 22:10:29 -0400 Subject: [Python-Dev] Unicode charmap decoders slow Message-ID: I have written my fastcharmap decoder and encoder. It's not meant to be better than the patch and other changes to come in a future version of Python, but it does work now with the current codecs. Using Hye-Shik Chang's benchmark, decoding is about 4.3x faster than the base, and encoding is about 2x faster than the base (that's comparing the base and the fast versions on my machine). If fastcharmap would be useful, please tell me where I should make it available, and any changes that are needed. I would also need to write an installer (distutils I guess). Fastcharmap is written in Python and Pyrex 0.9.3, and the .pyx file will need to be compiled before use. I used: pyrexc _fastcharmap.pyx gcc -c -fPIC -I/usr/include/python2.3/ _fastcharmap.c gcc -shared _fastcharmap.o -o _fastcharmap.so To use, hook each codec to be speed up: import fastcharmap help(fastcharmap) fastcharmap.hook('name_of_codec') u = unicode('some text', 'name_of_codec') s = u.encode('name_of_codec') No codecs were rewritten. It took me a while to learn enough to do this (Pyrex, more Python, some Python C API), and there were some surprises. Hooking in is grosser than I would have liked. I've only used it on Python 2.3 on FC3. Still, it should work going forward, and, if the dicts are replaced by something else, fastcharmap will know to leave everything alone. There's still a tiny bit of debugging print statements in it. >At 8:36 AM +0200 10/5/05, Martin v. L?wis wrote: >>Tony Nelson wrote: > ... >>> Encoding can be made fast using a simple hash table with external chaining. >>> There are max 256 codepoints to encode, and they will normally be well >>> distributed in their lower 8 bits. Hash on the low 8 bits (just mask), and >>> chain to an area with 256 entries. Modest storage, normally short chains, >>> therefore fast encoding. >> >>This is what is currently done: a hash map with 256 keys. You are >>complaining about the performance of that algorithm. The issue of >>external chaining is likely irrelevant: there likely are no collisions, >>even though Python uses open addressing. > >I think I'm complaining about the implementation, though on decode, not >encode. > >In any case, there are likely to be collisions in my scheme. Over the >next few days I will try to do it myself, but I will need to learn Pyrex, >some of the Python C API, and more about Python to do it. > > >>>>...I suggest instead just /caching/ the translation in C arrays stored >>>>with the codec object. The cache would be invalidated on any write to the >>>>codec's mapping dictionary, and rebuilt the next time anything was >>>>translated. This would maintain the present semantics, work with current >>>>codecs, and still provide the desired speed improvement. >> >>That is not implementable. You cannot catch writes to the dictionary. > >I should have been more clear. I am thinking about using a proxy object >in the codec's 'encoding_map' and 'decoding_map' slots, that will forward >all the dictionary stuff. The proxy will delete the cache on any call >which changes the dictionary contents. There are proxy classed and >dictproxy (don't know how its implemented yet) so it seems doable, at >least as far as I've gotten so far. > > >>> Note that this caching is done by new code added to the existing C >>> functions (which, if I have it right, are in unicodeobject.c). No >>> architectural changes are made; no existing codecs need to be changed; >>> everything will just work >> >>Please try to implement it. You will find that you cannot. I don't >>see how regenerating/editing the codecs could be avoided. > >Will do! ____________________________________________________________________ TonyN.:' ' From falcon at intercable.ru Thu Oct 13 10:48:56 2005 From: falcon at intercable.ru (Sokolov Yura) Date: Thu, 13 Oct 2005 12:48:56 +0400 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) Message-ID: <434E1F78.7020504@intercable.ru> May be allow modules to define __getattr__ ? def __getattr__(thing): try: return __some_standart_way__(thing) except AttributeError: if thing=="Queue": import sys from Queue import Queue setattr(sys.modules[__name__],"Queue",Queue) return Queue raise From raymond.hettinger at verizon.net Fri Oct 14 07:03:28 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 14 Oct 2005 01:03:28 -0400 Subject: [Python-Dev] AST branch update In-Reply-To: Message-ID: <005601c5d07c$9ea717c0$1fac958d@oemcomputer> > Neil and I have been working on the AST branch for the last week. > We're nearly ready to merge the changes to the head. Nice work. > I don't think the current test suite covers all of the possible syntax > errors that can be raised. Do the AST branch generate a syntax error for: foo(a = i for i in range(10)) ? Raymond From jcarlson at uci.edu Fri Oct 14 07:11:49 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 13 Oct 2005 22:11:49 -0700 Subject: [Python-Dev] Pythonic concurrency - offtopic In-Reply-To: <434B61ED.4080503@intercable.ru> References: <434B61ED.4080503@intercable.ru> Message-ID: <20051013220748.9195.JCARLSON@uci.edu> Sokolov Yura wrote: > > Offtopic: > > Microsoft Windows [Version 5.2.3790] > (C) Copyright 1985-2003 Microsoft Corp. > > G:\Working\1>c:\Python24\python > Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] on > win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> from os import fork > Traceback (most recent call last): > File "", line 1, in ? > ImportError: cannot import name fork > >>> Python for Windows, if I remember correctly, has never supported forking. This is because the underlying process execution code does not have support for the standard copy-on-write semantic which makes unix fork fast. Cygwin Python does support fork, but I believe this is through a literal copying of the memory space, which is far slower than unix fork. Until Microsoft adds kernel support for fork, don't expect standard Windows Python to support it. - Josiah From nas at arctrix.com Fri Oct 14 07:11:47 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Thu, 13 Oct 2005 23:11:47 -0600 Subject: [Python-Dev] AST branch update In-Reply-To: <005601c5d07c$9ea717c0$1fac958d@oemcomputer> References: <005601c5d07c$9ea717c0$1fac958d@oemcomputer> Message-ID: <20051014051147.GA9906@mems-exchange.org> On Fri, Oct 14, 2005 at 01:03:28AM -0400, Raymond Hettinger wrote: > Do the AST branch generate a syntax error for: > > foo(a = i for i in range(10)) No. It generates the same broken code as the current compiler. Neil From jcarlson at uci.edu Fri Oct 14 07:15:06 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 13 Oct 2005 22:15:06 -0700 Subject: [Python-Dev] C.E.R. Thoughts In-Reply-To: <434B623F.9030000@intercable.ru> References: <434B623F.9030000@intercable.ru> Message-ID: <20051013221327.9198.JCARLSON@uci.edu> Technical Support of Intercable Co wrote: > > And why not > if len(sys.argv) > 1 take sys.argv[1] == 'debug': > ... > > It was not so bad :-) > > A = len(sys.argv)==0 take None or sys.argv[1] > > Sorry for being noisy :-) The syntax for 2.5 has already been decided upon. Except for an act by Guido, it is likely to stay (None if len(sys.argv) == 0 else sys.argv[1]). - Josiah From fredrik at pythonware.com Fri Oct 14 07:34:50 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 14 Oct 2005 07:34:50 +0200 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> Message-ID: Guido van Rossum wrote: > > > BTW, Queue.Queue violates a recent module naming standard; it is now > > > considered bad style to name the class and the module the same. > > > Modules and packages should have short all-lowercase names, classes > > > should be CapWords. Even the same but different case is bad style. > > > > unfortunately, this standard seem to result in generic "spamtools" modules > > into which people throw everything that's even remotely related to "spam", > > followed by complaints about bloat and performance from users, followed by > > various more or less stupid attempts to implement lazy loading of hidden in- > > ternal modules, followed by more complaints from users who no longer has > > a clear view of what's really going on in there... > > > > I think I'll stick to the old standard for a few more years... > > Yeah, until you've learned to use packages. :( what does packages has to do with this ? does this new module naming standard only apply to toplevel package names ? From ironfroggy at gmail.com Fri Oct 14 08:16:16 2005 From: ironfroggy at gmail.com (Calvin Spealman) Date: Fri, 14 Oct 2005 02:16:16 -0400 Subject: [Python-Dev] Early PEP draft (For Python 3000?) In-Reply-To: References: Message-ID: <76fd5acf0510132316x6a8bcc8ck1c3d5a812abd447e@mail.gmail.com> On 10/11/05, Eyal Lotem wrote: > locals()['x'] = 1 # Quietly fails! > Replaced by: > frame.x = 1 # Raises error What about the possibility of making this hypothetic frame object an indexable, such that frame[0] is the current scope, frame[1] is the calling scope, etc.? On the same lines, what about closure[0] for the current frame, while closure[1] resolves to the closure the function was defined in? These would ensure that you could reliably access any namespace you would need, without nasty stack tricks and such, and would make working around some of the limitation of the closures, when you have such a need. One might even consider a __resolve__ to be defined in any namespace, allowing all the namespace resolution rules to be overridden by code at any level. From guido at python.org Fri Oct 14 08:18:45 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 13 Oct 2005 23:18:45 -0700 Subject: [Python-Dev] AST branch update In-Reply-To: <005601c5d07c$9ea717c0$1fac958d@oemcomputer> References: <005601c5d07c$9ea717c0$1fac958d@oemcomputer> Message-ID: [Jeremy] > > Neil and I have been working on the AST branch for the last week. > > We're nearly ready to merge the changes to the head. [Raymond] > Nice work. Indeed. I should've threatened to kill the AST branch long ago! :) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Fri Oct 14 08:50:29 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2005 19:50:29 +1300 Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading? (Making Queue.Queue easier to use)) In-Reply-To: <5.1.1.6.0.20051013214118.0312e0e8@mail.telecommunity.com> References: <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com> <434DEEBA.7050905@canterbury.ac.nz> <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com> <434DEEBA.7050905@canterbury.ac.nz> <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com> <5.1.1.6.0.20051013214118.0312e0e8@mail.telecommunity.com> Message-ID: <434F5535.4010201@canterbury.ac.nz> Phillip J. Eby wrote: > I meant that just changing its class is a mutation, and since immutables > can be shared or cached, that could lead to problems. So I do think > it's a reasonable implementation limit to disallow changing the > __class__ of an immutable. That's a fair point. Although I was actually thinking recently of a use case for changing the class of a tuple, inside a Pyrex module for database access. The idea was that the user would be able to supply a custom subclass of tuple for returning the records. To avoid extra copying of the data, I was going to create a normal uninitialised tuple, stuff the data into it, and then change its class to the user-supplied one. But seeing as all this would be happening in Pyrex where the normal restrictions don't apply anyway, I suppose it wouldn't matter if user code wasn't allowed to do this. Greg From greg.ewing at canterbury.ac.nz Fri Oct 14 08:52:34 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2005 19:52:34 +1300 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <434E1F78.7020504@intercable.ru> References: <434E1F78.7020504@intercable.ru> Message-ID: <434F55B2.1030106@canterbury.ac.nz> Sokolov Yura wrote: > May be allow modules to define __getattr__ ? I think I like the descriptor idea better. Besides being more in keeping with modern practice, it would allow for things like from autoloading import autoload Foo = autoload('foomodule', 'Foo') Blarg = autoload('blargmodule', 'Blarg') where autoload is defined as a suitable descriptor subclass. I guess we could do with a PEP on this... Greg From martin at v.loewis.de Fri Oct 14 09:14:23 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 14 Oct 2005 09:14:23 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: References: Message-ID: <434F5ACF.3000802@v.loewis.de> Tony Nelson wrote: > I have written my fastcharmap decoder and encoder. It's not meant to be > better than the patch and other changes to come in a future version of > Python, but it does work now with the current codecs. It's an interesting solution. > To use, hook each codec to be speed up: > > import fastcharmap > help(fastcharmap) > fastcharmap.hook('name_of_codec') > u = unicode('some text', 'name_of_codec') > s = u.encode('name_of_codec') > > No codecs were rewritten. It took me a while to learn enough to do this > (Pyrex, more Python, some Python C API), and there were some surprises. > Hooking in is grosser than I would have liked. I've only used it on Python > 2.3 on FC3. Indeed, and I would claim that you did not completely achieve your "no changes necessary" goal: you still have to install the hooks explicitly. I also think overriding codecs.charmap_{encode,decode} is really ugly. Even if this could be simplified if you would modify the existing codecs, I still don't think supporting changes to the encoding dict is worthwhile. People will probably want to update the codecs in-place, but I don't think we need to make a guarantee that that such an approach works independent of the Python version. People would be much better off writing their own codecs if they think the distributed ones are incorrect. Regards, Martin From greg.ewing at canterbury.ac.nz Fri Oct 14 09:07:05 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2005 20:07:05 +1300 Subject: [Python-Dev] Simplify lnotab? (AST branch update) In-Reply-To: <5.1.1.6.0.20051013214331.02f320a0@mail.telecommunity.com> References: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> <5.1.1.6.0.20051013214331.02f320a0@mail.telecommunity.com> Message-ID: <434F5919.6040002@canterbury.ac.nz> Phillip J. Eby wrote: > A more > compact scheme is possible, by using two tables - a bytecode->line > number table, and a line number-> file table. > > If you have to encode multiple files, you just offset their line numbers > by the size of the other files, More straightforwardly, the second table could just be a bytecode -> file number mapping. The filename is likely to change much less often than the line number, so this file would contain far fewer entries than the line number table. In the case of only one file, it would contain just a single entry, so it probably wouldn't even be worth the bother of special-casing that. You could save a bit more by having two kinds of line number table, "small" (16-bit entries) and "large" (32-bit entries) depending on the size of the code object and range of line numbers. The small one would be sufficient for almost all code objects, so the most common case would use only about 4 bytes per line of code. That's only twice as much as the current scheme uses. Greg From jcarlson at uci.edu Fri Oct 14 09:23:44 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 14 Oct 2005 00:23:44 -0700 Subject: [Python-Dev] Early PEP draft (For Python 3000?) In-Reply-To: <76fd5acf0510132316x6a8bcc8ck1c3d5a812abd447e@mail.gmail.com> References: <76fd5acf0510132316x6a8bcc8ck1c3d5a812abd447e@mail.gmail.com> Message-ID: <20051014000927.919E.JCARLSON@uci.edu> Calvin Spealman wrote: > > On 10/11/05, Eyal Lotem wrote: > > locals()['x'] = 1 # Quietly fails! > > Replaced by: > > frame.x = 1 # Raises error > > What about the possibility of making this hypothetic frame object an > indexable, such that frame[0] is the current scope, frame[1] is the > calling scope, etc.? On the same lines, what about closure[0] for the > current frame, while closure[1] resolves to the closure the function > was defined in? These would ensure that you could reliably access any > namespace you would need, without nasty stack tricks and such, and > would make working around some of the limitation of the closures, when > you have such a need. One might even consider a __resolve__ to be > defined in any namespace, allowing all the namespace resolution rules > to be overridden by code at any level. -1000 If you want a namespace, create one and pass it around. If the writer of a function or method wanted you monkeying around with a namespace, they would have given you one to work with. As for closure monkeywork, you've got to be kidding. Closures in Python are a clever and interesting way of keeping around certain things, but are actually unnecessary with the existance of class and instance namespaces. Every example of a closure can be re-done as a class/instance, and many end up looking better. - Josiah From nnorwitz at gmail.com Fri Oct 14 09:46:14 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Fri, 14 Oct 2005 00:46:14 -0700 Subject: [Python-Dev] AST branch update In-Reply-To: References: <005601c5d07c$9ea717c0$1fac958d@oemcomputer> Message-ID: On 10/13/05, Guido van Rossum wrote: > > Indeed. I should've threatened to kill the AST branch long ago! :) :-) I decreased a lot of the memory leaks. Here are some more to work on. I doubt this list is complete, but it's a start: PyObject_Malloc (obmalloc.c:717) _PyObject_DebugMalloc (obmalloc.c:1014) compiler_enter_scope (newcompile.c:1204) compiler_mod (newcompile.c:1894) PyAST_Compile (newcompile.c:471) Py_CompileStringFlags (pythonrun.c:1240) builtin_compile (bltinmodule.c:391) Tuple (Python-ast.c:907) ast_for_testlist (ast.c:1782) ast_for_classdef (ast.c:2677) ast_for_stmt (ast.c:2758) PyAST_FromNode (ast.c:233) PyParser_ASTFromFile (pythonrun.c:1291) parse_source_module (import.c:762) load_source_module (import.c:886) new_arena (obmalloc.c:500) PyObject_Malloc (obmalloc.c:699) PyObject_Realloc (obmalloc.c:837) _PyObject_DebugRealloc (obmalloc.c:1077) PyNode_AddChild (node.c:95) shift (parser.c:112) PyParser_AddToken (parser.c:244) parsetok (parsetok.c:165) PyParser_ParseFileFlags (parsetok.c:89) PyParser_ASTFromFile (pythonrun.c:1288) parse_source_module (import.c:762) load_source_module (import.c:886) Lambda (Python-ast.c:610) ast_for_lambdef (ast.c:859) ast_for_expr (ast.c:1443) ast_for_testlist (ast.c:1776) ast_for_expr_stmt (ast.c:1845) ast_for_stmt (ast.c:2716) PyAST_FromNode (ast.c:233) PyParser_ASTFromString (pythonrun.c:1271) Py_CompileStringFlags (pythonrun.c:1237) builtin_compile (bltinmodule.c:391) BinOp (Python-ast.c:557) ast_for_binop (ast.c:1389) ast_for_expr (ast.c:1531) ast_for_testlist (ast.c:1776) ast_for_expr_stmt (ast.c:1845) ast_for_stmt (ast.c:2716) PyAST_FromNode (ast.c:233) PyParser_ASTFromString (pythonrun.c:1271) Py_CompileStringFlags (pythonrun.c:1237) builtin_compile (bltinmodule.c:391) Name (Python-ast.c:865) ast_for_atom (ast.c:1201) ast_for_expr (ast.c:1555) ast_for_testlist (ast.c:1776) ast_for_expr_stmt (ast.c:1798) ast_for_stmt (ast.c:2716) PyAST_FromNode (ast.c:233) PyParser_ASTFromString (pythonrun.c:1271) Py_CompileStringFlags (pythonrun.c:1237) builtin_compile (bltinmodule.c:391) From nnorwitz at gmail.com Fri Oct 14 09:55:59 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Fri, 14 Oct 2005 00:55:59 -0700 Subject: [Python-Dev] AST branch update In-Reply-To: References: <005601c5d07c$9ea717c0$1fac958d@oemcomputer> Message-ID: On 10/14/05, Neal Norwitz wrote: > > I decreased a lot of the memory leaks. Here are some more to work on. > I doubt this list is complete, but it's a start: Oh and since I fixed the memory leaks in a generated file Python/Python-ast.c, the changes still need to be implemented in the right place (ie, Parser/asdl_c.py). Valgrind didn't report any invalid uses of memory, though there is also a lot potentially leaked memory. It seemed a lot higher than what I remembered, so I'm not sure if it's an issue or not. I'll look into that after we get the definite memory leaks plugged. n From mwh at python.net Fri Oct 14 10:23:40 2005 From: mwh at python.net (Michael Hudson) Date: Fri, 14 Oct 2005 09:23:40 +0100 Subject: [Python-Dev] Simplify lnotab? (AST branch update) In-Reply-To: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> (Phillip J. Eby's message of "Thu, 13 Oct 2005 20:59:36 -0400") References: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> Message-ID: <2mwtkga4mr.fsf@starship.python.net> "Phillip J. Eby" writes: > Even better if the lines for a particular piece of code don't have > to all come from the same file. This seems _fairly_ esoteric to me. Why do you need it? I can think of two uses for lnotab information: printing source lines and locating source lines on the filesystem. For both, I think I'd rather see some kind of defined protocol (methods on the code object, maybe?) rather than inventing some kind of insane too-general-for-the-common-case data structure. Cheers, mwh -- 42. You can measure a programmer's perspective by noting his attitude on the continuing vitality of FORTRAN. -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html From pje at telecommunity.com Fri Oct 14 17:20:45 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 14 Oct 2005 11:20:45 -0400 Subject: [Python-Dev] Simplify lnotab? (AST branch update) In-Reply-To: <434F5919.6040002@canterbury.ac.nz> References: <5.1.1.6.0.20051013214331.02f320a0@mail.telecommunity.com> <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> <5.1.1.6.0.20051013214331.02f320a0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051014111829.01f89288@mail.telecommunity.com> At 08:07 PM 10/14/2005 +1300, Greg Ewing wrote: >Phillip J. Eby wrote: > > > A more > > compact scheme is possible, by using two tables - a bytecode->line > > number table, and a line number-> file table. > > > > If you have to encode multiple files, you just offset their line numbers > > by the size of the other files, > >More straightforwardly, the second table could just be a >bytecode -> file number mapping. That would use more space in any case involving multiple files. >In the case of only one file, it would contain just a single >entry, so it probably wouldn't even be worth the bother of >special-casing that. A line->file mapping would also have only one entry in that case. >You could save a bit more by having two kinds of line number >table, "small" (16-bit entries) and "large" (32-bit entries) >depending on the size of the code object and range of line >numbers. The small one would be sufficient for almost all >code objects, so the most common case would use only about >4 bytes per line of code. That's only twice as much as the >current scheme uses. That'd probably work. From pje at telecommunity.com Fri Oct 14 17:28:01 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 14 Oct 2005 11:28:01 -0400 Subject: [Python-Dev] Simplify lnotab? (AST branch update) In-Reply-To: <2mwtkga4mr.fsf@starship.python.net> References: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051014112056.021b5eb0@mail.telecommunity.com> At 09:23 AM 10/14/2005 +0100, Michael Hudson wrote: >"Phillip J. Eby" writes: > > > Even better if the lines for a particular piece of code don't have > > to all come from the same file. > >This seems _fairly_ esoteric to me. Why do you need it? Compilers that inline function calls, but want the code to still be debuggable. AOP tools that weave bytecode. Overloaded functions implemented by combining bytecode. Okay, those are fairly esoteric use cases, I admit. :) However, PyPy already has some inlining capability in its optimizer, so it's not all that crazy of an idea that Python in general will need it. >I can think of two uses for lnotab information: printing source lines >and locating source lines on the filesystem. For both, I think I'd >rather see some kind of defined protocol (methods on the code object, >maybe?) rather than inventing some kind of insane >too-general-for-the-common-case data structure. Certainly a protocol would be nice; right now one is forced to interpret the data structure directly. Being able to say, "give me the file and line number for a given byte offset" would be handy in any case. However, since you can't subclass code objects, the capability would have to be part of the core. From mwh at python.net Fri Oct 14 17:53:17 2005 From: mwh at python.net (Michael Hudson) Date: Fri, 14 Oct 2005 16:53:17 +0100 Subject: [Python-Dev] Simplify lnotab? (AST branch update) In-Reply-To: <5.1.1.6.0.20051014112056.021b5eb0@mail.telecommunity.com> (Phillip J. Eby's message of "Fri, 14 Oct 2005 11:28:01 -0400") References: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> <5.1.1.6.0.20051014112056.021b5eb0@mail.telecommunity.com> Message-ID: <2moe5s9jte.fsf@starship.python.net> "Phillip J. Eby" writes: > At 09:23 AM 10/14/2005 +0100, Michael Hudson wrote: >>"Phillip J. Eby" writes: >> >> > Even better if the lines for a particular piece of code don't have >> > to all come from the same file. >> >>This seems _fairly_ esoteric to me. Why do you need it? > > Compilers that inline function calls, but want the code to still be > debuggable. AOP tools that weave bytecode. Overloaded functions > implemented by combining bytecode. Err... > Okay, those are fairly esoteric use cases, I admit. :) However, PyPy > already has some inlining capability in its optimizer, so it's not all that > crazy of an idea that Python in general will need it. Um. Well, _I_ still think it's pretty crazy. >>I can think of two uses for lnotab information: printing source lines >>and locating source lines on the filesystem. For both, I think I'd >>rather see some kind of defined protocol (methods on the code object, >>maybe?) rather than inventing some kind of insane >>too-general-for-the-common-case data structure. > > Certainly a protocol would be nice; right now one is forced to interpret > the data structure directly. Being able to say, "give me the file and line > number for a given byte offset" would be handy in any case. > > However, since you can't subclass code objects, the capability would have > to be part of the core. Clearly, but any changes to co_lnotab would have to be part of the core too. Let's not make a complicated situation _worse_. Something I didn't say was that a protocol like this would also let us remove the horrors of functions like inspect.getsourcelines() (see SF bugs passim). Cheers, mwh -- There's an aura of unholy black magic about CLISP. It works, but I have no idea how it does it. I suspect there's a goat involved somewhere. -- Johann Hibschman, comp.lang.scheme From walter at livinglogic.de Fri Oct 14 18:26:37 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri, 14 Oct 2005 18:26:37 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <434F5ACF.3000802@v.loewis.de> References: <434F5ACF.3000802@v.loewis.de> Message-ID: <434FDC3D.9040202@livinglogic.de> Martin v. L?wis wrote: > Tony Nelson wrote: > >> I have written my fastcharmap decoder and encoder. It's not meant to be >> better than the patch and other changes to come in a future version of >> Python, but it does work now with the current codecs. > > It's an interesting solution. I like the fact that encoding doesn't need a special data structure. >> To use, hook each codec to be speed up: >> >> import fastcharmap >> help(fastcharmap) >> fastcharmap.hook('name_of_codec') >> u = unicode('some text', 'name_of_codec') >> s = u.encode('name_of_codec') >> >> No codecs were rewritten. It took me a while to learn enough to do this >> (Pyrex, more Python, some Python C API), and there were some surprises. >> Hooking in is grosser than I would have liked. I've only used it on >> Python >> 2.3 on FC3. > > Indeed, and I would claim that you did not completely achieve your "no > changes necessary" goal: you still have to install the hooks explicitly. > I also think overriding codecs.charmap_{encode,decode} is really ugly. > > Even if this could be simplified if you would modify the existing > codecs, I still don't think supporting changes to the encoding dict > is worthwhile. People will probably want to update the codecs in-place, > but I don't think we need to make a guarantee that that such an approach > works independent of the Python version. People would be much better off > writing their own codecs if they think the distributed ones are > incorrect. Exacty. If you need another codec write your own insteaad of patching an existing one on the fly! Of course we can't accept Pyrex code in the Python core, so it would be great to rewrite the encoder as a patch to PyUnicode_EncodeCharmap(). This version must be able to cope with encoding tables that are random strings without crashing. We've already taken care of decoding. What we still need is a new gencodec.py and regenerated codecs. Bye, Walter D?rwald From mal at egenix.com Fri Oct 14 19:03:54 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 14 Oct 2005 19:03:54 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <434FDC3D.9040202@livinglogic.de> References: <434F5ACF.3000802@v.loewis.de> <434FDC3D.9040202@livinglogic.de> Message-ID: <434FE4FA.8000307@egenix.com> Walter D?rwald wrote: > We've already taken care of decoding. What we still need is a new > gencodec.py and regenerated codecs. I'll take care of that; just haven't gotten around to it yet. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 14 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From reinhold-birkenfeld-nospam at wolke7.net Fri Oct 14 19:30:05 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Fri, 14 Oct 2005 19:30:05 +0200 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <434E1F78.7020504@intercable.ru> References: <434E1F78.7020504@intercable.ru> Message-ID: Sokolov Yura wrote: > May be allow modules to define __getattr__ ? > > def __getattr__(thing): > try: > return __some_standart_way__(thing) > except AttributeError: > if thing=="Queue": > import sys > from Queue import Queue > setattr(sys.modules[__name__],"Queue",Queue) > return Queue > raise I proposed something like this in the RFE tracker a while ago, but no one commented on it. Reinhold -- Mail address is perfectly valid! From cfbolz at gmx.de Fri Oct 14 19:25:45 2005 From: cfbolz at gmx.de (Carl Friedrich Bolz) Date: Fri, 14 Oct 2005 19:25:45 +0200 Subject: [Python-Dev] Simplify lnotab? (AST branch update) In-Reply-To: <5.1.1.6.0.20051014112056.021b5eb0@mail.telecommunity.com> References: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> <2mwtkga4mr.fsf@starship.python.net> <5.1.1.6.0.20051014112056.021b5eb0@mail.telecommunity.com> Message-ID: Hi! Phillip J. Eby wrote: [snip] > Okay, those are fairly esoteric use cases, I admit. :) However, PyPy > already has some inlining capability in its optimizer, so it's not all that > crazy of an idea that Python in general will need it. It's kind of strange to argue with PyPy's inlining capabilities, since inlining in PyPy happens on a completely different level, that has nothing at all to do with Python code objects any more. So your proposed changes would not make a difference for PyPy (not even to speak about benefits). [snip] cheers, Carl Friedrich Bolz From raymond.hettinger at verizon.net Fri Oct 14 20:41:24 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 14 Oct 2005 14:41:24 -0400 Subject: [Python-Dev] Simplify lnotab? (AST branch update) In-Reply-To: <2moe5s9jte.fsf@starship.python.net> Message-ID: <000e01c5d0ee$e1c97f80$1fac958d@oemcomputer> > >> > Even better if the lines for a particular piece of code don't have > >> > to all come from the same file. > >> > >>This seems _fairly_ esoteric to me. Why do you need it? > > > > Compilers that inline function calls, but want the code to still be > > debuggable. AOP tools that weave bytecode. Overloaded functions > > implemented by combining bytecode. > > Err... > > > Okay, those are fairly esoteric use cases, I admit. :) However, PyPy > > already has some inlining capability in its optimizer, so it's not all > that > > crazy of an idea that Python in general will need it. > > Um. Well, _I_ still think it's pretty crazy. YAGNI Raymond From pje at telecommunity.com Fri Oct 14 21:43:43 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 14 Oct 2005 15:43:43 -0400 Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes Message-ID: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com> I ran across an interesting paper about some VM optimizations yesterday: http://www.object-arts.com/Papers/TheInterpreterIsDead.PDF One thing mentioned was that saving even one cycle in their 'PUSH_SELF' opcode improved interpreter performance by 5%. I thought that was pretty cool, and then I realized CPython doesn't even *have* a PUSH_SELF opcode. So, today, I took a stab at implementing one, by converting "LOAD_FAST 0" calls to a "LOAD_SELF" opcode. Pystone and Parrotbench improved by about 2% or so. That wasn't great, so I added a "SELF_ATTR" opcode that combines a LOAD_SELF and a LOAD_ATTR in the same opcode while avoiding extra stack and refcount manipulation. This raised the total improvement for pystone to about 5%, but didn't seem to improve parrotbench any further. I guess parrotbench doesn't do much self.attr stuff in places that really count, and looking at the code it indeed seems that most self.* stuff is done at higher levels of the parsing benchmark, not the innermost loops. Indeed, even pystone doesn't do much attribute access on the first argument of most of its functions, especially not those in inner loops. Only Proc1() and the Record.copy() method do anything that would be helped by SELF_ATTR. But it seems to me that this is very unusual for object-oriented code, and that more common uses of Python should be helped a lot more by this. Do we have any benchmarks that don't use 'foo = self.foo' type shortcuts in their inner loops? Anyway, my main question is, do these sound like worthwhile optimizations? The code isn't that complex; the only tricky thing I did was having the opcodes' error case (unbound local) fall through to the LOAD_FAST opcode so as not to duplicate the error handling code, in the hopes of keeping the eval loop size down. From pinard at iro.umontreal.ca Fri Oct 14 21:45:07 2005 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Fri, 14 Oct 2005 15:45:07 -0400 Subject: [Python-Dev] Simplify lnotab? (AST branch update) In-Reply-To: <2mwtkga4mr.fsf@starship.python.net> References: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> <2mwtkga4mr.fsf@starship.python.net> Message-ID: <20051014194507.GA6435@alcyon.progiciels-bpi.ca> [Michael Hudson] > "Phillip J. Eby" writes: > > Even better if the lines for a particular piece of code don't have > > to all come from the same file. > This seems _fairly_ esoteric to me. Why do you need it? For when Python code is generated from more than one original source file (referring to the `#line' directive message, a little earlier this week). For example, think include files. -- Fran?ois Pinard http://pinard.progiciels-bpi.ca From pinard at iro.umontreal.ca Fri Oct 14 21:46:29 2005 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Fri, 14 Oct 2005 15:46:29 -0400 Subject: [Python-Dev] Simplify lnotab? (AST branch update) In-Reply-To: <000e01c5d0ee$e1c97f80$1fac958d@oemcomputer> References: <2moe5s9jte.fsf@starship.python.net> <000e01c5d0ee$e1c97f80$1fac958d@oemcomputer> Message-ID: <20051014194629.GB6435@alcyon.progiciels-bpi.ca> [Raymond Hettinger] > > >> > Even better if the lines for a particular piece of code don't > > >> > have to all come from the same file. > YAGNI I surely needed it, more than once. Don't be so assertive. :-) -- Fran?ois Pinard http://pinard.progiciels-bpi.ca From pje at telecommunity.com Fri Oct 14 22:05:13 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 14 Oct 2005 16:05:13 -0400 Subject: [Python-Dev] Simplify lnotab? (AST branch update) In-Reply-To: <000e01c5d0ee$e1c97f80$1fac958d@oemcomputer> References: <2moe5s9jte.fsf@starship.python.net> Message-ID: <5.1.1.6.0.20051014160208.01f73c70@mail.telecommunity.com> At 02:41 PM 10/14/2005 -0400, Raymond Hettinger wrote: >YAGNI If the feature were there, I'd have used it already, so I wouldn't consider it YAGNI. In the cases where I would've used it, I instead split generated code into separate functions so I could compile() each one with a different filename. Also, I notice that the peephole optimizer contains stuff to avoid making co_lnotab "too complex", although I haven't looked at it to be sure it'd actually benefit from an expanded lnotab format. From skip at pobox.com Sat Oct 15 00:20:52 2005 From: skip at pobox.com (skip@pobox.com) Date: Fri, 14 Oct 2005 17:20:52 -0500 Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes In-Reply-To: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com> References: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com> Message-ID: <17232.12100.667664.919625@montanaro.dyndns.org> Phillip> Indeed, even pystone doesn't do much attribute access on the Phillip> first argument of most of its functions, especially not those Phillip> in inner loops. Only Proc1() and the Record.copy() method do Phillip> anything that would be helped by SELF_ATTR. But it seems to me Phillip> that this is very unusual for object-oriented code, and that Phillip> more common uses of Python should be helped a lot more by this. Phillip> Do we have any benchmarks that don't use 'foo = self.foo' type Phillip> shortcuts in their inner loops? (Just thinking out loud...) Maybe we should create an alternate "object-oriented" version of pystone as a way to inject more attribute access into a convenient benchmark. Even if it's completely artificial and has no connection to other versions of the Drhystone benchmark, it might be useful for testing improvements to attribute access. Skip From martin at v.loewis.de Sat Oct 15 00:22:37 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 15 Oct 2005 00:22:37 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <434FDC3D.9040202@livinglogic.de> References: <434F5ACF.3000802@v.loewis.de> <434FDC3D.9040202@livinglogic.de> Message-ID: <43502FAD.2030500@v.loewis.de> Walter D?rwald wrote: > Of course we can't accept Pyrex code in the Python core, so it would be > great to rewrite the encoder as a patch to PyUnicode_EncodeCharmap(). > This version must be able to cope with encoding tables that are random > strings without crashing. I don't think this will be necessary. I personally dislike the decoding tables, as I think a straight-forward trie will do better than a hashtable. Regards, Martin From martin at v.loewis.de Sat Oct 15 00:33:24 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 15 Oct 2005 00:33:24 +0200 Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes In-Reply-To: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com> References: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com> Message-ID: <43503234.4030902@v.loewis.de> Phillip J. Eby wrote: > Anyway, my main question is, do these sound like worthwhile > optimizations? In the past, I think the analysis was always "no". It adds an opcode, so increases the size of the switch, causing more pressure on the cache, with an overall questionable effect. As for measuring the effect of the change: how often does that pattern occur in the standard library? (compared to what total number of LOAD_ATTR) Regards, Martin From pje at telecommunity.com Sat Oct 15 01:33:44 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 14 Oct 2005 19:33:44 -0400 Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes In-Reply-To: <43503234.4030902@v.loewis.de> References: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com> <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051014190111.0382c220@mail.telecommunity.com> At 12:33 AM 10/15/2005 +0200, Martin v. L?wis wrote: >Phillip J. Eby wrote: > > Anyway, my main question is, do these sound like worthwhile > > optimizations? > >In the past, I think the analysis was always "no". It adds >an opcode, so increases the size of the switch, causing >more pressure on the cache, with an overall questionable >effect. Hm. I'd have thought 5% pystone and 2% pybench is nothing to sneeze at, for such a minor change. I thought Skip's peephole optimizer originally only produced a 5% or so speedup. In any case, in relation to this specific kind of optimization, this is the only thing I found: http://mail.python.org/pipermail/python-dev/2002-February/019854.html which is a proposal by Guido to do the same thing, but also speeding up the actual attribute lookup. I didn't find any follow-up suggesting that anybody tried this, but perhaps it was put on hold pending the AST branch? :) >As for measuring the effect of the change: how often >does that pattern occur in the standard library? >(compared to what total number of LOAD_ATTR) [pje at ns src]$ grep 'self\.[A-Za-z_]' Lib/*.py | wc -l 9919 [pje at ns src]$ grep '[a-zA-Z_][a-zA-Z_0-9]*\.[a-zA-Z_]' Lib/*.py | wc -l 19804 So, something like 50% of lines doing an attribute access include a 'self' attribute access. This very rough estimate may be thrown off by: * Import statements (causing an error in favor of more non-self attributes) * Functions whose first argument isn't 'self' (error in favor of non-self attributes) * Comments or docstrings talking about attributes or modules (could go either way) * Multiple attribute accesses on the same line (could go either way) The parrotbench code shows a similar ratio of self to non-self attribute usage, but nearly all of parrotbench's self-attribute usage is in b0.py, and not called in the innermost loop. That also suggests that the volume of usage of 'self.' isn't the best way to determine the performance impact, because pystone has almost no 'self.' usage at all, but still got a 5% total boost. From skip at pobox.com Sat Oct 15 02:22:53 2005 From: skip at pobox.com (skip@pobox.com) Date: Fri, 14 Oct 2005 19:22:53 -0500 Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes In-Reply-To: <5.1.1.6.0.20051014190111.0382c220@mail.telecommunity.com> References: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com> <5.1.1.6.0.20051014190111.0382c220@mail.telecommunity.com> Message-ID: <17232.19421.938597.152797@montanaro.dyndns.org> >> Phillip J. Eby wrote: >> > Anyway, my main question is, do these sound like worthwhile >> > optimizations? >> >> In the past, I think the analysis was always "no". It adds an opcode, >> so increases the size of the switch, causing more pressure on the >> cache, with an overall questionable effect. Phillip> Hm. I'd have thought 5% pystone and 2% pybench is nothing to Phillip> sneeze at, for such a minor change. We've added lots of new opcodes over the years. CPU caches have grown steadily in that time as well, from maybe 128KB-256KB in the early 90's to around 1MB today. I suspect cache size has kept up with the growth of Python's VM inner loop. At any rate, each change has to be judged on its own merits. If it speeds things up and is uncontroversial implementation-wise, I see no reason it should be rejected out-of-hand. (Send it to Raymond H. He'll probably sneak it in when Martin's not looking. ) Skip From falcon at intercable.ru Fri Oct 14 13:05:30 2005 From: falcon at intercable.ru (Sokolov Yura) Date: Fri, 14 Oct 2005 15:05:30 +0400 Subject: [Python-Dev] Pythonic concurrency - offtopic In-Reply-To: <20051013220748.9195.JCARLSON@uci.edu> References: <434B61ED.4080503@intercable.ru> <20051013220748.9195.JCARLSON@uci.edu> Message-ID: <434F90FA.6060007@intercable.ru> Josiah Carlson wrote: >Sokolov Yura wrote: > > >>Offtopic: >> >>Microsoft Windows [Version 5.2.3790] >>(C) Copyright 1985-2003 Microsoft Corp. >> >>G:\Working\1>c:\Python24\python >>Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] on >>win32 >>Type "help", "copyright", "credits" or "license" for more information. >> >>> from os import fork >>Traceback (most recent call last): >> File "", line 1, in ? >>ImportError: cannot import name fork >> >>> >> >> > >Python for Windows, if I remember correctly, has never supported forking. >This is because the underlying process execution code does not have >support for the standard copy-on-write semantic which makes unix fork >fast. > >Cygwin Python does support fork, but I believe this is through a literal >copying of the memory space, which is far slower than unix fork. > >Until Microsoft adds kernel support for fork, don't expect standard >Windows Python to support it. > > - Josiah > > > > > That is what i mean... sorry for being noisy... From tcdelaney at optusnet.com.au Sat Oct 15 02:30:07 2005 From: tcdelaney at optusnet.com.au (Tim Delaney) Date: Sat, 15 Oct 2005 10:30:07 +1000 Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes Message-ID: <000501c5d11f$98368b70$0201a8c0@ryoko> Sorry I can't reply to the message (I'm at home, and don't currently have python-dev sent there). I have a version of Raymond's constant binding recipe: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/277940 that also binds all attribute accesses all the way down into a single constant call e.g. LOAD_FAST 0 LOAD_ATTR 'a' LOAD_ATTR 'b' LOAD_ATTR 'c' LOAD_ATTR 'd' is bound to a single constant: LOAD_CONST 5 where constant 5 is the object obtained from `self.a.b.c.d`. Unfortunately, I think it's at work - don't seem to have a copy here :( Obviously, this isn't applicable to as many cases, but it might be interesting to compare what kind of results this produces compared to LOAD_SELF/SELF_ATTR. Tim Delaney From blais at furius.ca Sat Oct 15 08:50:21 2005 From: blais at furius.ca (Martin Blais) Date: Sat, 15 Oct 2005 02:50:21 -0400 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <2mbr27f0th.fsf@starship.python.net> References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> <2mbr27f0th.fsf@starship.python.net> Message-ID: <8393fff0510142350l81ba453md20cc47a445642ce@mail.gmail.com> On 10/3/05, Michael Hudson wrote: > Martin Blais writes: > > > How hard would that be to implement? > > import sys > reload(sys) > sys.setdefaultencoding('undefined') Hmmm any particular reason for the call to reload() here? From martin at v.loewis.de Sat Oct 15 10:03:32 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 15 Oct 2005 10:03:32 +0200 Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes In-Reply-To: <17232.19421.938597.152797@montanaro.dyndns.org> References: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com> <5.1.1.6.0.20051014190111.0382c220@mail.telecommunity.com> <17232.19421.938597.152797@montanaro.dyndns.org> Message-ID: <4350B7D4.4000102@v.loewis.de> skip at pobox.com wrote: > (Send it to Raymond H. He'll probably sneak it in when Martin's not > looking. ) I'm not personally objecting :-) I just recall that there was that kind of objection when I proposed similar changes myself a couple of years ago. Regards, Martin From reinhold-birkenfeld-nospam at wolke7.net Sat Oct 15 10:01:14 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Sat, 15 Oct 2005 10:01:14 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <8393fff0510142350l81ba453md20cc47a445642ce@mail.gmail.com> References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> <2mbr27f0th.fsf@starship.python.net> <8393fff0510142350l81ba453md20cc47a445642ce@mail.gmail.com> Message-ID: Martin Blais wrote: > On 10/3/05, Michael Hudson wrote: >> Martin Blais writes: >> >> > How hard would that be to implement? >> >> import sys >> reload(sys) >> sys.setdefaultencoding('undefined') > > Hmmm any particular reason for the call to reload() here? Yes. setdefaultencoding() is removed from sys by site.py. To get it again you must reload sys. Reinhold -- Mail address is perfectly valid! From mwh at python.net Sat Oct 15 10:17:36 2005 From: mwh at python.net (Michael Hudson) Date: Sat, 15 Oct 2005 09:17:36 +0100 Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes In-Reply-To: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com> (Phillip J. Eby's message of "Fri, 14 Oct 2005 15:43:43 -0400") References: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com> Message-ID: <2mk6gf9otb.fsf@starship.python.net> "Phillip J. Eby" writes: > Indeed, even pystone doesn't do much attribute access on the first argument > of most of its functions, especially not those in inner loops. Only > Proc1() and the Record.copy() method do anything that would be helped by > SELF_ATTR. But it seems to me that this is very unusual for > object-oriented code, and that more common uses of Python should be helped > a lot more by this. Is it that unusual though? I don't think it's that unreasonable to suppose that 'typical smalltalk code' sends messages to self a good deal more often than 'typical python code'. I'm not saying that this *is* the case, but my intuition is that it might be (not all Python code is that object oriented, after all). Cheers, mwh -- The source passes lint without any complaint (if invoked with >/dev/null). -- Daniel Fischer, http://www.ioccc.org/1998/df.hint From greg.ewing at canterbury.ac.nz Sat Oct 15 11:58:12 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 15 Oct 2005 22:58:12 +1300 Subject: [Python-Dev] Simplify lnotab? (AST branch update) In-Reply-To: <5.1.1.6.0.20051014111829.01f89288@mail.telecommunity.com> References: <5.1.1.6.0.20051013214331.02f320a0@mail.telecommunity.com> <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com> <5.1.1.6.0.20051013214331.02f320a0@mail.telecommunity.com> <5.1.1.6.0.20051014111829.01f89288@mail.telecommunity.com> Message-ID: <4350D2B4.2040401@canterbury.ac.nz> Phillip J. Eby wrote: > At 08:07 PM 10/14/2005 +1300, Greg Ewing wrote: > >> More straightforwardly, the second table could just be a >> bytecode -> file number mapping. > > That would use more space in any case involving multiple files. Are you sure? Most of the time you're going to have chunks of contiguous lines coming from the same file, and the bytecode->filename table will only have an entry for the first bytecode of the first line of each chunk. I don't see how that works out differently from mapping bytecodes->lines and then lines->files. > That'd probably work. Greg From fredrik at pythonware.com Sat Oct 15 14:35:17 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 15 Oct 2005 14:35:17 +0200 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <1129228828.7198.24.camel@fsol> Message-ID: Antoine Pitrou wrote: > > unfortunately, this standard seem to result in generic "spamtools" modules > > into which people throw everything that's even remotely related to "spam", > > followed by complaints about bloat and performance from users, followed by > > various more or less stupid attempts to implement lazy loading of hidden in- > > ternal modules, followed by more complaints from users who no longer has > > a clear view of what's really going on in there... > > BTW, what's the performance problem in importing unnecessary stuff > (assuming pyc files are already generated) ? larger modules can easily take 0.1-0.2 seconds to import (at least if they use enough external dependencies). that may not be a lot of time in itself, but it can result in several seconds extra startup time for a larger program. importing unneeded modules also add to the process size, of course. you don't need to import too many modules to gobble up a couple of megabytes... From skip at pobox.com Sat Oct 15 15:15:43 2005 From: skip at pobox.com (skip@pobox.com) Date: Sat, 15 Oct 2005 08:15:43 -0500 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com> <1129228828.7198.24.camel@fsol> Message-ID: <17233.255.298868.972667@montanaro.dyndns.org> >> BTW, what's the performance problem in importing unnecessary stuff >> (assuming pyc files are already generated) ? Fredrik> larger modules can easily take 0.1-0.2 seconds to import (at Fredrik> least if they use enough external dependencies). I wish it was that short. At work we use lots of SWIG-wrapped C++ libraries. Whole lotta dynamic linking goin' on... In our case I don't think autoloading would help all that much. We actually use all that stuff. The best we could do would be to defer the link step for a couple seconds. Skip From pje at telecommunity.com Sat Oct 15 18:24:33 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat, 15 Oct 2005 12:24:33 -0400 Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes In-Reply-To: <2mk6gf9otb.fsf@starship.python.net> References: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com> <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051015121403.01f2fbf0@mail.telecommunity.com> At 09:17 AM 10/15/2005 +0100, Michael Hudson wrote: >"Phillip J. Eby" writes: > > > Indeed, even pystone doesn't do much attribute access on the first > argument > > of most of its functions, especially not those in inner loops. Only > > Proc1() and the Record.copy() method do anything that would be helped by > > SELF_ATTR. But it seems to me that this is very unusual for > > object-oriented code, and that more common uses of Python should be helped > > a lot more by this. > >Is it that unusual though? I don't think it's that unreasonable to >suppose that 'typical smalltalk code' sends messages to self a good >deal more often than 'typical python code'. I'm not saying that this >*is* the case, but my intuition is that it might be (not all Python >code is that object oriented, after all). Well, my greps on the stdlib suggest that about 50% of all lines doing attribute access, include an attribute access on 'self'. So for the stdlib, it's darn common. Plus, all functions benefit a tiny bit from faster access to their first argument via the LOAD_SELF opcode, which is what produced the roughly 2% improvement of parrotbench. The overall performance question has more to do with whether any of those accesses to self or self attributes are in loops. A person who's experienced at doing Python performance tuning will probably lift as many of them out of the innermost loops as possible, thereby lessening the impact of the change somewhat. But someone who doesn't know to do that, or just hasn't done it yet, will get more benefit from the change, but not as much as they'd get by lifting out the attribute access. Thus my guess is that it'll speed up "typical", un-tuned Python code by a few %, and is unlikely to slow anything down - even compilation. (The compiler changes are extremely minimal and localized to the relevant bytecode emission points.) Seems like a freebie, all in all. From tcdelaney at optusnet.com.au Sat Oct 15 22:24:30 2005 From: tcdelaney at optusnet.com.au (Tim Delaney) Date: Sun, 16 Oct 2005 06:24:30 +1000 Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes Message-ID: <026101c5d1c6$72d1d090$0201a8c0@ryoko> Tim Delaney wrote: > that also binds all attribute accesses all the way down into a single > constant call e.g. > > LOAD_FAST 0 > LOAD_ATTR 'a' > LOAD_ATTR 'b' > LOAD_ATTR 'c' > LOAD_ATTR 'd' > > is bound to a single constant: > > LOAD_CONST 5 D'oh. I'm a moron - of course it can't do that. It'll do it for LOAD_GLOBAL followed by multiple LOAD_ATTR, and (I think) also for LOAD_NAME. Tim Delaney From tonynelson at georgeanelson.com Sun Oct 16 02:12:23 2005 From: tonynelson at georgeanelson.com (Tony Nelson) Date: Sat, 15 Oct 2005 20:12:23 -0400 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <434F5ACF.3000802@v.loewis.de> References: Message-ID: I have put up a new, packaged version of my fast charmap module at . Hopefully it is packaged properly and works properly (it works on my FC3 Python 2.3.4 system). This version is over 5 times faster than the base codec according to Hye-Shik Chang's benchmark (mostly from compiling it with -O3). I bring it up here mostly because I mention in its docs that improved faster charmap codecs are coming from the Python developers. Is it OK to say that, and have I said it right? I'll take that out if you folks want. I understand that my module is not favored by Martin v. L?wis, and I don't advocate it becoming part of Python. My web page and docs say that it may be useful until Python has the faster codecs. It allows changing the mappings because that is part of the current semantics -- a new version of Python can certainly change those semantics. I want to thank you all for so quickly going to work on the problem of making charmap codecs faster. It's to the benefit of Python users everywhere to have faster charmap codecs in Python. Your quickness impressed me. BTW, Martin, if you care to, would you explain to me how a Trie would be used for charmap encoding? I know a couple of approaches, but I don't know how to do it fast. (I've never actually had the occasion to use a Trie.) ____________________________________________________________________ TonyN.:' ' From ncoghlan at gmail.com Sun Oct 16 05:01:24 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 16 Oct 2005 13:01:24 +1000 Subject: [Python-Dev] Sourceforge CVS access In-Reply-To: References: <43468417.4000701@iinet.net.au> <43473E51.4010103@gmail.com> Message-ID: <4351C284.1040303@gmail.com> Guido van Rossum wrote: > You're in. Use it wisely. Let me know if there are things you still > cannot do. (But I'm not used to being SF project admin any more; other > admins may be able to help you quicker...) Almost there - checking out over SSH failed to work. I checked the python SF admin page, and I still only have read access to the CVS repository. So if one of the SF admins could flip that last switch, that would be great :) Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From kbk at shore.net Sun Oct 16 06:34:07 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Sun, 16 Oct 2005 00:34:07 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200510160434.j9G4Y7HG022965@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 344 open ( +3) / 2955 closed ( +2) / 3299 total ( +5) Bugs : 883 open ( -1) / 5341 closed (+20) / 6224 total (+19) RFE : 201 open ( +5) / 187 closed ( +0) / 388 total ( +5) New / Reopened Patches ______________________ Compiling and linking main() with C++ compiler (2005-10-12) http://python.org/sf/1324762 opened by Christoph Ludwig Adding redblack tree to collections module (2005-10-12) http://python.org/sf/1324770 opened by Hye-Shik Chang Letting "build_ext --libraries" take more than one lib (2005-10-13) http://python.org/sf/1326113 opened by Stephen A. Langer VS NET 2003 Project Files contain per file compiler settings (2005-10-15) http://python.org/sf/1327377 opened by Juan-Carlos Lopez-Garcia Static Windows Build fails to locate existing installation (2005-10-15) http://python.org/sf/1327594 opened by Doe, Baby Patches Closed ______________ Encoding alias "unicode-1-1-utf-7" (2005-07-26) http://python.org/sf/1245379 closed by doerwalter Py_INCREF/Py_DECREF with magic constant demo (2005-10-07) http://python.org/sf/1316653 closed by rhamphoryncus New / Reopened Bugs ___________________ irregular behavior within class using __setitem__ (2005-10-08) CLOSED http://python.org/sf/1317376 opened by capnSTABN Missing Library Modules (2005-10-09) CLOSED http://python.org/sf/1321736 opened by George LeCompte Minor error in the Library Reference doc (2005-10-10) CLOSED http://python.org/sf/1323294 opened by Colin J. Williams getwindowsversion() constants in sys module (2005-10-11) http://python.org/sf/1323369 opened by Tony Meyer C API doc for PySequence_Tuple duplicated (2005-10-11) CLOSED http://python.org/sf/1323739 opened by George Yoshida MSI installer not working (2005-10-11) CLOSED http://python.org/sf/1323810 opened by Eric Rucker ISO8859-9 broken (2005-10-11) http://python.org/sf/1324237 opened by Eray Ozkural Curses module doesn't install on Solaris 2.8 (2005-10-12) http://python.org/sf/1324799 opened by Andrew Koenig "as" keyword sometimes highlighted in strings (2005-10-12) http://python.org/sf/1325071 opened by Artur de Sousa Rocha binary code made by freeze results "unknown encoding" (2005-10-13) CLOSED http://python.org/sf/1325491 opened by greatPython Curses,h (2005-10-13) CLOSED http://python.org/sf/1325611 reopened by rbrenner Curses,h (2005-10-13) CLOSED http://python.org/sf/1325611 opened by hafnium Curses,h (2005-10-13) CLOSED http://python.org/sf/1325903 opened by hafnium traceback.py formats SyntaxError differently (2005-10-13) http://python.org/sf/1326077 opened by Neil Schemenauer odd behaviour when making lists of lambda forms (2005-10-13) CLOSED http://python.org/sf/1326195 opened by Johan Hidding itertools.count wraps around after maxint (2005-10-13) http://python.org/sf/1326277 opened by paul rubin pdb breaks programs which import from __main__ (2005-10-13) http://python.org/sf/1326406 opened by Ilya Sandler set.__getstate__ is not overriden (2005-10-14) http://python.org/sf/1326448 opened by George Sakkis SIGALRM alarm signal kills interpreter (2005-10-14) CLOSED http://python.org/sf/1326841 opened by paul rubin wrong TypeError traceback in generator expressions (2005-10-14) http://python.org/sf/1327110 opened by Yusuke Shinyama title() uppercases latin1 strings after accented letters (2005-10-14) CLOSED http://python.org/sf/1327233 opened by Humberto Di?genes Bugs Closed ___________ Segmentation fault with invalid "coding" (2005-10-07) http://python.org/sf/1316162 closed by birkenfeld irregular behavior within class using __setitem__ (2005-10-08) http://python.org/sf/1317376 closed by ncoghlan python.exe 2.4.2 compiled with VS2005 crashes (2005-10-03) http://python.org/sf/1311784 closed by loewis 2.4.2 make problems (2005-10-03) http://python.org/sf/1311579 closed by loewis 2.4.1 windows MSI has no _socket (2005-09-24) http://python.org/sf/1302793 closed by loewis The _ssl build process for 2.3.5 is broken (2005-09-16) http://python.org/sf/1292634 closed by loewis codecs.readline sometimes removes newline chars (2005-04-02) http://python.org/sf/1175396 closed by doerwalter Missing Library Modules (2005-10-09) http://python.org/sf/1321736 closed by nnorwitz inspect.getsourcelines() broken (2005-10-07) http://python.org/sf/1315961 closed by doerwalter Minor error in the Library Reference doc (2005-10-10) http://python.org/sf/1323294 closed by nnorwitz failure to build RPM on rhel 3 (2005-07-28) http://python.org/sf/1246900 closed by jafo C API doc for PySequence_Tuple duplicated (2005-10-11) http://python.org/sf/1323739 closed by nnorwitz MSI installer not working (2005-10-11) http://python.org/sf/1323810 closed by loewis [AST] Patch [ 1190012 ] should've checked for SyntaxWarnings (2005-05-04) http://python.org/sf/1195576 closed by nascheme binary code made by freeze results "unknown encoding" (2005-10-13) http://python.org/sf/1325491 closed by perky Curses,h (2005-10-13) http://python.org/sf/1325611 closed by birkenfeld Curses,h (2005-10-13) http://python.org/sf/1325611 closed by perky Curses,h (2005-10-13) http://python.org/sf/1325903 closed by birkenfeld odd behaviour when making lists of lambda forms (2005-10-13) http://python.org/sf/1326195 closed by rhettinger SIGALRM alarm signal kills interpreter (2005-10-14) http://python.org/sf/1326841 closed by loewis title() uppercases latin1 strings after accented letters (2005-10-15) http://python.org/sf/1327233 closed by perky New / Reopened RFE __________________ itemgetter built-in? (2005-10-10) http://python.org/sf/1322308 opened by capnSTABN fix for ms stdio tables (2005-10-11) http://python.org/sf/1324176 opened by Vambola Kotkas __name__ available during class dictionary build (2005-10-12) http://python.org/sf/1324261 opened by Adal Chiriliuc python scratchpad (2005-10-14) http://python.org/sf/1326830 opened by paul rubin From guido at python.org Sun Oct 16 06:39:02 2005 From: guido at python.org (Guido van Rossum) Date: Sat, 15 Oct 2005 21:39:02 -0700 Subject: [Python-Dev] Sourceforge CVS access In-Reply-To: <4351C284.1040303@gmail.com> References: <43468417.4000701@iinet.net.au> <43473E51.4010103@gmail.com> <4351C284.1040303@gmail.com> Message-ID: Sobebody help Nick! This is beyond my SF-fu! :-( On 10/15/05, Nick Coghlan wrote: > Guido van Rossum wrote: > > You're in. Use it wisely. Let me know if there are things you still > > cannot do. (But I'm not used to being SF project admin any more; other > > admins may be able to help you quicker...) > > Almost there - checking out over SSH failed to work. I checked the python SF > admin page, and I still only have read access to the CVS repository. So if one > of the SF admins could flip that last switch, that would be great :) > > Regards, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > --------------------------------------------------------------- > http://boredomandlaziness.blogspot.com > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sun Oct 16 06:45:41 2005 From: guido at python.org (Guido van Rossum) Date: Sat, 15 Oct 2005 21:45:41 -0700 Subject: [Python-Dev] Sourceforge CVS access In-Reply-To: References: <43468417.4000701@iinet.net.au> <43473E51.4010103@gmail.com> <4351C284.1040303@gmail.com> Message-ID: With Neal's help I've fixed Nick's permissions. Enjoy, Nick! On 10/15/05, Guido van Rossum wrote: > Somebody help Nick! This is beyond my SF-fu! :-( > > On 10/15/05, Nick Coghlan wrote: > > Guido van Rossum wrote: > > > You're in. Use it wisely. Let me know if there are things you still > > > cannot do. (But I'm not used to being SF project admin any more; other > > > admins may be able to help you quicker...) > > > > Almost there - checking out over SSH failed to work. I checked the python SF > > admin page, and I still only have read access to the CVS repository. So if one > > of the SF admins could flip that last switch, that would be great :) > > > > Regards, > > Nick. > > > > -- > > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > > --------------------------------------------------------------- > > http://boredomandlaziness.blogspot.com > > > > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Sun Oct 16 06:54:44 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 16 Oct 2005 14:54:44 +1000 Subject: [Python-Dev] Sourceforge CVS access In-Reply-To: References: <43468417.4000701@iinet.net.au> <43473E51.4010103@gmail.com> <4351C284.1040303@gmail.com> Message-ID: <4351DD14.4050302@gmail.com> Guido van Rossum wrote: > With Neal's help I've fixed Nick's permissions. Enjoy, Nick! Thanks folks - it looks to be all good now. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From jeremy at alum.mit.edu Sun Oct 16 07:30:26 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Sun, 16 Oct 2005 01:30:26 -0400 Subject: [Python-Dev] AST branch merge status Message-ID: I just merged the head back to the AST branch for what I hope is the last time. I plan to merge the branch to the head on Sunday evening. I'd appreciate it if folks could hold off on making changes on the trunk until that merge happens. If this is a non-trivial inconvenience for anyone, go ahead with the changes but send me mail to make sure that I don't lose the changes when I do the merge. Regardless, the compiler and Grammar are off limits. I'll blow away any changes you make there <0.1 wink>. Jeremy From martin at v.loewis.de Sun Oct 16 11:56:01 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 16 Oct 2005 11:56:01 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: References: Message-ID: <435223B1.2020209@v.loewis.de> Tony Nelson wrote: > BTW, Martin, if you care to, would you explain to me how a Trie would be > used for charmap encoding? I know a couple of approaches, but I don't know > how to do it fast. (I've never actually had the occasion to use a Trie.) I currently envision a three-level trie, with 5, 4, and 7 bits. You take the Unicode character (only chacters below U+FFFF supported), and take the uppermost 5 bits, as index in an array. There you find the base of a second array, to which you add the next 4 bits, which gives you an index into a third array, where you add the last 7 bits. This gives you the character, or 0 if it is unmappable. struct encoding_map{ unsigned char level0[32]; unsigned char *level1; unsigned char *level2; }; struct encoding_map *table; Py_UNICODE character; int level1 = table->level0[character>>11]; if(level1==0xFF)raise unmapped; int level2 = table->level1[16*level1 + ((character>>7) & 0xF)]; if(level2==0xFF)raise unmapped; int mapped = table->level2[128*level2 + (character & 0x7F)]; if(mapped==0)raise unmapped; Over a hashtable, this has the advantage of not having to deal with collisions. Instead, it guarantees you a lookup in a constant time. It is also quite space-efficient: all tables use bytes as indizes. As each level0 deals with 2048 characters, most character maps will only use 1 or two level1 blocks, meaning 16 or 32 bytes for level1. The number of level3 blocks required depends on the number of 127-character rows which the encoding spans; for most encodings, 3 or four such blocks will be sufficient (with ASCII spanning one such block typically), causing the entire memory consumption for many encodings to be less than 600 bytes. It would be possible to remove one level of indirection (giving some more speed) at the expense of additional memory: for example, and 8-8 trie would use 256 bytes for level 0, and then 256 bytes for each Unicode row where the encoding has characters, likely resulting in 1kiB for a typical encoding. Hye-Shik Chang version of the fast codec uses such an 8-8 trie, but conserves space by observing that most 256-char rows are only sparsely used by encodings (e.g. if you have only ASCII, you use only 128 characters from row 0). He therefore strips unused cells from the row, by also recording the first and last character per row. This brings some space back, at the expense of slow-down again. Regards, Martin From ncoghlan at iinet.net.au Sun Oct 16 14:46:44 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Sun, 16 Oct 2005 22:46:44 +1000 Subject: [Python-Dev] PEP 343 updated Message-ID: <43524BB4.7040808@iinet.net.au> PEP 343 has been updated on python.org. Highlights of the changes: - changed the name of the PEP to be simply "The 'with' Statement" - added __with__() method - added section on standard terminology (that is, contexts/context managers) - changed generator context decorator name to "context" - Updated "Resolved Issues" section - updated decimal.Context() example - updated closing() example so it works for objects without close methods I also added a new Open Issues section with the questions: - should the decorator be called "context" or something else, such as the old "contextmanager"? (The PEP currently says "context") - should the decorator be a builtin? (The PEP currently says yes) - should the decorator be applied automatically to generators used to write __with__ methods? (The PEP currently says yes) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at iinet.net.au Sun Oct 16 15:56:10 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Sun, 16 Oct 2005 23:56:10 +1000 Subject: [Python-Dev] Definining properties - a use case for class decorators? Message-ID: <43525BFA.9090309@iinet.net.au> On and off, I've been looking for an elegant way to handle properties using decorators. It hasn't really worked, because decorators are inherently single function, and properties span multiple functions. However, it occurred to me that Python already contains a construct for grouping multiple related functions together: classes. And that thought led me to this decorator: def def_property(cls_func): cls = cls_func() try: fget = cls.get.im_func except AttributeError: fget = None try: fset = cls.set.im_func except AttributeError: fset = None try: fdel = cls.delete.im_func except AttributeError: fdel = None return property(fget, fset, fdel, cls.__doc__) Obviously, this decorator can only be used by decorating a function that returns the class of interest: class Demo(object): @def_property def test(): class prop: """This is a test property""" def get(self): print "Getting attribute on instance" def set(self, value): print "Setting attribute on instance" def delete(self): print "Deleting attribute on instance" return prop Which gives the following behaviour: Py> Demo.test Py> Demo().test Getting attribute on instance Py> Demo().test = 1 Setting attribute on instance Py> del Demo().test Deleting attribute on instance Py> help(Demo.test) Help on property: This is a test property = get(self) = set(self, value) = delete(self) If we had class decorators, though, the decorator could be modified to skip the function invocation: def def_property(cls): try: fget = cls.get.im_func except AttributeError: fget = None try: fset = cls.set.im_func except AttributeError: fset = None try: fdel = cls.delete.im_func except AttributeError: fdel = None return property(fget, fset, fdel, cls.__doc__) And the usage would be much more direct: class Demo(object): @def_property class test: """This is a test property""" def get(self): print "Getting attribute on instance" def set(self, value): print "Setting attribute on instance" def delete(self): print "Deleting attribute on instance" Comments? Screams of horror? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From solipsis at pitrou.net Sun Oct 16 16:08:09 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 16 Oct 2005 16:08:09 +0200 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <43525BFA.9090309@iinet.net.au> References: <43525BFA.9090309@iinet.net.au> Message-ID: <1129471690.6133.9.camel@fsol> > class Demo(object): > @def_property > class test: > """This is a test property""" > def get(self): > print "Getting attribute on instance" > def set(self, value): > print "Setting attribute on instance" > def delete(self): > print "Deleting attribute on instance" The code looks like "self" refers to a test instance, but it will actually refer to a Demo instance. This is quite misleading. From gary at modernsongs.com Sun Oct 16 16:18:54 2005 From: gary at modernsongs.com (Gary Poster) Date: Sun, 16 Oct 2005 10:18:54 -0400 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <43525BFA.9090309@iinet.net.au> References: <43525BFA.9090309@iinet.net.au> Message-ID: <414317A5-0B04-4250-B458-A8B8A74AC221@modernsongs.com> On Oct 16, 2005, at 9:56 AM, Nick Coghlan wrote: > On and off, I've been looking for an elegant way to handle > properties using > decorators. This isn't my idea, and it might have been brought up here in the past to the same sorts of screams of horror to which you refer later, but I use the 'apply' pattern without too many internal objections for this: class Foo(object): # just a simple example, practically pointless _my_property = None @apply def my_property(): def get(self): return self._my_property def set(self, value): self._my_property = value return property(get, set) IMHO, I find this easier to parse than either of your two examples. Apologies if this has already been screamed at. :-) Gary From guido at python.org Sun Oct 16 17:19:07 2005 From: guido at python.org (Guido van Rossum) Date: Sun, 16 Oct 2005 08:19:07 -0700 Subject: [Python-Dev] PEP 343 updated In-Reply-To: <43524BB4.7040808@iinet.net.au> References: <43524BB4.7040808@iinet.net.au> Message-ID: On 10/16/05, Nick Coghlan wrote: > PEP 343 has been updated on python.org. > > Highlights of the changes: > > - changed the name of the PEP to be simply "The 'with' Statement" > - added __with__() method > - added section on standard terminology (that is, contexts/context managers) > - changed generator context decorator name to "context" > - Updated "Resolved Issues" section > - updated decimal.Context() example > - updated closing() example so it works for objects without close methods > > I also added a new Open Issues section with the questions: > > - should the decorator be called "context" or something else, such as the > old "contextmanager"? (The PEP currently says "context") > - should the decorator be a builtin? (The PEP currently says yes) > - should the decorator be applied automatically to generators used to write > __with__ methods? (The PEP currently says yes) I hope you reverted the status to "Proposed"... On the latter: I think it shouldn't; I don't like this kind of magic. I'll have to read it before I can comment on the rest. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sun Oct 16 17:23:15 2005 From: guido at python.org (Guido van Rossum) Date: Sun, 16 Oct 2005 08:23:15 -0700 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <43525BFA.9090309@iinet.net.au> References: <43525BFA.9090309@iinet.net.au> Message-ID: On 10/16/05, Nick Coghlan wrote: > On and off, I've been looking for an elegant way to handle properties using > decorators. > > It hasn't really worked, because decorators are inherently single function, > and properties span multiple functions. > > However, it occurred to me that Python already contains a construct for > grouping multiple related functions together: classes. Nick, and everybody else trying to find a "solution" for this "problem", please don't. There's nothing wrong with having the three accessor methods explicitly in the namespace, it's clear, and probably less typing (and certainly less indenting!). Just write this: class C: def getFoo(self): ... def setFoo(self): ... foo = property(getFoo, setFoo) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ark at acm.org Sun Oct 16 17:26:09 2005 From: ark at acm.org (Andrew Koenig) Date: Sun, 16 Oct 2005 11:26:09 -0400 Subject: [Python-Dev] PEP 343 updated In-Reply-To: <43524BB4.7040808@iinet.net.au> Message-ID: <000f01c5d265$f232c2f0$6402a8c0@arkdesktop> > PEP 343 has been updated on python.org. > Highlights of the changes: > - changed the name of the PEP to be simply "The 'with' Statement" Do you mean PEP 346, perchance? PEP 343 is something else entirely. From ironfroggy at gmail.com Sun Oct 16 17:51:30 2005 From: ironfroggy at gmail.com (Calvin Spealman) Date: Sun, 16 Oct 2005 11:51:30 -0400 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: References: <43525BFA.9090309@iinet.net.au> Message-ID: <76fd5acf0510160851l2aeacc6x156a203ca0c7ca60@mail.gmail.com> On 10/16/05, Guido van Rossum wrote: > On 10/16/05, Nick Coghlan wrote: > > On and off, I've been looking for an elegant way to handle properties using > > decorators. > > > > It hasn't really worked, because decorators are inherently single function, > > and properties span multiple functions. > > > > However, it occurred to me that Python already contains a construct for > > grouping multiple related functions together: classes. > > Nick, and everybody else trying to find a "solution" for this > "problem", please don't. There's nothing wrong with having the three > accessor methods explicitly in the namespace, it's clear, and probably > less typing (and certainly less indenting!). Just write this: > > class C: > def getFoo(self): ... > def setFoo(self): ... > foo = property(getFoo, setFoo) Does this necessisarily mean a 'no' still for class decorators, or do you just not like this particular use case for them. Or, are you perhaps against this proposal due to its use of nested classes? From ironfroggy at gmail.com Sun Oct 16 17:56:36 2005 From: ironfroggy at gmail.com (Calvin Spealman) Date: Sun, 16 Oct 2005 11:56:36 -0400 Subject: [Python-Dev] Early PEP draft (For Python 3000?) In-Reply-To: <20051014000927.919E.JCARLSON@uci.edu> References: <76fd5acf0510132316x6a8bcc8ck1c3d5a812abd447e@mail.gmail.com> <20051014000927.919E.JCARLSON@uci.edu> Message-ID: <76fd5acf0510160856r72f87f7fj51ad48c97003b810@mail.gmail.com> On 10/14/05, Josiah Carlson wrote: > > Calvin Spealman wrote: > > > > On 10/11/05, Eyal Lotem wrote: > > > locals()['x'] = 1 # Quietly fails! > > > Replaced by: > > > frame.x = 1 # Raises error > > > > What about the possibility of making this hypothetic frame object an > > indexable, such that frame[0] is the current scope, frame[1] is the > > calling scope, etc.? On the same lines, what about closure[0] for the > > current frame, while closure[1] resolves to the closure the function > > was defined in? These would ensure that you could reliably access any > > namespace you would need, without nasty stack tricks and such, and > > would make working around some of the limitation of the closures, when > > you have such a need. One might even consider a __resolve__ to be > > defined in any namespace, allowing all the namespace resolution rules > > to be overridden by code at any level. > > -1000 If you want a namespace, create one and pass it around. If the > writer of a function or method wanted you monkeying around with a > namespace, they would have given you one to work with. If they want you monkeying around with their namespace or not, you can do so with various tricks introspecting the frame stack and other internals. I was merely suggesting this as something more standardized, perhaps across the various Python implementations. It would also provide a single point of restriction when you want to disable such things. > As for closure monkeywork, you've got to be kidding. Closures in Python > are a clever and interesting way of keeping around certain things, but > are actually unnecessary with the existance of class and instance > namespaces. Every example of a closure can be re-done as a > class/instance, and many end up looking better. i mostly put that in there for completeness. From guido at python.org Sun Oct 16 18:20:13 2005 From: guido at python.org (Guido van Rossum) Date: Sun, 16 Oct 2005 09:20:13 -0700 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <76fd5acf0510160851l2aeacc6x156a203ca0c7ca60@mail.gmail.com> References: <43525BFA.9090309@iinet.net.au> <76fd5acf0510160851l2aeacc6x156a203ca0c7ca60@mail.gmail.com> Message-ID: On 10/16/05, Calvin Spealman wrote: > On 10/16/05, Guido van Rossum wrote: > > Nick, and everybody else trying to find a "solution" for this > > "problem", please don't. There's nothing wrong with having the three > > accessor methods explicitly in the namespace, it's clear, and probably > > less typing (and certainly less indenting!). Just write this: > > > > class C: > > def getFoo(self): ... > > def setFoo(self): ... > > foo = property(getFoo, setFoo) > > Does this necessisarily mean a 'no' still for class decorators, or do > you just not like this particular use case for them. Or, are you > perhaps against this proposal due to its use of nested classes? I'm still -0 on class decorators pending good use cases. I'm -1 on using a class decorator (if we were to introduce them) for get/set properties; it doesn't save writing and it doesn't save reading. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Sun Oct 16 18:53:24 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 16 Oct 2005 18:53:24 +0200 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: References: Message-ID: <43528584.7090306@v.loewis.de> Tony Nelson wrote: > Umm, 0 (NUL) is a valid output character in most of the 8-bit character > sets. It could be handled by having a separate "exceptions" string of the > unicode code points that actually map to the exception char. Yes. But only U+0000 should normally map to 0. It could be special-cased altogether. > As you are concerned about pathological cases for hashing (that would make > the hash chains long), it is worth noting that in such cases this data > structure could take 64K bytes. Of course, no such case occurs in standard > encodings, and 64K is affordable as long is it is rare. I'm not concerned with long hash chains, I dislike having collisions in the first place (even if they are only for two code points). Having to deal with collisions makes the code more complicated, and less predictable. It's primarily a matter of taste: avoid hashtables if you can :-) Regards, Martin From jcarlson at uci.edu Sun Oct 16 19:22:14 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 16 Oct 2005 10:22:14 -0700 Subject: [Python-Dev] Early PEP draft (For Python 3000?) In-Reply-To: <76fd5acf0510160856r72f87f7fj51ad48c97003b810@mail.gmail.com> References: <20051014000927.919E.JCARLSON@uci.edu> <76fd5acf0510160856r72f87f7fj51ad48c97003b810@mail.gmail.com> Message-ID: <20051016100016.37E3.JCARLSON@uci.edu> Calvin Spealman wrote: > > On 10/14/05, Josiah Carlson wrote: > > > > Calvin Spealman wrote: > > > > > > On 10/11/05, Eyal Lotem wrote: > > > > locals()['x'] = 1 # Quietly fails! > > > > Replaced by: > > > > frame.x = 1 # Raises error > > > > > > What about the possibility of making this hypothetic frame object an > > > indexable, such that frame[0] is the current scope, frame[1] is the > > > calling scope, etc.? On the same lines, what about closure[0] for the > > > current frame, while closure[1] resolves to the closure the function > > > was defined in? These would ensure that you could reliably access any > > > namespace you would need, without nasty stack tricks and such, and > > > would make working around some of the limitation of the closures, when > > > you have such a need. One might even consider a __resolve__ to be > > > defined in any namespace, allowing all the namespace resolution rules > > > to be overridden by code at any level. > > > > -1000 If you want a namespace, create one and pass it around. If the > > writer of a function or method wanted you monkeying around with a > > namespace, they would have given you one to work with. > > If they want you monkeying around with their namespace or not, you can > do so with various tricks introspecting the frame stack and other > internals. I was merely suggesting this as something more > standardized, perhaps across the various Python implementations. It > would also provide a single point of restriction when you want to > disable such things. What I'm saying is that whether or not you can modify the contents of stack frames via tricks, you shouldn't. Why? Because as I said, if the writer wanted you to be hacking around with a namespace, they should have passed you a shared namespace. From what I understand, there are very few (good) reasons why a user should muck with stack frames, among them because it is quite convenient to write custom traceback printers (like web CGI, etc.), and if one is tricky, limit the callers of a function/method to those "allowable". There may be other good reasons, but until you offer a use-case that is compelling for reasons why it should be easier to access and/or modify the contents of stack frames, I'm going to remain at -1000. - Josiah From ironfroggy at gmail.com Sun Oct 16 19:37:17 2005 From: ironfroggy at gmail.com (Calvin Spealman) Date: Sun, 16 Oct 2005 13:37:17 -0400 Subject: [Python-Dev] Early PEP draft (For Python 3000?) In-Reply-To: <76fd5acf0510161036i4ab09e2cu39bd6961a60df783@mail.gmail.com> References: <20051014000927.919E.JCARLSON@uci.edu> <76fd5acf0510160856r72f87f7fj51ad48c97003b810@mail.gmail.com> <20051016100016.37E3.JCARLSON@uci.edu> <76fd5acf0510161036i4ab09e2cu39bd6961a60df783@mail.gmail.com> Message-ID: <76fd5acf0510161037v477874b0w5595e3edffe71511@mail.gmail.com> On 10/16/05, Josiah Carlson wrote: > > Calvin Spealman wrote: > > > > On 10/14/05, Josiah Carlson wrote: > > > > > > Calvin Spealman wrote: > > > > > > > > On 10/11/05, Eyal Lotem wrote: > > > > > locals()['x'] = 1 # Quietly fails! > > > > > Replaced by: > > > > > frame.x = 1 # Raises error > > > > > > > > What about the possibility of making this hypothetic frame object an > > > > indexable, such that frame[0] is the current scope, frame[1] is the > > > > calling scope, etc.? On the same lines, what about closure[0] for the > > > > current frame, while closure[1] resolves to the closure the function > > > > was defined in? These would ensure that you could reliably access any > > > > namespace you would need, without nasty stack tricks and such, and > > > > would make working around some of the limitation of the closures, when > > > > you have such a need. One might even consider a __resolve__ to be > > > > defined in any namespace, allowing all the namespace resolution rules > > > > to be overridden by code at any level. > > > > > > -1000 If you want a namespace, create one and pass it around. If the > > > writer of a function or method wanted you monkeying around with a > > > namespace, they would have given you one to work with. > > > > If they want you monkeying around with their namespace or not, you can > > do so with various tricks introspecting the frame stack and other > > internals. I was merely suggesting this as something more > > standardized, perhaps across the various Python implementations. It > > would also provide a single point of restriction when you want to > > disable such things. > > What I'm saying is that whether or not you can modify the contents of > stack frames via tricks, you shouldn't. Why? Because as I said, if the > writer wanted you to be hacking around with a namespace, they should > have passed you a shared namespace. > > From what I understand, there are very few (good) reasons why a user > should muck with stack frames, among them because it is quite convenient > to write custom traceback printers (like web CGI, etc.), and if one is > tricky, limit the callers of a function/method to those "allowable". > There may be other good reasons, but until you offer a use-case that is > compelling for reasons why it should be easier to access and/or modify > the contents of stack frames, I'm going to remain at -1000. I think I was wording this badly. I meant to suggest this as a way to define nested functions (or classes?) and probably access names from various levels of scope. In this way, a nested function would be able to say "bind the name 'a' in the namespace in which I am defined to this object", thus offering more fine grained approached than the current global keyword. I know there has been talk of this issue before, but I don't know if it works with or against anything said for this previously. From blais at furius.ca Mon Oct 17 02:26:43 2005 From: blais at furius.ca (Martin Blais) Date: Sun, 16 Oct 2005 20:26:43 -0400 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> <2mbr27f0th.fsf@starship.python.net> <8393fff0510142350l81ba453md20cc47a445642ce@mail.gmail.com> Message-ID: <8393fff0510161726i6fd5c798u38aa3875c8f6ac4d@mail.gmail.com> On 10/15/05, Reinhold Birkenfeld wrote: > Martin Blais wrote: > > On 10/3/05, Michael Hudson wrote: > >> Martin Blais writes: > >> > >> > How hard would that be to implement? > >> > >> import sys > >> reload(sys) > >> sys.setdefaultencoding('undefined') > > > > Hmmm any particular reason for the call to reload() here? > > Yes. setdefaultencoding() is removed from sys by site.py. To get it > again you must reload sys. Thanks. cheers, From greg.ewing at canterbury.ac.nz Mon Oct 17 03:42:02 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 17 Oct 2005 14:42:02 +1300 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: References: <43525BFA.9090309@iinet.net.au> Message-ID: <4353016A.1010707@canterbury.ac.nz> Guido van Rossum wrote: > Nick, and everybody else trying to find a "solution" for this > "problem", please don't. Denying that there's a problem isn't going to make it go away. Many people, including me, have the feeling that the standard way of defining properties at the moment leaves something to be desired, for all the same reasons that have led to @-decorators. However, I agree that trying to keep the accessor method names out of the class namespace isn't necessary, and may not even be desirable. The way I'm defining properties in PyGUI at the moment looks like this: class C: foo = overridable_property('foo', "The foo property") def get_foo(self): ... def set_foo(self, x): ... This has the advantage that the accessor methods can be overridden in subclasses with the expected effect. This is particularly important in PyGUI, where I have a generic class definition which establishes the valid properties and their docstrings, and implementation subclasses for different platforms which supply the accessor methods. The only wart is the necessity of mentioning the property name twice, once on the lhs and once as an argument. I haven't thought of a good solution to that, yet. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From tdelaney at avaya.com Mon Oct 17 04:39:15 2005 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Mon, 17 Oct 2005 12:39:15 +1000 Subject: [Python-Dev] Definining properties - a use case for class decorators? Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB6E2@au3010avexu1.global.avaya.com> Greg Ewing wrote: > class C: > > foo = overridable_property('foo', "The foo property") > > def get_foo(self): > ... > > def set_foo(self, x): > ... > > This has the advantage that the accessor methods can be > overridden in subclasses with the expected effect. This is a point I was going to bring up. > The only wart is the necessity of mentioning the property > name twice, once on the lhs and once as an argument. > I haven't thought of a good solution to that, yet. Have a look at my comment to Steven Bethard's recipe: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/408713 Tim Delaney From guido at python.org Mon Oct 17 05:06:23 2005 From: guido at python.org (Guido van Rossum) Date: Sun, 16 Oct 2005 20:06:23 -0700 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <4353016A.1010707@canterbury.ac.nz> References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> Message-ID: [Guido] > > Nick, and everybody else trying to find a "solution" for this > > "problem", please don't. [Greg Ewing] > Denying that there's a problem isn't going to make it > go away. Many people, including me, have the feeling that > the standard way of defining properties at the moment leaves > something to be desired, for all the same reasons that have > led to @-decorators. My challenge to many people, including you, is to make that feeling more concrete. Sometimes when you have such a feeling it just means you haven't drunk the kool-aid yet. :) With decorators there was a concrete issue: the modifier trailed after the function body, in a real sense "hiding" from the reader. I don't see such an issue with properties. Certainly the proposed solutions so far are worse than the problem. > However, I agree that trying to keep the accessor method > names out of the class namespace isn't necessary, and may > not even be desirable. The way I'm defining properties in > PyGUI at the moment looks like this: > > class C: > > foo = overridable_property('foo', "The foo property") > > def get_foo(self): > ... > > def set_foo(self, x): > ... > > This has the advantage that the accessor methods can be > overridden in subclasses with the expected effect. This > is particularly important in PyGUI, where I have a generic > class definition which establishes the valid properties > and their docstrings, and implementation subclasses for > different platforms which supply the accessor methods. But since you define the API, are you sure that you need properties at all? Maybe the users would be happy to write widget.get_foo() and widget.set_foo(x) instead of widget.foo or widget.foo = x? > The only wart is the necessity of mentioning the property > name twice, once on the lhs and once as an argument. > I haven't thought of a good solution to that, yet. To which Tim Delaney responded, "have a look at my response here:" http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/408713 I looked at that, and now I believe it's actually *better* to mention the property name twice, at least compared to Tim' s approach. Looking at that version, I think it's obscuring the semantics; it (ab)uses the fact that a function's name is accessible through its __name__ attribute. But (unlike Greg's version) it breaks down when one of the arguments is not a plain function. This makes it brittle in the context of renaming operations, e.g.: getx = lambda self: 42 def sety(self, value): self._y = value setx = sety x = LateBindingProperty(getx, setx) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From nnorwitz at gmail.com Mon Oct 17 05:07:37 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sun, 16 Oct 2005 20:07:37 -0700 Subject: [Python-Dev] Guido v. Python, Round 1 Message-ID: We all know Guido likes Python. But the real question is do pythons like Guido? http://python.org/neal/ n From jeremy at alum.mit.edu Mon Oct 17 05:26:31 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Sun, 16 Oct 2005 23:26:31 -0400 Subject: [Python-Dev] AST branch merge status In-Reply-To: References: Message-ID: Real life interfered with the planned merge tonight. I hope you'll all excuse and wait until tomorrow night. Jeremy On 10/16/05, Jeremy Hylton wrote: > I just merged the head back to the AST branch for what I hope is the > last time. I plan to merge the branch to the head on Sunday evening. > I'd appreciate it if folks could hold off on making changes on the > trunk until that merge happens. > > If this is a non-trivial inconvenience for anyone, go ahead with the > changes but send me mail to make sure that I don't lose the changes > when I do the merge. Regardless, the compiler and Grammar are off > limits. I'll blow away any changes you make there <0.1 wink>. > > Jeremy > > From steven.bethard at gmail.com Mon Oct 17 06:21:31 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Sun, 16 Oct 2005 22:21:31 -0600 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <434E47EB.8090909@gmail.com> References: <20051012072528.68vls5q6143k08c0@login.werra.lunarpages.com> <434DD815.8070909@canterbury.ac.nz> <434E47EB.8090909@gmail.com> Message-ID: Nick Coghlan wrote: > Having module attribute access obey the descriptor protocol (__get__, __set__, > __delete__) sounds like a pretty good option to me. > > It would even be pretty backwards compatible, as I'd be hardpressed to think > why anyone would have a descriptor *instance* as a top-level object in a > module (descriptor definition, yes, but not an instance). Aren't all functions descriptors? py> def baz(): ... print "Evaluating baz!" ... return "Module attribute" ... py> baz() Evaluating baz! 'Module attribute' py> baz.__get__(__import__(__name__), None) > py> baz.__get__(__import__(__name__), None)() Traceback (most recent call last): File "", line 1, in ? TypeError: baz() takes no arguments (1 given) How would your proposal change the invocation of module-level functions? STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From tdelaney at avaya.com Mon Oct 17 06:53:55 2005 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Mon, 17 Oct 2005 14:53:55 +1000 Subject: [Python-Dev] Definining properties - a use case for classdecorators? Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB6E3@au3010avexu1.global.avaya.com> Guido van Rossum wrote: > To which Tim Delaney responded, "have a look at my response here:" > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/408713 > > I looked at that, and now I believe it's actually *better* to mention > the property name twice, at least compared to Tim' s approach. I never said it was a *good* approach - just *an* approach ;) Tim Delaney From nnorwitz at gmail.com Mon Oct 17 07:21:00 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sun, 16 Oct 2005 22:21:00 -0700 Subject: [Python-Dev] problem with genexp In-Reply-To: References: Message-ID: On 10/10/05, Neal Norwitz wrote: > There's a problem with genexp's that I think really needs to get > fixed. See http://python.org/sf/1167751 the details are below. This > code: > > >>> foo(a = i for i in range(10)) > > I agree with the bug report that the code should either raise a > SyntaxError or do the right thing. The change to Grammar/Grammar below seems to fix the problem and all the tests pass. Can anyone comment on whether this fix is correct/appropriate? Is there a better way to fix the problem? -argument: [test '='] test [gen_for] # Really [keyword '='] test +argument: test [gen_for] | test '=' test ['(' gen_for ')'] # Really [keyword '='] test n From martin at v.loewis.de Mon Oct 17 08:36:17 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 17 Oct 2005 08:36:17 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <8393fff0510161726i6fd5c798u38aa3875c8f6ac4d@mail.gmail.com> References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com> <2mbr27f0th.fsf@starship.python.net> <8393fff0510142350l81ba453md20cc47a445642ce@mail.gmail.com> <8393fff0510161726i6fd5c798u38aa3875c8f6ac4d@mail.gmail.com> Message-ID: <43534661.9050804@v.loewis.de> Martin Blais wrote: >>Yes. setdefaultencoding() is removed from sys by site.py. To get it >>again you must reload sys. > > > Thanks. Actually, I should take the opportunity to advise people that setdefaultencoding doesn't really work. With the default default encoding, strings and Unicode objects hash equal when they are equal. If you change the default encoding, this property goes away (perhaps unless you change it to Latin-1). As a result, dictionaries where you mix string and Unicode keys won't work: you might not find a value for a string key when looking up with a Unicode object, and vice versa. Regards, Martin From murman at gmail.com Mon Oct 17 09:09:12 2005 From: murman at gmail.com (Michael Urman) Date: Mon, 17 Oct 2005 02:09:12 -0500 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <43525BFA.9090309@iinet.net.au> References: <43525BFA.9090309@iinet.net.au> Message-ID: On 10/16/05, Nick Coghlan wrote: > On and off, I've been looking for an elegant way to handle properties using > decorators. Why use decorators when a metaclass will already do the trick, and save you a line? This doesn't necessarily get around Antoine's complaint that it looks like self refers to the wrong type, but I'm not convinced anyone would be confused. class MetaProperty(type): def __new__(cls, name, bases, dct): if bases[0] is object: # allow us to create class Property return type.__new__(cls, name, bases, dct) return property(dct.get('get'), dct.get('set'), dct.get('delete'), dct.get('__doc__')) def __init__(cls, name, bases, dct): if bases[0] is object: return type.__init__(cls, name, bases, dct) class Property(object): __metaclass__ = MetaProperty class Test(object): class foo(Property): """The foo property""" def get(self): return self._foo def set(self, val): self._foo = val def delete(self): del self._foo test = Test() test.foo = 'Yay!' assert test._foo == 'Yay!' From seojiwon at gmail.com Mon Oct 17 10:59:20 2005 From: seojiwon at gmail.com (Jiwon Seo) Date: Mon, 17 Oct 2005 01:59:20 -0700 Subject: [Python-Dev] problem with genexp In-Reply-To: References: Message-ID: On 10/16/05, Neal Norwitz wrote: > On 10/10/05, Neal Norwitz wrote: > > There's a problem with genexp's that I think really needs to get > > fixed. See http://python.org/sf/1167751 the details are below. This > > code: > > > > >>> foo(a = i for i in range(10)) > > > > I agree with the bug report that the code should either raise a > > SyntaxError or do the right thing. > > The change to Grammar/Grammar below seems to fix the problem and all > the tests pass. Can anyone comment on whether this fix is > correct/appropriate? Is there a better way to fix the problem? > > -argument: [test '='] test [gen_for] # Really [keyword '='] test > +argument: test [gen_for] | test '=' test ['(' gen_for ')'] # Really > [keyword '='] test The other option would be changes in the Python/compile.c (somewhat) like following diff -r2.352 compile.c 6356c6356,6362 - if (TYPE(n) == argument && NCH(n) == 3) { --- + if (TYPE(n) == argument && NCH(n) == 4) { + PyErr_SetString(PyExc_SyntaxError, + "invalid syntax"); + symtable_error(st, n->n_lineno); + return; + } + else if (TYPE(n) == argument && NCH(n) == 3) { but IMO, changing the Grammar looks more obvious. > > n > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/seojiwon%40gmail.com > > From ncoghlan at gmail.com Mon Oct 17 11:06:47 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 17 Oct 2005 19:06:47 +1000 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: References: <43525BFA.9090309@iinet.net.au> Message-ID: <435369A7.1020601@gmail.com> Michael Urman wrote: > class Test(object): > class foo(Property): > """The foo property""" > def get(self): return self._foo > def set(self, val): self._foo = val > def delete(self): del self._foo > > test = Test() > test.foo = 'Yay!' > assert test._foo == 'Yay!' Thus proving once again, that metaclasses are the one true way to monkey with classes ;) Cheers, Nick. P.S. I think I need an email program that disables the send button after 11 pm. . . -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Mon Oct 17 11:32:33 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 17 Oct 2005 19:32:33 +1000 Subject: [Python-Dev] PEP 343 updated In-Reply-To: References: <43524BB4.7040808@iinet.net.au> Message-ID: <43536FB1.7080505@gmail.com> Guido van Rossum wrote: > On 10/16/05, Nick Coghlan wrote: > I hope you reverted the status to "Proposed"... I hadn't, but I've now fixed that in CVS (both in the PEP and the PEP index), and added some text into the PEP saying why it was reverted to Draft. > On the latter: I think it shouldn't; I don't like this kind of magic. > I'll have to read it before I can comment on the rest. I don't particularly like treating __with__ specially either, but I'm not sure I like the alternative. The alternative is that we'd never be able to safely define a __with__ method directly on generators - the reason is that we would want a "def __with__" where the @context decorator has been forgotten to trigger a TypeError when it is used. If generator-iterators were to provide a context manager to automatically invoke close(), then leaving out "@context" would result in a very obscure bug (as g.close() would be used to finish the context, instead of g.next() or g.throw()). On the other hand, if the context decorator is invoked automatically when a generator function is supplied to populate the __with__ slot, then using a generator to define a __with__ method will "just work", instead of "only works if you also apply the context decorator" Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Mon Oct 17 11:35:34 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 17 Oct 2005 19:35:34 +1000 Subject: [Python-Dev] PEP 343 updated In-Reply-To: <000f01c5d265$f232c2f0$6402a8c0@arkdesktop> References: <000f01c5d265$f232c2f0$6402a8c0@arkdesktop> Message-ID: <43537066.8050009@gmail.com> Andrew Koenig wrote: >> PEP 343 has been updated on python.org. > >> Highlights of the changes: > >> - changed the name of the PEP to be simply "The 'with' Statement" > > Do you mean PEP 346, perchance? PEP 343 is something else entirely. No, I mean PEP 343 - it describes Guido's proposal for a "with" statement. The old name made perfect sense if you'd been part of the PEP 340 discussion, but was rather obscure otherwise. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Mon Oct 17 11:55:13 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 17 Oct 2005 19:55:13 +1000 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: References: <20051012072528.68vls5q6143k08c0@login.werra.lunarpages.com> <434DD815.8070909@canterbury.ac.nz> <434E47EB.8090909@gmail.com> Message-ID: <43537501.6080501@gmail.com> Steven Bethard wrote: > Nick Coghlan wrote: >> Having module attribute access obey the descriptor protocol (__get__, __set__, >> __delete__) sounds like a pretty good option to me. >> >> It would even be pretty backwards compatible, as I'd be hardpressed to think >> why anyone would have a descriptor *instance* as a top-level object in a >> module (descriptor definition, yes, but not an instance). > > Aren't all functions descriptors? So Josh pointed out. > py> def baz(): > ... print "Evaluating baz!" > ... return "Module attribute" > ... > py> baz() > Evaluating baz! > 'Module attribute' > py> baz.__get__(__import__(__name__), None) > > > py> baz.__get__(__import__(__name__), None)() > Traceback (most recent call last): > File "", line 1, in ? > TypeError: baz() takes no arguments (1 given) > > How would your proposal change the invocation of module-level functions? It would, alas, break it. And now that I think about it, functions have to work the way they do, otherwise binding an arbitrary function to a class variable wouldn't work properly. So the class descriptor protocol can't be used as is at the module level, because functions are descriptor instances. Ah well, another idea runs aground on the harsh rocks of reality. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From steve at holdenweb.com Mon Oct 17 13:55:00 2005 From: steve at holdenweb.com (Steve Holden) Date: Mon, 17 Oct 2005 12:55:00 +0100 Subject: [Python-Dev] Guido v. Python, Round 1 In-Reply-To: References: Message-ID: Neal Norwitz wrote: > We all know Guido likes Python. But the real question is do pythons like Guido? > > http://python.org/neal/ > Neal: Getting a 404 on this one right now. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From phd at mail2.phd.pp.ru Mon Oct 17 14:14:54 2005 From: phd at mail2.phd.pp.ru (Oleg Broytmann) Date: Mon, 17 Oct 2005 16:14:54 +0400 Subject: [Python-Dev] Guido v. Python, Round 1 In-Reply-To: References: Message-ID: <20051017121454.GA1213@phd.pp.ru> On Mon, Oct 17, 2005 at 12:55:00PM +0100, Steve Holden wrote: > > http://python.org/neal/ > > > Getting a 404 on this one right now. No problems here, very nice fotos! :) Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From jimjjewett at gmail.com Mon Oct 17 17:19:22 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 17 Oct 2005 11:19:22 -0400 Subject: [Python-Dev] PEP 3000 and exec Message-ID: Guido van Rossum wrote: > Another idea might be to change the exec() spec so that you are > required to pass in a namespace (and you can't use locals() either!). > Then the whole point becomes moot. I think of exec as having two major uses: (1) A run-time compiler (2) A way to change the local namespace, based on run-time information (such as a config file). By turning exec into a function with its own namespace (and enforcing a readonly locals()), the second use is eliminated. Is this intentional for security/style/efficiency/predictability? If so, could exec/eval at least (1) Be treatable as nested functions, so that they can *read* the current namespace. (2) Grow a return value, so that they can more easily pass information back to at least a (tuple of) known variable name(s). -jJ From guido at python.org Mon Oct 17 17:40:59 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 17 Oct 2005 08:40:59 -0700 Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use) In-Reply-To: <43537501.6080501@gmail.com> References: <20051012072528.68vls5q6143k08c0@login.werra.lunarpages.com> <434DD815.8070909@canterbury.ac.nz> <434E47EB.8090909@gmail.com> <43537501.6080501@gmail.com> Message-ID: On 10/17/05, Nick Coghlan wrote: > Ah well, another idea runs aground on the harsh rocks of reality. I should point out that it's intentional that there are very few similarities between modules and classes. Many attempts have been made to unify the two, but these never work right, because the module can't decide whether it behaves like a class or like an instance. Also the direct access to global variables prevents you to put any kind of code in the get-attribute path. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Oct 17 17:49:44 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 17 Oct 2005 08:49:44 -0700 Subject: [Python-Dev] PEP 3000 and exec In-Reply-To: References: Message-ID: On 10/17/05, Jim Jewett wrote: > Guido van Rossum wrote: > > > Another idea might be to change the exec() spec so that you are > > required to pass in a namespace (and you can't use locals() either!). > > Then the whole point becomes moot. > > I think of exec as having two major uses: > > (1) A run-time compiler > (2) A way to change the local namespace, based on run-time > information (such as a config file). > > By turning exec into a function with its own namespace (and > enforcing a readonly locals()), the second use is eliminated. > > Is this intentional for security/style/efficiency/predictability? Yes, there are lots of problems with (2); both the human reader and the compiler often don't quite know what the intended effect is. > If so, could exec/eval at least > > (1) Be treatable as nested functions, so that they can *read* the > current namespace. There will be a way to get the current namespace (similar to locals() but without its bugs). But it's probably better to create an empty namespace and explicitly copy into it only those things that you are willing to expose to the exec'ed code (or the things it needs). > (2) Grow a return value, so that they can more easily pass > information back to at least a (tuple of) known variable name(s). You can easily build that functionality yourself; after running exec(), you just pick certain things out of the namespace that you expect it to create. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Mon Oct 17 18:06:36 2005 From: skip at pobox.com (skip@pobox.com) Date: Mon, 17 Oct 2005 11:06:36 -0500 Subject: [Python-Dev] Guido v. Python, Round 1 In-Reply-To: References: Message-ID: <17235.52236.458715.854015@montanaro.dyndns.org> Neal> We all know Guido likes Python. But the real question is do Neal> pythons like Guido? Neal> http://python.org/neal/ Like Steve (and unlike Oleg), I get 404s for this page. I also tried "www.python.org" and "~neal". Skip From steve at holdenweb.com Mon Oct 17 18:27:52 2005 From: steve at holdenweb.com (Steve Holden) Date: Mon, 17 Oct 2005 17:27:52 +0100 Subject: [Python-Dev] Guido v. Python, Round 1 In-Reply-To: <17235.52236.458715.854015@montanaro.dyndns.org> References: <17235.52236.458715.854015@montanaro.dyndns.org> Message-ID: <4353D108.1060107@holdenweb.com> skip at pobox.com wrote: > Neal> We all know Guido likes Python. But the real question is do > Neal> pythons like Guido? > > Neal> http://python.org/neal/ > > Like Steve (and unlike Oleg), I get 404s for this page. I also tried > "www.python.org" and "~neal". > This appears to be a DNS issue: the stuff is on creosote, 213.84.134.214, not www or dinsdale. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From tim.peters at gmail.com Mon Oct 17 18:34:03 2005 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 17 Oct 2005 12:34:03 -0400 Subject: [Python-Dev] Guido v. Python, Round 1 In-Reply-To: <17235.52236.458715.854015@montanaro.dyndns.org> References: <17235.52236.458715.854015@montanaro.dyndns.org> Message-ID: <1f7befae0510170934h21d072dbn2037d634af7d17f@mail.gmail.com> [Skip] > Like Steve (and unlike Oleg), I get 404s for this page. I also tried > "www.python.org" and "~neal". The original http://python.org/neal/ worked fine for me, and still does. OTOH, http://www.python.org/neal/ gets a 404, and (the original without the trailing backslash) http://python.org/neal "changes itself" to the 404 on . From steve at holdenweb.com Mon Oct 17 18:27:52 2005 From: steve at holdenweb.com (Steve Holden) Date: Mon, 17 Oct 2005 17:27:52 +0100 Subject: [Python-Dev] Guido v. Python, Round 1 In-Reply-To: <17235.52236.458715.854015@montanaro.dyndns.org> References: <17235.52236.458715.854015@montanaro.dyndns.org> Message-ID: <4353D108.1060107@holdenweb.com> skip at pobox.com wrote: > Neal> We all know Guido likes Python. But the real question is do > Neal> pythons like Guido? > > Neal> http://python.org/neal/ > > Like Steve (and unlike Oleg), I get 404s for this page. I also tried > "www.python.org" and "~neal". > This appears to be a DNS issue: the stuff is on creosote, 213.84.134.214, not www or dinsdale. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From arigo at tunes.org Mon Oct 17 18:52:09 2005 From: arigo at tunes.org (Armin Rigo) Date: Mon, 17 Oct 2005 18:52:09 +0200 Subject: [Python-Dev] AST branch update In-Reply-To: References: Message-ID: <20051017165209.GA14358@code1.codespeak.net> Hi Jeremy, On Thu, Oct 13, 2005 at 04:52:14PM -0400, Jeremy Hylton wrote: > I don't think the current test suite covers all of the possible syntax > errors that can be raised. I'd like to add a new test suite that > covers all of the remaining cases, perhaps moving some existing tests > into this module as well. You might be interested in PyPy's test suite here. In particular, http://codespeak.net/svn/pypy/dist/pypy/interpreter/test/test_syntax.py contains a list of syntactically valid and invalid corner cases. If you are willing to check out the whole of PyPy (i.e. http://codespeak.net/svn/pypy/dist) you should also be able to run the whole test suite, or at least the following tests: python test_all.py pypy/interpreter/test/test_compiler.py python test_all.py pypy/interpreter/pyparser/ which compare CPython's builtin compiler with our own compilers; as of PyPy revision 18722 these tests pass on all CPython versions (2.3.5, 2.4.2, HEAD). A bientot, Armin. From jimjjewett at gmail.com Mon Oct 17 19:06:20 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 17 Oct 2005 13:06:20 -0400 Subject: [Python-Dev] PEP 3000 and exec Message-ID: For communicating with an exec/eval child, once exec cannot run in the current namespace, I asked that it be possible to pass a read-only "current view" and to see a return value. (Guido): >... it's probably better to create an empty namespace and > explicitly copy into it ... > ... just pick certain things out of the namespace [afterwards] Yes and no. If the exec'ed code is well defined (and it needs to be if security is a concern), then that works well. For more exploratory code, it can be hard to know what in advance what the code will need, or to agree on the names of return variables. The simplest general API that I can come up with is "You're allowed to see anything I can" (even if it is in a nested scope or base class, and I realize that you *probably* won't need it). "Return value is whatever you explicitly choose to return" (Lisp's "last result" might be even simpler, but would probably lead to confusion other places.) -jJ From nnorwitz at gmail.com Mon Oct 17 19:11:43 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 17 Oct 2005 10:11:43 -0700 Subject: [Python-Dev] Guido v. Python, Round 1 In-Reply-To: <1f7befae0510170934h21d072dbn2037d634af7d17f@mail.gmail.com> References: <17235.52236.458715.854015@montanaro.dyndns.org> <1f7befae0510170934h21d072dbn2037d634af7d17f@mail.gmail.com> Message-ID: On 10/17/05, Tim Peters wrote: > > [Skip] > > Like Steve (and unlike Oleg), I get 404s for this page. I also tried > > "www.python.org " and "~neal". > > The original > > http://python.org/neal/ > > worked fine for me, and still does. OTOH, > > http://www.python.org/neal/ > > gets a 404, and (the original without the trailing backslash) > Yup, as most people already pointed out, I only put this up on creosote and should have added that to the URL. I don't have an account on dinsdale and can't copy stuff up there AFAIK. This URL should work for a while longer. http://creosote.python.org/neal/ n -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20051017/fd16bf23/attachment.html From tonynelson at georgeanelson.com Sun Oct 16 18:33:25 2005 From: tonynelson at georgeanelson.com (Tony Nelson) Date: Sun, 16 Oct 2005 12:33:25 -0400 Subject: [Python-Dev] Unicode charmap decoders slow In-Reply-To: <435223B1.2020209@v.loewis.de> References: Message-ID: At 11:56 AM +0200 10/16/05, Martin v. L?wis wrote: >Tony Nelson wrote: >> BTW, Martin, if you care to, would you explain to me how a Trie would be >> used for charmap encoding? I know a couple of approaches, but I don't know >> how to do it fast. (I've never actually had the occasion to use a Trie.) > >I currently envision a three-level trie, with 5, 4, and 7 bits. You take >the Unicode character (only chacters below U+FFFF supported), and take >the uppermost 5 bits, as index in an array. There you find the base >of a second array, to which you add the next 4 bits, which gives you an >index into a third array, where you add the last 7 bits. This gives >you the character, or 0 if it is unmappable. Umm, 0 (NUL) is a valid output character in most of the 8-bit character sets. It could be handled by having a separate "exceptions" string of the unicode code points that actually map to the exception char. Usually "exceptions" would be a string of length 1. Suggested changes below. >struct encoding_map{ > unsigned char level0[32]; > unsigned char *level1; > unsigned char *level2; Py_UNICODE *exceptions; >}; > >struct encoding_map *table; >Py_UNICODE character; >int level1 = table->level0[character>>11]; >if(level1==0xFF)raise unmapped; >int level2 = table->level1[16*level1 + ((character>>7) & 0xF)]; >if(level2==0xFF)raise unmapped; >int mapped = table->level2[128*level2 + (character & 0x7F)]; change: >if(mapped==0)raise unmapped; to: if(mapped==0) { Py_UNICODE *ep; for(ep=table->exceptions; *ep; ep++) if(*ep==character) break; if(!*ep)raise unmapped; } >Over a hashtable, this has the advantage of not having to deal with >collisions. Instead, it guarantees you a lookup in a constant time. OK, I see the benefit. Your code is about the same amount of work as the hash table lookup in instructions, indirections, and branches, normally uses less of the data cache, and has a fixed running time. It may use one more branch, but its branches are easily predicted. Thank you for explaining it. >It is also quite space-efficient: all tables use bytes as indizes. >As each level0 deals with 2048 characters, most character maps >will only use 1 or two level1 blocks, meaning 16 or 32 bytes >for level1. The number of level3 blocks required depends on >the number of 127-character rows which the encoding spans; >for most encodings, 3 or four such blocks will be sufficient >(with ASCII spanning one such block typically), causing the >entire memory consumption for many encodings to be less than >600 bytes. ... As you are concerned about pathological cases for hashing (that would make the hash chains long), it is worth noting that in such cases this data structure could take 64K bytes. Of course, no such case occurs in standard encodings, and 64K is affordable as long is it is rare. ____________________________________________________________________ TonyN.:' ' From skip at pobox.com Mon Oct 17 20:14:00 2005 From: skip at pobox.com (skip@pobox.com) Date: Mon, 17 Oct 2005 13:14:00 -0500 Subject: [Python-Dev] Guido v. Python, Round 1 In-Reply-To: References: <17235.52236.458715.854015@montanaro.dyndns.org> <1f7befae0510170934h21d072dbn2037d634af7d17f@mail.gmail.com> Message-ID: <17235.59880.819873.541201@montanaro.dyndns.org> Neal> This URL should work for a while longer. Neal> http://creosote.python.org/neal/ Ah, the vagaries of URL redirection. Thanks... Skip From greg.ewing at canterbury.ac.nz Tue Oct 18 02:42:50 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 18 Oct 2005 13:42:50 +1300 Subject: [Python-Dev] Guido v. Python, Round 1 In-Reply-To: References: Message-ID: <4354450A.2020400@canterbury.ac.nz> Neal Norwitz wrote: > We all know Guido likes Python. But the real question is do pythons like Guido? > > http://python.org/neal/ ??? I get a 404 for this. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Oct 18 03:15:45 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 18 Oct 2005 14:15:45 +1300 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> Message-ID: <43544CC1.5050204@canterbury.ac.nz> Guido van Rossum wrote: > With decorators there was a concrete issue: the modifier trailed after > the function body, in a real sense "hiding" from the reader. A similar thing happens with properties, the property definition (which is the public interface) trailing after the accessor methods (which are an implementation detail). > Certainly the proposed solutions so far are worse than > the problem. I agree with that (except for mine, of course :-). I still feel that the ultimate solution lies in some form of syntactic support, although I haven't decided what yet. > But since you define the API, are you sure that you need properties at > all? Maybe the users would be happy to write widget.get_foo() and > widget.set_foo(x) instead of widget.foo or widget.foo = x? I'm one of my main users, and I wouldn't be happy with it. I *have* thought about this quite carefully. An early version of the PyGUI API (predating properties) did things that way, and people complained. After re-doing it with properties, and getting some experience using the result, I'm convinced that properties are the way to go for this particular application. > To which Tim Delaney responded, "have a look at my response here:" > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/408713 > > I looked at that, and now I believe it's actually *better* to mention > the property name twice, at least compared to Tim' s approach. I'm inclined to agree. Passing functions that you're not going to use as functions but just use the name of doesn't seem right. And in my version, it's not *really* redundant, since the name is only used to derive the names of the accessor methods. It doesn't *have* to be the same as the property name, although using anything else could justifiably be regarded as insane... -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From guido at python.org Tue Oct 18 03:55:48 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 17 Oct 2005 18:55:48 -0700 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <43544CC1.5050204@canterbury.ac.nz> References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> Message-ID: [Guido] > > I looked at that, and now I believe it's actually *better* to mention > > the property name twice, at least compared to Tim' s approach. [Greg Ewing] > I'm inclined to agree. Passing functions that you're not > going to use as functions but just use the name of doesn't > seem right. > > And in my version, it's not *really* redundant, since the > name is only used to derive the names of the accessor methods. > It doesn't *have* to be the same as the property name, although > using anything else could justifiably be regarded as insane... OK, so how's this for a radical proposal. Let's change the property built-in so that its arguments can be either functions or strings (or None). If they are functions or None, it behaves exactly like it always has. If an argument is a string, it should be a method name, and the method is looked up by that name each time the property is used. Because this is late binding, it can be put before the method definitions, and a subclass can override the methods. Example: class C: foo = property('getFoo', 'setFoo', None, 'the foo property') def getFoo(self): return self._foo def setFoo(self, foo): self._foo = foo What do you think? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Tue Oct 18 04:08:12 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 18 Oct 2005 15:08:12 +1300 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> Message-ID: <4354590C.5070708@canterbury.ac.nz> Guido van Rossum wrote: > Let's change the property built-in so that its arguments can be either > functions or strings (or None). > > If an argument is a string, it should be a method name, and the method > is looked up by that name each time the property is used. That sounds reasonable. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From barry at python.org Tue Oct 18 04:08:48 2005 From: barry at python.org (Barry Warsaw) Date: Mon, 17 Oct 2005 22:08:48 -0400 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> Message-ID: <1129601328.9405.13.camel@geddy.wooz.org> On Mon, 2005-10-17 at 21:55, Guido van Rossum wrote: > Let's change the property built-in so that its arguments can be either > functions or strings (or None). If they are functions or None, it > behaves exactly like it always has. > > If an argument is a string, it should be a method name, and the method > is looked up by that name each time the property is used. Because this > is late binding, it can be put before the method definitions, and a > subclass can override the methods. Example: > > class C: > > foo = property('getFoo', 'setFoo', None, 'the foo property') > > def getFoo(self): > return self._foo > > def setFoo(self, foo): > self._foo = foo > > What do you think? Ick, for all the reasons that strings are less appealing than names. IMO, there's not enough advantage in having the property() call before the functions than after. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20051017/a14d6183/attachment-0001.pgp From greg.ewing at canterbury.ac.nz Tue Oct 18 04:15:43 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 18 Oct 2005 15:15:43 +1300 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <1129601328.9405.13.camel@geddy.wooz.org> References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> Message-ID: <43545ACF.7060004@canterbury.ac.nz> Barry Warsaw wrote: > Ick, for all the reasons that strings are less appealing than names. > IMO, there's not enough advantage in having the property() call before > the functions than after. That's not the only benefit - you also get overridability of the accessor methods. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From guido at python.org Tue Oct 18 04:24:36 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 17 Oct 2005 19:24:36 -0700 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <1129601328.9405.13.camel@geddy.wooz.org> References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> Message-ID: [Guido] > > Let's change the property built-in so that its arguments can be either > > functions or strings (or None). If they are functions or None, it > > behaves exactly like it always has. > > > > If an argument is a string, it should be a method name, and the method > > is looked up by that name each time the property is used. Because this > > is late binding, it can be put before the method definitions, and a > > subclass can override the methods. Example: > > > > class C: > > > > foo = property('getFoo', 'setFoo', None, 'the foo property') > > > > def getFoo(self): > > return self._foo > > > > def setFoo(self, foo): > > self._foo = foo > > > > What do you think? [Barry] > Ick, for all the reasons that strings are less appealing than names. I usually wholeheartedly agree with that argument, but here I don't see an alternative. > IMO, there's not enough advantage in having the property() call before > the functions than after. Maybe you didn't see the use case that Greg had in mind? He wants to be able to override the getter and/or setter in a subclass, without changing the docstring or having to repeat the property() call. That requires us to do a late binding lookup based on a string. Tim Delaney had a different solution where you would pass in the functions but all it did was use their __name__ attribute to look up the real function at runtime. The problem with that is that the __name__ attribute may not be what you expect, as it may not correspond to the name of the object passed in. Example: class C: def getx(self): ...something... gety = getx y = property(gety) class D(C): def gety(self): ...something else... Here, the intention is clearly to override the way the property's value is computed, but it doesn't work right -- gety.__name__ is 'getx', and D doesn't override getx, so D().y calls C.getx() instead of D.gety(). If you can think of a solution that looks better than mine, you're a genius. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at python.org Tue Oct 18 05:10:50 2005 From: barry at python.org (Barry Warsaw) Date: Mon, 17 Oct 2005 23:10:50 -0400 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> Message-ID: <1129605050.9405.29.camel@geddy.wooz.org> On Mon, 2005-10-17 at 22:24, Guido van Rossum wrote: > > IMO, there's not enough advantage in having the property() call before > > the functions than after. > > Maybe you didn't see the use case that Greg had in mind? He wants to > be able to override the getter and/or setter in a subclass, without > changing the docstring or having to repeat the property() call. That > requires us to do a late binding lookup based on a string. True, I missed that use case. But can't you already support override-ability just by refactoring the getter and setter into separate methods? IOW, the getter and setter isn't overridden, but they call other methods that implement the core functionality and that /are/ overridden. Okay, that means a few extra methods per property, but that still doesn't seem too bad. > If you can think of a solution that looks better than mine, you're a genius. Oh, I know that's not the case, but it's such a tempting challenge, I'll try anyway :). class A(object): def __init__(self): self._x = 0 def set_x(self, x): self._set_x(x) def _set_x(self, x): print 'A._set_x()' self._x = x def get_x(self): return self._get_x() def _get_x(self): print 'A._get_x()' return self._x x = property(get_x, set_x) class B(A): def _set_x(self, x): print 'B._set_x()' super(B, self)._set_x(x) def _get_x(self): print 'B._get_x()' return super(B, self)._get_x() a = A() b = B() a.x = 7 b.x = 9 print a.x print b.x Basically A.get_x() and A.set_x() are just wrappers to make the property machinery work the way you want. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20051017/da1f0a5b/attachment.pgp From guido at python.org Tue Oct 18 05:46:47 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 17 Oct 2005 20:46:47 -0700 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <1129605050.9405.29.camel@geddy.wooz.org> References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> <1129605050.9405.29.camel@geddy.wooz.org> Message-ID: [Barry] > > > IMO, there's not enough advantage in having the property() call before > > > the functions than after. [Guido] > > Maybe you didn't see the use case that Greg had in mind? He wants to > > be able to override the getter and/or setter in a subclass, without > > changing the docstring or having to repeat the property() call. That > > requires us to do a late binding lookup based on a string. [Barry] > True, I missed that use case. But can't you already support > override-ability just by refactoring the getter and setter into separate > methods? IOW, the getter and setter isn't overridden, but they call > other methods that implement the core functionality and that /are/ > overridden. Okay, that means a few extra methods per property, but that > still doesn't seem too bad. > > > If you can think of a solution that looks better than mine, you're a genius. > > Oh, I know that's not the case, but it's such a tempting challenge, I'll > try anyway :). [...] Nice try. I guess it's similar to this, which is a bit more concise and doesn't require as many underscores: class B: def get_x(self): return self._x def set_x(self, x): self._x = x x = property(lambda self: self.get_x(), lambda self, x: self.set_x(x)) But I still like the version with strings better: x = property('get_x', 'set_x') This trades two lambdas for two pairs of string quotes; a good deal IMO! Now, if I were to follow Paul Graham's recommendations strictly (http://www.paulgraham.com/diff.html), point 7 saysthat Python should have a symbol type. I've always maintained that this is unnecessary and that we can just as well use regular strings. This makes it easy to constructs names on the fly that you pass to getattr() and setattr() using standard string operations. Suppose the symbol type were written as \foo (meaning a quoted reference to the identifier 'foo'). Then the above could be written like this: x = property(\get_x, \set_x) But I'm not sure this buys us anything, so I still believe that using 'set_x' and 'get_x' is just fine here. Greg Ewing, whose taste in language features is hard to beat, seems to agree. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Tue Oct 18 06:11:12 2005 From: aahz at pythoncraft.com (Aahz) Date: Mon, 17 Oct 2005 21:11:12 -0700 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> Message-ID: <20051018041112.GA14975@panix.com> On Mon, Oct 17, 2005, Guido van Rossum wrote: > > If an argument is a string, it should be a method name, and the method > is looked up by that name each time the property is used. Because this > is late binding, it can be put before the method definitions, and a > subclass can override the methods. Example: > > class C: > foo = property('getFoo', 'setFoo', None, 'the foo property') +1 The only other alternative is to introduce some kind of block. This is a good solution that is not particularly intrusive; it leaves the door open to a well-designed block structure later on. The one niggle I have is that it's going to be a little unwieldy to explain, but people who create properties really ought to understand Python well enough to deal with it. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you think it's expensive to hire a professional to do the job, wait until you hire an amateur." --Red Adair From pje at telecommunity.com Tue Oct 18 06:35:19 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 18 Oct 2005 00:35:19 -0400 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: References: <1129605050.9405.29.camel@geddy.wooz.org> <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> <1129605050.9405.29.camel@geddy.wooz.org> Message-ID: <5.1.1.6.0.20051018001632.01eec1d0@mail.telecommunity.com> At 08:46 PM 10/17/2005 -0700, Guido van Rossum wrote: >Now, if I were to follow Paul Graham's recommendations strictly >(http://www.paulgraham.com/diff.html), point 7 saysthat Python should >have a symbol type. I've always maintained that this is unnecessary >and that we can just as well use regular strings. Well, unless you're going to also do #8 ("a notation for code"), I'd agree. :) But then again, Graham also lists #6 ("programs composed of expressions"), and even though I'm often tempted by the desire to write something as a big expression, the truth is that most people's brains (mine included) just don't have enough stack space for it. The people that have that much mental stack space can already write lambda+listcomp atrocities for the rest of us to boggle at. :) Logix (http://livelogix.net/logix/) basically adds everything on Graham's list to Python, and then compiles it to Python bytecode. But the result is something that still doesn't seem very Pythonic to me. Of course, with good restraint, it seems to me that Logix allows some very tasteful language extensions (John Landahl created a nice syntax sugar for generic functions with it), but making full-tilt use of Graham's 9 features seems to result in a very Lisp-like experience, even without the parentheses. At the same time, I would note that Ruby does seem to have an edge on Python in terms of ability to create "little languages" of the sort that Logix also excels at. Compare SCons (Python) with Rakefiles (Ruby), for example, or SQLObject (Python) to Rails' ActiveRecord. In each case, the Python DSL syntax is okay, but Ruby's is better. Even PEP 340 in its heydey wasn't going to improve on it much, because Ruby DSL's benefit mainly from being able to pass the blocks to functions which could then hold on to them for later use. (Also, in an ironic twist, Ruby requires fewer parentheses than Python for such operations, so the invocation looks more like user-defined syntax.) From steven.bethard at gmail.com Tue Oct 18 06:46:12 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 17 Oct 2005 22:46:12 -0600 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <1129601328.9405.13.camel@geddy.wooz.org> References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> Message-ID: Barry Warsaw wrote: > On Mon, 2005-10-17 at 21:55, Guido van Rossum wrote: > > > Let's change the property built-in so that its arguments can be either > > functions or strings (or None). If they are functions or None, it > > behaves exactly like it always has. > > > > If an argument is a string, it should be a method name, and the method > > is looked up by that name each time the property is used. Because this > > is late binding, it can be put before the method definitions, and a > > subclass can override the methods. Example: > > > > class C: > > > > foo = property('getFoo', 'setFoo', None, 'the foo property') > > > > def getFoo(self): > > return self._foo > > > > def setFoo(self, foo): > > self._foo = foo > > > > What do you think? > > Ick, for all the reasons that strings are less appealing than names. I'm not sure if you'll like it any better, but I combined Michael Urman's suggestion with my late-binding property recipe to get: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/442418 It solves the name-repetition problem and the late-binding problem (I believe), at the cost of either adding an extra argument to the functions forming the property or confusing the "self" argument a little. STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From guido at python.org Tue Oct 18 06:59:18 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 17 Oct 2005 21:59:18 -0700 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> Message-ID: On 10/17/05, Steven Bethard wrote: > I'm not sure if you'll like it any better, but I combined Michael > Urman's suggestion with my late-binding property recipe to get: > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/442418 > It solves the name-repetition problem and the late-binding problem (I > believe), at the cost of either adding an extra argument to the > functions forming the property or confusing the "self" argument a > little. That is probably as good as you can get it *if* you prefer the nested class over a property call with string arguments. Personally, I find the nested class inheriting from Property a lot more "magical" than the call to property() with string arguments. I wonder if at some point in the future Python will have to develop a macro syntax so that you can write Property foo: def get(self): return self._foo ...etc... which would somehow translate into code similar to your recipe. But until then, I prefer the simplicity of foo = property('get_foo', 'set_foo') -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gjc at inescporto.pt Tue Oct 18 13:01:28 2005 From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro) Date: Tue, 18 Oct 2005 12:01:28 +0100 Subject: [Python-Dev] Coroutines, generators, function calling Message-ID: <1129633289.12510.24.camel@localhost> There's one thing about coroutines using python generators that is still troubling, and I was wondering if anyone sees any potencial solution at language level... Suppose you have a complex coroutine (this is just an example, not so complex, but you get the idea, I hope): def show_message(msg): win = create_window(msg) # slide down for y in xrange(10): win.move(0, y*20) yield Timeout(0.1) # timeout yield Timeout(3) # slide up for y in xrange(10, 0, -1): win.move(0, y*20) yield Timeout(0.1) win.destroy() Suppose now I want to move the window animation to a function, to factorize the code: def animate(win, steps): for y in steps: win.move(0, y*20) yield Timeout(0.1) def show_message(msg): win = create_window(msg) animate(win, xrange(10)) # slide down yield Timeout(3) animate(win, xrange(10, 0, -1)) # slide up win.destroy() This obviously doesn't work, because calling animate() produces another generator, instead of calling the function. In coroutines context, it's like it produces another coroutine, while all I wanted was to call a function. I don't suppose there could be a way to make the yield inside the subfunction have the same effect as if it was inside the function that called it? Perhaps some special notation, either at function calling or at function definition? -- Gustavo J. A. M. Carneiro The universe is always one step beyond logic. From ncoghlan at gmail.com Tue Oct 18 15:36:21 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 18 Oct 2005 23:36:21 +1000 Subject: [Python-Dev] Coroutines, generators, function calling In-Reply-To: <1129633289.12510.24.camel@localhost> References: <1129633289.12510.24.camel@localhost> Message-ID: <4354FA55.4040909@gmail.com> Gustavo J. A. M. Carneiro wrote: > I don't suppose there could be a way to make the yield inside the > subfunction have the same effect as if it was inside the function that > called it? Perhaps some special notation, either at function calling or > at function definition? You mean like a for loop? ;) def show_message(msg): win = create_window(msg) for step in animate(win, xrange(10)): # slide down yield step yield Timeout(3) for step in animate(win, xrange(10, 0, -1)): # slide up yield step win.destroy() Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From jimjjewett at gmail.com Tue Oct 18 15:37:12 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 18 Oct 2005 09:37:12 -0400 Subject: [Python-Dev] Defining properties - a use case for class decorators? Message-ID: Greg Ewing wrote: >> ... the standard way of defining properties at the moment >> leaves something to be desired, for all the same reasons >> that have led to @-decorators. Guido write: > ... make that feeling more concrete. ... > With decorators there was a concrete issue: the modifier > trailed after the function body, in a real sense "hiding" > from the reader. I don't see such an issue with properties. For me, the property declaration (including the function declarations) is too verbose, and ends up hiding the rest of the class. My ideal syntax would look something like: # Declare "x" to name a property. Create a storage slot, # with the generic get/set/del/doc. (doc == "property x"?) property(x) # Change the setter, possibly in a subclass property(x) set: if x<5: __x = x If I don't want anything special on the get, I shouldn't have to add any "get" boilerplate to my code. An alternative might be slots=[x, y, z] to automatically create default properties for x, y, and z, while declaring that instances won't have arbitrary fields. That said, I'm not sure the benefit is enough to justify the extra complications, and your suggestion of allowing strings for method names may be close enough. I agree that the use of strings is awkward, but ... probably no worse than using them with __dict__ today. -jJ From gjc at inescporto.pt Tue Oct 18 15:47:08 2005 From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro) Date: Tue, 18 Oct 2005 14:47:08 +0100 Subject: [Python-Dev] Coroutines, generators, function calling In-Reply-To: References: Message-ID: <1129643229.12510.37.camel@localhost> On Tue, 2005-10-18 at 09:07 -0400, Jim Jewett wrote: > Suppose now I want to move the window animation to a function, to > factorize the code: > > def animate(win, steps): > for y in steps: > win.move(0, y*20) > yield Timeout(0.1) > > def show_message(msg): > win = create_window(msg) > animate(win, xrange(10)) # slide down > yield Timeout(3) > animate(win, xrange(10, 0, -1)) # slide up > win.destroy() > > This obviously doesn't work, because calling animate() produces > another generator, instead of calling the function. In coroutines > context, it's like it produces another coroutine, while all I wanted was > to call a function. > > I don't suppose there could be a way to make the yield inside the > subfunction have the same effect as if it was inside the function that > called it? Perhaps some special notation, either at function calling or > at function definition? > > --------------------------------- > > I may be missing something, but to me the answer looks like: > > def show_message(msg): > win = create_window(msg) > for v in animate(win, xrange(10)): # slide down > yield v > yield Timeout(3) > for v in animate(win, xrange(10, 0, -1)): # slide up > yield v > win.destroy() Sure, that would work. Or even this, if the scheduler would automatically recognize generator objects being yielded and so would run the the nested coroutine until finish: def show_message(msg): win = create_window(msg) yield animate(win, xrange(10)) # slide down yield Timeout(3) yield animate(win, xrange(10, 0, -1)) # slide up win.destroy() Sure, it could work, but still... I wish for something that would avoid creating a nested coroutine. Maybe I'm asking for too much, I don't know. Just trying to get some feedback... Regards. -- Gustavo J. A. M. Carneiro The universe is always one step beyond logic. From ncoghlan at gmail.com Tue Oct 18 15:59:03 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 18 Oct 2005 23:59:03 +1000 Subject: [Python-Dev] Defining properties - a use case for class decorators? In-Reply-To: References: Message-ID: <4354FFA7.6020204@gmail.com> Jim Jewett wrote: > That said, I'm not sure the benefit is enough to justify the > extra complications, and your suggestion of allowing strings > for method names may be close enough. I agree that the > use of strings is awkward, but ... probably no worse than > using them with __dict__ today. An idea that was kicked around on c.l.p a long while back was "statement local variables", where you could define some extra names just for a single simple statement: x = property(get, set, delete, doc) given: doc = "Property x (must be less than 5)" def get(self): try: return self._x except AttributeError: self._x = 0 return 0 def set(self, value): if value >= 5: raise ValueError("value too big") self._x = x def delete(self): del self._x As I recall, the idea died due to problems with figuring out how to allow the simple statement to both see the names from the nested block and modify the surrounding namespace, but prevent the names from the nested block from affecting the surrounding namespace after the statement was completed. Another option would be to allow attribute reference targets when binding function names: x = property("Property x (must be less than 5)") def x.get(instance): try: return instance._x except AttributeError: instance._x = 0 return 0 def x.set(instance, value): if value >= 5: raise ValueError("value too big") instance._x = x def x.delete(instance): del instance._x Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ark at acm.org Tue Oct 18 16:04:36 2005 From: ark at acm.org (Andrew Koenig) Date: Tue, 18 Oct 2005 10:04:36 -0400 Subject: [Python-Dev] Coroutines, generators, function calling In-Reply-To: <1129643229.12510.37.camel@localhost> Message-ID: <004801c5d3ec$e29b5360$6402a8c0@arkdesktop> > Sure, that would work. Or even this, if the scheduler would > automatically recognize generator objects being yielded and so would run > the the nested coroutine until finish: This idea has been discussed before. I think the problem with recognizing generators as the subject of "yield" statements is that then you can't yield a generator even if you want to. The best syntax I can think of without adding a new keyword looks like this: yield from x which would be equivalent to for i in x: yield i Note that this equivalence would imply that x can be any iterable, not just a generator. For example: yield from ['Hello', 'world'] would be equivalent to yield 'Hello' yield 'world' From ncoghlan at gmail.com Tue Oct 18 16:17:22 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 19 Oct 2005 00:17:22 +1000 Subject: [Python-Dev] Coroutines, generators, function calling In-Reply-To: <004801c5d3ec$e29b5360$6402a8c0@arkdesktop> References: <004801c5d3ec$e29b5360$6402a8c0@arkdesktop> Message-ID: <435503F2.2090103@gmail.com> Andrew Koenig wrote: >> Sure, that would work. Or even this, if the scheduler would >> automatically recognize generator objects being yielded and so would run >> the the nested coroutine until finish: > > This idea has been discussed before. I think the problem with recognizing > generators as the subject of "yield" statements is that then you can't yield > a generator even if you want to. > > The best syntax I can think of without adding a new keyword looks like this: > > yield from x > > which would be equivalent to > > for i in x: > yield i > > Note that this equivalence would imply that x can be any iterable, not just > a generator. For example: > > yield from ['Hello', 'world'] > > would be equivalent to > > yield 'Hello' > yield 'world' Hmm, I actually quite like that. The best I came up with was "yield for", and that just didn't read correctly. Whereas "yield from seq" says exactly what it is doing. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From barry at python.org Tue Oct 18 16:57:01 2005 From: barry at python.org (Barry Warsaw) Date: Tue, 18 Oct 2005 10:57:01 -0400 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> <1129605050.9405.29.camel@geddy.wooz.org> Message-ID: <1129647421.24013.7.camel@geddy.wooz.org> On Mon, 2005-10-17 at 23:46, Guido van Rossum wrote: > But I still like the version with strings better: > > x = property('get_x', 'set_x') > > This trades two lambdas for two pairs of string quotes; a good deal IMO! You could of course "just" do the wrapping in property(). I put that in quotes because you'd have the problem of knowing when to wrap and when not to, but there would be ways to solve that. But I won't belabor the point any longer, except to ask what happens when you typo one of those strings? What kind of exception do you get and when do you get it? -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20051018/21468041/attachment.pgp From solipsis at pitrou.net Tue Oct 18 17:05:34 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 18 Oct 2005 17:05:34 +0200 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <1129647421.24013.7.camel@geddy.wooz.org> References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> <1129605050.9405.29.camel@geddy.wooz.org> <1129647421.24013.7.camel@geddy.wooz.org> Message-ID: <1129647934.6135.23.camel@fsol> Le mardi 18 octobre 2005 ? 10:57 -0400, Barry Warsaw a ?crit : > On Mon, 2005-10-17 at 23:46, Guido van Rossum wrote: > > > But I still like the version with strings better: > > > > x = property('get_x', 'set_x') > > > > This trades two lambdas for two pairs of string quotes; a good deal IMO! > > You could of course "just" do the wrapping in property(). I put that in > quotes because you'd have the problem of knowing when to wrap and when > not to, but there would be ways to solve that. But I won't belabor the > point any longer, except to ask what happens when you typo one of those > strings? What kind of exception do you get and when do you get it? AttributeError when actually accessing the property, no? Guido's proposal seems quite nice to me. It helps group all property declarations at the beginning, and having accessor methods at the end with other non-public methods. Currently I never use properties, because it makes classes much less readable for the same kind of reasons as what Jim wrote: Le mardi 18 octobre 2005 ? 09:37 -0400, Jim Jewett a ?crit : > For me, the property declaration (including the function > declarations) is too verbose, and ends up hiding the rest > of the class. From pje at telecommunity.com Tue Oct 18 17:26:04 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 18 Oct 2005 11:26:04 -0400 Subject: [Python-Dev] Coroutines, generators, function calling In-Reply-To: <1129633289.12510.24.camel@localhost> Message-ID: <5.1.1.6.0.20051018112412.01f1f9e8@mail.telecommunity.com> At 12:01 PM 10/18/2005 +0100, Gustavo J. A. M. Carneiro wrote: >def show_message(msg): > win = create_window(msg) > animate(win, xrange(10)) # slide down > yield Timeout(3) > animate(win, xrange(10, 0, -1)) # slide up > win.destroy() > > This obviously doesn't work, because calling animate() produces >another generator, instead of calling the function. In coroutines >context, it's like it produces another coroutine, while all I wanted was >to call a function. Just 'yield animate(win, xrange(10))' and have the trampoline recognize generators. See the PEP 342 trampoline example, which does this. When the animate() is exhausted, it'll resume the "calling" function. > I don't suppose there could be a way to make the yield inside the >subfunction have the same effect as if it was inside the function that >called it? Perhaps some special notation, either at function calling or >at function definition? Yes, it's 'yield' at the function calling. :) From pje at telecommunity.com Tue Oct 18 17:31:32 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 18 Oct 2005 11:31:32 -0400 Subject: [Python-Dev] Defining properties - a use case for class decorators? In-Reply-To: <4354FFA7.6020204@gmail.com> References: Message-ID: <5.1.1.6.0.20051018112713.01f10250@mail.telecommunity.com> At 11:59 PM 10/18/2005 +1000, Nick Coghlan wrote: >An idea that was kicked around on c.l.p a long while back was "statement >local >variables", where you could define some extra names just for a single simple >statement: > > x = property(get, set, delete, doc) given: > doc = "Property x (must be less than 5)" > def get(self): > try: > return self._x > except AttributeError: > self._x = 0 > return 0 >... > >As I recall, the idea died due to problems with figuring out how to allow the >simple statement to both see the names from the nested block and modify the >surrounding namespace, but prevent the names from the nested block from >affecting the surrounding namespace after the statement was completed. Haskell's "where" statement does this, but the block *doesn't* modify the surrounding namespace; it's strictly local. With those semantics, the Python translation of the above could just be something like: def _tmp(): doc = "blah" def get(self): # etc. # ... return property(get,set,delete,doc) x = _tmp() Which works great except for the part that co_lnotab won't let you identify that "return" line as being the original expression line, due to the monotonically-increasing bit. ;) Note that a "where" or "given" statement like this could make it a little easier to drop lambda. From michele.simionato at gmail.com Tue Oct 18 17:38:40 2005 From: michele.simionato at gmail.com (Michele Simionato) Date: Tue, 18 Oct 2005 15:38:40 +0000 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <1129647934.6135.23.camel@fsol> References: <43525BFA.9090309@iinet.net.au> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> <1129605050.9405.29.camel@geddy.wooz.org> <1129647421.24013.7.camel@geddy.wooz.org> <1129647934.6135.23.camel@fsol> Message-ID: <4edc17eb0510180838u2fab1cebv6e8975525ece9944@mail.gmail.com> On 10/18/05, Antoine Pitrou wrote: > Le mardi 18 octobre 2005 ? 10:57 -0400, Barry Warsaw a ?crit : > Currently I never use properties, because it makes classes much less > readable for the same kind of reasons as what Jim wrote. Me too, I never use properties directly. However I have experimented with using helper functions to generate the properties: _dic = {} def makeproperty(x): def getx(self): return _dic[self, x] def setx(self, value): _dic[self, x] = value return property(getx, setx) class C(object): x = makeproperty('x') c = C() c.x = 1 print c.x But in general I prefer to write a custom descriptor class, since it gives me much more control. Michele Simionato From aahz at pythoncraft.com Tue Oct 18 17:49:00 2005 From: aahz at pythoncraft.com (Aahz) Date: Tue, 18 Oct 2005 08:49:00 -0700 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <1129647421.24013.7.camel@geddy.wooz.org> References: <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> <1129605050.9405.29.camel@geddy.wooz.org> <1129647421.24013.7.camel@geddy.wooz.org> Message-ID: <20051018154900.GB22469@panix.com> On Tue, Oct 18, 2005, Barry Warsaw wrote: > > You could of course "just" do the wrapping in property(). I put that in > quotes because you'd have the problem of knowing when to wrap and when > not to, but there would be ways to solve that. But I won't belabor the > point any longer, except to ask what happens when you typo one of those > strings? What kind of exception do you get and when do you get it? AttributeError, just like this: class C: pass C().foo() Last night I was thinking that maybe TypeError would be better, but AttributeError is going to be what the internal machinery raises, and I decided there was no point trying to translate it. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you think it's expensive to hire a professional to do the job, wait until you hire an amateur." --Red Adair From jcarlson at uci.edu Tue Oct 18 18:55:56 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 18 Oct 2005 09:55:56 -0700 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <20051018041112.GA14975@panix.com> References: <20051018041112.GA14975@panix.com> Message-ID: <20051018094159.37EE.JCARLSON@uci.edu> Aahz wrote: > > On Mon, Oct 17, 2005, Guido van Rossum wrote: > > > > If an argument is a string, it should be a method name, and the method > > is looked up by that name each time the property is used. Because this > > is late binding, it can be put before the method definitions, and a > > subclass can override the methods. Example: > > > > class C: > > foo = property('getFoo', 'setFoo', None, 'the foo property') > > +1 > > The only other alternative is to introduce some kind of block. This is > a good solution that is not particularly intrusive; it leaves the door > open to a well-designed block structure later on. The one niggle I have > is that it's going to be a little unwieldy to explain, but people who > create properties really ought to understand Python well enough to deal > with it. I remember posing an unanswered question back when blocks were being offered, and being that you brought up blocks again, I'll ask a more specific variant of my original question: What would this mythical block statement look like that would make properties easier to write than the above late-binding or the subclass Property recipe? - Josiah From solipsis at pitrou.net Tue Oct 18 19:17:14 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 18 Oct 2005 19:17:14 +0200 Subject: [Python-Dev] properties and block statement In-Reply-To: <20051018094159.37EE.JCARLSON@uci.edu> References: <20051018041112.GA14975@panix.com> <20051018094159.37EE.JCARLSON@uci.edu> Message-ID: <1129655834.8244.9.camel@fsol> > What would this mythical block statement look like that would make > properties easier to write than the above late-binding or the subclass > Property recipe? I suppose something like: class C(object): x = prop: """ Yay for property x! """ def __get__(self): return self._x def __set__(self, value): self._x = x and then: def prop(@block): return property( fget=block.get("__get__"), fset=block.get("__set__"), fdel=block.get("__delete__"), doc=block.get("__doc__", ""), ) (where "@bargs" would be the syntax to refer to block args as a dict, the same way "**kargs" already exist) From jcarlson at uci.edu Tue Oct 18 19:23:59 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 18 Oct 2005 10:23:59 -0700 Subject: [Python-Dev] Defining properties - a use case for class decorators? In-Reply-To: <4354FFA7.6020204@gmail.com> References: <4354FFA7.6020204@gmail.com> Message-ID: <20051018095537.37F1.JCARLSON@uci.edu> Nick Coghlan wrote: > > Jim Jewett wrote: > > That said, I'm not sure the benefit is enough to justify the > > extra complications, and your suggestion of allowing strings > > for method names may be close enough. I agree that the > > use of strings is awkward, but ... probably no worse than > > using them with __dict__ today. > > An idea that was kicked around on c.l.p a long while back was "statement local > variables", where you could define some extra names just for a single simple > statement: > > x = property(get, set, delete, doc) given: > doc = "Property x (must be less than 5)" > def get(self): > try: > return self._x > except AttributeError: > self._x = 0 > return 0 > def set(self, value): > if value >= 5: raise ValueError("value too big") > self._x = x > def delete(self): > del self._x > > As I recall, the idea died due to problems with figuring out how to allow the > simple statement to both see the names from the nested block and modify the > surrounding namespace, but prevent the names from the nested block from > affecting the surrounding namespace after the statement was completed. You wouldn't be able to write to the surrounding namespace, but a closure would work fine for this. def Property(fcn): ns = fcn() return property(ns.get('get'), ns.get('set'), ns.get('delete'), ns.get('doc')) class foo(object): @Property def x(): doc = "Property x (must be less than 5)" def get(self): try: return self._x except AttributeError: self._x = 0 return 0 def set(self, value): if value >= 5: raise ValueError("value too big") self._x = value def delete(self): del self._x return locals() In an actual 'given:' statement, one could create a local function namespace with the proper func_closure attribute (which is automatically executed), then execute the lookup of the arguments to the statement in the 'given:' line from this closure, but assign to surrounding scope. Then again, maybe the above function and decorator approach are better. An unfortunate side-effect of with statement early-binding of 'as VAR' is that unless one works quite hard at mucking about with frames, the following has a wholly ugly implementation (whether or not one cares about the persistance of the variables defined within the block, you still need to modify x when you are done, which may as well cause a cleanup of the objects defined within the block...if such things are possible)... with Property as x: ... > Another option would be to allow attribute reference targets when binding > function names: *shivers at the proposal* That's scary. It took me a few minutes just to figure out what the heck that was supposed to do. - Josiah From solipsis at pitrou.net Tue Oct 18 19:32:46 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 18 Oct 2005 19:32:46 +0200 Subject: [Python-Dev] properties and block statement In-Reply-To: <1129655834.8244.9.camel@fsol> References: <20051018041112.GA14975@panix.com> <20051018094159.37EE.JCARLSON@uci.edu> <1129655834.8244.9.camel@fsol> Message-ID: <1129656766.8244.20.camel@fsol> Le mardi 18 octobre 2005 ? 19:17 +0200, Antoine Pitrou a ?crit : > > What would this mythical block statement look like that would make > > properties easier to write than the above late-binding or the subclass > > Property recipe? > > I suppose something like: > > class C(object): > x = prop: > """ Yay for property x! """ > def __get__(self): > return self._x > def __set__(self, value): > self._x = x An example of applicating this scheme to Twisted: domain_name = "www.google.com" reactor.resolve(domain_name): def callback(value): print "%s resolves to %s" % (domain_name, value) def errback(error): print "failed to resolve %s!" And then in the Reactor class: def resolve(self, name, @block): ... d = defer.Deferred() cb = block.pop('callback') if cb is not None: d.addCallback(cb) eb = block.pop('errback') if eb is not None: d.addCallback(eb) if block: raise ValueError("spurious blockargs: %s" % str(block)) return d From jcarlson at uci.edu Tue Oct 18 21:56:32 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 18 Oct 2005 12:56:32 -0700 Subject: [Python-Dev] properties and block statement In-Reply-To: <1129655834.8244.9.camel@fsol> References: <20051018094159.37EE.JCARLSON@uci.edu> <1129655834.8244.9.camel@fsol> Message-ID: <20051018124010.37F4.JCARLSON@uci.edu> Antoine Pitrou wrote: > > > > What would this mythical block statement look like that would make > > properties easier to write than the above late-binding or the subclass > > Property recipe? > > I suppose something like: > > class C(object): > x = prop: > """ Yay for property x! """ > def __get__(self): > return self._x > def __set__(self, value): > self._x = x > > and then: > > def prop(@block): > return property( > fget=block.get("__get__"), > fset=block.get("__set__"), > fdel=block.get("__delete__"), > doc=block.get("__doc__", ""), > ) > > (where "@bargs" would be the syntax to refer to block args as a dict, > the same way "**kargs" already exist) You are saving 3 lines over the decorator/function approach at the cost of possible confusion over blocks and an easily forgotten/not read @ just after an open paren. Thanks, but I'll stick to the Property decorator, Property subclass, property late bindings, or even a Property metaclass*, and not need to modify Python syntax. - Josiah * Property metaclass in an embedded class definition: class Property(type): def __init__(*args): pass def __new__(cls, name, bases, dct): return property(dct.get('get'), dct.get('set'), dct.get('delete'), dct.get('__doc__', '')) class foo(object): class x(object): __metaclass__ = Property 'hello' def get(self): try: return self._x except AttributeError: self._x = 0 return 0 def set(self, value): if value >= 5: raise ValueError("value too big") self._x = value def delete(self): del self._x From solipsis at pitrou.net Tue Oct 18 22:48:10 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 18 Oct 2005 22:48:10 +0200 Subject: [Python-Dev] properties and block statement In-Reply-To: <20051018124010.37F4.JCARLSON@uci.edu> References: <20051018094159.37EE.JCARLSON@uci.edu> <1129655834.8244.9.camel@fsol> <20051018124010.37F4.JCARLSON@uci.edu> Message-ID: <1129668490.8464.16.camel@fsol> Le mardi 18 octobre 2005 ? 12:56 -0700, Josiah Carlson a ?crit : > You are saving 3 lines over the decorator/function approach [...] Well, obviously, the point of a block statement or construct is that it can be applied to many other things than properties. Otherwise it is overkill as you imply. (I'm not actively advocating this by the way, I was just answering a request for an example) From martin at v.loewis.de Wed Oct 19 00:19:01 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 19 Oct 2005 00:19:01 +0200 Subject: [Python-Dev] Migrating to subversion Message-ID: <435574D5.2040604@v.loewis.de> AFAICT, everything is now setup to actually switch to subversion. The infrastructure is complete, the conversion procedure is complete, and Guido pronounced that the migration could happen. One open issue is where to do the hosting: whether to pay a commercial hosting company (i.e. wush.net), or whether to let it be volunteer-hosted on svn.python.org. Guido was undecided, like several other developers; the majority of the rest apparently was in favour of trying it on svn.python.org. Anthony Baxter specifically told me that he would now be fine with hosting it on svn.python.org as it gives us more control. If it doesn't work out, we can still go to a commercial hoster. So I would like to start a conversion in the near future. One complication apparently is that SF doesn't manage to create nightly CVS tarballs anymore; the one I just got is 5 days old. So we would submit a support request that they manually trigger tarball generation to shorten the freeze period. If people want to test the installation before the switch happens, this would be the time to do it. Regards, Martin From pje at telecommunity.com Wed Oct 19 00:43:53 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 18 Oct 2005 18:43:53 -0400 Subject: [Python-Dev] Migrating to subversion In-Reply-To: <435574D5.2040604@v.loewis.de> Message-ID: <5.1.1.6.0.20051018184316.01fa0d90@mail.telecommunity.com> At 12:19 AM 10/19/2005 +0200, Martin v. L?wis wrote: >If people want to test the installation before the switch happens, >this would be the time to do it. What will the procedure be for getting a login? I assume our SF logins won't simply be transferred/transferrable. From skip at pobox.com Wed Oct 19 01:16:36 2005 From: skip at pobox.com (skip@pobox.com) Date: Tue, 18 Oct 2005 18:16:36 -0500 Subject: [Python-Dev] Migrating to subversion In-Reply-To: <435574D5.2040604@v.loewis.de> References: <435574D5.2040604@v.loewis.de> Message-ID: <17237.33364.928502.548360@montanaro.dyndns.org> Martin> If people want to test the installation before the switch Martin> happens, this would be the time to do it. Martin, Can you let us know again the magic incantation to check out the source from the repository? Thx, Skip From greg.ewing at canterbury.ac.nz Wed Oct 19 01:20:32 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 19 Oct 2005 12:20:32 +1300 Subject: [Python-Dev] Property syntax for Py3k (properties and block statement) In-Reply-To: <1129655834.8244.9.camel@fsol> References: <20051018041112.GA14975@panix.com> <20051018094159.37EE.JCARLSON@uci.edu> <1129655834.8244.9.camel@fsol> Message-ID: <43558340.1010008@canterbury.ac.nz> Antoine Pitrou wrote: > I suppose something like: > > class C(object): > x = prop: > """ Yay for property x! """ > def __get__(self): > return self._x > def __set__(self, value): > self._x = x I've just looked at Steven Bethard's recipe, and it seems to give you something very like this. Two problems with it are the abuse of 'class' to define something that's not really used as a class, and the need to explicitly inherit from the base class's property descriptor. In Py3k, I'd like to see 'property' renamed to 'Property', and 'property' become a keyword used something like class C: property x: """This is the x property.""" def __get__(self): ... def __set__(self, value): ... def __del__(self): ... The accessors should be overridable in subclasses, so you can do class D(C): property x: def __get__(self): """This overrides the __get__ property for x in C""" ... Greg From michel at cignex.com Wed Oct 19 02:01:47 2005 From: michel at cignex.com (Michel Pelletier) Date: Tue, 18 Oct 2005 17:01:47 -0700 Subject: [Python-Dev] Guido v. Python, Round 1 In-Reply-To: <17235.59880.819873.541201@montanaro.dyndns.org> References: <17235.52236.458715.854015@montanaro.dyndns.org> <1f7befae0510170934h21d072dbn2037d634af7d17f@mail.gmail.com> <17235.59880.819873.541201@montanaro.dyndns.org> Message-ID: <43558CEB.2020304@cignex.com> skip at pobox.com wrote: > Neal> This URL should work for a while longer. > > Neal> http://creosote.python.org/neal/ > > Ah, the vagaries of URL redirection. Thanks... The front of his shirt says ++ungood; Is that the whole joke or is the punchline on the back? -Michel From bcannon at gmail.com Wed Oct 19 02:39:49 2005 From: bcannon at gmail.com (Brett Cannon) Date: Tue, 18 Oct 2005 17:39:49 -0700 Subject: [Python-Dev] Migrating to subversion In-Reply-To: <17237.33364.928502.548360@montanaro.dyndns.org> References: <435574D5.2040604@v.loewis.de> <17237.33364.928502.548360@montanaro.dyndns.org> Message-ID: On 10/18/05, skip at pobox.com wrote: > > Martin> If people want to test the installation before the switch > Martin> happens, this would be the time to do it. > > Martin, > > Can you let us know again the magic incantation to check out the source from > the repository? > And any other problems people come across or questions they have about Subversion and its use, please do ask. I will try to start a new section in the dev FAQ for svn-specific issues. -Brett From dave at pythonapocrypha.com Wed Oct 19 05:28:14 2005 From: dave at pythonapocrypha.com (Dave Brueck) Date: Tue, 18 Oct 2005 21:28:14 -0600 Subject: [Python-Dev] Guido v. Python, Round 1 In-Reply-To: <43558CEB.2020304@cignex.com> References: <17235.52236.458715.854015@montanaro.dyndns.org> <1f7befae0510170934h21d072dbn2037d634af7d17f@mail.gmail.com> <17235.59880.819873.541201@montanaro.dyndns.org> <43558CEB.2020304@cignex.com> Message-ID: <4355BD4E.9010506@pythonapocrypha.com> Michel Pelletier wrote: > skip at pobox.com wrote: > >> Neal> This URL should work for a while longer. >> >> Neal> http://creosote.python.org/neal/ >> >>Ah, the vagaries of URL redirection. Thanks... > > > The front of his shirt says ++ungood; Is that the whole joke or is the > punchline on the back? http://www.ee.surrey.ac.uk/Personal/L.Wood/double-plus-ungood/ From martin at v.loewis.de Wed Oct 19 07:50:58 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 19 Oct 2005 07:50:58 +0200 Subject: [Python-Dev] Migrating to subversion In-Reply-To: <5.1.1.6.0.20051018184316.01fa0d90@mail.telecommunity.com> References: <5.1.1.6.0.20051018184316.01fa0d90@mail.telecommunity.com> Message-ID: <4355DEC2.3030103@v.loewis.de> Phillip J. Eby wrote: > What will the procedure be for getting a login? I assume our SF logins > won't simply be transferred/transferrable. You should send your SSH2 public key along with your preferred logname (firstname.lastname) to pydotorg at python.org. Regards, Martin From martin at v.loewis.de Wed Oct 19 07:54:42 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 19 Oct 2005 07:54:42 +0200 Subject: [Python-Dev] Migrating to subversion In-Reply-To: <17237.33364.928502.548360@montanaro.dyndns.org> References: <435574D5.2040604@v.loewis.de> <17237.33364.928502.548360@montanaro.dyndns.org> Message-ID: <4355DFA2.3000801@v.loewis.de> skip at pobox.com wrote: > Can you let us know again the magic incantation to check out the source from > the repository? See http://www.python.org/dev/svn.html It's (say) svn co svn+ssh://pythondev at svn.python.org/python/trunk/Misc for read-write access, and svn co http://svn.python.org/projects/python/trunk/Misc for read-only access; viewcvs is at http://svn.python.org/view Regards, Martin From martin at v.loewis.de Wed Oct 19 07:55:57 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 19 Oct 2005 07:55:57 +0200 Subject: [Python-Dev] Migrating to subversion In-Reply-To: References: <435574D5.2040604@v.loewis.de> <17237.33364.928502.548360@montanaro.dyndns.org> Message-ID: <4355DFED.2020004@v.loewis.de> Brett Cannon wrote: > And any other problems people come across or questions they have about > Subversion and its use, please do ask. I will try to start a new > section in the dev FAQ for svn-specific issues. Please integrate http://www.python.org/dev/svn.html (linked from 1.3 of devfaq.html) if possible. Regards, Martin From stefan.rank at ofai.at Wed Oct 19 09:01:21 2005 From: stefan.rank at ofai.at (Stefan Rank) Date: Wed, 19 Oct 2005 09:01:21 +0200 Subject: [Python-Dev] properties and block statement In-Reply-To: <1129655834.8244.9.camel@fsol> References: <20051018041112.GA14975@panix.com> <20051018094159.37EE.JCARLSON@uci.edu> <1129655834.8244.9.camel@fsol> Message-ID: <4355EF41.4010803@ofai.at> on 18.10.2005 19:17 Antoine Pitrou said the following: >> What would this mythical block statement look like that would make >>properties easier to write than the above late-binding or the subclass >>Property recipe? > > I suppose something like: > > class C(object): > x = prop: > """ Yay for property x! """ > def __get__(self): > return self._x > def __set__(self, value): > self._x = x > > and then: > > def prop(@block): > return property( > fget=block.get("__get__"), > fset=block.get("__set__"), > fdel=block.get("__delete__"), > doc=block.get("__doc__", ""), > ) > > (where "@bargs" would be the syntax to refer to block args as a dict, > the same way "**kargs" already exist) > I think there is no need for a special @syntax for this to work. I suppose it would be possible to allow a trailing block after any function invocation, with the effect of creating a new namespace that gets treated as containing keyword arguments. No additional function needed for the property example:: class C(object): x = property(): doc = """ Yay for property x! """ def fget(self): return self._x def fset(self, value): self._x = x (This does not help with the problem of overridability though...) A drawback is that such a "keyword block" would only be possible for the last function invocation of a statement. Although the block could also be inside the method invocation parentheses? I do not think that this is a pretty sight but I'll spell it out anyways ;-) :: class C(object): x = property(: doc = """ Yay for property x! """ def fget(self): return self._x def fset(self, value): self._x = x ) --stefan From michele.simionato at gmail.com Wed Oct 19 10:51:50 2005 From: michele.simionato at gmail.com (Michele Simionato) Date: Wed, 19 Oct 2005 08:51:50 +0000 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> Message-ID: <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com> On 10/18/05, Guido van Rossum wrote: > I wonder if at some point in the future Python will have to develop a > macro syntax so that you can write > > Property foo: > def get(self): return self._foo > ...etc... This reminds me of an idea I have kept in my drawer for a couple of years or so. Here is my proposition: we could have the statement syntax : to be syntactic sugar for = (, , ) For instance properties could be defined as follows: def Property(name, args, dic): return property( dic.get('fget'), dic.get('fset'), dic.get('fdel'), dic.get('__doc__')) Property p(): "I am a property" def fget(self): pass def fset(self): pass def fdel(self): pass Another typical use case could be a dispatcher: class Dispatcher(object): def __init__(self, name, args, dic): self.dic = dic def __call__(self, action, *args, **kw): return self.dic.get(action)(*args, **kw) Dispatcher dispatch(action): def do_this(): pass def do_that(): pass def default(): pass dispatch('do_this') Notice that the proposal is already implementable by abusing the class statement: class : __metaclass__ = But abusing metaclasses for this task is ugly. BTW, if the proposal was implemented, the 'class' would become redundant and could be replaced by 'type': class : <=> type : ;) Michele Simionato From duncan.booth at suttoncourtenay.org.uk Wed Oct 19 11:11:16 2005 From: duncan.booth at suttoncourtenay.org.uk (Duncan Booth) Date: Wed, 19 Oct 2005 10:11:16 +0100 Subject: [Python-Dev] properties and block statement References: <1129655834.8244.9.camel@fsol> <4355EF41.4010803@ofai.at> Message-ID: Stefan Rank wrote in news:4355EF41.4010803 at ofai.at: > I think there is no need for a special @syntax for this to work. > > I suppose it would be possible to allow a trailing block after any > function invocation, with the effect of creating a new namespace that > gets treated as containing keyword arguments. > I suspect that without any syntax changes at all it will be possible (for some stack crawling implementation of 'propertycontext', and assuming nobody makes property objects immutable) to do: class C(object): with propertycontext() as x: doc = """ Yay for property x! """ def fget(self): return self._x def fset(self, value): self._x = value for inheritance you would have to specify the base property: class D(C): with propertycontext(C.x) as x: def fset(self, value): self._x = value+1 propertycontext could look something like: import sys @contextmanager def propertycontext(parent=None): classframe = sys._getframe(2) cvars = classframe.f_locals marker = object() keys = ('fget', 'fset', 'fdel', 'doc') old = [cvars.get(key, marker) for key in keys] if parent: pvars = [getattr(parent, key) for key in ('fget', 'fset', 'fdel', '__doc__')] else: pvars = [None]*4 args = dict(zip(keys, pvars)) prop = property() try: yield prop for key, orig in zip(keys, old): v = cvars.get(key, marker) if v is not orig: args[key] = v prop.__init__(**args) finally: for k,v in zip(keys,old): if v is marker: if k in cvars: del cvars[k] else: cvars[k] = v From ncoghlan at gmail.com Wed Oct 19 11:25:05 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 19 Oct 2005 19:25:05 +1000 Subject: [Python-Dev] Defining properties - a use case for class decorators? In-Reply-To: <20051018095537.37F1.JCARLSON@uci.edu> References: <4354FFA7.6020204@gmail.com> <20051018095537.37F1.JCARLSON@uci.edu> Message-ID: <435610F1.60207@gmail.com> Josiah Carlson wrote: >> Another option would be to allow attribute reference targets when binding >> function names: > > *shivers at the proposal* That's scary. It took me a few minutes just > to figure out what the heck that was supposed to do. Yeah, I think it's a concept with many, many more downsides than upsides. A "given" or "where" clause based solution would be far easier to read: x.get = f given: def f(): pass A given clause has its own problems though (the out-of-order execution it involves being the one which seems to raise the most hackles). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From p.f.moore at gmail.com Wed Oct 19 11:42:25 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 19 Oct 2005 10:42:25 +0100 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com> References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com> Message-ID: <79990c6b0510190242l10dee1b8va08d9245252dcf8d@mail.gmail.com> On 10/19/05, Michele Simionato wrote: > On 10/18/05, Guido van Rossum wrote: > > I wonder if at some point in the future Python will have to develop a > > macro syntax so that you can write > > > > Property foo: > > def get(self): return self._foo > > ...etc... > > This reminds me of an idea I have kept in my drawer for a couple of years or so. > Here is my proposition: we could have the statement syntax > > : > > > to be syntactic sugar for > > = (, , ) Cor. That looks like very neat/scary stuff. I'm not sure if I feel that that is a good thing or a bad thing :-) One question - in the expansion, "name" is used on both sides of the assignment. Consider something name(): This expands to name = something(name, (), ) What should happen if name wasn't defined before? A literal translation will result in a NameError. Maybe an expansion name = something('name', (), ) would be better (ie, the callable gets the *name* of the target as an argument, rather than the old value). Also, the bit needs some clarification. I'm guessing that it would be a suite, executed in a new, empty namespace, and the is the resulting modified namespace (with __builtins__ removed?) In other words, take , and do d = {} exec in d del d['__builtins__'] then is the resulting value of d. Interesting idea... Paul. From ncoghlan at gmail.com Wed Oct 19 11:47:17 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 19 Oct 2005 19:47:17 +1000 Subject: [Python-Dev] Defining properties - a use case for class decorators? In-Reply-To: <5.1.1.6.0.20051018112713.01f10250@mail.telecommunity.com> References: <5.1.1.6.0.20051018112713.01f10250@mail.telecommunity.com> Message-ID: <43561625.4030902@gmail.com> Phillip J. Eby wrote: > Note that a "where" or "given" statement like this could make it a > little easier to drop lambda. I think the "lambda will disappear in Py3k" concept might have been what triggered the original 'where' statement discussion. The idea was to be able to lift an arbitrary subexpression out of a function call or assignment statement without having to worry about affecting the surrounding namespace, and without distracting attention from the original statement. Basically, let a local refactoring *stay* local. The discussion wandered fairly far afield from that original goal though. One reason it fell apart was trying to answer the seemingly simple question "What would this print?": def f(): a = 1 b = 2 print 1, locals() print 3, locals() given: a = 2 c = 3 print 2, locals() print 4, locals() Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From michele.simionato at gmail.com Wed Oct 19 12:10:53 2005 From: michele.simionato at gmail.com (Michele Simionato) Date: Wed, 19 Oct 2005 10:10:53 +0000 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <79990c6b0510190242l10dee1b8va08d9245252dcf8d@mail.gmail.com> References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com> <79990c6b0510190242l10dee1b8va08d9245252dcf8d@mail.gmail.com> Message-ID: <4edc17eb0510190310o637d141bvaaad939b0f3c072b@mail.gmail.com> On 10/19/05, Paul Moore wrote: > > One question - in the expansion, "name" is used on both sides of the > assignment. Consider > > something name(): > > > This expands to > > name = something(name, (), ) > > What should happen if name wasn't defined before? A literal > translation will result in a NameError. Maybe an expansion > > name = something('name', (), ) > > would be better (ie, the callable gets the *name* of the target as an > argument, rather than the old value). > > Also, the bit needs some clarification. I'm guessing > that it would be a suite, executed in a new, empty namespace, and the > is the resulting modified namespace (with > __builtins__ removed?) > > In other words, take , and do > > d = {} > exec in d > del d['__builtins__'] > > then is the resulting value of d. > > Interesting idea... > > Paul. > would be a string and a dictionary. As I said, the semantic would be exactly the same as the current way of doing it: class : __metaclass__ = I am just advocating for syntactic sugar, the functionality is already there. Michele Simionato From solipsis at pitrou.net Wed Oct 19 13:12:26 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 19 Oct 2005 13:12:26 +0200 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com> References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com> Message-ID: <1129720346.6215.18.camel@fsol> Hi Michele, > Property p(): > "I am a property" > def fget(self): > pass > def fset(self): > pass > def fdel(self): > pass In effect this is quite similar to the proposal I've done (except that you've reversed the traditional assignment order from "p = Property()" to "Property p()") If others find it interesting and Guido doesn't frown on it, maybe we should sit down and start writing a PEP ? ciao Antoine. From janc13 at gmail.com Wed Oct 19 13:59:59 2005 From: janc13 at gmail.com (JanC) Date: Wed, 19 Oct 2005 13:59:59 +0200 Subject: [Python-Dev] Pythonic concurrency - offtopic In-Reply-To: <20051013220748.9195.JCARLSON@uci.edu> References: <434B61ED.4080503@intercable.ru> <20051013220748.9195.JCARLSON@uci.edu> Message-ID: <984838bf0510190459o695d99bcqa44e7eb072cf230d@mail.gmail.com> On 10/14/05, Josiah Carlson wrote: > Until Microsoft adds kernel support for fork, don't expect standard > Windows Python to support it. AFAIK the NT kernel has support for fork, but the Win32 subsystem doesn't support it (you can only use it with the POSIX subsystem). -- JanC From jimjjewett at gmail.com Wed Oct 19 15:23:35 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 19 Oct 2005 09:23:35 -0400 Subject: [Python-Dev] Early PEP draft (For Python 3000?) Message-ID: (In http://mail.python.org/pipermail/python-dev/2005-October/057251.html) Eyal Lotem wrote: > Name: Attribute access for all namespaces ... > global x ; x = 1 > Replaced by: > module.x = 1 Attribute access as an option would be nice, but might be slower. Also note that one common use for a __dict__ is that you don't know what keys are available; meeting this use case with attribute access would require some extra machinery, such as an iterator over attributes. -jJ From jimjjewett at gmail.com Wed Oct 19 15:44:09 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 19 Oct 2005 09:44:09 -0400 Subject: [Python-Dev] Defining properties - a use case for class decorators? Message-ID: (In http://mail.python.org/pipermail/python-dev/2005-October/057409.html,) Nick Coghlan suggested allowing attribute references as binding targets. > x = property("Property x (must be less than 5)") > def x.get(instance): ... Josiah shivered and said it was hard to tell what was even intended, and (in http://mail.python.org/pipermail/python-dev/2005-October/057437.html) Nick agreed that it was worse than > x.get = f given: > def f(): ... Could someone explain to me why it is worse? I understand not wanting to modify object x outside of its definition. I understand that there is some trickiness about instancemethods and bound variables. But these objections seem equally strong for both forms, as well as for the current "equivalent" of def f(): ... x.get = f The first form (def x.get) at least avoids repeating (or even creating) the temporary function name. The justification for decorators was to solve this very problem within a module or class. How is this different? Is it just that attributes shouldn't be functions, and this might encourage the practice? -jJ From steven.bethard at gmail.com Wed Oct 19 17:38:12 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Wed, 19 Oct 2005 09:38:12 -0600 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com> References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com> Message-ID: Michele Simionato wrote: > This reminds me of an idea I have kept in my drawer for a couple of years or so. > Here is my proposition: we could have the statement syntax > > : > > > to be syntactic sugar for > > = (, , ) > [snip] > BTW, if the proposal was implemented, the 'class' would become > redundant and could be replaced by 'type': > > class : > > > <=> > > type : > Wow, that's really neat. And you save a keyword! ;-) I'd definitely like to see a PEP. STeVE -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From pje at telecommunity.com Wed Oct 19 17:42:05 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 19 Oct 2005 11:42:05 -0400 Subject: [Python-Dev] Defining properties - a use case for class decorators? In-Reply-To: <43561625.4030902@gmail.com> References: <5.1.1.6.0.20051018112713.01f10250@mail.telecommunity.com> <5.1.1.6.0.20051018112713.01f10250@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051019113729.01fa79f8@mail.telecommunity.com> At 07:47 PM 10/19/2005 +1000, Nick Coghlan wrote: >Phillip J. Eby wrote: > > Note that a "where" or "given" statement like this could make it a > > little easier to drop lambda. > >I think the "lambda will disappear in Py3k" concept might have been what >triggered the original 'where' statement discussion. > >The idea was to be able to lift an arbitrary subexpression out of a function >call or assignment statement without having to worry about affecting the >surrounding namespace, and without distracting attention from the original >statement. Basically, let a local refactoring *stay* local. > >The discussion wandered fairly far afield from that original goal though. > >One reason it fell apart was trying to answer the seemingly simple question >"What would this print?": > > def f(): > a = 1 > b = 2 > print 1, locals() > print 3, locals() given: > a = 2 > c = 3 > print 2, locals() > print 4, locals() It would print "SyntaxError", because the 'given' or 'where' clause should only work on an expression or assignment statement, not print. :) In Python 3000, where print is a function, it should print the numbers in sequence, with 1+4 showing the outer locals, and 2+3 showing the inner locals (not including 'b', since b is not a local variable in the nested block). I don't see what's hard about the question, if you view the block as syntax sugar for a function definition and invocation on the right hand side of an assignment. Of course, if you assume it can occur on *any* statement (e.g. print), I suppose things could seem more hairy. From jeremy at alum.mit.edu Wed Oct 19 17:46:01 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Wed, 19 Oct 2005 11:46:01 -0400 Subject: [Python-Dev] AST branch merge status In-Reply-To: References: Message-ID: I'm still making slow progress on this front. I have a versioned merged to the CVS head. I'd like to make a final pass over the patch. I'd upload it to SF, but I can't connect to a web server there. If anyone would like to eyeball that patch before I commit it, I can email it to you. Jeremy On 10/16/05, Jeremy Hylton wrote: > Real life interfered with the planned merge tonight. I hope you'll > all excuse and wait until tomorrow night. > > Jeremy > > On 10/16/05, Jeremy Hylton wrote: > > I just merged the head back to the AST branch for what I hope is the > > last time. I plan to merge the branch to the head on Sunday evening. > > I'd appreciate it if folks could hold off on making changes on the > > trunk until that merge happens. > > > > If this is a non-trivial inconvenience for anyone, go ahead with the > > changes but send me mail to make sure that I don't lose the changes > > when I do the merge. Regardless, the compiler and Grammar are off > > limits. I'll blow away any changes you make there <0.1 wink>. > > > > Jeremy > > > > > From skip at pobox.com Wed Oct 19 18:43:59 2005 From: skip at pobox.com (skip@pobox.com) Date: Wed, 19 Oct 2005 11:43:59 -0500 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com> Message-ID: <17238.30671.916987.907009@montanaro.dyndns.org> >> : >> ... Steve> Wow, that's really neat. And you save a keyword! ;-) Two if you add a builtin called "function" (get rid of "def"). Skip From pje at telecommunity.com Wed Oct 19 18:49:47 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 19 Oct 2005 12:49:47 -0400 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <17238.30671.916987.907009@montanaro.dyndns.org> References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com> Message-ID: <5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com> At 11:43 AM 10/19/2005 -0500, skip at pobox.com wrote: > >> : > >> > ... > > Steve> Wow, that's really neat. And you save a keyword! ;-) > >Two if you add a builtin called "function" (get rid of "def"). Not unless the tuple is passed in as an abstract syntax tree or something. From jcarlson at uci.edu Wed Oct 19 20:01:06 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 19 Oct 2005 11:01:06 -0700 Subject: [Python-Dev] Pythonic concurrency - offtopic In-Reply-To: <984838bf0510190459o695d99bcqa44e7eb072cf230d@mail.gmail.com> References: <20051013220748.9195.JCARLSON@uci.edu> <984838bf0510190459o695d99bcqa44e7eb072cf230d@mail.gmail.com> Message-ID: <20051019103148.380E.JCARLSON@uci.edu> JanC wrote: > > On 10/14/05, Josiah Carlson wrote: > > Until Microsoft adds kernel support for fork, don't expect standard > > Windows Python to support it. > > AFAIK the NT kernel has support for fork, but the Win32 subsystem > doesn't support it (you can only use it with the POSIX subsystem). Good to know. But if I remember subsystem semantics properly, you can use a single subsystem at any time, so if one wanted to use fork from the POSIX subsystem, one would necessarily have to massage the rest of Python into NT's POSIX subsystem, which could be a problem because NT/2K's posix subsystem doesn't support network interfaces, memory mapped files, ... http://msdn2.microsoft.com/en-us/library/y23kc048 Based on this page: http://www.cygwin.com/cygwin-ug-net/highlights.html ...it does seem possible to borrow cygwin's implementation of fork for use on win32, but I would guess that most people would be disappointed with its performance in comparison to unix fork. - Josiah From jcarlson at uci.edu Wed Oct 19 20:03:13 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 19 Oct 2005 11:03:13 -0700 Subject: [Python-Dev] Early PEP draft (For Python 3000?) In-Reply-To: References: Message-ID: <20051019105405.3817.JCARLSON@uci.edu> Jim Jewett wrote: > > (In http://mail.python.org/pipermail/python-dev/2005-October/057251.html) > Eyal Lotem wrote: > > > Name: Attribute access for all namespaces ... > > > global x ; x = 1 > > Replaced by: > > module.x = 1 > > Attribute access as an option would be nice, but might be slower. > > Also note that one common use for a __dict__ is that you don't > know what keys are available; meeting this use case with > attribute access would require some extra machinery, such as > an iterator over attributes. This particular use case is easily handled. Put the following once at the top of the module... module = __import__(__name__) Then one can access (though perhaps not quickly) the module-level variables for that module. To access attributes, it is a quick scan through module.__dict__, dir(), or vars(). Want to make that automatic? Write an import hook that puts a reference to the module in the module itself on load/reload. - Josiah From jcarlson at uci.edu Wed Oct 19 20:58:06 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 19 Oct 2005 11:58:06 -0700 Subject: [Python-Dev] Early PEP draft (For Python 3000?) In-Reply-To: <76fd5acf0510161037v477874b0w5595e3edffe71511@mail.gmail.com> References: <76fd5acf0510161036i4ab09e2cu39bd6961a60df783@mail.gmail.com> <76fd5acf0510161037v477874b0w5595e3edffe71511@mail.gmail.com> Message-ID: <20051019110251.381A.JCARLSON@uci.edu> Calvin Spealman wrote: > On 10/16/05, Josiah Carlson wrote: [snip] > > What I'm saying is that whether or not you can modify the contents of > > stack frames via tricks, you shouldn't. Why? Because as I said, if the > > writer wanted you to be hacking around with a namespace, they should > > have passed you a shared namespace. > > > > From what I understand, there are very few (good) reasons why a user > > should muck with stack frames, among them because it is quite convenient > > to write custom traceback printers (like web CGI, etc.), and if one is > > tricky, limit the callers of a function/method to those "allowable". > > There may be other good reasons, but until you offer a use-case that is > > compelling for reasons why it should be easier to access and/or modify > > the contents of stack frames, I'm going to remain at -1000. > > I think I was wording this badly. I meant to suggest this as a way to > define nested functions (or classes?) and probably access names from > various levels of scope. In this way, a nested function would be able > to say "bind the name 'a' in the namespace in which I am defined to > this object", thus offering more fine grained approached than the > current global keyword. I know there has been talk of this issue > before, but I don't know if it works with or against anything said for > this previously. And as I have said, if you want people to modify a namespace, you should be creating a namespace and passing it around. If you want people to have access to some embedded definition, then you expose it. If some writer of some module/class/whatever decides that they want to embed some thing that you think should have been exposed to the outside world, then complain the the writer that they have designed it poorly. Take a walk though the standard library. You will likely note the rarity of embedded function/class definitions. In those cases where they are used, it is generally for a good reason. You will also note the general rarity of stack frame access. Prior to the cycle-removing garbage collector, this was because accessing stack frames could result in memory leaks of stack frames. You may also note the rarity of modification of stack frame contents (I'm not sure there are any), which can be quite dangerous. Right now it is difficult to go and access the value of a local 'x' three callers above you in the stack frame. I think this is great, working as intended in fact. Being able to read and/or modify arbitrary stack frame contents, and/or being able to pass those stack frames around: foo(frame[3]), is quite dangerous. I'm still -1000. - Josiah From skip at pobox.com Wed Oct 19 21:22:06 2005 From: skip at pobox.com (skip@pobox.com) Date: Wed, 19 Oct 2005 14:22:06 -0500 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com> References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com> <5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com> Message-ID: <17238.40158.735826.504410@montanaro.dyndns.org> >>>>> "Phillip" == Phillip J Eby writes: Phillip> At 11:43 AM 10/19/2005 -0500, skip at pobox.com wrote: >> >> : >> >> >> ... >> Steve> Wow, that's really neat. And you save a keyword! ;-) >> >> Two if you add a builtin called "function" (get rid of "def"). Phillip> Not unless the tuple is passed in as an abstract syntax tree or Phillip> something. Hmmm... Maybe I misread something then. I saw (I think) that type Foo (base): def __init__(self): pass would be equivalent to class Foo (base): def __init__(self): pass and thought that function myfunc(arg1, arg2): pass would be equivalent to def myfunc(arg1, arg2): pass where "function" a builtin that when called returns a new function. Skip From jcarlson at uci.edu Wed Oct 19 21:46:12 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 19 Oct 2005 12:46:12 -0700 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <17238.40158.735826.504410@montanaro.dyndns.org> References: <5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com> <17238.40158.735826.504410@montanaro.dyndns.org> Message-ID: <20051019124508.3836.JCARLSON@uci.edu> skip at pobox.com wrote: > > >>>>> "Phillip" == Phillip J Eby writes: > > Phillip> At 11:43 AM 10/19/2005 -0500, skip at pobox.com wrote: > >> >> : > >> >> > >> ... > >> > Steve> Wow, that's really neat. And you save a keyword! ;-) > >> > >> Two if you add a builtin called "function" (get rid of "def"). > > Phillip> Not unless the tuple is passed in as an abstract syntax tree or > Phillip> something. > > Hmmm... Maybe I misread something then. I saw (I think) that > > type Foo (base): > def __init__(self): > pass > > would be equivalent to > > class Foo (base): > def __init__(self): > pass > > and thought that > > function myfunc(arg1, arg2): > pass > > would be equivalent to > > def myfunc(arg1, arg2): > pass > > where "function" a builtin that when called returns a new function. For it to work in classes, it would need to execute the body of the class, which is precisely why it can't work with functions. - Josiah From jcarlson at uci.edu Wed Oct 19 22:10:30 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 19 Oct 2005 13:10:30 -0700 Subject: [Python-Dev] Defining properties - a use case for class decorators? In-Reply-To: References: Message-ID: <20051019102954.380B.JCARLSON@uci.edu> Jim Jewett wrote: > (In http://mail.python.org/pipermail/python-dev/2005-October/057409.html,) > Nick Coghlan suggested allowing attribute references as binding targets. > > > x = property("Property x (must be less than 5)") > > > def x.get(instance): ... > > Josiah shivered and said it was hard to tell what was even intended, and > (in http://mail.python.org/pipermail/python-dev/2005-October/057437.html) > Nick agreed that it was worse than > > > x.get = f given: > > def f(): ... > > Could someone explain to me why it is worse? def x.get(...): ... Seems to imply that one is defining a method on x. This is not the case. It is also confused by the x.get(instance) terminology that I doubt has ever seen light of day in production code. Instance of what? Instance of x? The class? ... I'm personally averse to the 'given:' syntax, if only because under certain situations, it can be reasonably emulated. > I understand not wanting to modify object x outside of its definition. > > I understand that there is some trickiness about instancemethods > and bound variables. > > But these objections seem equally strong for both forms, as well > as for the current "equivalent" of > > def f(): ... > x.get = f > > The first form (def x.get) at least avoids repeating (or even creating) > the temporary function name. > > The justification for decorators was to solve this very problem within > a module or class. How is this different? Is it just that attributes > shouldn't be functions, and this might encourage the practice? Many will agree that there is a problem with how properties are defined. There are many proposed solutions, some of which use decorators, custom subclasses, metaclasses, etc. I have a problem with it because from the description, you could use... def x.y.z.a.b.c.foobarbaz(...): ... ...and it woud be unclear to the reader or writer what the hell x.y.z.a.b.c is (class, instance, module), which can come up if the definition/import of x is far enough away from the definition of x. . Again, ick. - Josiah From pje at telecommunity.com Wed Oct 19 22:15:33 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 19 Oct 2005 16:15:33 -0400 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <20051019124508.3836.JCARLSON@uci.edu> References: <17238.40158.735826.504410@montanaro.dyndns.org> <5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com> <17238.40158.735826.504410@montanaro.dyndns.org> Message-ID: <5.1.1.6.0.20051019161405.01fb3218@mail.telecommunity.com> At 12:46 PM 10/19/2005 -0700, Josiah Carlson wrote: >skip at pobox.com wrote: > > >>>>> "Phillip" == Phillip J Eby writes: > > > > Phillip> Not unless the tuple is passed in as an abstract syntax > tree or > > Phillip> something. > > > > Hmmm... Maybe I misread something then. I saw (I think) that > > > > type Foo (base): > > def __init__(self): > > pass > > > > would be equivalent to > > > > class Foo (base): > > def __init__(self): > > pass > > > > and thought that > > > > function myfunc(arg1, arg2): > > pass > > > > would be equivalent to > > > > def myfunc(arg1, arg2): > > pass > > > > where "function" a builtin that when called returns a new function. > >For it to work in classes, it would need to execute the body of the >class, which is precisely why it can't work with functions. Not only that, but the '(arg1, arg2)' for classes is a tuple of *values*, but for functions it's just a function signature, not an expression! Which is why this would effectively have to be a macro facility. From fredrik at pythonware.com Wed Oct 19 22:23:35 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 19 Oct 2005 22:23:35 +0200 Subject: [Python-Dev] Definining properties - a use case for classdecorators? References: <43525BFA.9090309@iinet.net.au><4353016A.1010707@canterbury.ac.nz><43544CC1.5050204@canterbury.ac.nz> Message-ID: Guido van Rossum wrote: > OK, so how's this for a radical proposal. > > Let's change the property built-in so that its arguments can be either > functions or strings (or None). If they are functions or None, it > behaves exactly like it always has. > > If an argument is a string, it should be a method name, and the method > is looked up by that name each time the property is used. Because this > is late binding, it can be put before the method definitions, and a > subclass can override the methods. Example: > > class C: > > foo = property('getFoo', 'setFoo', None, 'the foo property') > > def getFoo(self): > return self._foo > > def setFoo(self, foo): > self._foo = foo > > What do you think? +1 from here. > If you can think of a solution that looks better than mine, you're a genius. letting "class" inject a slightly magic "self" variable into the class namespace ? class C: foo = property(self.getFoo, self.setFoo, None, 'the foo property') def getFoo(self): return self._foo def setFoo(self, foo): self._foo = foo (figuring out exactly what "self" should be is left as an exercise etc) From guido at python.org Wed Oct 19 22:53:04 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 19 Oct 2005 13:53:04 -0700 Subject: [Python-Dev] Definining properties - a use case for classdecorators? In-Reply-To: References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz> <43544CC1.5050204@canterbury.ac.nz> Message-ID: On 10/19/05, Fredrik Lundh wrote: > letting "class" inject a slightly magic "self" variable into the class namespace ? > > class C: > > foo = property(self.getFoo, self.setFoo, None, 'the foo property') > > def getFoo(self): > return self._foo > > def setFoo(self, foo): > self._foo = foo > > (figuring out exactly what "self" should be is left as an exercise etc) It's magical enough to deserve to be called __self__. But even so: I've seen proposals like this a few times in other contexts. I may even have endorsed the idea at one time. The goal is always the same: forcing delayed evaluation of a getattr operation without using either a string literal or a lambda. But I find it quite a bit too magical, for all values of xyzzy, that xyzzy.foo would return a function of one argument that, when called with an argument x, returns x.foo. Even if it's easy enough to write the solution (*), that sentence describing it gives me goosebumps. And the logical consequence, xyzzy.foo(x), which is an obfuscated way to write x.foo, makes me nervous. (*) Here's the solution: class XYZZY(object): def __getattr__(self, name): return lambda arg: getattr(arg, name) xyzzy = XYZZY() -- --Guido van Rossum (home page: http://www.python.org/~guido/) From blais at furius.ca Wed Oct 19 22:54:16 2005 From: blais at furius.ca (Martin Blais) Date: Wed, 19 Oct 2005 16:54:16 -0400 Subject: [Python-Dev] enumerate with a start index Message-ID: <8393fff0510191354s3676682dk58a4db65edee1fee@mail.gmail.com> Hi Just wondering, would anyone think of it as a good idea if the enumerate() builtin could accept a "start" argument? I've run across a few cases where this would have been useful. It seems generic enough too. From michel at cignex.com Thu Oct 20 00:14:31 2005 From: michel at cignex.com (Michel Pelletier) Date: Wed, 19 Oct 2005 15:14:31 -0700 Subject: [Python-Dev] Coroutines, generators, function calling In-Reply-To: <004801c5d3ec$e29b5360$6402a8c0@arkdesktop> References: <1129643229.12510.37.camel@localhost> <004801c5d3ec$e29b5360$6402a8c0@arkdesktop> Message-ID: <4356C547.8020402@cignex.com> Andrew Koenig wrote: >> Sure, that would work. Or even this, if the scheduler would >>automatically recognize generator objects being yielded and so would run >>the the nested coroutine until finish: > > > This idea has been discussed before. I think the problem with recognizing > generators as the subject of "yield" statements is that then you can't yield > a generator even if you want to. > > The best syntax I can think of without adding a new keyword looks like this: > > yield from x > > which would be equivalent to > > for i in x: > yield i My eyes really like the syntax, but I wonder about it's usefulness. In rdflib, particularly here: http://svn.rdflib.net/trunk/rdflib/backends/IOMemory.py We yield values from inside for loops all over the place, but the yielded value is very rarely just the index value (only 1 of 14 yields) , but something calculated from the index value, so the new syntax would not be useful, unless it was something that provided access to the index item as a variable, like: yield foo(i) for i in x which barely saves you anything (a colon, a newline, and an indent). (hey wait, isn't that a generator comprehension? Haven't really encountered those yet). Of course rdflib could be the minority case and most folks who yield in loops are yielding only the index value directly. off to read the generator comprehension docs... -Michel From michel at cignex.com Thu Oct 20 01:03:52 2005 From: michel at cignex.com (Michel Pelletier) Date: Wed, 19 Oct 2005 16:03:52 -0700 Subject: [Python-Dev] enumerate with a start index In-Reply-To: <8393fff0510191354s3676682dk58a4db65edee1fee@mail.gmail.com> References: <8393fff0510191354s3676682dk58a4db65edee1fee@mail.gmail.com> Message-ID: <4356D0D8.9010209@cignex.com> Martin Blais wrote: > Hi > > Just wondering, would anyone think of it as a good idea if the > enumerate() builtin could accept a "start" argument? I've run across > a few cases where this would have been useful. It seems generic > enough too. +1, but something more useful might be a a cross between enumerate a zip, where you pass N iterables and it yields N-tuples. Then you could do something like: zipyield(range(10, 20), mygenerator()) and it would be like you wanted for enumerate, but starting from 10 in this case. -Michel > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org > From ark at acm.org Thu Oct 20 01:17:05 2005 From: ark at acm.org (Andrew Koenig) Date: Wed, 19 Oct 2005 19:17:05 -0400 Subject: [Python-Dev] Coroutines, generators, function calling In-Reply-To: <4356C547.8020402@cignex.com> Message-ID: <00ad01c5d503$3c529370$6402a8c0@arkdesktop> > We yield values from inside for loops all over the place, but the > yielded value is very rarely just the index value (only 1 of 14 yields) > , but something calculated from the index value, so the new syntax would > not be useful, unless it was something that provided access to the index > item as a variable, like: > > yield foo(i) for i in x > > which barely saves you anything (a colon, a newline, and an indent). > (hey wait, isn't that a generator comprehension? Here's a use case: def preorder(tree): if tree: yield tree yield from preorder(tree.left) yield from preorder(tree.right) From jcarlson at uci.edu Thu Oct 20 01:28:29 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 19 Oct 2005 16:28:29 -0700 Subject: [Python-Dev] enumerate with a start index In-Reply-To: <4356D0D8.9010209@cignex.com> References: <8393fff0510191354s3676682dk58a4db65edee1fee@mail.gmail.com> <4356D0D8.9010209@cignex.com> Message-ID: <20051019162606.385D.JCARLSON@uci.edu> Michel Pelletier wrote: > > Martin Blais wrote: > > Hi > > > > Just wondering, would anyone think of it as a good idea if the > > enumerate() builtin could accept a "start" argument? I've run across > > a few cases where this would have been useful. It seems generic > > enough too. > > +1, but something more useful might be a a cross between enumerate a > zip, where you pass N iterables and it yields N-tuples. Then you could > do something like: > > zipyield(range(10, 20), mygenerator()) > > and it would be like you wanted for enumerate, but starting from 10 in > this case. All of this already exists. from itertools import izip, count for i,j in izip(count(start), iterable): ... Read your standard library. - Josiah From pje at telecommunity.com Thu Oct 20 03:40:35 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 19 Oct 2005 21:40:35 -0400 Subject: [Python-Dev] Pre-PEP: Task-local variables Message-ID: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> This is still rather rough, but I figured it's easier to let everybody fill in the remaining gaps by arguments than it is for me to pick a position I like and try to convince everybody else that it's right. :) Your feedback is requested and welcome. PEP: XXX Title: Task-local Variables Author: Phillip J. Eby Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 19-Oct-2005 Python-Version: 2.5 Post-History: 19-Oct-2005 Abstract ======== Many Python modules provide some kind of global or thread-local state, which is relatively easy to implement. With the acceptance of PEP 342, however, co-routines will become more common, and it will be desirable in many cases to treat each as its own logical thread of execution. So, many kinds of state that might now be kept as a thread-specific variable (such as the "current transaction" in ZODB or the "current database connection" in SQLObject) will not work with coroutines. This PEP proposes a simple mechanism akin to thread-local variables, but which will make it easy and efficient for co-routine schedulers to switch state between tasks. The mechanism is proposed for the standard library because its usefulness is dependent on its adoption by standard library modules, such as the ``decimal`` module. The proposed features can be implemented as pure Python code, and as such are suitable for use by other Python implementations (including older versions of Python, if desired). Motivation ========== PEP 343's new "with" statement makes it very attractive to temporarily alter some aspect of system state, and then restore it, using a context manager. Many of PEP 343's examples are of this nature, whether they are temporarily redirecting ``sys.stdout``, or temporarily altering decimal precision. But when this attractive feature is combined with PEP 342-style co-routines, a new challenge emerges. Consider this code, which may misbehave if run as a co-routine:: with opening(filename, "w") as f: with redirecting_stdout(f): print "Hello world" yield pause(5) print "Goodbye world" Problems can arise from this code in two ways. First, the redirection of output "leaks out" to other coroutines during the pause. Second, when this coroutine is finished, it resets stdout to whatever it was at the beginning of the coroutine, regardless of what another co-routine might have been using. Similar issues can be demonstrated using the decimal context, transactions, database connections, etc., which are all likely to be popular contexts for the "with" statement. However, if these new context managers are written to use global or thread-local state, coroutines will be locked out of the market, so to speak. Therefore, this PEP proposes to provide and promote a standard way of managing per-execution-context state, such that coroutine schedulers can keep each coroutine's state distinct. If this mechanism is then used by library modules (such as ``decimal``) to maintain their current state, then they will be transparently compatible with co-routines as well as threaded and threadless code. (Note that for Python 2.x versions, backward compatibility requires that we continue to allow direct reassignment to e.g. ``sys.stdout``. So, it will still of course be possible to write code that will interoperate poorly with co-routines. But for Python 3.x it seems worth considering making some of the ``sys`` module's contents into task-local variables rather than assignment targets.) Specification ============= This PEP proposes to offer a standard library module called ``context``, with the following core contents: Variable A class that allows creation of a context variable (see below). snapshot() Returns a snapshot of the current execution context. swap(ctx) Set the current context to `ctx`, returning a snapshot of the current context. The basic idea here is that a co-routine scheduler can switch between tasks by doing something like:: last_coroutine.state = context.swap(next_coroutine.state) Or perhaps more like:: # ... execute coroutine iteration last_coroutine.state = context.snapshot() # ... figure out what routine to run next context.swap(next_coroutine.state) Each ``context.Variable`` stores and retrieves its state using the current execution context, which is thread-specific. (Thus, each thread may execute any number of concurrent tasks, although most practical systems today have only one thread that executes coroutines, the other threads being reserved for operations that would otherwise block co-routine execution. Nonetheless, such other threads will often still require context variables of their own.) Context Variable Objects ------------------------ A context variable object provides the following methods: get(default=None) Return the value of the variable in the current execution context, or `default` if not set. set(value) Set the value of the variable for the current execution context. unset() Delete the value of the variable for the current execution context. __call__(*value) If called with an argument, return a context manager that sets the variable to the specified value, then restores the old value upon ``__exit__``. If called without an argument, return the value of the variable for the current execution context, or raise an error if no value is set. Thus:: with some_variable(value): foo() would be roughly equivalent to:: old = some_variable() some_variable.set(value) try: foo() finally: some_variable.set(old) Implementation Details ---------------------- The simplest possible implementation is for ``Variable`` objects to use themselves as unique keys into an execution context dictionary. The context dictionary would be stored in another dictionary, keyed by ``get_thread_ident()``. This approach would work with almost any version or implementation of Python. For efficiency's sake, however, CPython could simply store the execution context dictionary in its "thread state" structure, creating an empty dictionary at thread initialization time. This would make it somewhat easier to offer a C API for access to context variables, especially where efficiency of access is desirable. But the proposal does not depend on this. In the PEP author's experiments, a simple copy-on-write optimization to the the ``set()`` and ``unset()`` methods allows for high performance task switching. By placing a "frozen" flag in the context dictionary when a snapshot is taken, and then checking for the flag before making changes, a single snapshot can be shared by multiple callers, and thus a ``swap()`` operation is little more than two dictionary writes and a read. This leads to higher performance in the typical case, because context variables are more likely to set in outer loops, but task switches are more likely to occur in inner loops. A copy-on-write approach thus prevents copying from occurring during most task switches. Possible Enhancements --------------------- The core of this proposal is extremely minimalist, as it should be possible to do almost anything desired using combinations of ``Variable`` objects or by simply using variables whose values are mutable objects. There are, however, a variety of options for enhancement: ``manager`` decorator The ``context`` module could perhaps be the home of the PEP 343 ``contextmanager`` decorator, effectively renamed to ``context.manager``. This could be a natural fit, in that it would remind the creators of new context managers that they should consider tracking any associated state in a ``context.Variable``. Proxy class Sometimes it's useful to have an object that looks like a module global (e.g. ``sys.stdout``) but which actually delegates its behavior to a context-specific instance. Thus, you could have one ``sys.stdout``, but its actual output would be directed based on the current execution context. The simplest form of such a proxy class might look something like:: class Proxy(object): def __init__(self, initial_value): self.var = context.Variable() self.var.set(initial_value) def __call__(self,*value): return object.__getattribute__(self,'var')(*value) def __getattribute__(self, attr): var = object.__getattribute__(self,'var') return getattr(var, attr) sys.stdout = Proxy(sys.stdout) # make sys.stdout selectable with sys.stdout(somefile): # temporary redirect in current context print "hey!" The main open issues in implementing this sort of proxy are in the precise set of special methods (e.g. ``__getitem__``, ``__setattr__``, etc.) that should be supported, and what API should be supplied for changing the value, setting a default value for new threads, etc. Low-level API Currently, this PEP does not specify an API for accessing and modifying the current execution context, nor a C API for such access. It currently assumes that ``snapshot()``, ``swap()`` and ``Variable`` are the only public means of accessing context information. It may be desirable to offer finer-grained APIs for use by more advanced uses (such as creating an API for management of proxies). And it may be desirable to have a C API for use by Python extensions that wish convenient access to context variables. Rationale ========= Different libraries have different uses for maintaining a "current" state, be it global or local to a specific thread or task. There is currently no way for task-management code to find and switch all of these "current" states. And even if it could, task switching performance would degrade linearly as new libraries were added. One possible alternative approach to this proposal, would be for explicit task objects to exist, and to provide a way to give them identities, so that libraries could instead store their own state as a property of the task, rather than storing their state in a task-specific mapping. This offers similar potential performance to a copy-on-write strategy, but would use more memory than this proposal when only one task is involved. (Because each variable would have a dictionary mapping from task to the variable's value, but in this proposal there is simply a single dictionary for the task.) Some languages offer "dynamically scoped" variables that are somewhat similar in behavior to the context variables proposed by this PEP. The principal differences are that: 1. Context variables are objects used to obtain or save a value, rather than being a syntactic construct of the language. 2. PEP 343 allows for *controlled* manipulation of context variables, helping to prevent "duelling libraries" from changing state on each other. Also, a library can potentially ``snapshot()`` a desired state at startup, and use ``swap()`` to restore that state on re-entry. (And could even define a simple decorator to wrap its entry points to ensure this.) 3. The PEP author is not aware of any language that explicitly offers coroutine-scoped variables, but presumes that they can be modelled with monads or continuations in functional languages like Haskell. (And I only mention this to forestall the otherwise-inevitable response from fans of such techniques, pointing out that it's possible.) Reference Implementation ======================== The author has prototyped an implementation with somewhat fancier features than shown here, but prefers not to publish it until the basic features and choices of optional functionality have been discussed on Python-Dev. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: From jcarlson at uci.edu Thu Oct 20 04:30:54 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 19 Oct 2005 19:30:54 -0700 Subject: [Python-Dev] Pre-PEP: Task-local variables In-Reply-To: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> Message-ID: <20051019191804.386C.JCARLSON@uci.edu> "Phillip J. Eby" wrote: > For efficiency's sake, however, CPython could simply store the > execution context dictionary in its "thread state" structure, creating > an empty dictionary at thread initialization time. This would make it > somewhat easier to offer a C API for access to context variables, > especially where efficiency of access is desirable. But the proposal > does not depend on this. What about a situation in which corutines are handled by multiple threads? Any time a corutine passed from one thread to another, it would lose its state. While I agree with the obvious "don't do that" response, I don't believe that the proposal will actually go very far in preventing real problems when using context managers and generators or corutines. Why? How much task state is going to be monitored/saved? Just sys? Perhaps sys and the module in which a corutine was defined? Eventually you will have someone who says, "I need Python to be saving and restoring the state of the entire interpreter so that I can have a per-user execution environment that cannot be corrupted by another user." But how much farther out is that? - Josiah From pje at telecommunity.com Thu Oct 20 06:13:29 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 20 Oct 2005 00:13:29 -0400 Subject: [Python-Dev] Pre-PEP: Task-local variables In-Reply-To: <20051019191804.386C.JCARLSON@uci.edu> References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051020000801.01fafe18@mail.telecommunity.com> At 07:30 PM 10/19/2005 -0700, Josiah Carlson wrote: >What about a situation in which corutines are handled by multiple >threads? Any time a corutine passed from one thread to another, it >would lose its state. It's the responsibility of a coroutine scheduler to take a snapshot() when a task is suspended, and to swap() it in when resumed. So it doesn't matter that you've changed what thread you're running in, as long as you keep the context with the coroutine that "owns" it. >While I agree with the obvious "don't do that" response, I don't believe >that the proposal will actually go very far in preventing real problems >when using context managers and generators or corutines. Why? How much >task state is going to be monitored/saved? Just sys? Perhaps sys and >the module in which a corutine was defined? As I mentioned in the PEP, I don't think that we would bother having Python-defined variables be context-specific until Python 3.0. This is mainly intended for the kinds of things described in the proposal: ZODB current transaction, current database connection, decimal context, etc. Basically, anything that you'd have a thread-local for now, and indeed most anything that you'd use a global variable and 'with:' for. > Eventually you will have >someone who says, "I need Python to be saving and restoring the state of >the entire interpreter so that I can have a per-user execution >environment that cannot be corrupted by another user." But how much >farther out is that? I don't see how that's even related. This is simply a replacement for thread-local variables that allows you to also be compatible with "lightweight" (coroutine-based) threads. From jcarlson at uci.edu Thu Oct 20 07:29:18 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 19 Oct 2005 22:29:18 -0700 Subject: [Python-Dev] Pre-PEP: Task-local variables In-Reply-To: <5.1.1.6.0.20051020000801.01fafe18@mail.telecommunity.com> References: <20051019191804.386C.JCARLSON@uci.edu> <5.1.1.6.0.20051020000801.01fafe18@mail.telecommunity.com> Message-ID: <20051019221804.3880.JCARLSON@uci.edu> "Phillip J. Eby" wrote: > It's the responsibility of a coroutine scheduler to take a snapshot() when > a task is suspended, and to swap() it in when resumed. So it doesn't > matter that you've changed what thread you're running in, as long as you > keep the context with the coroutine that "owns" it. > > As I mentioned in the PEP, I don't think that we would bother having > Python-defined variables be context-specific until Python 3.0. This is > mainly intended for the kinds of things described in the proposal: ZODB > current transaction, current database connection, decimal context, > etc. Basically, anything that you'd have a thread-local for now, and > indeed most anything that you'd use a global variable and 'with:' for. > > I don't see how that's even related. This is simply a replacement for > thread-local variables that allows you to also be compatible with > "lightweight" (coroutine-based) threads. I just re-read the proposal with your clarifications in mind. Looks good. +1 - Josiah From michele.simionato at gmail.com Thu Oct 20 09:35:17 2005 From: michele.simionato at gmail.com (Michele Simionato) Date: Thu, 20 Oct 2005 07:35:17 +0000 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <17238.40158.735826.504410@montanaro.dyndns.org> References: <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com> <5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com> <17238.40158.735826.504410@montanaro.dyndns.org> Message-ID: <4edc17eb0510200035u370b57f9ub1d66b4e99d1be62@mail.gmail.com> As other explained, the syntax would not work for functions (and it is not intended to). A possible use case I had in mind is to define inlined modules to be used as bunches of attributes. For instance, I could define a module as module m(): a = 1 b = 2 where 'module' would be the following function: def module(name, args, dic): mod = types.ModuleType(name, dic.get('__doc__')) for k in dic: setattr(mod, k, dic[k]) return mod From ncoghlan at gmail.com Thu Oct 20 14:40:07 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Oct 2005 22:40:07 +1000 Subject: [Python-Dev] Pre-PEP: Task-local variables In-Reply-To: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> Message-ID: <43579027.6040007@gmail.com> Phillip J. Eby wrote: > This is still rather rough, but I figured it's easier to let everybody fill > in the remaining gaps by arguments than it is for me to pick a position I > like and try to convince everybody else that it's right. :) Your feedback > is requested and welcome. I think you're actually highlighting a bigger issue with the behaviour of "yield" inside a "with" block, and working around it rather than fixing the fundamental problem. The issue with "yield" causing changes to leak to outer scopes isn't limited to coroutine style usage - it can happen with generator-iterators, too. What's missing is a general way of saying "suspend this context temporarily, and resume it when done". An example use-case not involving 'yield' at all is the "asynchronise" functionality. A generator-iterator that works in a high precision decimal.Context(), but wants to return values from inside a loop using normal precision is another example not involving coroutines. The basic idea would be to provide syntax that allows a with statement to be "suspended", along the lines of: with EXPR as VAR: for VAR2 in EXPR2: without: BLOCK To mean: abc = (EXPR).__with__() exc = (None, None, None) VAR = abc.__enter__() try: for VAR2 in EXPR2: try: abc.__suspend__() try: BLOCK finally: abc.__resume__() except: exc = sys.exc_info() raise finally: abc.__exit__(*exc) To keep things simple, just as 'break' and 'continue' work only on the innermost loop, 'without' would only apply to the innermost 'with' statement. Locks, for example, could support this via: class Lock(object): def __with__(self): return self def __enter__(self): self.acquire() return self def __resume__(self): self.acquire() def __suspend__(self): self.release() def __exit__(self): self.release() (Note that there's a potential problem if the call to acquire() in __resume__ fails, but that's no different than if this same dance is done manually). Cheers, Nick. P.S. Here's a different generator wrapper that could be used to create a generator-based "suspendable context" that can be invoked multiple times through use of the "without" keyword. If applied to the PEP 343 decimal.Context() __with__ method example, it would automatically restore the original context for the duration of the "without" block: class SuspendableGeneratorContext(object): def __init__(self, func, args, kwds): self.gen = None self.func = func self.args = args self.kwds = kwds def __with__(self): return self def __enter__(self): if self.gen is not None: raise RuntimeError("context already in use") gen = self.func(*args, **kwds) try: result = gen.next() except StopIteration: raise RuntimeError("generator didn't yield") self.gen = gen return result def __resume__(self): if self.gen is None: raise RuntimeError("context not suspended") gen = self.func(*args, **kwds) try: gen.next() except StopIteration: raise RuntimeError("generator didn't yield") self.gen = gen def __suspend__(self): try: self.gen.next() except StopIteration: return else: raise RuntimeError("generator didn't stop") def __exit__(self, type, value, traceback): gen = self.gen self.gen = None if type is None: try: gen.next() except StopIteration: return else: raise RuntimeError("generator didn't stop") else: try: gen.throw(type, value, traceback) except (type, StopIteration): return else: raise RuntimeError("generator caught exception") def suspendable_context(func): def helper(*args, **kwds): return SuspendableGeneratorContext(func, args, kwds) return helper -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Thu Oct 20 15:25:48 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Oct 2005 23:25:48 +1000 Subject: [Python-Dev] Pre-PEP: Task-local variables In-Reply-To: <43579027.6040007@gmail.com> References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <43579027.6040007@gmail.com> Message-ID: <43579ADC.80006@gmail.com> Nick Coghlan wrote: > P.S. Here's a different generator wrapper that could be used to create a > generator-based "suspendable context" that can be invoked multiple times > through use of the "without" keyword. If applied to the PEP 343 > decimal.Context() __with__ method example, it would automatically restore the > original context for the duration of the "without" block. I realised this isn't actually true for the version I posted, and the __with__ method example in the PEP - changes made to the decimal context in the "without" block would be visible after the "with" block. Consider the following: def iter_sin(iterable): # Point A with decimal.getcontext() as ctx: ctx.prec += 10 for r in iterable: y = sin(r) # Very high precision during calculation without: yield +y # Interim results have normal precision # Point B What I posted would essentially work for this example, but there isn't a guarantee that the context at Point A is the same as the context at Point B - the reason is that the thread-local context may be changed within the without block (i.e., external to this iterator), and that changed context would get saved when the decimal.Context context manager was resumed. To fix that, the arguments to StopIteration in __suspend__ would need to be used as arguments when the generator is recreated in __resume__. That is, the context manager would look like: @suspendable def __with__(self, oldctx=None): # Accept argument in __resume__ newctx = self.copy() if oldctx is None: oldctx = decimal.getcontext() decimal.setcontext(newctx) try: yield newctx finally: decimal.setcontext(oldctx) raise StopIteration(oldctx) # Return result in __suspend__ (This might look cleaner if "return arg" in a generator was equivalent to "raise StopIteration(arg)" as previously discussed) And (including reversion to 'one-use-only' status) the wrapper class would look like: class SuspendableGeneratorContext(object): def __init__(self, func, args, kwds): self.gen = func(*args, **kwds) self.func = func self.args = None def __with__(self): return self def __enter__(self): try: return self.gen.next() except StopIteration: raise RuntimeError("generator didn't yield") def __suspend__(self): try: self.gen.next() except StopIteration, ex: # Use the return value as the arguments for resumption self.args = ex.args return else: raise RuntimeError("generator didn't stop") def __resume__(self): if self.args is None: raise RuntimeError("context not suspended") self.gen = self.func(*args) try: self.gen.next() except StopIteration: raise RuntimeError("generator didn't yield") def __exit__(self, type, value, traceback): if type is None: try: self.gen.next() except StopIteration: return else: raise RuntimeError("generator didn't stop") else: try: self.gen.throw(type, value, traceback) except (type, StopIteration): return else: raise RuntimeError("generator caught exception") def suspendable_context(func): def helper(*args, **kwds): return SuspendableGeneratorContext(func, args, kwds) return helper Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From jimjjewett at gmail.com Thu Oct 20 15:48:06 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 20 Oct 2005 09:48:06 -0400 Subject: [Python-Dev] Early PEP draft (For Python 3000?) Message-ID: I'll try to be more explicit; if Josiah and I are talking past each other, than the explanation was clearly not yet mature. (In http://mail.python.org/pipermail/python-dev/2005-October/057251.html) Eyal Lotem suggested: > Name: Attribute access for all namespaces ... > global x ; x = 1 > Replaced by: > module.x = 1 I responded: > Attribute access as an option would be nice, but might be slower. > Also note that one common use for a __dict__ is that you don't > know what keys are available; meeting this use case with > attribute access would require some extra machinery, such as > an iterator over attributes. Josiah Carlson responded (http://mail.python.org/pipermail/python-dev/2005-October/057451.html) > This particular use case is easily handled. Put the following > once at the top of the module... > module = __import__(__name__) > Then one can access (though perhaps not quickly) the module-level > variables for that module. To access attributes, it is a quick scan > through module.__dict__, dir(), or vars(). My understanding of the request was that all namespaces -- including those returned by globals() and locals() -- should be used with attribute access *instead* of __dict__ access. module.x is certainly nicer than module.__dict__['x'] Even with globals() and locals(), I usually *wish* I could use attribute access, to avoid creating a string when what I really want is a name. The catch is that sometimes I don't know the names in advance, and have to iterate over the dict -- as you suggested. That works fine today; my question is what to do instead if __dict__ is unavailable. Note that vars(obj) itself conceptually returns a NameSpace rather than a dict, so that isn't the answer. My inclination is to add an __iterattr__ that returns (attribute name, attribute value) pairs, and to make this the default iterator for NameSpace objects. Whether the good of (1) not needing to mess with __dict__, and (2) not needing to pretend that strings are names is enough to justify an extra magic method ... I'm not as sure. -jJ From guido at python.org Thu Oct 20 17:57:49 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 20 Oct 2005 08:57:49 -0700 Subject: [Python-Dev] Pre-PEP: Task-local variables In-Reply-To: <43579ADC.80006@gmail.com> References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <43579027.6040007@gmail.com> <43579ADC.80006@gmail.com> Message-ID: Whoa, folks! Can I ask the gentlemen to curb their enthusiasm? PEP 343 is still (back) on the drawing table, PEP 342 has barely been implemented (did it survive the AST-branch merge?), and already you are talking about adding more stuff. Please put on the brakes! If there's anything this discussion shows me, it's that implicit contexts are a dangerous concept, and should be treated with much skepticism. I would recommend that if you find yourself needing context data while programming an asynchronous application using generator trampolines simulating coroutines, you ought to refactor the app so that the context is explicitly passed along rather than grabbed implicitly. Zope doesn't *require* you to get the context from a thread-local, and I presume that SQLObject also has a way to explicitly use a specific connection (I'm assuming cursors and similar data structures have an explicit reference to the connection used to create them). Heck, even Decimal allows you to invoke every operation as a method on a decimal.Context object! I'd rather not tie implicit contexts to the with statement, conceptually. Most uses of the with-statement are purely local (e.g. "with open(fn) as f"), or don't apply to coroutines (e.g. "with my_lock"). I'd say that "with redirect_stdout(f)" also doesn't apply -- we already know it doesn't work in threaded applications, and that restriction is easily and logically extended to coroutines. If you're writing a trampoline for an app that needs to modify decimal contexts, the decimal module already provides the APIs for explicitly saving and restoring contexts. I know that somewhere in the proto-PEP Phillip argues that the context API needs to be made a part of the standard library so that his trampoline can efficiently swap implicit contexts required by arbitrary standard and third-party library code. My response to that is that library code (whether standard or third-party) should not depend on implicit context unless it assumes it can assume complete control over the application. (That rules out pretty much everything except Zope, which is fine with me. :-) Also, Nick wants the name 'context' for PEP-343 style context managers. I think it's overloading too much to use the same word for per-thread or per-coroutine context. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jcarlson at uci.edu Thu Oct 20 18:48:43 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 20 Oct 2005 09:48:43 -0700 Subject: [Python-Dev] Early PEP draft (For Python 3000?) In-Reply-To: References: Message-ID: <20051020093601.3889.JCARLSON@uci.edu> Jim Jewett wrote: > I'll try to be more explicit; if Josiah and I are talking past each > other, than the explanation was clearly not yet mature. > > (In http://mail.python.org/pipermail/python-dev/2005-October/057251.html) > Eyal Lotem suggested: > > > Name: Attribute access for all namespaces ... > > > global x ; x = 1 > > Replaced by: > > module.x = 1 > > I responded: > > Attribute access as an option would be nice, but might be slower. > > > Also note that one common use for a __dict__ is that you don't > > know what keys are available; meeting this use case with > > attribute access would require some extra machinery, such as > > an iterator over attributes. > > Josiah Carlson responded > (http://mail.python.org/pipermail/python-dev/2005-October/057451.html) > > > This particular use case is easily handled. Put the following > > once at the top of the module... > > > module = __import__(__name__) > > > Then one can access (though perhaps not quickly) the module-level > > variables for that module. To access attributes, it is a quick scan > > through module.__dict__, dir(), or vars(). > > My understanding of the request was that all namespaces -- > including those returned by globals() and locals() -- should > be used with attribute access *instead* of __dict__ access. Yeah, I missed the transition from arbitrary stack frame access to strictly global and local scope attribute access. > module.x is certainly nicer than module.__dict__['x'] > > Even with globals() and locals(), I usually *wish* I could > use attribute access, to avoid creating a string when what > I really want is a name. Indeed. > The catch is that sometimes I don't know the names in > advance, and have to iterate over the dict -- as you > suggested. That works fine today; my question is what > to do instead if __dict__ is unavailable. > > Note that vars(obj) itself conceptually returns a NameSpace > rather than a dict, so that isn't the answer. >>> help(vars) vars(...) vars([object]) -> dictionary Without arguments, equivalent to locals(). With an argument, equivalent to object.__dict__. When an object lacks a dictionary, dir() works just fine. >>> help(dir) Help on built-in function dir: dir(...) dir([object]) -> list of strings Return an alphabetized list of names comprising (some of) the attributes of the given object, and of attributes reachable from it: No argument: the names in the current scope. Module object: the module attributes. Type or class object: its attributes, and recursively the attributes of its bases. Otherwise: its attributes, its class's attributes, and recursively the attributes of its class's base classes. > My inclination is to add an __iterattr__ that returns > (attribute name, attribute value) pairs, and to make this the > default iterator for NameSpace objects. def __iterattr__(obj): for i in dir(obj): yield i, getattr(obj, i) > Whether the good of > (1) not needing to mess with __dict__, and > (2) not needing to pretend that strings are names > is enough to justify an extra magic method ... I'm not as sure. I don't know, but leaning towards no; dir() works pretty well. Yeah, you have to use getattr(), but there are worse things. - Josiah From dalcinl at gmail.com Thu Oct 20 19:04:03 2005 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 20 Oct 2005 14:04:03 -0300 Subject: [Python-Dev] enumerate with a start index In-Reply-To: <8393fff0510191354s3676682dk58a4db65edee1fee@mail.gmail.com> References: <8393fff0510191354s3676682dk58a4db65edee1fee@mail.gmail.com> Message-ID: On 10/19/05, Martin Blais wrote: > Just wondering, would anyone think of it as a good idea if the > enumerate() builtin could accept a "start" argument? And why not an additional "step" argument? Anyway, perhaps all this can be done with a 'xrange' object... -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From pje at telecommunity.com Thu Oct 20 19:14:08 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 20 Oct 2005 13:14:08 -0400 Subject: [Python-Dev] Pre-PEP: Task-local variables In-Reply-To: <43579027.6040007@gmail.com> References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051020131024.02033d58@mail.telecommunity.com> At 10:40 PM 10/20/2005 +1000, Nick Coghlan wrote: >Phillip J. Eby wrote: > > This is still rather rough, but I figured it's easier to let everybody > fill > > in the remaining gaps by arguments than it is for me to pick a position I > > like and try to convince everybody else that it's right. :) Your > feedback > > is requested and welcome. > >I think you're actually highlighting a bigger issue with the behaviour of >"yield" inside a "with" block, and working around it rather than fixing the >fundamental problem. > >The issue with "yield" causing changes to leak to outer scopes isn't limited >to coroutine style usage - it can happen with generator-iterators, too. > >What's missing is a general way of saying "suspend this context temporarily, >and resume it when done". Actually, it's fairly simple to write a generator decorator using context.swap() that saves and restores the current execution state around next()/send()/throw() calls, if you prefer it to be the generator's responsibility to maintain such context. From michel at cignex.com Thu Oct 20 00:14:31 2005 From: michel at cignex.com (Michel Pelletier) Date: Wed, 19 Oct 2005 15:14:31 -0700 Subject: [Python-Dev] Coroutines, generators, function calling In-Reply-To: <004801c5d3ec$e29b5360$6402a8c0@arkdesktop> References: <1129643229.12510.37.camel@localhost> <004801c5d3ec$e29b5360$6402a8c0@arkdesktop> Message-ID: <4356C547.8020402@cignex.com> Andrew Koenig wrote: >> Sure, that would work. Or even this, if the scheduler would >>automatically recognize generator objects being yielded and so would run >>the the nested coroutine until finish: > > > This idea has been discussed before. I think the problem with recognizing > generators as the subject of "yield" statements is that then you can't yield > a generator even if you want to. > > The best syntax I can think of without adding a new keyword looks like this: > > yield from x > > which would be equivalent to > > for i in x: > yield i My eyes really like the syntax, but I wonder about it's usefulness. In rdflib, particularly here: http://svn.rdflib.net/trunk/rdflib/backends/IOMemory.py We yield values from inside for loops all over the place, but the yielded value is very rarely just the index value (only 1 of 14 yields) , but something calculated from the index value, so the new syntax would not be useful, unless it was something that provided access to the index item as a variable, like: yield foo(i) for i in x which barely saves you anything (a colon, a newline, and an indent). (hey wait, isn't that a generator comprehension? Haven't really encountered those yet). Of course rdflib could be the minority case and most folks who yield in loops are yielding only the index value directly. off to read the generator comprehension docs... -Michel From jeremy at alum.mit.edu Thu Oct 20 22:04:27 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Thu, 20 Oct 2005 16:04:27 -0400 Subject: [Python-Dev] Pre-PEP: Task-local variables In-Reply-To: References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <43579027.6040007@gmail.com> <43579ADC.80006@gmail.com> Message-ID: On 10/20/05, Guido van Rossum wrote: > Whoa, folks! Can I ask the gentlemen to curb their enthusiasm? > > PEP 343 is still (back) on the drawing table, PEP 342 has barely been > implemented (did it survive the AST-branch merge?), and already you > are talking about adding more stuff. Please put on the brakes! Yes. PEP 342 survived the merge of the AST branch. I wonder, though, if the Grammar for it can be simplified at all. I haven't read the PEP closely, but I found the changes a little hard to follow. That is, why was the grammar changed the way it was -- or how would you describe the intent of the changes? It was hard when doing the transformation in ast.c to be sure that the intent of the changes was honored. On the other hand, it seemed to have extensive tests and they all pass. Jeremy From pje at telecommunity.com Thu Oct 20 22:29:12 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 20 Oct 2005 16:29:12 -0400 Subject: [Python-Dev] Pre-PEP: Task-local variables In-Reply-To: References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <43579027.6040007@gmail.com> <43579ADC.80006@gmail.com> Message-ID: <5.1.1.6.0.20051020162549.01faedf0@mail.telecommunity.com> At 04:04 PM 10/20/2005 -0400, Jeremy Hylton wrote: >On 10/20/05, Guido van Rossum wrote: > > Whoa, folks! Can I ask the gentlemen to curb their enthusiasm? > > > > PEP 343 is still (back) on the drawing table, PEP 342 has barely been > > implemented (did it survive the AST-branch merge?), and already you > > are talking about adding more stuff. Please put on the brakes! > >Yes. PEP 342 survived the merge of the AST branch. I wonder, though, >if the Grammar for it can be simplified at all. I haven't read the >PEP closely, but I found the changes a little hard to follow. That >is, why was the grammar changed the way it was -- or how would you >describe the intent of the changes? The intent was to make it so that '(yield optional_expr)' always works, and also that [lvalue =] yield optional_expr works. If you can find another way to hack the grammar so that both of 'em work, it's certainly okay by me. The changes I made were just the simplest things I could figure out to do. I seem to recall that the hard part was the need for 'yield expr,expr' to be interpreted as '(yield expr,expr)', not '(yield expr),expr', for backward compatibility reasons. From tzot at mediconsa.com Thu Oct 20 23:08:12 2005 From: tzot at mediconsa.com (Christos Georgiou) Date: Fri, 21 Oct 2005 00:08:12 +0300 Subject: [Python-Dev] list splicing References: <432E4BC9.1020100@canterbury.ac.nz> Message-ID: "Greg Ewing" wrote in message news:432E4BC9.1020100 at canterbury.ac.nz... > Karl Chen wrote: >> Hi, has anybody considered adding something like this: >> a = [1, 2] >> [ 'x', *a, 'y'] >> >> as syntactic sugar for >> a = [1, 2] >> [ 'x' ] + a + [ 'y' ]. > > You can write that as > a = [1, 2] > a[1:1] = a I'm sure you meant to write: a = [1, 2] b = ['x', 'y'] b[1:1] = a Occasional absence of mind makes other people feel useful! PS actually one *can* write a = [1, 2] ['x', 'y'][1:1] = a since this is not actually an assignment but rather syntactic sugar for a function call, but I don't know how one would use the modified list, since b = ['x','y'][1:1] = a doesn't quite fulfill the initial requirement ;) From pje at telecommunity.com Thu Oct 20 23:35:31 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 20 Oct 2005 17:35:31 -0400 Subject: [Python-Dev] Pre-PEP: Task-local variables In-Reply-To: References: <43579ADC.80006@gmail.com> <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <43579027.6040007@gmail.com> <43579ADC.80006@gmail.com> Message-ID: <5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com> At 08:57 AM 10/20/2005 -0700, Guido van Rossum wrote: >Whoa, folks! Can I ask the gentlemen to curb their enthusiasm? > >PEP 343 is still (back) on the drawing table, PEP 342 has barely been >implemented (did it survive the AST-branch merge?), and already you >are talking about adding more stuff. Please put on the brakes! Sorry. I thought that 343 was just getting a minor tune-up. In the months since the discussion and approval (and implementation; Michael Hudson actually had a PEP 343 patch out there), I've been doing a lot of thinking about how they will be used in applications, and thought that it would be a good idea to promote people using task-specific variables in place of globals or thread-locals. The conventional wisdom is that global variables are bad, but the truth is that they're very attractive because they allow you to have one less thing to pass around and think about in every line of code. Without globals, you would sooner or later end up with every function taking twenty arguments to pass through states down to other code, or else trying to cram all this data into some kind of "context" object, which then won't work with code that doesn't know about *your* definition of what a context is. Globals are thus extremely attractive for practical software development. If they weren't so useful, it wouldn't be necessary to warn people not to use them, after all. :) The problem with globals, however, is that sometimes they need to be changed in a particular context. PEP 343 makes it safer to use globals because you can always offer a context manager that changes them temporarily, without having to hand-write a try-finally block. This will make it even *more* attractive to use globals, which is not a problem as long as the code has no multitasking of any sort. Of course, the multithreading scenario is usually fixed by using thread-locals. All I'm proposing is that we replace thread locals with task locals, and promote the use of task-local variables for managed contexts (such as the decimal context) *that would otherwise be a global or a thread-local variable*. This doesn't seem to me like a very big deal; just an encouragement for people to make their stuff easy to use with PEP 342 and 343. By the way, I don't know if you do much with Java these days, but a big part of the whole J2EE fiasco and the rise of the so-called "lightweight containers" in Java has all been about how to manage implicit context so that you don't get stuck with either the inflexibility of globals or the deadweight of passing tons of parameters around. One of the big selling points of AspectJ is that it lets you implicitly funnel parameters from point A to point B without having to modify all the call signatures in between. In other words, its use is promoted for precisely the sort of thing that 'with' plus a task variable would be ideal for. As far as I can tell, 'with' plus a task variable is *much* easier to explain, use, and understand than an aspect-oriented programming tool is! (Especially from the "if the implementation is easy to explain, it may be a good idea" perspective.) >I know that somewhere in the proto-PEP Phillip argues that the context >API needs to be made a part of the standard library so that his >trampoline can efficiently swap implicit contexts required by >arbitrary standard and third-party library code. My response to that >is that library code (whether standard or third-party) should not >depend on implicit context unless it assumes it can assume complete >control over the application. I think maybe there's some confusion here, at least on my part. :) I see two ways to read your statement, one of which seems to be saying that we should get rid of the decimal context (because it doesn't have complete control over the application), and the other way of reading it doesn't seem connected to what I proposed. Anything that's a global variable is an "implicit context". Because of that, I spent considerable time and effort in PEAK trying to utterly stamp out global variables. *Everything* in PEAK has an explicit context. But that then becomes more of a pain to *use*, because you are now stuck with managing it, even if you cram it into a Zope-style acquisition tree so there's only one "context" to deal with. Plus, it assumes that everything the developer wants to do can be supplied by *one* framework, be it PEAK, Zope, or whatever, which is rarely the case but still forces framework developers to duplicate everybody else's stuff. In other words, I've come to realize that the path the major Python application frameworks is not really Pythonic. A Pythonic framework shouldn't load you down with new management burdens and keep you from using other frameworks. It should make life easier, and make your code *more* interoperable, not less. Indeed, I've pretty much come to agreement with the part of the Python developer community that has says Frameworks Are Evil. A primary source of this evil in the big three frameworks (PEAK, Twisted, and Zope) stem from their various approaches to dealing with this issue of context, which lack the simplicity of global (or task-local) variables. So, the lesson I've taken from my attempt to make everything explicit is that what developers *really* want is to have global variables, just without the downsides of uncontrolled modifications, and inter-thread or inter-task pollution. Explicit isn't always better than implicit, because oftentimes the practicality of having implicit things is much more important than the purity of making them all explicit. Simple is better than complex, and task-local variables are *much* simpler than trying to make everything explicit. >Also, Nick wants the name 'context' for PEP-343 style context >managers. I think it's overloading too much to use the same word for >per-thread or per-coroutine context. Actually, I was the one who originally proposed the term "context manager", and it doesn't seem like a conflict to me. Indeed, I suggested in the pre-PEP that "@context.manager" might be where we could put the decorator. The overload was intentional, to suggest that when creating a new context manager, it's worth considering whether the state should be kept in a context variable, rather than a global variable. The naming choice was for propaganda purposes, in other words. :) Anyway, I'll withdraw the proposal for now. We can always leave it out of 2.5, I can release an independent implementation, and then submit it for consideration again in the 2.6 timeframe. I just thought it would be a no-brainer to use task locals where thread locals are currently being used, and that's really all I was proposing we do as far as stdlib changes anyway. I was also hoping to get good input from Python-dev regarding some of the open issues, to try and build a consensus on them from the beginning. From tzot at mediconsa.com Thu Oct 20 23:51:10 2005 From: tzot at mediconsa.com (Christos Georgiou) Date: Fri, 21 Oct 2005 00:51:10 +0300 Subject: [Python-Dev] bool(iter([])) changed between 2.3 and 2.4 References: <001c01c5be3c$53130dc0$6522c797@oemcomputer> Message-ID: "Guido van Rossum" wrote in message news:ca471dc205092017071f2eb1e8 at mail.gmail.com... >> [Fred] >> > think iterators shouldn't have length at all: >> > they're *not* containers and shouldn't act that way. >> >> Some iterators can usefully report their length with the invariant: >> len(it) == len(list(it)). > >I still consider this an erroneous hypergeneralization of the concept >of iterators. Iterators should be pure iterators and not also act as >containers. Which other object type implements __len__ but not >__getitem__? Too late, and probably irrelevant by now; the answer though is set([1,2,3]) From guido at python.org Fri Oct 21 02:23:04 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 20 Oct 2005 17:23:04 -0700 Subject: [Python-Dev] Pre-PEP: Task-local variables In-Reply-To: <5.1.1.6.0.20051020162549.01faedf0@mail.telecommunity.com> References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <43579027.6040007@gmail.com> <43579ADC.80006@gmail.com> <5.1.1.6.0.20051020162549.01faedf0@mail.telecommunity.com> Message-ID: On 10/20/05, Phillip J. Eby wrote: > At 04:04 PM 10/20/2005 -0400, Jeremy Hylton wrote: > >On 10/20/05, Guido van Rossum wrote: > > > Whoa, folks! Can I ask the gentlemen to curb their enthusiasm? > > > > > > PEP 343 is still (back) on the drawing table, PEP 342 has barely been > > > implemented (did it survive the AST-branch merge?), and already you > > > are talking about adding more stuff. Please put on the brakes! > > > >Yes. PEP 342 survived the merge of the AST branch. I wonder, though, > >if the Grammar for it can be simplified at all. I haven't read the > >PEP closely, but I found the changes a little hard to follow. That > >is, why was the grammar changed the way it was -- or how would you > >describe the intent of the changes? > > The intent was to make it so that '(yield optional_expr)' always works, and > also that [lvalue =] yield optional_expr works. If you can find another > way to hack the grammar so that both of 'em work, it's certainly okay by > me. The changes I made were just the simplest things I could figure out to do. Right. > I seem to recall that the hard part was the need for 'yield expr,expr' to > be interpreted as '(yield expr,expr)', not '(yield expr),expr', for > backward compatibility reasons. But only at the statement level. These should be errors IMO: foo(yield expr, expr) foo(expr, yield expr) foo(1 + yield expr) x = yield expr, expr x = expr, yield expr x = 1 + yield expr -- --Guido van Rossum (home page: http://www.python.org/~guido/) From anthony at interlink.com.au Fri Oct 21 04:02:11 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri, 21 Oct 2005 12:02:11 +1000 Subject: [Python-Dev] AST branch is in? Message-ID: <200510211202.12015.anthony@interlink.com.au> So it looks like the AST branch has landed. Wooo! Well done to all who were involved - it seems like it's been a huge amount of work. Could someone involved give a short email laying out what concrete (no pun intended) advantages this new compiler gives us? Does it just allow us to do new and interesting manipulations of the code during compilation? Cleaner, easier to maintain, or the like? Anthony -- Anthony Baxter It's never too late to have a happy childhood. From nnorwitz at gmail.com Fri Oct 21 04:32:56 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 20 Oct 2005 19:32:56 -0700 Subject: [Python-Dev] AST branch is in? In-Reply-To: <200510211202.12015.anthony@interlink.com.au> References: <200510211202.12015.anthony@interlink.com.au> Message-ID: On 10/20/05, Anthony Baxter wrote: > > Could someone involved give a short email laying out what concrete (no > pun intended) advantages this new compiler gives us? Does it just > allow us to do new and interesting manipulations of the code during > compilation? Cleaner, easier to maintain, or the like? The Grammar is (was at one point at least) shared between Jython and would allow more tools to be able to share infrastructure. The idea is to eventually be able to have [JP]ython output the same AST to tools. There is quite a bit of generated code based on the Grammar. So some stuff should be easier. Other stuff is just moved. You still need to convert from the AST to the byte code. Hopefully it will be easier to do various sorts of optimization and general manipulation of an AST rather than what existed before. Only time will tell if we can acheive many of the benefits, so it would be good if people could review the code and see if things look more complex/complicated and suggest improvements. I'm not all that familiar with the structure, I'm more of a hopeful consumer of it. HTH, n From simon.belak at hruska.si Fri Oct 21 04:34:02 2005 From: simon.belak at hruska.si (Simon Belak) Date: Fri, 21 Oct 2005 04:34:02 +0200 Subject: [Python-Dev] A solution to the evils of static typing and interfaces? Message-ID: <4358539A.7050901@hruska.si> Hi, I was thinking why not have a separate file for all the proposed optional meta-information (in particular interfaces, static types)? Something along the lines of IDLs in CORBA (with pythonic syntax, of curse). This way most of the benefits are retained without "contaminating" the actual syntax (dare I be so pretentious to even hope making both sides happy?). For the sole purpose of illustration, let meta-files have extension .pym and linking to source-files be name based: parrot.py parrot.pym (parrot.pyc) With some utilities like a prototype generator (to and from meta-files) and a synchronization tool, time penalty on development for having two separate files could be kept within reason. We could even go as far as introducing a syntax allowing custom meta-information to be added. For example something akin to decorators. parrot.pym: @sharedinstance class Parrot: # Methods # note this are only prototypes so no semicolon or suite is needed @cache def playDead(a : int, b : int) -> None # Attributes @const name : str where sharedinstance, cache and const are custom meta-information. This opens up countless possibilities for third-party interpreter enchantments and/or optimisations by providing a fully portable (as all meta-information are optional) language extensions. P.S. my sincerest apologies if I am reopening a can of worms here From guido at python.org Fri Oct 21 04:57:16 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 20 Oct 2005 19:57:16 -0700 Subject: [Python-Dev] Pre-PEP: Task-local variables In-Reply-To: <5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com> References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <43579027.6040007@gmail.com> <43579ADC.80006@gmail.com> <5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com> Message-ID: On 10/20/05, Phillip J. Eby wrote: > At 08:57 AM 10/20/2005 -0700, Guido van Rossum wrote: > >Whoa, folks! Can I ask the gentlemen to curb their enthusiasm? > > > >PEP 343 is still (back) on the drawing table, PEP 342 has barely been > >implemented (did it survive the AST-branch merge?), and already you > >are talking about adding more stuff. Please put on the brakes! > > Sorry. I thought that 343 was just getting a minor tune-up. Maybe, but the issues on the table are naming issues -- is __with__ the right name, or should it be __context__? Should the decorator be applied implicitly? Should the decorator be called @context or @contextmanager? > In the months > since the discussion and approval (and implementation; Michael Hudson > actually had a PEP 343 patch out there), Which he described previously as "a hack" and apparently didn't feel comfortable checking in. At least some of it will have to be redone, (a) for the AST code, and (b) for the revised PEP. > I've been doing a lot of thinking > about how they will be used in applications, and thought that it would be a > good idea to promote people using task-specific variables in place of > globals or thread-locals. That's clear, yes. :-) I still find it unlikely that a lot of people will be using trampoline frameworks. You and Twisted, that's all I expect. > The conventional wisdom is that global variables are bad, but the truth is > that they're very attractive because they allow you to have one less thing > to pass around and think about in every line of code. Which doesn't make them less bad -- they're still there and perhaps more likely to trip you up when you least expect it. I think there's a lot of truth in that conventional wisdom. > Without globals, you > would sooner or later end up with every function taking twenty arguments to > pass through states down to other code, or else trying to cram all this > data into some kind of "context" object, which then won't work with code > that doesn't know about *your* definition of what a context is. Methinks you are exaggerating for effect. > Globals are thus extremely attractive for practical software > development. If they weren't so useful, it wouldn't be necessary to warn > people not to use them, after all. :) > > The problem with globals, however, is that sometimes they need to be > changed in a particular context. PEP 343 makes it safer to use globals > because you can always offer a context manager that changes them > temporarily, without having to hand-write a try-finally block. This will > make it even *more* attractive to use globals, which is not a problem as > long as the code has no multitasking of any sort. Hm. There are different kinds of globals. Most globals don't need to be context-managed at all, because they can safely be shared between threads, tasks or coroutines. Caches usually fall in this category (e.g. the compiled regex cache). A little locking is all it takes. The globals that need to be context-managed are the pernicious kind of which you can never have too few. :-) They aren't just accumulating global state, they are implicit parameters, thereby truly invoking the reasons why globals are frowned upon. > Of course, the multithreading scenario is usually fixed by using > thread-locals. All I'm proposing is that we replace thread locals with > task locals, and promote the use of task-local variables for managed > contexts (such as the decimal context) *that would otherwise be a global or > a thread-local variable*. This doesn't seem to me like a very big deal; > just an encouragement for people to make their stuff easy to use with PEP > 342 and 343. I'm all for encouraging people to make their stuff easy to use with these PEPs, and with multi-threading use. But IMO the best way to accomplish those goals is to refrain from global (or thread-local or task-local) context as much as possible, for example by passing along explicit context. The mere existence of a standard library module to make handling task-specific contexts easier sends the wrong signal; it suggests that it's a good pattern to use, which it isn't -- it's a last-resort pattern, when all other solutions fail. If it weren't for Python's operator overloading, the decimal module would have used explicit contexts (like the Java version); but since it would be really strange to have such a fundamental numeric type without the ability to use the conventional operator notation, we resorted to per-thread context. Even that doesn't always do the right thing -- handling decimal contexts is surprisingly subtle (as Nick can testify based on his experiences attempting to write a decimal context manager for the with-statement!). Yes, coroutines make it even subtler. But I haven't seen the use case yet for mixing coroutines with changes to decimal context settings; somehow it doesn't strike me as a likely use case (not that you can't construct one, so don't bother -- I can imagine it too, I just think YAGNI). > By the way, I don't know if you do much with Java these days, but a big > part of the whole J2EE fiasco and the rise of the so-called "lightweight > containers" in Java has all been about how to manage implicit context so > that you don't get stuck with either the inflexibility of globals or the > deadweight of passing tons of parameters around. I have to trust your word on that; I'm using Tomcat and not liking it but overly long parameter lists or context management aren't on my list of gripes. I have no idea what a "lightweight container" is. It sounds (especially since you put it in scare quotes :-) like a typical Java understatement. > One of the big selling > points of AspectJ is that it lets you implicitly funnel parameters from > point A to point B without having to modify all the call signatures in > between. Again, I'll have to trust you on this. I've never tried AspectJ or any other aspect-oriented system. But frankly I believe the idea is overhyped -- there are a few example cases that everyone uses to show it off (persistence, thread-safety) but I'm not sure these warrant the weight of the solution. > In other words, its use is promoted for precisely the sort of > thing that 'with' plus a task variable would be ideal for. This I simply don't follow (except that you seem to agree with me that AspectJ is overkill :-). The with-statement is primarily useful for mandatory cleanup (or release) and for restoring temporary changes. Even if decimal contexts were always passed around explicitly, a with-statement around a block with temporarily increased precision or changed error handling would make sense. > As far as I can > tell, 'with' plus a task variable is *much* easier to explain, use, and > understand than an aspect-oriented programming tool is! (Especially from > the "if the implementation is easy to explain, it may be a good idea" > perspective.) And this may be a very good thing. But I still expect that the number of people who need these is a lot smaller than you think (since clearly *you* need it :-). > >I know that somewhere in the proto-PEP Phillip argues that the context > >API needs to be made a part of the standard library so that his > >trampoline can efficiently swap implicit contexts required by > >arbitrary standard and third-party library code. My response to that > >is that library code (whether standard or third-party) should not > >depend on implicit context unless it assumes it can assume complete > >control over the application. > > I think maybe there's some confusion here, at least on my part. :) I see > two ways to read your statement, one of which seems to be saying that we > should get rid of the decimal context (because it doesn't have complete > control over the application), and the other way of reading it doesn't seem > connected to what I proposed. I simply see decimal as the exception that proves the rule. > Anything that's a global variable is an "implicit context". See above for my quibbles with that. > Because of > that, I spent considerable time and effort in PEAK trying to utterly stamp > out global variables. *Everything* in PEAK has an explicit context. But > that then becomes more of a pain to *use*, because you are now stuck with > managing it, even if you cram it into a Zope-style acquisition tree so > there's only one "context" to deal with. Plus, it assumes that everything > the developer wants to do can be supplied by *one* framework, be it PEAK, > Zope, or whatever, which is rarely the case but still forces framework > developers to duplicate everybody else's stuff. Well, face it. Frameworks want to control the world. Multiple frameworks rarely cooperate until they somehow agree on a common ground. That usually doesn't happen until both frameworks are already mature, and then it's painful of course. But I don't see a solution -- that's just the nature of frameworks. > In other words, I've come to realize that the path the major Python > application frameworks is not really Pythonic. (Is there a missing work "take" after "frameworks"?) > A Pythonic framework shouldn't load you down with new > management burdens and keep you from using > other frameworks. It should make life easier, and make your code *more* > interoperable, not less. Indeed, I've pretty much come to agreement with > the part of the Python developer community that has says Frameworks Are > Evil. I would agree, yes. :-) > A primary source of this evil in the big three frameworks (PEAK, > Twisted, and Zope) stem from their various approaches to dealing with this > issue of context, which lack the simplicity of global (or task-local) > variables. I think that's rather an exaggeration (again for effect?). They're frameworks, they want you to do everything in a way that reflects the framework's philosophy. Python, in its design philosophy, tries hard *not* to be a framework. (This sets it apart from Java, which is hostile to non-Java code.) Python tries to be helpful when you want to solve part of your problem using a different tool. It tries to work well even if Python is only a small part of your total solution. It tries to be agnostic of platform-specific frameworks, optionally working with them (e.g. fork and pipes on Unix) but not depending or relying on them. Even threads are quite optional to Python. > So, the lesson I've taken from my attempt to make everything explicit is > that what developers *really* want is to have global variables, just > without the downsides of uncontrolled modifications, and inter-thread or > inter-task pollution. Explicit isn't always better than implicit, because > oftentimes the practicality of having implicit things is much more > important than the purity of making them all explicit. Simple is better > than complex, and task-local variables are *much* simpler than trying to > make everything explicit. Maybe. But this may just be a case where you simply can't have your cake and eat it too. I expect that having 100 task-local variables would probe to be just as big a pain as 100 other forms of context, implicit or explicit. > >Also, Nick wants the name 'context' for PEP-343 style context > >managers. I think it's overloading too much to use the same word for > >per-thread or per-coroutine context. > > Actually, I was the one who originally proposed the term "context manager", > and it doesn't seem like a conflict to me. Indeed, I suggested in the > pre-PEP that "@context.manager" might be where we could put the > decorator. The overload was intentional, to suggest that when creating a > new context manager, it's worth considering whether the state should be > kept in a context variable, rather than a global variable. The naming > choice was for propaganda purposes, in other words. :) That may be, but I think it's confusing, since most of the popular uses of the with-statement will have nothing to do with task-locals. > Anyway, I'll withdraw the proposal for now. Thanks. > We can always leave it out of > 2.5, I can release an independent implementation, and then submit it for > consideration again in the 2.6 timeframe. That sounds like a much better plan than rushing into it now. > I just thought it would be a > no-brainer to use task locals where thread locals are currently being used, > and that's really all I was proposing we do as far as stdlib changes > anyway. I was also hoping to get good input from Python-dev regarding > some of the open issues, to try and build a consensus on them from > the beginning. If you look at the code in decimal.py, it already has three different ways to handle contexts, depending on the Python version and whether it has threads. Adding task-locals would just complicate matters. (Sorry for the long post -- there just wasn't anything you said that I felt could be left unquoted. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Oct 21 04:59:42 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 20 Oct 2005 19:59:42 -0700 Subject: [Python-Dev] AST branch is in? In-Reply-To: <200510211202.12015.anthony@interlink.com.au> References: <200510211202.12015.anthony@interlink.com.au> Message-ID: On 10/20/05, Anthony Baxter wrote: > So it looks like the AST branch has landed. Wooo! Well done to all who > were involved - it seems like it's been a huge amount of work. Hear, hear. Great news! Thanks to Jeremy, Neil and all the others. I can't wait to check it out! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ark at acm.org Fri Oct 21 06:58:28 2005 From: ark at acm.org (Andrew Koenig) Date: Fri, 21 Oct 2005 00:58:28 -0400 Subject: [Python-Dev] Coroutines, generators, function calling In-Reply-To: <4356C547.8020402@cignex.com> Message-ID: <006101c5d5fc$16ddb2b0$6402a8c0@arkdesktop> > so the new syntax would > not be useful, unless it was something that provided access to the index > item as a variable, like: > > yield foo(i) for i in x > > which barely saves you anything (a colon, a newline, and an indent). Not even that, because you can omit the newline and indent: for i in x: yield foo(i) There's a bigger difference between for i in x: yield i and yield from x Moreover, I can imagine optimization opportunities for "yield from" that would not make sense in the context of comprehensions. From nnorwitz at gmail.com Fri Oct 21 07:00:16 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 20 Oct 2005 22:00:16 -0700 Subject: [Python-Dev] Questionable AST wibbles Message-ID: Jeremy, There are a bunch of mods from the AST branch that got integrated into head. Hopefully, by doing this on python-dev more people will get involved. I'll describe high level things first, but there will be a ton of details later on. If people don't want to see this crap on python-dev, I can take this offline. Highlevel overview of code size (rough decrease of 300 C source lines): * Python/compile.c -2733 (was 6822 now 4089) * Python/Python-ast.c +2281 (new file) * Python/asdl.c +92 (new file) * plus other minor mods symtable.h has lots of changes to structs and APIs. Not sure what needs to be doc'ed. I was very glad to see that ./python compileall.py Lib took virtually the same time before and after AST. Yeah! Unfortunately, I can't say the same for memory usage for running compileall: Before AST: [10120 refs] After AST: [916096 refs] I believe there aren't that many true memory leaks from running valgrind. Though there are likely some ref leaks. Most of this is probably stuff that we are just hanging on to that is not required. I will continue to run valgrind to find more problems. A bunch of APIs changed and there is some additional name pollution. Since these are pretty internal APIs, I'm not sure that part is a big deal. I will try to find more name pollution and eliminate it by prefixing with Py. One API change which I think was a mistake was _Py_Mangle() losing 2 parameters (I think this was how it was a long time ago). See typeobject.c, Python.h, compile.c. pythonrun.h has a bunch of changes. I think a lot of the APIs changed, but there might be backwards compatible macros. I'm not sure. I need to review closely. symtable.h has lots of changes to structs and APIs. Not sure what needs to be doc'ed. Some #defines are history (I think they are in the enum now): TYPE_*. code.h was added, but it mostly contains stuff from compile.h. Should we remove code.h and just put everything in compile.h? This will remove lots little changes. code.h & compile.h are tightly coupled. If we keep them separate, I would like to see some other changes. This probably is not a big deal, but I was surprised by this change: +++ test_repr.py 20 Oct 2005 19:59:24 -0000 1.20 @@ -123,7 +123,7 @@ def test_lambda(self): self.failUnless(repr(lambda x: x).startswith( - " References: <5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com> <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <43579027.6040007@gmail.com> <43579ADC.80006@gmail.com> <5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20051021012649.02a37070@mail.telecommunity.com> At 07:57 PM 10/20/2005 -0700, Guido van Rossum wrote: >(Sorry for the long post -- there just wasn't anything you said that I >felt could be left unquoted. :-) Wow. You've brought up an awful lot of stuff I want to respond to, about the nature of frameworks, AOP, Chandler, PEP 342, software deployment, etc. But I know you're busy, and the draft I was working on in reply to this has gotten simply huge and still unfinished, so I think I should just turn it all into a blog article on "Why Frameworks Are Evil And What We Can Do To Stop Them". :) I don't think I've exaggerated anything, though. I think maybe you're perceiving more vehemence than I actually have on the issue. Context variables are a very small thing and I've not been arguing that they're a big one. In the scope of the coming Global War On Frameworks, they are pretty small potatoes. :) From nnorwitz at gmail.com Fri Oct 21 08:35:57 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 20 Oct 2005 23:35:57 -0700 Subject: [Python-Dev] problem with genexp In-Reply-To: References: Message-ID: On 10/16/05, Neal Norwitz wrote: > On 10/10/05, Neal Norwitz wrote: > > There's a problem with genexp's that I think really needs to get > > fixed. See http://python.org/sf/1167751 the details are below. This > > code: > > > > >>> foo(a = i for i in range(10)) > > > > I agree with the bug report that the code should either raise a > > SyntaxError or do the right thing. > > The change to Grammar/Grammar below seems to fix the problem and all > the tests pass. Can anyone comment on whether this fix is > correct/appropriate? Is there a better way to fix the problem? Since no one responded other than Jiwon, I checked in this change. I did *not* backport it since what was syntactically correct in 2.4.2 would raise an error in 2.4.3. I'm not sure which is worse. I'll leave it up to Anthony whether this should be backported. BTW, the change was the same regardless of old code vs. new AST code. n From mwh at python.net Fri Oct 21 14:41:47 2005 From: mwh at python.net (Michael Hudson) Date: Fri, 21 Oct 2005 13:41:47 +0100 Subject: [Python-Dev] Pre-PEP: Task-local variables In-Reply-To: (Guido van Rossum's message of "Thu, 20 Oct 2005 19:57:16 -0700") References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <43579027.6040007@gmail.com> <43579ADC.80006@gmail.com> <5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com> Message-ID: <2mr7af6nzo.fsf@starship.python.net> Guido van Rossum writes: > On 10/20/05, Phillip J. Eby wrote: >> At 08:57 AM 10/20/2005 -0700, Guido van Rossum wrote: >> >Whoa, folks! Can I ask the gentlemen to curb their enthusiasm? >> > >> >PEP 343 is still (back) on the drawing table, PEP 342 has barely been >> >implemented (did it survive the AST-branch merge?), and already you >> >are talking about adding more stuff. Please put on the brakes! >> >> Sorry. I thought that 343 was just getting a minor tune-up. > > Maybe, but the issues on the table are naming issues -- is __with__ > the right name, or should it be __context__? Should the decorator be > applied implicitly? Should the decorator be called @context or > @contextmanager? > >> In the months >> since the discussion and approval (and implementation; Michael Hudson >> actually had a PEP 343 patch out there), > > Which he described previously as "a hack" Err, that was the code I used for my talk at EuroPython. That really *was* a hack. The code on SF is much better. > and apparently didn't feel comfortable checking in. Well, I was kind of hoping for a review, or positive comment on the tracker, or *something* (Phillip posted half a review here a couple of weeks ago, but I've been stupidly stupidly busy since then). > At least some of it will have to be redone, (a) for the AST code, Indeed. Not much, I hope, the compiler changes were fairly simple. > and (b) for the revised PEP. Which I still haven't digested :-/ Cheers, mwh -- I'm about to search Google for contract assassins to go to Iomega and HP's programming groups and kill everyone there with some kind of electrically charged rusty barbed thing. -- http://bofhcam.org/journal/journal.html, 2002-01-08 From ncoghlan at gmail.com Fri Oct 21 15:34:14 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 21 Oct 2005 23:34:14 +1000 Subject: [Python-Dev] Pre-PEP: Task-local variables In-Reply-To: <5.1.1.6.0.20051020131024.02033d58@mail.telecommunity.com> References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <5.1.1.6.0.20051020131024.02033d58@mail.telecommunity.com> Message-ID: <4358EE56.4010903@gmail.com> Phillip J. Eby wrote: > Actually, it's fairly simple to write a generator decorator using > context.swap() that saves and restores the current execution state > around next()/send()/throw() calls, if you prefer it to be the > generator's responsibility to maintain such context. Yeah, I also realised there's a fairly obvious solution to my decimal.Context "problem" too: def iter_sin(iterable): orig_ctx = decimal.getcontext() with orig_ctx as ctx: ctx.prec += 10 for r in iterable: y = sin(r) # Very high precision during calculation with orig_ctx: yield +y # Interim results have normal precision # We get "ctx" back here # We get "orig_ctx" back here That is, if you want to be able to restore the original context just *save* the damn thing. . . Ah well, chalk up the __suspend__/__resume__ idea up as another case of me getting overly enthusiastic about a complex idea without looking for simpler solutions first. It's not like it would be the first time ;) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Fri Oct 21 16:30:26 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 22 Oct 2005 00:30:26 +1000 Subject: [Python-Dev] AST branch is in? In-Reply-To: <200510211202.12015.anthony@interlink.com.au> References: <200510211202.12015.anthony@interlink.com.au> Message-ID: <4358FB82.9040109@gmail.com> Anthony Baxter wrote: > So it looks like the AST branch has landed. Wooo! Well done to all who > were involved - it seems like it's been a huge amount of work. Congratulations from this quarter, too. I really liked the structure of the new compiler in the limited time I spent working with it on the AST branch, and am glad it has made its way onto the HEAD for Python 2.5. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Fri Oct 21 17:08:43 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 22 Oct 2005 01:08:43 +1000 Subject: [Python-Dev] Pre-PEP: Task-local variables In-Reply-To: References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com> <43579027.6040007@gmail.com> <43579ADC.80006@gmail.com> <5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com> Message-ID: <4359047B.6020203@gmail.com> Guido van Rossum wrote: > If it weren't for Python's operator overloading, the decimal module > would have used explicit contexts (like the Java version); but since > it would be really strange to have such a fundamental numeric type > without the ability to use the conventional operator notation, we > resorted to per-thread context. Even that doesn't always do the right > thing -- handling decimal contexts is surprisingly subtle (as Nick can > testify based on his experiences attempting to write a decimal context > manager for the with-statement!). Indeed. Fortunately it isn't as complicated as I feared last night (it turned out to be a problem with me trying to hit a small nail with the new sledgehammer I was playing with, forgetting entirely about the trusty old normal hammer still in the toolkit). > But I haven't seen the use case yet for mixing coroutines with changes > to decimal context settings; somehow it doesn't strike me as a likely > use case (not that you can't construct one, so don't bother -- I can > imagine it too, I just think YAGNI). For Python 2.5, I think the approach of generators explicitly reverting altered contexts around yield expressions is a reasonable way to go. This concept is workable for generators, because they *know* when they're going to lose control (i.e., by invoking yield), whereas it's impossible for threads to know when the eval loop is going to drop them in favour of a different thread. I think the parallel between __iter__ and __with__ continues to work here, too - alternate context managers to handle reversion of the context (e.g., Lock.released()) can be provided as separate methods, just as alternative iterators are provided (e.g., dict.iteritems(), dict.itervalues()). Also, just as we eventually added "itertools" to support specific ways of working with iterators, I expect to eventually see "contexttools" to support specific ways of working with contexts (e.g. duck-typed contexts like "closing", or a 'nested' context that allowed multiple resources to be properly managed by a single with statement). contexttools would also be the place for ideas like suspending and resuming a context - rather than requiring specific syntax, it could be implemented as a context manager: ctx = suspendable_context(EXPR) with ctx as VAR: # VAR would still be the result of (EXPR).__with__().__enter__() # It's just that suspendable_context would be taking care of # making that happen, rather than it happening the usual way with ctx.suspended(): # Context is suspended here # Context is resumed here I do *not* think we should add contexttools in Python 2.5, because there's far too much chance of YAGNI. We need experience with the 'with' statement before we can really identify the tools that are appropriate. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From jeremy at alum.mit.edu Fri Oct 21 17:13:47 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 21 Oct 2005 11:13:47 -0400 Subject: [Python-Dev] Questionable AST wibbles In-Reply-To: References: Message-ID: On 10/21/05, Neal Norwitz wrote: > There are a bunch of mods from the AST branch that got integrated into > head. Hopefully, by doing this on python-dev more people will get > involved. I'll describe high level things first, but there will be a > ton of details later on. If people don't want to see this crap on > python-dev, I can take this offline. Thanks for all the notes and questions, Neal. There were a lot of changes made over a long time, and it's good to discuss some of them. > Highlevel overview of code size (rough decrease of 300 C source lines): > * Python/compile.c -2733 (was 6822 now 4089) > * Python/Python-ast.c +2281 (new file) > * Python/asdl.c +92 (new file) > * plus other minor mods > > symtable.h has lots of changes to structs and APIs. Not sure what > needs to be doc'ed. The old symtable wasn't well documented and the API it exposed to Python programmers was lousy. We need to figure out a good Python API and document it. > I was very glad to see that ./python compileall.py Lib took virtually > the same time before and after AST. Yeah! Unfortunately, I can't say > the same for memory usage for running compileall: > > Before AST: [10120 refs] > After AST: [916096 refs] That's great news! That is, I expected it to be a lot slower to compile and didn't have any particulary good ideas about how to speed it up. I expected there to be a lot of memory bloat and think we can fix that without undue effort :-). > A bunch of APIs changed and there is some additional name pollution. > Since these are pretty internal APIs, I'm not sure that part is a big > deal. I will try to find more name pollution and eliminate it by > prefixing with Py. Right. The code isn't binary compatible with Python 2.4 right now, but given the APIs that changed I wasn't too concerned about that. I'm not sure who should make the final decision there. > One API change which I think was a mistake was _Py_Mangle() losing 2 > parameters (I think this was how it was a long time ago). See > typeobject.c, Python.h, compile.c. I don't mind this one since it's an _Py function. I don't think code outside the core should use it. > pythonrun.h has a bunch of changes. I think a lot of the APIs > changed, but there might be backwards compatible macros. I'm not > sure. I need to review closely. We should double-check. I tried to get rid of the nest of different functions that call each other by replacing the old ones with macros that call the newest ones (the functions that take the most arguments). It's not really a related change, except that it seemed like cleanup of compiler-related code. Also, a bunch of functions started taking const char* instead of char*. I think that's a net win, too. > code.h was added, but it mostly contains stuff from compile.h. Should > we remove code.h and just put everything in compile.h? This will > remove lots little changes. > code.h & compile.h are tightly coupled. If we keep them separate, I > would like to see some other changes. I would like to keep them separate. The compiler produces code objects, but consumers of code objects don't need to know anything about the compiler. You did remind me that I intended to remove the #include "compile.h" lines from a bunch of files that merely consume code objects. What other changes would you like to see? > This probably is not a big deal, but I was surprised by this change: > > +++ test_repr.py 20 Oct 2005 19:59:24 -0000 1.20 > @@ -123,7 +123,7 @@ > > def test_lambda(self): > self.failUnless(repr(lambda x: x).startswith( > - " + " > This one may be only marginally worse (names w/parameter unpacking): > > test_grammar.py > > - verify(f4.func_code.co_varnames == ('two', '.2', 'compound', > - 'argument', 'list')) > + vereq(f4.func_code.co_varnames, > + ('two', '.1', 'compound', 'argument', 'list')) > > There are still more things I need to review. These were the biggest > issues I found. I don't think most are that big of a deal, just > wanted to point stuff out. I don't have a strong sense for how important these changes are. I don't think the old behavior was documented, but I can imagine some code depending on these implementation details. Jeremy From guido at python.org Fri Oct 21 17:26:35 2005 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Oct 2005 08:26:35 -0700 Subject: [Python-Dev] Questionable AST wibbles In-Reply-To: References: Message-ID: On 10/21/05, Jeremy Hylton wrote: > On 10/21/05, Neal Norwitz wrote: > > This probably is not a big deal, but I was surprised by this change: > > > > +++ test_repr.py 20 Oct 2005 19:59:24 -0000 1.20 > > @@ -123,7 +123,7 @@ > > > > def test_lambda(self): > > self.failUnless(repr(lambda x: x).startswith( > > - " > + "", please change it back. The angle brackets make it stand out more, and I imagine people might be checking for this to handle it specially. > > This one may be only marginally worse (names w/parameter unpacking): > > > > test_grammar.py > > > > - verify(f4.func_code.co_varnames == ('two', '.2', 'compound', > > - 'argument', 'list')) > > + vereq(f4.func_code.co_varnames, > > + ('two', '.1', 'compound', 'argument', 'list')) This doesn't bother me. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Fri Oct 21 17:42:44 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 21 Oct 2005 11:42:44 -0400 Subject: [Python-Dev] Questionable AST wibbles In-Reply-To: References: Message-ID: <5.1.1.6.0.20051021114101.01faf080@mail.telecommunity.com> At 11:13 AM 10/21/2005 -0400, Jeremy Hylton wrote: >I don't have a strong sense for how important these changes are. I >don't think the old behavior was documented, but I can imagine some >code depending on these implementation details. I'm pretty sure I've seen code in the field (e.g. recipes in the online Python cookbook) that checked for a function's name being ''. That's also a thing that's likely to show up in people's doctests. From jeremy at alum.mit.edu Fri Oct 21 18:03:54 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 21 Oct 2005 12:03:54 -0400 Subject: [Python-Dev] AST branch is in? In-Reply-To: References: <200510211202.12015.anthony@interlink.com.au> Message-ID: On 10/20/05, Neal Norwitz wrote: > On 10/20/05, Anthony Baxter wrote: > > > > Could someone involved give a short email laying out what concrete (no > > pun intended) advantages this new compiler gives us? Does it just > > allow us to do new and interesting manipulations of the code during > > compilation? Cleaner, easier to maintain, or the like? > I just wanted to clarify that Neal meant the abstract syntax, not the grammar. It should allow people to write tools to analyze Python source code without having to worry about the often irrelevant details of the exact tokens or the way they are parsed. We should be able to get to a state where tools using the AST work with Python and Jython (and maybe IronPython, who knows). The tokenize and parser modules still exist for tools for which those details aren't irrelevant. We should also think about how to migrate the compiler module from its current AST to the new AST, although the backwards compatibility issues there are a bit tricky. > The Grammar is (was at one point at least) shared between Jython and > would allow more tools to be able to share infrastructure. The idea > is to eventually be able to have [JP]ython output the same AST to > tools. There is quite a bit of generated code based on the Grammar. > So some stuff should be easier. Other stuff is just moved. You still > need to convert from the AST to the byte code. > > Hopefully it will be easier to do various sorts of optimization and > general manipulation of an AST rather than what existed before. I think it should be a lot easier to write tools for the C Python compiler that do extra analysis or optimization. The existing peephole optimizer could be improved by integrating it with the bytecode assembler (for example, eliminating all NOP bytecodes). Jeremy From jeremy at alum.mit.edu Fri Oct 21 18:06:42 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 21 Oct 2005 12:06:42 -0400 Subject: [Python-Dev] AST branch is in? In-Reply-To: References: <200510211202.12015.anthony@interlink.com.au> Message-ID: On 10/20/05, Guido van Rossum wrote: > On 10/20/05, Anthony Baxter wrote: > > So it looks like the AST branch has landed. Wooo! Well done to all who > > were involved - it seems like it's been a huge amount of work. > > Hear, hear. Great news! Thanks to Jeremy, Neil and all the others. I > can't wait to check it out! I want to thank all the people who made it possible by writing code and debugging. I hope this is a complete list: Armin Rigo Brett Cannon Grant Edwards John Ehresman Kurt Kaiser Neal Norwitz Neil Schemenauer Nick Coghlan Tim Peters And thanks to the PSF and PyCon organizers for hosting the formerly annual ast-branch sprints! Jeremy From nas at arctrix.com Fri Oct 21 20:32:22 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Fri, 21 Oct 2005 18:32:22 +0000 (UTC) Subject: [Python-Dev] AST branch is in? References: <200510211202.12015.anthony@interlink.com.au> Message-ID: Anthony Baxter wrote: > Could someone involved give a short email laying out what concrete (no > pun intended) advantages this new compiler gives us? One advantage is that it decreases the coupling between the parser and the backend of the compiler. For example, it should be possible to replace the parser without modifying a lot of the compiler. Also, the concrete syntax tree (CST) generated by Python's parser is not a convenient data structure to deal with. Anyone who's used the 'parser' module probably experienced the pain: >>> parser.ast2list(parser.suite('a = 1')) [257, [266, [267, [268, [269, [320, [298, [299, [300, [301, [303, [304, [305, [306, [307, [308, [309, [310, [311, [1, 'a']]]]]]]]]]]]]]], [22, '='], [320, [298, [299, [300, [301, [303, [304, [305, [306, [307, [308, [309, [310, [311, [2, '1']]]]]]]]]]]]]]]]], [4, '']]], [0, '']] > Does it just allow us to do new and interesting manipulations of > the code during compilation? Well, that's a pretty big deal, IMHO. For example, adding pychecker-like functionality should be straight forward now. I also hope some of the namespace optimizations get explored (e.g. PEP 267). > Cleaner, easier to maintain, or the like? At this point, the old and new compiler are pretty similar in terms of complexity. However, the new compiler is a much better base to build upon. Neil From guido at python.org Fri Oct 21 21:13:36 2005 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Oct 2005 12:13:36 -0700 Subject: [Python-Dev] AST branch is in? In-Reply-To: References: <200510211202.12015.anthony@interlink.com.au> Message-ID: On 10/21/05, Neil Schemenauer wrote: > Also, the concrete syntax tree (CST) generated by Python's parser is > not a convenient data structure to deal with. Anyone who's used the > 'parser' module probably experienced the pain: > > >>> parser.ast2list(parser.suite('a = 1')) > [257, [266, [267, [268, [269, [320, [298, [299, [300, [301, > [303, [304, [305, [306, [307, [308, [309, [310, [311, [1, > 'a']]]]]]]]]]]]]]], [22, '='], [320, [298, [299, [300, [301, [303, > [304, [305, [306, [307, [308, [309, [310, [311, [2, > '1']]]]]]]]]]]]]]]]], [4, '']]], [0, '']] That's the fault of the 'parser' extension module though, and this affects tools using the parser module, not the bytecode compiler itself. The CST exposed to C programmers is slightly higher level. (But the new AST is higher level still, of course.) BTW, Elemental is letting me open-source a reimplementation of pgen in Python. This also includes a nifty way to generate ASTs. This should become available within a few weeks. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Fri Oct 21 23:43:01 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 22 Oct 2005 07:43:01 +1000 Subject: [Python-Dev] Questionable AST wibbles In-Reply-To: References: Message-ID: <435960E5.7090502@gmail.com> Jeremy Hylton wrote: > I would like to keep them separate. The compiler produces code > objects, but consumers of code objects don't need to know anything > about the compiler. Please do keep these separate - the only reason I've ever had to muck with code objects is to check if a function is a generator or not, and including the entire compiler header just for that seemed like overkill. It's not a huge issue for me, but the separate header files do give better 'separation of concerns'. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From mal at egenix.com Sat Oct 22 00:01:06 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 22 Oct 2005 00:01:06 +0200 Subject: [Python-Dev] Questionable AST wibbles In-Reply-To: References: Message-ID: <43596522.7090209@egenix.com> Neal Norwitz wrote: > Jeremy, > > There are a bunch of mods from the AST branch that got integrated into > head. Hopefully, by doing this on python-dev more people will get > involved. I'll describe high level things first, but there will be a > ton of details later on. If people don't want to see this crap on > python-dev, I can take this offline. > > Highlevel overview of code size (rough decrease of 300 C source lines): > * Python/compile.c -2733 (was 6822 now 4089) > * Python/Python-ast.c +2281 (new file) > * Python/asdl.c +92 (new file) > * plus other minor mods FYI, I'm getting these warnings: Python/Python-ast.c: In function `marshal_write_expr_context': Python/Python-ast.c:1995: warning: unused variable `i' Python/Python-ast.c: In function `marshal_write_boolop': Python/Python-ast.c:2070: warning: unused variable `i' Python/Python-ast.c: In function `marshal_write_operator': Python/Python-ast.c:2085: warning: unused variable `i' Python/Python-ast.c: In function `marshal_write_unaryop': Python/Python-ast.c:2130: warning: unused variable `i' Python/Python-ast.c: In function `marshal_write_cmpop': Python/Python-ast.c:2151: warning: unused variable `i' Python/Python-ast.c: In function `marshal_write_keyword': Python/Python-ast.c:2261: warning: unused variable `i' Python/Python-ast.c: In function `marshal_write_alias': Python/Python-ast.c:2270: warning: unused variable `i' -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 21 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mal at egenix.com Sat Oct 22 00:04:20 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 22 Oct 2005 00:04:20 +0200 Subject: [Python-Dev] New codecs checked in Message-ID: <435965E4.5050207@egenix.com> I've checked in a whole bunch of newly generated codecs which now make use of the faster charmap decoding variant added by Walter a short while ago. Please let me know if you find any problems. Some codecs (esp. the Mac OS X ones) have minor changes. These originate from updated mapping files on ftp.unicode.org. I also added an alias iso8859_1 -> latin_1, so that applications using the iso8859_1 encoding name can benefit from the faster native implementation of the latin_1 codec. Regards, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 22 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From jimjjewett at gmail.com Sat Oct 22 00:25:47 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 21 Oct 2005 18:25:47 -0400 Subject: [Python-Dev] PEP 267 -- is the semantics change OK? Message-ID: (In http://mail.python.org/pipermail/python-dev/2005-October/057501.html) Neil Schemenauer suggested PEP 267 as an example of something that might be easier with the AST compiler. As written, PEP 267 does propose a slight semantics change -- but it might be an improvement, if it is acceptable. Today, after from othermod import val1 import othermod val2 = othermod.val2 othermod.val3 # Just making sure it was referenced early othermod.val1 = "new1" othermod.val2 = "new2" othermod.val3 = "new3" print val1, val2, othermod.val3 The print statement will see the updated val3, but will still have the original values for val1 and val2. Under PEP267, all three variables would be compiled to a slot access in othermod, and would see the updated objects. In many cases, this would be a *good* thing. It might allow reload to be rewritten to do what people expect. On the other hand, it would be a change. Would it be an acceptable change? -jJ From jeremy at alum.mit.edu Sat Oct 22 00:48:34 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 21 Oct 2005 18:48:34 -0400 Subject: [Python-Dev] PEP 267 -- is the semantics change OK? In-Reply-To: References: Message-ID: On 10/21/05, Jim Jewett wrote: > (In http://mail.python.org/pipermail/python-dev/2005-October/057501.html) > Neil Schemenauer suggested PEP 267 as an example of something that > might be easier with the AST compiler. > > As written, PEP 267 does propose a slight semantics change -- but it > might be an improvement, if it is acceptable. No, it does not. PEP 267 suggests a way to preserve the existing semantics. You could probably come up with a much simpler approach if you agreed to change semantics. Jeremy > Today, after > > from othermod import val1 > import othermod > val2 = othermod.val2 > othermod.val3 # Just making sure it was referenced early > > othermod.val1 = "new1" > othermod.val2 = "new2" > othermod.val3 = "new3" > > print val1, val2, othermod.val3 > > The print statement will see the updated val3, but will still have > the original values for val1 and val2. > > Under PEP267, all three variables would be compiled to a slot > access in othermod, and would see the updated objects. > > In many cases, this would be a *good* thing. It might allow > reload to be rewritten to do what people expect. On the other > hand, it would be a change. Would it be an acceptable change? > > -jJ > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > From t-meyer at ihug.co.nz Sat Oct 22 02:05:11 2005 From: t-meyer at ihug.co.nz (Tony Meyer) Date: Sat, 22 Oct 2005 13:05:11 +1300 Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-09-01 through 2005-09-16 Message-ID: <3A5D2942-E0C3-4768-94D5-E43B914BB80B@ihug.co.nz> This is over a month late, sorry, but here it is (Steve did his threads ages ago; I've fallen really behind). Summaries for the second half of September and the first half of October will soon follow. As always, if anyone is able to give this a quick look that would be great. Feedback to me or Steve (steven.bethard at gmail.com). Thanks! ============= Announcements ============= ----------------------------- QOTF: Quotes of the Fortnight ----------------------------- In the thread on the print statement, Charles Cazabon provided some nice imagery for Guido's Python 3.0 strategy. Our first QOTF is his comment about the print statement: It's an anomaly. It stands out in the language as a sore thumb waiting for Guido's hammer. We also learned something important about the evolution of Python thanks to Paul Moore. In the thread on the Python 3.0 executable name, Greg Ewing worried that if the Python 3.0 executable is named "py": Python 4.0 is going to just be called "p", and by the time we get to Python 5.0, the name will have vanished altogether! Fortunately, as Paul Moore explains in our second QOTF, these naming conventions are exactly as we should expect them: That's OK, by the time Python 5.0 comes out, it will have taken over the world and be the default language for everything. So omitting the name is exactly right :-) [SJB] Contributing threads: - `Replacement for print in Python 3.0 `__ - `Python 3 executable name `__ -------------------------------------------------- The "Swiss Army Knife (...Not)" API design pattern -------------------------------------------------- This fortnight saw a number of different discussions on what Guido's guiding principles are in making design decisions about Python. Guido introduced the "Swiss Army Knife (...Not)" API design pattern, which has been lauded by some as `the long-lost 20th principle from the Zen of Python`_. A direct quote from Guido: [I]nstead of a single "swiss-army-knife" function with various options that choose different behavior variants, it's better to have different dedicated functions for each of the major functionality types. This principle is the basis for pairs like str.split() and str.rsplit () or str.find() and str.rfind(). The goal is to keep cognitive overhead down by associating with each use case a single function with a minimal number of parameters. [SJB] .. _the long-lost 20th principle from the Zen of Python: http:// mail.python.org/pipermail/python-dev/2005-September/056228.html Contributing threads: - `Replacement for print in Python 3.0 `__ - `Replacement for print in Python 3.0 `__ ------------------------ A Python-to-C++ compiler ------------------------ Mark Dufour announced `Shed Skin`_, an experimental Python-to-C++ compiler, which can convert many Python programs into optimized C++ code, using static type inference techniques. It works best for Python programs written in a relatively static C++-style; much work remains to be done, and Mark would like anyone interested in getting involved to contact him. Shed Skin was one of the recent `Google`_ `Summer of Code`_ projects. .. _Shed Skin: http://shedskin.sourceforge.net .. _Google: http://www.google.com .. _Summer of Code: http://code.google.com/summerofcode.html [TAM] Contributing thread: - `First release of Shed Skin, a Python-to-C++ compiler. `__ -------------------------------------------------------------- python-checkins followups now stay on the python-checkins list -------------------------------------------------------------- In a follow-up to a `thread in early July`_, the python-checkins mailing list Reply-To header munging has been turned off. Previously, follow-ups to python-checkins would be addressed to python-dev; now, follow-ups will stay on the python-checkins list by default. .. _thread in early July: http://www.python.org/dev/summary/ 2005-07-01_2005-07-15.html#behavior-of-sourceforge-when-replying-to- python-checkins [TAM] Contributing thread: - `python-checkins reply-to `__ ========= Summaries ========= -------------------------------------------- Converting print to a function in Python 3.0 -------------------------------------------- In Python 3.0, Guido wants to change print from a statement to a function. Some of his motivation for this change: * Converting code that uses the print statement to instead use the logging package, a UI package, etc. is complicated because of the syntax. Parentheses, commas and >>s all behave differently in the print statement than they would in a function call. * Having print as a statement makes the language harder to evolve. For example, if it's determined that Python should gain printf behavior of some sort, adding this is harder -- as a statement, it would require the introduction of new syntax, as a function, it would feel like a second-class citizen compared to print. * Since the print statement always inserts spaces, code that doesn't want these spaces will often have to use a completely different style of formatting (e.g. using sys.stdout.write and/or string formatting) * Changing the behavior of statements is hard, while builtin functions can simply be replaced by setting an attribute of __builtin__. Guido's initial proposal suggested three methods to be adopted by all stream (file-like) objects:: stream.write(a1, a2, ...) equivalent to: map(stream.write, map(str, [a1, a2, ...])) stream.writeln(a1, a2, ...) equivalent to: stream.write(a1, a2, ..., "\n") stream.writef(fmt, a1, a2, ...) equivalent to: stream.write(fmt % (a1, a2, ...)) Additionally, three new builtins would appear, write(), writeln() and writef() which called the corresponding methods on sys.stdout. People had a number of problems with this initial proposal: * People make heavy use of the space-insertion behavior of the current print statement. With Guido's initial proposal, inserting spaces would require manually adding space characters, e.g. ``write (foo, " ", bar, " ", baz)``. * People want to keep the stream API simple. With Guido's initial proposal, all file-like objects would probably need to support these three new methods. (But see also `Deriving file-like object methods from read() and write()`_.) * People primarily (about 85% of the time) use the print statement to print complete lines. With Guido's initial proposal, the function to do this, writeln(), has the longer name than the less-frequently needed write(). There were a variety of proposals following Guido's that attempted to address the issues above, most of which were posted to the wiki_. They generally all proposed a function something like:: def print(*args): sys.stdout.write(' '.join(str(arg) for arg in args)) sys.stdout.write('\n') with support for a file= keyword parameter to specify a stream other than sys.stdout, and a sep= keyword parameter to specify a separator other than ' '. There was some discussion about how the final newline could be suppressed, including a nl= keyword parameter and the usage of the Ellipsis object (e.g. so that ``print(foo, bar, ...) `` would not print the final newline). There was also substantial support for a formatting variant like:: def printf(fmt, *args): print(fmt % args) In the end, Guido seemed to be leaning towards supporting three printing variants:: * print(...) would be much like the proposals above, calling str() on each argument and then printing them with spaces in between and a following newline * printraw(...) or printbare(...) would also call str() on each argument and print them, but with no intervening spaces and no final newline (c) printf(fmt, ...) would string-substitute the arguments into the format string and then write the format string Each of these functions would also accept a keyword parameter for specifying a stream other than sys.stdout. Because ``print`` is a keyword, ``from __future__ import printing`` would be required to use the new print() function. At this point, the thread trailed off, and no final decisions were made. [SJB] .. _wiki: http://wiki.python.org/moin/PrintAsFunction Contributing threads: - `Python 3 design principles `__ - `Replacement for print in Python 3.0 `__ - `New Wiki page - PrintAsFunction `__ - `Hacking print (was: Replacement for print in Python 3.0) `__ - `Pascaloid print substitute (Replacement for print in Python 3.0) `__ ---------------------------------- Making C code easier in Python 3.0 ---------------------------------- Nick Jacobson asked whether reference counting would be replaced in Python 3.0. Guido pointed out that the (CPython) implementation would have to be completely changed, and that isn't planned; many people also pointed out that reference counting is an implementation detail, not part of the language specification, and that there are other options that can be explored (e.g. `PyPy`_, `Jython`_, `IronPython`_). Arising from this question was a suggestion from Greg Ewing to build something akin to `Pyrex`_ (which takes care of reference count/ garbage collection issues automatically) into the standard Python distribution. This suggestion was met with general enthusiasm; some general discussion about which cases were most appropriate for Pyrex use (e.g. extension modules, wrapping C libraries, modules implemented in C for performance reasons) also followed. .. _PyPy: http://codespeak.net/pypy/ .. _Jython: http://www.jython.org/ .. _IronPython: http://www.ironpython.com/ .. _Pyrex: http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/ [TAM] Contributing thread: - `reference counting in Py3K `__ -------------------------- Multiple views of a string -------------------------- Tim Delaney suggested that str.partition could return 'views' of a string, rather than new string objects, as the substrings, to avoid the time needed to create strings that are potentially unused. Raymond Hettinger pointed out that the practical cost is unlikely to be significant, as the strings are likely to be empty, small, or used, and that all the pre-Python 2.5 methods would still be available for those times when they would be more appropriate. However, using string 'views' (objects that reference the 'parent' string, rather than copying the data) caught the imagination of several Python-dev'ers. Discussion ensued about how this object could work (Skip Montanaro threw together a sample implementation); towards the end it was pointed out that buffer() objects, with some additional string methods, could provide this slice-like instance with low memory requirements. Guido also mentioned `NSString`_, the NextStep string type used by `ObjC`_, which is fairly similar. .. _ObjC: http://en.wikipedia.org/wiki/Objc .. _NSString: http://developer.apple.com/documentation/Cocoa/ Reference/Foundation/ObjC_classic/Classes/NSString.html [TAM] Contributing threads: - `Proof of the pudding: str.partition() `__ - `String views (was: Re: Proof of the pudding: str.partition()) `__ - `String views `__ ------------------------------- String-formatting in Python 3.0 ------------------------------- Currently, using ``%`` for string formatting has a number of inconvenient consequences: * precedence issues: ``"a%sa" % "b"*4`` produces ``'abaabaabaaba'``, not ``'abbbba'`` * special-cased tuples: ``"%s" % x`` produces a string representation of x *unless* x is a tuple (in which case it unpacks it, raising a TypeError if ``len(x) != 1``). * keyword formatting issues: a number of people have complained that ``%(myvar)s`` is much more complicated than it needs to be (hence string.Template's ``$myvar``). To address the first two issues, Raymond Hettinger proposed that string formatting become a builtin function, and others proposed that formatting become a method of str/unicode objects. Guido definitely agreed with the move from ``%`` to a callable, but it was unclear as to his preference for function or method. Nick Coghlan tried to address the ``%(myvar)s`` issue by exploring a few extensions to string.Template formatting. He produced a format() function where arguments could be specified either by position (e.g. $1, $2, etc.) or with keywords (e.g. $item, $quantity), and where the usual C-style format specifiers were still supported:: format("$item: $[0.2f]quantity", quantity=0.5, item='Bees') format("$1: $[0.2f]2", 'Bees', 0.5) Nick also briefly explored format specifiers for expanding iterables, but Guido disliked the idea, explaining that adding or removing a print from a program should not drastically change the program's behavior (as it might if a print accidentally consumed an iterable that you weren't done with). There was also some high-level discussion about internationalization concerns, where format strings need to be easy for translators to read and reorganize. Since word orders may change, having either keyword parameters or positional parameters (as in Nick's scheme) is crucial. Unfortunately, this discussion seemed to get lost in the massive `Converting print to a function in Python 3.0`_ discussion, and no decisions were made about either a formatting function or Nick's format specifier extensions. [SJB] Contributing threads: - `Replacement for print in Python 3.0 `__ - `string formatting options and removing basestring.__mod__ (WAS: Replacement for print in Python 3.0) `__ - `string formatting and i18n `__ - `string.Template format enhancements (Re: Replacement for print in Python 3.0) `__ --------------------------------------------------------- Deriving file-like object methods from read() and write() --------------------------------------------------------- A variety of methods on the file object, including __iter__(), next (), readline(), readlines() and writelines(), are all derivable from the read() and write() methods. At least twice this fortnight, the issue was raised about making it easier for file-like objects to add the derivable methods if they've defined read() and write(). One suggestion was to provide a FileMixin class (like the DictMixin of UserDict) that other types could inherit from. This has the problem that the creator of the file-like object must determine at the time that the class is defined that it should support the additional methods. It is also more difficult to use mixin classes in C code (because multiple inheritance requires dealing with the type's metaclass). Fredrik Lundh suggested that something along the lines of `PEP 246`_'s object adaptation might be appropriate, but there was still some disagreement on the issue. [SJB] .. _PEP 246: http://www.python.org/peps/pep-0246.html Contributing threads: - `Mixin classes in the standard library `__ - `Simplify the file-like-object interface (Replacement for print in Python 3.0) `__ - `Simplify the file-like-object interface `__ ------------------------------------ Making new-style classes the default ------------------------------------ Lisandro Dalcin proposed that something like:: from __future__ import new_style_classes be introduced to have newly defined classes implicitly derive from object. It was pointed out that this functionality is already available through the module-level statement:: __metaclass__ = type The argument was made that the __future__ version would be easier for non-experts to understand and to Google for, but Guido declared that the current syntax is fine -- there are much more important issues to be dealt with right now. [SJB] Contributing thread: - `PEP 3000 and new style classes `__ -------------------------------------------------- Using __future__ to have builtins return iterators -------------------------------------------------- Lisandro Dalcin requested a __future__ import of some sort that would * make range() and zip() return iterators * remove xrange() * make the dict.keys(), dict.values(), dict.items() etc. methods return iterators Guido indicated that an alternate builtins module could be provided so that the first point could be covered with something like:: from future_builtins import zip, range However, there wasn't really a good way to change the dict methods. Simply importing a new dict object from "future_builtins" wouldn't solve the problem because using anyone's module that used the old dict object would mean a mix of the two types in your module. And since __future__ imports are intended to affect only the module which includes them, changing the builtin dict object globally would be inappropriate (as it would let an import in one module break code in another). [SJB] Contributing thread: - `PEP 3000 and iterators `__ ------------------------------------------------------------- Using compiled re methods vs. using module-level re functions ------------------------------------------------------------- After Michael Chermside commented that users should be encouraged to use the methods on compiled re objects instead of the re functions available at the module level (and after Stephen J. Turnbull promised to look into supplying such a documentation patch), there was a brief discussion about how much of a difference using the compiled re objects really makes. As it turns out, in the CPython implementation, the module-level functions cache the first 100 patterns, so in many cases, the only additional cost of using the module-level functions is a dictionary lookup. [SJB] Contributing thread: - `Revising RE docs `__ -------------------------------------- urlparse and urls with too many '../'s -------------------------------------- Fabien Schwob pointed out that urlparse.join() doesn't strip out any extraneous '..' directories (e.g. http://example.com/../index.html). While Guido indicated that he found the current behaviour acceptable, Jeff Epler pointed out that `RFC 2396`_ states that invalid URIs like this may be handled by removing the ".." segments from the resolved path (although this is an implementation detail). Armin Rigo indicated that, even if this is theoretically not a bug, a proposed patch with this motiviation would be welcome. .. _RFC 2396: http://www.faqs.org/rfcs/rfc2396.html [TAM] Contributing thread: - `bug in urlparse `__ ----------------------------------------------- Using an iterator instead of a tuple for \*args ----------------------------------------------- Nick Coghlan suggested that in Python 3.0, the \*args extended function call syntax should produce an iterator instead of a tuple as it currently does. That would mean that code like:: output(*some_long_iterator) would not load the entire iterator into a memory before processing it. I pointed him to a previous discussion Raymond Hettinger and I had about the subject that indicated that for \*args, sequences were preferable to iterators in a number of situations. Guido agreed, indicating that \*args will continue to be sequences in Python 3.0. [SJB] Contributing thread: - `iterators and extended function call syntax (WAS: Replacement for print in Python 3.0) `__ ---------------------------------------- Constructing traceback objects in Python ---------------------------------------- Contributing thread: - `Asynchronous use of Traceback objects `__ ------------------------------- No dedent() methods for strings ------------------------------- Contributing thread: - `str.dedent `__ ------------------------ Arguments vs. parameters ------------------------ - `Term unification `__ ----------------------------------------------------------------------- Removing sequence support from the return value of stat() in Python 3.0 ----------------------------------------------------------------------- Terry J. Reedy proposed that, in Python 3.0, instead of os.stat() returning a sequence (where the order of the items is only of historical significance), a proper stat object be returned. This was met with general support, and so seems likely to occur. Skip Montanaro also proposed that the st_ prefixes in the attribute names be removed, since there are no namespace issues to be concerned with, which met with some approval, but concern from Guido that the forms with the prefixes would be more familiar to users, and make Googling or grepping simpler. [TAM] Contributing thread: - `stat() return value (was: Re: Proof of the pudding: str.partition ()) `__ -------------------------------------------------- Making code in the Tools directory more accessible -------------------------------------------------- Installation of Python typically doesn't include the Tools directory; combined with the lack of mention of these scripts in the documentation, this means that knowledge of these generally useful scripts is fairly limited. Tim Peters noted that historically a Tools directory was only added to the Windows installer if it was specifically requested; as such, the audiopy, bgen, compiler, faqwiz, framer, modulator, msi, unicode, and world Tools directories are not currently included in the Windows installer. Nick Coghlan added that Tools/README.txt isn't included in the Windows installer, so Windows users don't get a synopsis of the tools that are included; he also suggested that adding this readme to the "undocumented modules" section of the standard library would be a simple improvement. Non- windows users typically don't get the Tools directory at all with an install. Remaining questions included how the directory should be documented (e.g. man pages for the scripts, a documentation page for them), where to install them on non-Windows installations (e.g. /usr/share/ python, /usr/lib/pythonX.Y/Tools), and whether the Windows installer should include all of the directories. [TAM] Contributing thread: - `Tools directory (Was RE: Replacement for print in Python 3.0) `__ ---------------------------------- Responsiveness of IDLE development ---------------------------------- Noam Raphael posted a request for help getting a large patch to IDLE committed to CVS. He was concerned that there hasn't been any IDLE development recently, and that patches are not being considered. He indicated that his group was considering offering a fork of IDLE with the improvements, but that they would much prefer integrating the improvements into the core distribution. It was pointed out that a fork might be the best solution, for various reasons (e.g. the improvements may not be of general interest, the release time would be much quicker), and that this was how the current version of IDLE was developed. The dicussion died out, so it seems likely that a fork will be the resulting solution. [TAM] Contributing thread: - `IDLE development `__ ----------------------------- Speeding up list append calls ----------------------------- A `comp.lang.python message from Tim Peters`_ prompted Neal Norwitz to investigate how the code that Tim posted could be sped up. He hacked the code to replace var.append() with the LIST_APPEND opcode, and achieved a roughly 200% speed increase. Although this doesn't work in general, Neal wondered if it could be used as a peephole optimization when a variable is known to be a list. Martin v. L?wis suggested that the code could simply check whether it was a list; Phillip J. Eby and Fredrik Lundh pointed out that this is similar to what various math operators do (e.g. speeding up int + int calls). .. _comp.lang.python message from Tim Peters: http:// groups.google.com/group/comp.lang.python/msg/9075a3bc59c334c9 [TAM] Contributing thread: - `speeding up list append calls `__ ------------------------------------ Allowing str.strip to remove "words" ------------------------------------ Jonny Reichwald proposed an enhancement to str.strip(). In addition to its current form, where it takes a string of characters to strip, to take any iterable containing either character lists or string lists, so that is is possible to remove entire words from the stripped string. For example:: #A char list gives the same result as the standard strip >>> my_strip("abcdeed", "de") 'abc' #A list of strings instead >>> my_strip("abcdeed", ("ed",)) 'abcde' #The char order in the strings to be stripped are of importance >>> my_strip("abcdeed", ("ad", "eb")) 'abcdeed' Raymond Hettinger queried whether there was actual demand for such a change, and whether such demand was sufficient to justify the added complexity; Josiah Carlson also pointed out that implementing this only requires a four-line function. Judging from the lack of responses, it seems likely that there isn't enough demand. Contributing thread: - `str.strip() enhancement `__ ================ Deferred Threads ================ - `C coding experiment `__ - `os.path.diff(path1, path2) `__ =============== Skipped Threads =============== - `import exceptions `__ - `[Python-checkins] python/dist/src/Lib/test test_re.py, 1.45.6.3, 1.45.6.4 `__ - `setdefault's second argument `__ - `Alternative imports (Re: Python 3 design principles) `__ - `python/dist/src/Lib/test test_re.py, 1.45.6.3, 1.45.6.4 `__ - `Status of PEP 328 `__ - `Weekly Python Patch/Bug Summary `__ - `itertools.chain should take an iterable ? `__ - `partition() (was: Remove str.find in 3.0?) `__ - `gdbinit problem `__ - `Exception Reorg PEP checked in `__ - `international python `__ - `SIGPIPE => SIG_IGN? `__ - `[draft] python-dev Summary for 2005-08-16 through 2005-08-31 `__ - `[Python-checkins] python/dist/src/Lib urllib.py, 1.169, 1.170 `__ - `Wanting to learn `__ - `Python code.interact() and UTF-8 locale `__ - `pygettext() without newlines (Was: Re: Replacement for print in Python 3.0) `__ - `Python 3 executable name (was: Re: PEP 3000 and iterators) `__ - `Python 3 executable name `__ - `Skiping searching throw dictionaries of mro() members. `__ - `Fwd: [Python-checkins] python/dist/src/Misc NEWS, 1.1193.2.94, 1.1193.2.95 `__ - `[Python-checkins] python/dist/src/Lib/test regrtest.py, 1.171, 1.172 test_ioctl.py, 1.2, 1.3 `__ - `python/dist/src/Lib urllib.py, 1.165.2.1, 1.165.2.2 `__ - `Variant of removing GIL. `__ - `Compatibility between Python 2.3.x and Python 2.4.x `__ - `Example for "property" violates "Python is not a one pass compiler" `__ - `python optimization `__ From guido at python.org Sat Oct 22 03:39:47 2005 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Oct 2005 18:39:47 -0700 Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-09-01 through 2005-09-16 In-Reply-To: <3A5D2942-E0C3-4768-94D5-E43B914BB80B@ihug.co.nz> References: <3A5D2942-E0C3-4768-94D5-E43B914BB80B@ihug.co.nz> Message-ID: On 10/21/05, Tony Meyer wrote: > This is over a month late, sorry, but here it is (Steve did his > threads ages ago; I've fallen really behind). Better late than never! These summaries are awesome. Just one nit: > ---------------------------------- > Responsiveness of IDLE development > ---------------------------------- > > Noam Raphael posted a request for help getting a large patch to IDLE > committed to CVS. He was concerned that there hasn't been any IDLE > development recently, and that patches are not being considered. He > indicated that his group was considering offering a fork of IDLE with > the improvements, but that they would much prefer integrating the > improvements into the core distribution. > > It was pointed out that a fork might be the best solution, for > various reasons (e.g. the improvements may not be of general > interest, the release time would be much quicker), and that this was > how the current version of IDLE was developed. The dicussion died > out, so it seems likely that a fork will be the resulting solution. Later, it turned out that Kurt Kaiser had missed this message on python-dev (which he only reads occasionally); he redirected the thread to idle-dev where it seems that his issues with the contribution are being resolved and a fork is averted. Whew! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.peters at gmail.com Sat Oct 22 04:52:36 2005 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 21 Oct 2005 22:52:36 -0400 Subject: [Python-Dev] int(string) (was: DRAFT: python-dev Summary for 2005-09-01 through 2005-09-16) Message-ID: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com> ... > ----------------------------- > Speeding up list append calls > ----------------------------- > > A `comp.lang.python message from Tim Peters`_ prompted Neal Norwitz > to investigate how the code that Tim posted could be sped up. He > hacked the code to replace var.append() with the LIST_APPEND opcode, > .... Someone want a finite project that would _really_ help their Uncle Timmy in his slow-motion crusade to get Python on the list of "solved it!" languages for each problem on that magnificent site? http://spoj.sphere.pl It turns out that many of the problems there have input encoded as vast quantities of integers (stdin is a mass of decimal integers on one or more lines). Most infamous for Python is this tutorial (you don't get points for solving it) problem, which is _trying_ to test whether your language of choice can read from stdin "fast enough": http://spoj.sphere.pl/problems/INTEST/ """ The input begins with two positive integers n k (n, k<=10**7). The next n lines of input contain one positive integer t_i, not greater than 10**9, each. Output Write a single integer to output, denoting how many integers t_i are divisable by k. Example Input: 7 3 1 51 966369 7 9 999996 11 Output: 4 """ There's an 8-second time limit, and I believe stdin is about 8MB (you're never allowed to see the actual input they use). They have a slower machine than you use ;-), so it's harder than it sounds. To date, 975 people have submitted a program that passed, but only a few managed to do it using Python. I did, and it required every trick in the book, including using psyco. Turns out it's _not_ input speed that's the problem here, and not even mainly the speed of integer mod: the bulk of the time is spent in int(string) (and, yes, that's also far more important to the problem Neal was looking at than list.append time). If you can even track all the levels of C function calls that ends up invoking , you find yourself in PyOS_strtoul(), which is a nifty all-purpose routine that accepts inputs in bases 2 thru 36, can auto-detect base, and does platform-independent overflow checking at the cost of a division per digit. All those are features, but it makes for sloooow conversion. I assume it's the overflow-checking that's the major time sink, and it's not correct anyway: it does the check slightly differently for base 10 than for any other base, explained only in the checkin comment for rev 2.13, 8 years ago: For base 10, cast unsigned long to long before testing overflow. This prevents 4294967296 from being an acceptable way to spell zero! So what are the odds that base 10 was the _only_ base that had a "bad input" case for the overflow-check method used? If you thought "slim", you were right ;-) Here are other bad cases, under all Python versions to date (on a 32-bit box; if sizeof(long) == 8, there are different bad cases): int('102002022201221111211', 3) = 0 int('32244002423141', 5) = 0 int('1550104015504', 6) = 0 int('211301422354', 7) = 0 int('12068657454', 9) = 0 int('1904440554', 11) = 0 int('9ba461594', 12) = 0 int('535a79889', 13) = 0 int('2ca5b7464', 14) = 0 int('1a20dcd81', 15) = 0 int('a7ffda91', 17) = 0 int('704he7g4', 18) = 0 int('4f5aff66', 19) = 0 int('3723ai4g', 20) = 0 int('281d55i4', 21) = 0 int('1fj8b184', 22) = 0 int('1606k7ic', 23) = 0 int('mb994ag', 24) = 0 int('hek2mgl', 25) = 0 int('dnchbnm', 26) = 0 int('b28jpdm', 27) = 0 int('8pfgih4', 28) = 0 int('76beigg', 29) = 0 int('5qmcpqg', 30) = 0 int('4q0jto4', 31) = 0 int('3aokq94', 33) = 0 int('2qhxjli', 34) = 0 int('2br45qb', 35) = 0 int('1z141z4', 36) = 0 IOW, the only bases that _aren't_ "bad" are powers of 2, and 10 because it's special-cased (BTW, I'm not sure that base 10 doesn't have a different bad case now, but don't care enough to prove it one way or the other). Now fixing that is easy: the problem comes from being too clever, doing both a multiply and an addition before checking for overflow. Check each operation on its own and it would be bulletproof, without special-casing. But that might be even slower (it would remove the branch special-casing 10, but add a cheap integer addition overflow check with its own branch). The challenge (should you decide to accept it ) is to replace the overflow-checking with something both correct _and_ much faster than doing n integer divisions for an n-character input. For example, 36**6 < 2**32-1, so whenever the input has no more than 6 digits overflow is impossible regardless of base and regardless of platform. That's simple and exploitable. For extra credit, make int(string) go faster than preparing your taxes ;-) BTW, Python as-is can be used to solve many (I'd bet most) of these problems in the time limit imposed, although it may take some effort, and it may not be possible without using psyco. A Python triumph I'm particularly fond of: http://spoj.sphere.pl/problems/FAMILY/ The legend at the bottom: Warning: large Input/Output data, be careful with certain languages seems to be a euphemism for "don't even think about using Python" <0.9 wink>. But there's a big difference in this one: it's a _hard_ problem, requiring graph analysis, delicate computation, greater than double-precision precision (in the end), and can hugely benefit from preprocessing a batch of queries to plan out and minimize the number of operations needed. Five people have solved it to date (click on "Best Solutions"), and you'll see that my Python entry is the second-fastest so far, beating 3 C++ entries by 3 excellent C++ programmers. I don't know what they did, but I suspect I was far more willing to code up an effective but tedious "plan out and minimize" phase _because_ I was using Python. I sure didn't beat them on reading the mass quantities of integers from stdin . From hyeshik at gmail.com Sat Oct 22 07:55:05 2005 From: hyeshik at gmail.com (Hye-Shik Chang) Date: Sat, 22 Oct 2005 14:55:05 +0900 Subject: [Python-Dev] LXR site for Python CVS Message-ID: <4f0b69dc0510212255t61185aa5x4e4b8e253e0c2573@mail.gmail.com> Hi, I just set up a LXR instance for Python CVS for my personal use: http://pxr.openlook.org/pxr/ If you find it useful, feel free to use the site. :) The source files will be updated twice a day. Hye-Shik From mwh at python.net Sat Oct 22 10:38:46 2005 From: mwh at python.net (Michael Hudson) Date: Sat, 22 Oct 2005 09:38:46 +0100 Subject: [Python-Dev] int(string) In-Reply-To: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com> (Tim Peters's message of "Fri, 21 Oct 2005 22:52:36 -0400") References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com> Message-ID: <2mfyqu6j55.fsf@starship.python.net> Tim Peters writes: > Turns out it's _not_ input speed that's the problem here, and not even > mainly the speed of integer mod: the bulk of the time is spent in > int(string) (and, yes, that's also far more important to the problem > Neal was looking at than list.append time). If you can even track all > the levels of C function calls that ends up invoking , you find > yourself in PyOS_strtoul(), which is a nifty all-purpose routine that > accepts inputs in bases 2 thru 36, can auto-detect base, and does > platform-independent overflow checking at the cost of a division per > digit. All those are features, but it makes for sloooow conversion. > I assume it's the overflow-checking that's the major time sink, and > it's not correct anyway: it does the check slightly differently for > base 10 than for any other base, explained only in the checkin comment > for rev 2.13, 8 years ago: > > For base 10, cast unsigned long to long before testing overflow. > This prevents 4294967296 from being an acceptable way to spell zero! > > So what are the odds that base 10 was the _only_ base that had a "bad > input" case for the overflow-check method used? If you thought > "slim", you were right ;-) Here are other bad cases, under all Python > versions to date (on a 32-bit box; if sizeof(long) == 8, there are > different bad cases): > > int('102002022201221111211', 3) = 0 [...] Eek! > Now fixing that is easy: the problem comes from being too clever, Surprise! > doing both a multiply and an addition before checking for overflow. > Check each operation on its own and it would be bulletproof, without > special-casing. But that might be even slower (it would remove the > branch special-casing 10, but add a cheap integer addition overflow > check with its own branch). > > The challenge (should you decide to accept it ) is to replace > the overflow-checking with something both correct _and_ much faster > than doing n integer divisions for an n-character input. For example, > 36**6 < 2**32-1, so whenever the input has no more than 6 digits > overflow is impossible regardless of base and regardless of platform. > That's simple and exploitable. For extra credit, make int(string) go > faster than preparing your taxes ;-) So, you're suggesting dividing the input up into known non-overflowing chunks and using the normal Python operations to combine those chunks, relying on them overflowing to longs as needed? All of the examples you posted should have returned longs anyway, right? I guess the change to automatically overflowing to longs has led to some code that shows its history more than one would like. Cheers, mwh -- I think if we have the choice, I'd rather we didn't explicitly put flaws in the reST syntax for the sole purpose of not insulting the almighty. -- /will on the doc-sig From ncoghlan at gmail.com Sat Oct 22 10:54:13 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 22 Oct 2005 18:54:13 +1000 Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-09-01 through 2005-09-16 In-Reply-To: References: <3A5D2942-E0C3-4768-94D5-E43B914BB80B@ihug.co.nz> Message-ID: <4359FE35.9010405@gmail.com> Guido van Rossum wrote: > On 10/21/05, Tony Meyer wrote: >> This is over a month late, sorry, but here it is (Steve did his >> threads ages ago; I've fallen really behind). > > Better late than never! These summaries are awesome. I certainly find them to be a very useful reminder of list threads that got overwhelmed by other discussions. I'm still trying to close out the naming issues for PEP 343, but I hope to get back to the "Template.format" method idea eventually (along with an idea inspired by the discussion of the module level functions in the 're' module - how about providing similar module level functions in the string module that correspond to the methods of Template objects?). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From skip at pobox.com Sat Oct 22 13:48:17 2005 From: skip at pobox.com (skip@pobox.com) Date: Sat, 22 Oct 2005 06:48:17 -0500 Subject: [Python-Dev] Comparing date+time w/ just time Message-ID: <17242.9985.574217.23379@montanaro.dyndns.org> With significant input from Fred I made some changes to xmlrpclib a couple months ago to better integrate datetime objects into xmlrpclib. That raised some problems because I neglected to add support for comparing datetime objects with xmlrpclib.DateTime objects. (The problem showed up in MoinMoin.) I've been working on that recently (adding rich comparison methods to DateTime while retaining __cmp__ for backward compatibility), and have second thoughts about one of the original changes. I tried to support datetime, date and time objects. My problems are with support for time objects. Marshalling datetimes as xmlrpclib.DateTime objects is no problem (though you lose fractions of a second). Marshalling dates is reasonable if you treat the time as 00:00:00. I decided to marshal datetime.time objects by fixing the day portion of the xmlrpclib.DateTime object as today's date. That's the suspect part. When I went back recently to add better comparison support, I decided to compare xmlrpclib.DateTime objects with time objects by simply comparing the HH:MM:SS part of the DateTime with the time object. That's making me a bit queazy now. datetime.time(hour=23) would compare equal to any DateTime with its time equal to 11PM. Under the rule, "in the face of ambiguity, refuse the temptation to guess", I'm inclined to dump support for marshalling and comparison of time objects altogether. Do others agree that was a bad idea? Thx, Skip From guido at python.org Sat Oct 22 15:58:16 2005 From: guido at python.org (Guido van Rossum) Date: Sat, 22 Oct 2005 06:58:16 -0700 Subject: [Python-Dev] Comparing date+time w/ just time In-Reply-To: <17242.9985.574217.23379@montanaro.dyndns.org> References: <17242.9985.574217.23379@montanaro.dyndns.org> Message-ID: On 10/22/05, skip at pobox.com wrote: > With significant input from Fred I made some changes to xmlrpclib a couple > months ago to better integrate datetime objects into xmlrpclib. That raised > some problems because I neglected to add support for comparing datetime > objects with xmlrpclib.DateTime objects. (The problem showed up in > MoinMoin.) I've been working on that recently (adding rich comparison > methods to DateTime while retaining __cmp__ for backward compatibility), and > have second thoughts about one of the original changes. > > I tried to support datetime, date and time objects. My problems are with > support for time objects. Marshalling datetimes as xmlrpclib.DateTime > objects is no problem (though you lose fractions of a second). Marshalling > dates is reasonable if you treat the time as 00:00:00. I decided to marshal > datetime.time objects by fixing the day portion of the xmlrpclib.DateTime > object as today's date. That's the suspect part. > > When I went back recently to add better comparison support, I decided to > compare xmlrpclib.DateTime objects with time objects by simply comparing the > HH:MM:SS part of the DateTime with the time object. That's making me a bit > queazy now. datetime.time(hour=23) would compare equal to any DateTime with > its time equal to 11PM. Under the rule, "in the face of ambiguity, refuse > the temptation to guess", I'm inclined to dump support for marshalling and > comparison of time objects altogether. Do others agree that was a bad idea? Agreed. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at iinet.net.au Sat Oct 22 15:58:48 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Sat, 22 Oct 2005 23:58:48 +1000 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues Message-ID: <435A4598.3060403@iinet.net.au> I'm still looking for more feedback on the issues raised in the last update of PEP 343. There hasn't been much direct feedback so far, but I've rephrased and suggested resolutions for the outstanding issues based on what feedback I have received, and my own thoughts over the last week of so. For those simply skimming, my proposed issue resolutions are: 1. Use the slot name "__context__" instead of "__with__" 2. Reserve the builtin name "context" for future use as described below 3a. Give generator-iterators a native context that invokes self.close() 3b. Use "contextmanager" as a builtin decorator to get generator-contexts 4. Special case the __context__ slot to avoid the need to decorate it For those that disagree with the proposed resolutions above, or simply would like more details, here's the reasoning: 1. Should the slot be named "__with__" or "__context__"? Guido raised this as a side comment during the discussion of PJE's task variables pre-PEP, and it's a fair question. The closest analogous slot method ("__iter__") is named after the protocol it relates to, rather than the associated statement/expression keyword (that is, the method isn't called "__for__"). The next closest analogous slot is one that doesn't actually exist yet - the proposed "boolean" protocol. This again uses the name of a protocol rather than the associated keyword (that is, the name "__bool__" was suggested rather than "__if__"). At the moment, PEP 343 makes the opposite choice - it uses the keyword, rather than the protocol name (that is, it uses "__with__" instead of using "__context__"). That inconsistency would be a bad thing, in my opinion, and I propose that the slot should instead be named "__context__". 2. If the slot is called "__context__" what should a "context" builtin do? Again, considering existing slot names, a slot with a given name is generally invoked by the builtin type or function with the same name. This is true of the builtin types, and also true of iter, cmp and pow. getattr, setattr and delattr get in on the act as well. So, to be consistent, one would expect a "context" builtin to be able to be used such that "context(x)" invoked "x.__context__()". Such a method might special-case certain types, or have a two-argument form that accepted an "enter" function and an "exit" function, but using it to mark a generator that is to be used as a context manager (as currently suggested in PEP 343) would not be appropriate. I don't mind either way whether or not a "context" builtin is actually included for Python 2.5. However, even if it isn't included, the name should be reserved for that purpose (that is, we shouldn't be using it to name a generator decorator or a module). 3. How should generators behave when used as contexts? With PEP 342 accepted, generators pose a problem, because they have two possible uses as contexts. The first is for a generator that is intended to be used as an actual iterator. This case is a case of resource management - ensuring the close method is invoked on the generator-iterator when done with it (i.e., similar to the proposed native context for files). PEP 343 proposes a second use case for generators - to write custom context managers. In this case, the __enter__ method steps the generator once, and the __exit__ method finishes the job. I propose that we give generator-iterator objects resource management behaviour by default (i.e., __context__ and __enter__ methods that just "return self", and an __exit__ method that invokes "self.close()"). The "contextmanager" builtin decorator from previous drafts of the PEP (called simply "context" in the current draft) can then be used to get the custom context manager behaviour. I previously thought giving generators a native context caused problems with getting silent failures when the "contextmanager" decorator was inadvertently omitted. This is still technically true - the "with" statement itself won't raise a TypeError because the generator is a legal context. However, with this bug, the context manager won't be getting entered *at all* (it gets closed without its next() method ever being called). Even the most cursory testing of the generator-context function should be able to tell whether the generator-context is being entered or not. The main alternative (having yet-another-decorator to give generators "auto-close" functionality) would be possible, but the additional builtin clutter would be getting to the point where it started to concern me. And given that "yield" inside "try/finally" is now always legal, I consider it reasonable that using a generator in a "with" statement would also always be legal. Further, if type_new special cases __context__ as suggested below, then the context behaviour of generators used to define "__iter__" and "__context__" slots will always be appropriate. 4. Should the __context__ slot be special-cased in type_new? Currently, type_new special cases the "__new__" slot and automatically applies the staticmethod decorator when it finds a function occupying that slot in the class attribute dictionary. I propose that type_new also special case the situation where the "__context__" slot is occupied by a generator function, and automatically apply the "contextmanager" decorator. This looks much nicer when using a generator to write a __context__ function, and also avoids the situation where the decorator is omitted, and the object becomes legal to use directly in with statements but doesn't actually do the right thing. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From jim at zope.com Sat Oct 22 16:21:28 2005 From: jim at zope.com (Jim Fulton) Date: Sat, 22 Oct 2005 10:21:28 -0400 Subject: [Python-Dev] Comparing date+time w/ just time In-Reply-To: <17242.9985.574217.23379@montanaro.dyndns.org> References: <17242.9985.574217.23379@montanaro.dyndns.org> Message-ID: <435A4AE8.6010708@zope.com> skip at pobox.com wrote: > With significant input from Fred I made some changes to xmlrpclib a couple > months ago to better integrate datetime objects into xmlrpclib. That raised > some problems because I neglected to add support for comparing datetime > objects with xmlrpclib.DateTime objects. (The problem showed up in > MoinMoin.) I've been working on that recently (adding rich comparison > methods to DateTime while retaining __cmp__ for backward compatibility), and > have second thoughts about one of the original changes. > > I tried to support datetime, date and time objects. My problems are with > support for time objects. Marshalling datetimes as xmlrpclib.DateTime > objects is no problem (though you lose fractions of a second). Marshalling > dates is reasonable if you treat the time as 00:00:00. I don't think that is reasonable at all. I would normally expect a date to represent the whole day, not a particular, unspecified time. Other people may have other expectations, but xmlrpclib should not assume a particular interpretation. > I decided to marshal > datetime.time objects by fixing the day portion of the xmlrpclib.DateTime > object as today's date. That's the suspect part. Very very suspect. :) > When I went back recently to add better comparison support, I decided to > compare xmlrpclib.DateTime objects with time objects by simply comparing the > HH:MM:SS part of the DateTime with the time object. That's making me a bit > queazy now. datetime.time(hour=23) would compare equal to any DateTime with > its time equal to 11PM. Under the rule, "in the face of ambiguity, refuse > the temptation to guess", I'm inclined to dump support for marshalling and > comparison of time objects altogether. Do others agree that was a bad idea? I agree that it was a bad idea and that you should not try to marshal time objects or compare time objects with DateTime objects. Similarly, I strongly recommend that you also stop trying to marshal date objects or compare date objects to DateTime objects. After all, if the datetime module doesn't allow compatison of date and datetime, why should you try to compare date and DateTime? Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From tim.peters at gmail.com Sat Oct 22 18:13:53 2005 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 22 Oct 2005 12:13:53 -0400 Subject: [Python-Dev] int(string) In-Reply-To: <2mfyqu6j55.fsf@starship.python.net> References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com> <2mfyqu6j55.fsf@starship.python.net> Message-ID: <1f7befae0510220913te4f0975k95932d0515b6ce59@mail.gmail.com> [Tim] ... >> int('102002022201221111211', 3) = 0 I should have added that all those examples simply used 2**32 as input, expressed as a string in the input base. They're not the only failing cases; e.g., this is also obviously wrong: >>> int('102002022201221111212', 3) 1 ... >> The challenge (should you decide to accept it ) is to replace >> the overflow-checking with something both correct _and_ much faster >> than doing n integer divisions for an n-character input. For example, >> 36**6 < 2**32-1, so whenever the input has no more than 6 digits >> overflow is impossible regardless of base and regardless of platform. >> That's simple and exploitable. For extra credit, make int(string) go >> faster than preparing your taxes ;-) |Michael Hudson] > So, you're suggesting dividing the input up into known non-overflowing > chunks and using the normal Python operations to combine those chunks, > relying on them overflowing to longs as needed? Possibly. I want int(str), for the comparatively short decimal strings most apps convert most of the time, to be much faster too. The _simplest_ thing one could do with the observation is add a number-of-digits counter to PyOS_strtoul's loop, skip the overflow check entirely for the first six digits converted, and for every digit (if any) after the sixth do "obviously correct" overflow checking. That would save min(len(s), 6) integer divisions per call, and would probably be a real speed win for most apps that do a lot of int(string). Slightly more ambitious would be to use a different constant per base; e.g., for base 10 overflow is impossible if there are no more than 9 digits, and exploiting that would buy that int(decimal_str) would almost never need to do an integer division in most apps. The strategy you suggest could, if implemented carefully, speed all int(string) and long(string) operations, except for long(string, base) where base is a power of 2 (the latter case is highly optimized already, in longobject.c's long_from_binary_base). Speeding long(string) for non-power-of-2 bases is tricky. It benefits already from the internal muladd1() routine, which does the "multiply by the base and add in the next digit" step in one gulp, mutating the C representation of a long directly. That's a very efficient loop in part because it _knows_ the base fits in a single "Python long digit". Combining larger chunks _could_ be faster, but the multiplication problem gets harder if base**chunk_size exceeds a single Python long digit. So there are a world of possible complications here. I'd be delighted to see "just" correct overflow checking plus a major speed boost for int(decimal_string) where the result does fit in a 32-bit unsigned int (which I'm sure accounts for the vast bulk of dynamic real-life int(string) invocations). > All of the examples you posted should have returned longs anyway, right? On a 32-bit box, yes. Regardless of box, all of the original examples should return 2**32. The one at the top of this message should return 2**32+1. > I guess the change to automatically overflowing to longs has led to > some code that shows its history more than one would like. Well, these particular cases were always broken -- they always returned 0. The difference is that in modern Pythons they should return the right answer, while in older Pythons they should have raised OverflowError. From rhamph at gmail.com Sat Oct 22 18:49:38 2005 From: rhamph at gmail.com (Adam Olsen) Date: Sat, 22 Oct 2005 10:49:38 -0600 Subject: [Python-Dev] int(string) In-Reply-To: <2mfyqu6j55.fsf@starship.python.net> References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com> <2mfyqu6j55.fsf@starship.python.net> Message-ID: > Tim Peters writes: > > > Turns out it's _not_ input speed that's the problem here, and not even > > mainly the speed of integer mod: the bulk of the time is spent in > > int(string) (and, yes, that's also far more important to the problem > > Neal was looking at than list.append time). If you can even track all > > the levels of C function calls that ends up invoking , you find > > yourself in PyOS_strtoul(), which is a nifty all-purpose routine that > > accepts inputs in bases 2 thru 36, can auto-detect base, and does > > platform-independent overflow checking at the cost of a division per > > digit. All those are features, but it makes for sloooow conversion. > > > I assume it's the overflow-checking that's the major time sink, Are you sure? https://sourceforge.net/tracker/index.php?func=detail&aid=1334979&group_id=5470&atid=305470 That patch removes the division from the loop (and fixes the bugs), but gives only a small increase in speed. -- Adam Olsen, aka Rhamphoryncus From guido at python.org Sat Oct 22 19:22:56 2005 From: guido at python.org (Guido van Rossum) Date: Sat, 22 Oct 2005 10:22:56 -0700 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: <435A4598.3060403@iinet.net.au> References: <435A4598.3060403@iinet.net.au> Message-ID: On 10/22/05, Nick Coghlan wrote: > I'm still looking for more feedback on the issues raised in the last update of > PEP 343. There hasn't been much direct feedback so far, but I've rephrased and > suggested resolutions for the outstanding issues based on what feedback I have > received, and my own thoughts over the last week of so. Thanks for bringing this up again. It's been at the back of my mind, but hasn't had much of a chance to come to the front lately... > For those simply skimming, my proposed issue resolutions are: > > 1. Use the slot name "__context__" instead of "__with__" +1 > 2. Reserve the builtin name "context" for future use as described below +0.5. I don't think we'll need that built-in, but I do think that the term "context" is too overloaded to start using it for anything in particular. > 3a. Give generator-iterators a native context that invokes self.close() I'll have to think about this one more, and I don't have time for that right now. > 3b. Use "contextmanager" as a builtin decorator to get generator-contexts +1 > 4. Special case the __context__ slot to avoid the need to decorate it -1. I expect that we'll also see generator *functions* (not methods) as context managers. The functions need the decorator. For consistency the methods should also be decorated explicitly. For example, while I'm now okay (at the +0.5 level) with having files automatically behave like context managers, one could still write an explicit context manager 'opening': @contextmanager def opening(filename): f = open(filename) try: yield f finally: f.close() Compare to class FileLike: def __init__(self, ...): ... def close(self): ... @contextmanager def __context__(self): try: yield self finally: self.close() -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Sat Oct 22 19:26:35 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Sat, 22 Oct 2005 13:26:35 -0400 Subject: [Python-Dev] Comparing date+time w/ just time In-Reply-To: <17242.9985.574217.23379@montanaro.dyndns.org> References: <17242.9985.574217.23379@montanaro.dyndns.org> Message-ID: <200510221326.36225.fdrake@acm.org> On Saturday 22 October 2005 07:48, skip at pobox.com wrote: > ..., I'm inclined to dump support > for marshalling and comparison of time objects altogether. Do others > agree that was a bad idea? Very much. As Jim notes, supporting date objects is more than a little questionable as well. Dates and times, separate from a date-time, are completely unsupported by the bare XML-RPC protocol. Applications must determine what they mean and how to encode them in XML-RPC separately if they need to do so. -Fred -- Fred L. Drake, Jr. From tim.peters at gmail.com Sat Oct 22 19:38:11 2005 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 22 Oct 2005 13:38:11 -0400 Subject: [Python-Dev] int(string) In-Reply-To: References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com> <2mfyqu6j55.fsf@starship.python.net> Message-ID: <1f7befae0510221038u4bd75ca5l67dc930ae16d24b7@mail.gmail.com> [Tim] >> I assume it's the overflow-checking that's the major time sink, [Adam Olsen] > Are you sure? No -- that's what "assume" means <0.7 wink>. For example, there's a long chain of function calls involved in int(string) too. > > > That patch removes the division from the loop (and fixes the bugs), > but gives only a small increase in speed. As measured how? Platform, compiler, input, etc? Is the "ULONG_MAX / base" part compiled to inline code or to a call to a library routine (e.g., if the latter, it could be that a dividend with "the sign bit set" is extraordinarily expensive for unsigned division -- depends on the pair in use)? If so, a small static table could avoid all runtime division. If not, note that the number of divisions hasn't actually changed for 1-character input. Etc. In any case, I agree it _should_ fix the bugs (although it also needs new tests to verify that). From rhamph at gmail.com Sat Oct 22 20:03:45 2005 From: rhamph at gmail.com (Adam Olsen) Date: Sat, 22 Oct 2005 12:03:45 -0600 Subject: [Python-Dev] int(string) In-Reply-To: <1f7befae0510221038u4bd75ca5l67dc930ae16d24b7@mail.gmail.com> References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com> <2mfyqu6j55.fsf@starship.python.net> <1f7befae0510221038u4bd75ca5l67dc930ae16d24b7@mail.gmail.com> Message-ID: On 10/22/05, Tim Peters wrote: > [Tim] > >> I assume it's the overflow-checking that's the major time sink, > > [Adam Olsen] > > Are you sure? > > No -- that's what "assume" means <0.7 wink>. For example, there's a > long chain of function calls involved in int(string) too. > > > > > > > That patch removes the division from the loop (and fixes the bugs), > > but gives only a small increase in speed. > > As measured how? Platform, compiler, input, etc? Is the "ULONG_MAX / > base" part compiled to inline code or to a call to a library routine > (e.g., if the latter, it could be that a dividend with "the sign bit > set" is extraordinarily expensive for unsigned division -- depends on > the pair in use)? If so, a small static table could > avoid all runtime division. If not, note that the number of divisions > hasn't actually changed for 1-character input. Etc. AMD Athlon 2500+, Linux 2.6.13, GCC 4.0.2 rhamph at factor:~/src/Python-2.4.1$ python2.4 -m timeit 'int("999999999")' 1000000 loops, best of 3: 0.834 usec per loop rhamph at factor:~/src/Python-2.4.1$ ./python -m timeit 'int("999999999")' 1000000 loops, best of 3: 0.801 usec per loop rhamph at factor:~/src/Python-2.4.1$ python2.4 -m timeit 'int("9")' 1000000 loops, best of 3: 0.709 usec per loop rhamph at factor:~/src/Python-2.4.1$ ./python -m timeit 'int("9")' 1000000 loops, best of 3: 0.717 usec per loop Originally I just tried the longer string so I hadn't noticed that the smaller string was slightly slower. Oh well, caveat emptor. -- Adam Olsen, aka Rhamphoryncus From reinhold-birkenfeld-nospam at wolke7.net Sat Oct 22 22:51:24 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Sat, 22 Oct 2005 22:51:24 +0200 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <4edc17eb0510200035u370b57f9ub1d66b4e99d1be62@mail.gmail.com> References: <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com> <5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com> <17238.40158.735826.504410@montanaro.dyndns.org> <4edc17eb0510200035u370b57f9ub1d66b4e99d1be62@mail.gmail.com> Message-ID: Michele Simionato wrote: > As other explained, the syntax would not work for functions (and it is > not intended to). > A possible use case I had in mind is to define inlined modules to be > used as bunches > of attributes. For instance, I could define a module as > > module m(): > a = 1 > b = 2 > > where 'module' would be the following function: > > def module(name, args, dic): > mod = types.ModuleType(name, dic.get('__doc__')) > for k in dic: setattr(mod, k, dic[k]) > return mod Wow. This looks like an almighty tool. We can have modules, interfaces, classes and properties all the like with this. Guess a PEP would be nice. Reinhold From guido at python.org Sat Oct 22 23:14:34 2005 From: guido at python.org (Guido van Rossum) Date: Sat, 22 Oct 2005 14:14:34 -0700 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: References: <435A4598.3060403@iinet.net.au> Message-ID: Here's another argument against automatically decorating __context__. What if I want to have a class with a __context__ method that returns a custom context manager that *doesn't* involve applying @contextmanager to a generator? While technically this is possible with your proposal (since such a method wouldn't be a generator), it's exceedingly subtle for the human reader. I'd much rather see the @contextmanager decorator to emphasize the difference. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Sat Oct 22 23:28:00 2005 From: skip at pobox.com (skip@pobox.com) Date: Sat, 22 Oct 2005 16:28:00 -0500 Subject: [Python-Dev] Comparing date+time w/ just time In-Reply-To: <435A4AE8.6010708@zope.com> References: <17242.9985.574217.23379@montanaro.dyndns.org> <435A4AE8.6010708@zope.com> Message-ID: <17242.44768.957645.357325@montanaro.dyndns.org> Based on feedback from Jim and Fred, I took out date and time object marshalling and comparison. (Actually, you can still compare an xmlrpclib.DateTime object with a datetime.date object, because DateTime objects can be compared with anything that has a timetuple method.) There's a patch at http://python.org/sf/1330538 I went ahead and assigned it to Fred since he's worked with that code fairly recently. Skip From bokr at oz.net Sun Oct 23 01:49:53 2005 From: bokr at oz.net (Bengt Richter) Date: Sat, 22 Oct 2005 16:49:53 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). Message-ID: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> Please bear with me for a few paragraphs ;-) One aspect of str-type strings is the efficiency afforded when all the encoding really is ascii. If the internal encoding were e.g. fixed utf-16le for strings, maybe with today's computers it would still be efficient enough for most actual string purposes (excluding the current use of str-strings as byte sequences). I.e., you'd still have to identify what was "strings" (of characters) and what was really byte sequences with no implied or explicit encoding or character semantics. Ok, let's make that distinction explicit: Call one kind of string a byte sequence and the other a character sequence (representation being a separate issue). A unicode object is of course the prime _general_ representation of a character sequence in Python, but all the names in python source code (that become NAME tokens) are UIAM also character sequences, and representable by a byte sequence interpreted according to ascii encoding. For the sake of discussion, suppose we had another _character_ sequence type that was the moral equivalent of unicode except for internal representation, namely a str subclass with an encoding attribute specifying the encoding that you _could_ use to decode the str bytes part to get unicode (which you wouldn't do except when necessary). We could call it class charstr(str): ... and have chrstr().bytes be the str part and chrstr().encoding specify the encoding part. In all the contexts where we have obvious encoding information, we can then generate a charstr instead of a str. E.g., if the source of module_a has # -*- coding: latin1 -*- cs = '?ber-cool' then type(cs) # => cs.bytes # => '\xfcber-cool' cs.encoding # => 'latin-1' and print cs would act like print cs.bytes.decode(cs.encoding) -- or I guess sys.stdout.write(cs.bytes.decode(cs.encoding).encode(sys.stdout.encoding) followed by sys.stdout.write('\n'.decode('ascii').encode(sys.stdout.encoding) for the newline of the print. Now if module_b has # -*- coding: utf8 -*- cs = '?ber-cool' and we interactively import module_a, module_b and then print module_a.cs + ' =?= ' + module_b.cs what could happen ideally vs. what we have currently? UIAM, currently we would just get the concatenation of the three str byte sequences concatenated to make '\xfcber-cool =?= \xc3\xbcber-cool' and that would be printed as whatever that comes out as without conversion when seen by the output according to sys.stdout.encoding. But if those cs instances had been charstr instances, the coding cookie encoding information would have been preserved, and the interactive print could have evaluated the string expression -- given cs.decode() as sugar for (cs.bytes.decode(cs.encoding or globals().get('__encoding__') or __import__('sys').getdefaultencoding())) -- as module_a.cs.decode() + ' =?= '.decode() + module_b.cs.decode() if pairwise terms differ in encoding as they might all here. If the interactive session source were e.g. latin-1, like module_a, then module_a.cs + ' =?= ' would not require an encoding change, because the ' =?= ' would be a charstr instance with encoding == 'latin-1', and so the result would still be latin-1 that far. But with module_b.cs being utf8, the next addition would cause the .decode() promotions to unicode. In a console window, the ' =?= '.encoding might be 'cp437' or such, and the first addition would then cause promotion (since module_a.cs.encoding != 'cp437'). I have sneaked in run-time access to individual modules' encodings by assuming that the encoding cookie could be compiled in as an explicit global __encoding__ variable for any given module (what to have as __encoding__ for built-in modules could vary for various purposes). ISTM this could have use in situations where an encoding assumption is necessary and currently 'ascii' is not as good a guess as one could make, though I suspect if string literals became charstr strings instead of str strings, many if not most of those situations would disappear (I'm saying this because ATM I can't think of an 'ascii'-guess situation that wouldn't go away ;-) If there were a charchr() version of chr() that would result in a charstr instead of a str, IWT one would want an easy-sugar default encoding assumption, probably based on the same as one would assume for '%c' % num in a given module source -- which presumably would be '%c'.encoding, where '%c' assumes the encoding of the module source, normally recorded in __encoding__. So charchr(n) would act like chr(n).decode().encode(''.encoding) -- or more reasonably charstr(chr(n)), which would be short for charstr(chr(n), globals().get('__encoding__') or __import__('sys').getdefaultencoding()) Or some efficient equivalent ;-) Using strings in dicts requires hashing to find key comparison candidates and comparison to check for key equivalence. This would seem to point to some kind of normalized hashing, but not necessarily normalized key representation. Some is apparently happening, since >>> hash('a') == hash(unicode('a')) True I don't know what would be worth the trouble to optimize string key usage where strings are really all of one encoding vs totally general use vs a heavily biased mix. Or even if it could be done without unreasonable complexity. Maybe a dict could be given an option to hash all its keys as unicode vs whatever it does now. But having a charstr subtype of str would improve the "implicit" conversions to unicode IMO. Anyway, I wanted to throw in my .02USD re the implicit conversions, taking the view that much of the implicitness could be based on reliable inferences from source encodings of string literals or from their effects as format strings. Regards, Bengt Richter [not a normal subscriber to python-dev, so I'll have to google for any responses] From raymond.hettinger at verizon.net Sun Oct 23 07:30:17 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sun, 23 Oct 2005 01:30:17 -0400 Subject: [Python-Dev] AST reverts PEP 342 implementation and IDLE starts working again Message-ID: <000201c5d792$daa57240$79fbcc97@oemcomputer> FWIW, a few months ago, I reported that File New or File Open in IDLE would crash Python as a result of the check-in implementing PEP 342. Now, with AST checked-in, IDLE has started working again. Given the reconfirmation, I recommend that the 342 patch be regarded as suspect and not be restored until the fault is found and repaired. Raymond From pje at telecommunity.com Sun Oct 23 07:53:59 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 23 Oct 2005 01:53:59 -0400 Subject: [Python-Dev] AST reverts PEP 342 implementation and IDLE starts working again In-Reply-To: <000201c5d792$daa57240$79fbcc97@oemcomputer> Message-ID: <5.1.1.6.0.20051023014942.01f9f8b8@mail.telecommunity.com> At 01:30 AM 10/23/2005 -0400, Raymond Hettinger wrote: >FWIW, a few months ago, I reported that File New or File Open in IDLE >would crash Python as a result of the check-in implementing PEP 342. >Now, with AST checked-in, IDLE has started working again. Given the >reconfirmation, I recommend that the 342 patch be regarded as suspect >and not be restored until the fault is found and repaired. PEP 342 is actually implemented in the HEAD. See: http://mail.python.org/pipermail/python-dev/2005-October/057477.html So, your observation actually means that the bug, if any, was somewhere else, or was inadvertently fixed or hidden by the AST branch merge. From ncoghlan at gmail.com Sun Oct 23 11:35:56 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 23 Oct 2005 19:35:56 +1000 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: References: <435A4598.3060403@iinet.net.au> Message-ID: <435B597C.6040300@gmail.com> Guido van Rossum wrote: > Here's another argument against automatically decorating __context__. > > What if I want to have a class with a __context__ method that returns > a custom context manager that *doesn't* involve applying > @contextmanager to a generator? > > While technically this is possible with your proposal (since such a > method wouldn't be a generator), it's exceedingly subtle for the human > reader. I'd much rather see the @contextmanager decorator to emphasize > the difference. Being able to easily pull a native context manager out and turn it into an independent context manager just by changing its name is also a big plus. For that matter, consider a class that had a "normal" context manager (its context slot), and an alternative context manager (defined as a separate method). The fact that one had the contextmanager decorator and the other one didn't would be rather confusing. So you've convinced me that auto-decoration is not the right thing to do. Those that really don't like decorating a slot can always write it as: def UndecoratedSlot(object): @contextmanager def native_context(self): print "Entering native context" yield print "Exiting native context cleanly" __context__ = native_context Or: def UndecoratedSlot(object): def __context__(self): return self.native_context() @contextmanager def native_context(self): print "Entering native context" yield print "Exiting native context cleanly" However, I'm still concerned about the fact that the following class has a context manager that doesn't actually work: class Broken(object): def __context__(self): print "This never gets executed" yield print "Neither does this" So how about if type_new simply raises a TypeError if it finds a generator-iterator function in the __context__ slot? Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Sun Oct 23 11:52:46 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 23 Oct 2005 19:52:46 +1000 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: References: <43544CC1.5050204@canterbury.ac.nz> <1129601328.9405.13.camel@geddy.wooz.org> <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com> <5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com> <17238.40158.735826.504410@montanaro.dyndns.org> <4edc17eb0510200035u370b57f9ub1d66b4e99d1be62@mail.gmail.com> Message-ID: <435B5D6E.80101@gmail.com> Reinhold Birkenfeld wrote: > Michele Simionato wrote: >> As other explained, the syntax would not work for functions (and it is >> not intended to). >> A possible use case I had in mind is to define inlined modules to be >> used as bunches >> of attributes. For instance, I could define a module as >> >> module m(): >> a = 1 >> b = 2 >> >> where 'module' would be the following function: >> >> def module(name, args, dic): >> mod = types.ModuleType(name, dic.get('__doc__')) >> for k in dic: setattr(mod, k, dic[k]) >> return mod > > Wow. This looks like an almighty tool. We can have modules, interfaces, > classes and properties all the like with this. > > Guess a PEP would be nice. Very nice indeed. I'd be more supportive if it was defined as a new statement such as "create" with the syntax: create TYPE NAME(ARGS): BLOCK The result would be roughly equivalent to: kwds = {} exec BLOCK in kwds NAME = TYPE(NAME, ARGS, kwds) Such that the existing 'class' statement is equivalent to: create __metaclass__ NAME(ARGS): BLOCK Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From raymond.hettinger at verizon.net Sun Oct 23 13:27:31 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sun, 23 Oct 2005 07:27:31 -0400 Subject: [Python-Dev] AST reverts PEP 342 implementation and IDLE starts working again In-Reply-To: <5.1.1.6.0.20051023014942.01f9f8b8@mail.telecommunity.com> Message-ID: <000601c5d7c4$c2ace600$79fbcc97@oemcomputer> [Phillip J. Eby] > your observation actually means that the bug, if any, was somewhere > else, or was inadvertently fixed or hidden by the AST branch merge. What a nice side benefit :-) Raymond From martin at v.loewis.de Sun Oct 23 15:53:04 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 23 Oct 2005 15:53:04 +0200 Subject: [Python-Dev] New codecs checked in In-Reply-To: <435965E4.5050207@egenix.com> References: <435965E4.5050207@egenix.com> Message-ID: <435B95C0.9060005@v.loewis.de> M.-A. Lemburg wrote: > I've checked in a whole bunch of newly generated codecs > which now make use of the faster charmap decoding variant added > by Walter a short while ago. > > Please let me know if you find any problems. I think we should work on eliminating the decoding_map variables. There are some codecs which rely on them being present in other codecs (e.g. koi8_u.py is based on koi8_r.py); however, this could be updated to use, say decoding_table = codecs.update_decoding_map(koi8_r.decoding_table, { 0x00a4: 0x0454, # CYRILLIC SMALL LETTER UKRAINIAN IE 0x00a6: 0x0456, # CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I 0x00a7: 0x0457, # CYRILLIC SMALL LETTER YI (UKRAINIAN) 0x00ad: 0x0491, # CYRILLIC SMALL LETTER UKRAINIAN GHE WITH UPTURN 0x00b4: 0x0404, # CYRILLIC CAPITAL LETTER UKRAINIAN IE 0x00b6: 0x0406, # CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I 0x00b7: 0x0407, # CYRILLIC CAPITAL LETTER YI (UKRAINIAN) 0x00bd: 0x0490, # CYRILLIC CAPITAL LETTER UKRAINIAN GHE WITH UPTURN }) With all these cross-references gone, the decoding_maps could also go. Regards, Martin From giovanniangeli at iquattrocastelli.it Sun Oct 23 18:12:42 2005 From: giovanniangeli at iquattrocastelli.it (giovanniangeli@iquattrocastelli.it) Date: Sun, 23 Oct 2005 18:12:42 +0200 (CEST) Subject: [Python-Dev] cross compiling python for embedded systems Message-ID: <32987.82.58.25.23.1130083962.squirrel@www.iquattrocastelli.it> is this the right place to ask: How could I build the python interpreter for an embedded linux target system (arm9 based), cross-compiling on a linux PC host? thanks, Giovanni Angeli. From guido at python.org Sun Oct 23 18:19:40 2005 From: guido at python.org (Guido van Rossum) Date: Sun, 23 Oct 2005 09:19:40 -0700 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: <435B597C.6040300@gmail.com> References: <435A4598.3060403@iinet.net.au> <435B597C.6040300@gmail.com> Message-ID: On 10/23/05, Nick Coghlan wrote: > However, I'm still concerned about the fact that the following class has a > context manager that doesn't actually work: > > class Broken(object): > def __context__(self): > print "This never gets executed" > yield > print "Neither does this" That's only because of your proposal to endow generators with a default __context__ manager. Drop that idea and you're golden. (As long as nobody snuck the proposal back in to let the with-statement silently ignore objects that don't have a __context__ method -- that was rejected long ago on.) In my previous mail I said I had to think about that one more -- well, I have, and I'm now -1 on it. Very few generators (that aren't used a context manangers) will need the immediate explicit close() call, and it will happen eventually when they are GC'ed anyway. Too much magic is bad for your health. > So how about if type_new simply raises a TypeError if it finds a > generator-iterator function in the __context__ slot? No. type should not bother with understanding what the class is trying to do. __new__ is only special because it is part of the machinery that type itself invokes in order to create a new class. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Sun Oct 23 18:51:27 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 23 Oct 2005 12:51:27 -0400 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: References: <435B597C.6040300@gmail.com> <435A4598.3060403@iinet.net.au> <435B597C.6040300@gmail.com> Message-ID: <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com> At 09:19 AM 10/23/2005 -0700, Guido van Rossum wrote: >On 10/23/05, Nick Coghlan wrote: > > However, I'm still concerned about the fact that the following class has a > > context manager that doesn't actually work: > > > > class Broken(object): > > def __context__(self): > > print "This never gets executed" > > yield > > print "Neither does this" > >That's only because of your proposal to endow generators with a >default __context__ manager. Drop that idea and you're golden. > >(As long as nobody snuck the proposal back in to let the >with-statement silently ignore objects that don't have a __context__ >method -- that was rejected long ago on.) Actually, you've just pointed out a new complication introduced by having __context__. The return value of __context__ is supposed to have an __enter__ and an __exit__. Is it a type error if it doesn't? How do we handle that, exactly? That is, assuming generators don't have enter/exit/context methods, then the above code is broken because its __context__ returns an object without enter/exit, sort of like an __iter__ that returns something without a 'next()'. From martin at v.loewis.de Sun Oct 23 19:03:56 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 23 Oct 2005 19:03:56 +0200 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover Message-ID: <435BC27C.1010503@v.loewis.de> I'd like to start the subversion switchover this coming Wednesday, with a total commit freeze at 16:00 GMT. If you have larger changes to commit that you would like to commit before the switchover, but after that date, please let me know. At that point, I will set the repository to read-only (through a commitinfo hook), and request that SF rolls a tarfile. I will then notify you when the Subversion repository is online. If you have sandboxes with modifications, it might be good to cvs diff -u them now. I plan to keep the CVS up for a short while after the switchover (about a month); after that point, you will need to get the CVS tarball and retarget your sandbox to perform diffs. I'm not aware of a procedure to convert a CVS sandbox into an SVN one, so you will have to recheckout all your sandboxes after the switch. Regards, Martin From jepler at unpythonic.net Sun Oct 23 19:22:32 2005 From: jepler at unpythonic.net (jepler@unpythonic.net) Date: Sun, 23 Oct 2005 12:22:32 -0500 Subject: [Python-Dev] cross compiling python for embedded systems In-Reply-To: <32987.82.58.25.23.1130083962.squirrel@www.iquattrocastelli.it> References: <32987.82.58.25.23.1130083962.squirrel@www.iquattrocastelli.it> Message-ID: <20051023172232.GA11117@unpythonic.net> There's a patch on sourceforge for cross compiling. I haven't used it personally. http://sourceforge.net/tracker/index.php?func=detail&aid=1006238&group_id=5470&atid=305470 Jeff -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20051023/9935c4e7/attachment.pgp From martin at v.loewis.de Sun Oct 23 19:22:37 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 23 Oct 2005 19:22:37 +0200 Subject: [Python-Dev] cross compiling python for embedded systems In-Reply-To: <32987.82.58.25.23.1130083962.squirrel@www.iquattrocastelli.it> References: <32987.82.58.25.23.1130083962.squirrel@www.iquattrocastelli.it> Message-ID: <435BC6DD.3030900@v.loewis.de> giovanniangeli at iquattrocastelli.it wrote: > How could I build the python interpreter for an embedded linux target system > (arm9 based), cross-compiling on a linux PC host? No. news:comp.lang.python (aka: mailto:python-list at python.org) would be the right list. This would be the right list for the question "I made this and that modification to get it cross-compile, can somebody please review them?" Regards, Martin From p.f.moore at gmail.com Sun Oct 23 21:15:15 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 23 Oct 2005 20:15:15 +0100 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com> References: <435A4598.3060403@iinet.net.au> <435B597C.6040300@gmail.com> <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com> Message-ID: <79990c6b0510231215tb93f1c0nb58821a36f033a5f@mail.gmail.com> On 10/23/05, Phillip J. Eby wrote: > Actually, you've just pointed out a new complication introduced by having > __context__. The return value of __context__ is supposed to have an > __enter__ and an __exit__. Is it a type error if it doesn't? How do we > handle that, exactly? > > That is, assuming generators don't have enter/exit/context methods, then > the above code is broken because its __context__ returns an object without > enter/exit, sort of like an __iter__ that returns something without a 'next()'. I would have thought that the parallel with __iter__ would be the right way to go: >>> class C: ... def __iter__(self): ... return 12 ... >>> c = C() >>> iter(c) Traceback (most recent call last): File "", line 1, in ? TypeError: __iter__ returned non-iterator of type 'int' >>> So, when you try calling __context__ in a with statement (or I guess in a context() builtin if one were to be added), raise a TypeError if the resulting object doesn't have __enter__ and __exit__ methods. (Or maybe just if it has neither - I can't recall if the methods are optional, but certainly having neither is wrong). Paul. From mwh at python.net Sun Oct 23 22:08:30 2005 From: mwh at python.net (Michael Hudson) Date: Sun, 23 Oct 2005 21:08:30 +0100 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <435BC27C.1010503@v.loewis.de> ( =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Sun, 23 Oct 2005 19:03:56 +0200") References: <435BC27C.1010503@v.loewis.de> Message-ID: <2mbr1g6loh.fsf@starship.python.net> "Martin v. L?wis" writes: > I'd like to start the subversion switchover this coming Wednesday, > with a total commit freeze at 16:00 GMT. Yay! Thanks again for doing this. Cheers, mwh -- [Perl] combines all the worst aspects of C and Lisp: a billion different sublanguages in one monolithic executable. It combines the power of C with the readability of PostScript. -- Jamie Zawinski From jason.orendorff at gmail.com Mon Oct 24 00:10:28 2005 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Sun, 23 Oct 2005 18:10:28 -0400 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> Message-ID: -1 on keeping the source encoding of string literals. Python should definitely decode them at compile time. -1 on decoding implicitly "as needed". This causes decoding to happen late, in unpredictable places. Decodes can fail; they should happen as early and as close to the data source as possible. -j From barry at python.org Mon Oct 24 00:43:50 2005 From: barry at python.org (Barry Warsaw) Date: Sun, 23 Oct 2005 18:43:50 -0400 Subject: [Python-Dev] PEP 351, the freeze protocol Message-ID: <1130107429.11268.40.camel@geddy.wooz.org> I've had this PEP laying around for quite a few months. It was inspired by some code we'd written which wanted to be able to get immutable versions of arbitrary objects. I've finally finished the PEP, uploaded a sample patch (albeit a bit incomplete), and I'm posting it here to see if there is any interest. http://www.python.org/peps/pep-0351.html Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20051023/3d621ed8/attachment.pgp From guido at python.org Mon Oct 24 01:58:48 2005 From: guido at python.org (Guido van Rossum) Date: Sun, 23 Oct 2005 16:58:48 -0700 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com> References: <435A4598.3060403@iinet.net.au> <435B597C.6040300@gmail.com> <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com> Message-ID: On 10/23/05, Phillip J. Eby wrote: > Actually, you've just pointed out a new complication introduced by having > __context__. The return value of __context__ is supposed to have an > __enter__ and an __exit__. Is it a type error if it doesn't? How do we > handle that, exactly? Of course it's an error! The translation in the PEP should make that quite clear (there's no testing for whether __context__, __enter__ and/or __exit__ exist before they are called). It would be an AttributeError. > That is, assuming generators don't have enter/exit/context methods, then > the above code is broken because its __context__ returns an object without > enter/exit, sort of like an __iter__ that returns something without a 'next()'. Right. That was my point. Nick's worried about undecorated __context__ because he wants to endow generators with a different default __context__. I say no to both proposals and the worries cancel each other out. EIBTI. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rhamph at gmail.com Mon Oct 24 02:01:44 2005 From: rhamph at gmail.com (Adam Olsen) Date: Sun, 23 Oct 2005 18:01:44 -0600 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <1130107429.11268.40.camel@geddy.wooz.org> References: <1130107429.11268.40.camel@geddy.wooz.org> Message-ID: On 10/23/05, Barry Warsaw wrote: > I've had this PEP laying around for quite a few months. It was inspired > by some code we'd written which wanted to be able to get immutable > versions of arbitrary objects. I've finally finished the PEP, uploaded > a sample patch (albeit a bit incomplete), and I'm posting it here to see > if there is any interest. > > http://www.python.org/peps/pep-0351.html My sandboxes need freezing for some stuff and ultimately freezable user classes will be desirable, but for performance reasons I prefer freezing inplace. Not much overlap with PEP 351 really. -- Adam Olsen, aka Rhamphoryncus From bob at redivi.com Mon Oct 24 02:24:05 2005 From: bob at redivi.com (Bob Ippolito) Date: Sun, 23 Oct 2005 17:24:05 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> Message-ID: <7E14C004-99C3-4382-BA76-FE2F731B4CE5@redivi.com> On Oct 23, 2005, at 3:10 PM, Jason Orendorff wrote: > -1 on decoding implicitly "as needed". This causes decoding to happen > late, in unpredictable places. Decodes can fail; they should happen > as early and as close to the data source as possible. That's not necessarily true... Some codecs can't fail, like latin1. I think the main use case for this is to speed up usage of text in these sorts of formats anyway. -bob From srichter at cosmos.phy.tufts.edu Mon Oct 24 02:52:27 2005 From: srichter at cosmos.phy.tufts.edu (Stephan Richter) Date: Sun, 23 Oct 2005 20:52:27 -0400 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> Message-ID: <200510232052.27775.srichter@cosmos.phy.tufts.edu> On Sunday 23 October 2005 18:10, Jason Orendorff wrote: > -1 on keeping the source encoding of string literals. ?Python should > definitely decode them at compile time. > > -1 on decoding implicitly "as needed". ?This causes decoding to happen > late, in unpredictable places. ?Decodes can fail; they should happen > as early and as close to the data source as possible. +1. We have followed this last practice throughout Zope 3 successfully. In our case, the publisher framework (in other words the output-protocol-specific layer) is responsible for the decoding and encoding of input and output streams, respectively. We have been pretty much free of any encoding/decoding troubles since. Having our application only use unicode internally was one of the best decisions we have made. Regards, Stephan -- Stephan Richter CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student) Web2k - Web Software Design, Development and Training From guido at python.org Mon Oct 24 03:06:00 2005 From: guido at python.org (Guido van Rossum) Date: Sun, 23 Oct 2005 18:06:00 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> Message-ID: Folks, please focus on what Python 3000 should do. I'm thinking about making all character strings Unicode (possibly with different internal representations a la NSString in Apple's Objective C) and introduce a separate mutable bytes array data type. But I could use some validation or feedback on this idea from actual practitioners. I don't want to see proposals to mess with the str/unicode semantics in Python 2.x. Let' leave the Python 2.x str/unicode semantics alone until Python 3000 -- we don't need mutliple transitions. (Although we could add the mutable bytes array type sooner.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bob at redivi.com Mon Oct 24 03:31:12 2005 From: bob at redivi.com (Bob Ippolito) Date: Sun, 23 Oct 2005 18:31:12 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> Message-ID: <9BAB15CE-D7BE-4080-9A21-3ED4EE01B0AE@redivi.com> On Oct 23, 2005, at 6:06 PM, Guido van Rossum wrote: > Folks, please focus on what Python 3000 should do. > > I'm thinking about making all character strings Unicode (possibly with > different internal representations a la NSString in Apple's Objective > C) and introduce a separate mutable bytes array data type. But I could > use some validation or feedback on this idea from actual > practitioners. > > I don't want to see proposals to mess with the str/unicode semantics > in Python 2.x. Let' leave the Python 2.x str/unicode semantics alone > until Python 3000 -- we don't need mutliple transitions. (Although we > could add the mutable bytes array type sooner.) +1, this is precisely what I'd like to see. -bob From pje at telecommunity.com Mon Oct 24 04:23:40 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 23 Oct 2005 22:23:40 -0400 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> Message-ID: <5.1.1.6.0.20051023221433.02ab8200@mail.telecommunity.com> At 06:06 PM 10/23/2005 -0700, Guido van Rossum wrote: >Folks, please focus on what Python 3000 should do. > >I'm thinking about making all character strings Unicode (possibly with >different internal representations a la NSString in Apple's Objective >C) and introduce a separate mutable bytes array data type. But I could >use some validation or feedback on this idea from actual >practitioners. +1. Chandler has been going through quite an upheaval to get its unicode handling together. Having a bytes type would be great, as long as there was support for files and sockets to produce bytes instead of strings (unless an encoding was specified). I'm tempted to say it would be even better if there was a command line option that could be used to force all binary opens to result in bytes, and require all text opens to specify an encoding. The Chandler i18n project lead would jump for joy if we had a way to keep "legacy" strings out of the system, apart from ASCII string constants found in code. It would then be okay not to drop support for the implicit conversions; if you can't get strings on input, then conversion's not really an issue. Anyway, I think all of the things I'd like to see can be done without breakage in 2.5. For Chandler at least, we'd be willing to go with a command-line option that's more strict, in order to be able to ensure that plugin developers can't accidentally put 8-bit strings in somewhere, just by opening a file. From jcarlson at uci.edu Mon Oct 24 04:29:11 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 23 Oct 2005 19:29:11 -0700 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <435B5D6E.80101@gmail.com> References: <435B5D6E.80101@gmail.com> Message-ID: <20051023121124.38D4.JCARLSON@uci.edu> Nick Coghlan wrote: > > Reinhold Birkenfeld wrote: > > Michele Simionato wrote: > >> As other explained, the syntax would not work for functions (and it is > >> not intended to). > >> A possible use case I had in mind is to define inlined modules to be > >> used as bunches > >> of attributes. For instance, I could define a module as > >> > >> module m(): > >> a = 1 > >> b = 2 > >> > >> where 'module' would be the following function: > >> > >> def module(name, args, dic): > >> mod = types.ModuleType(name, dic.get('__doc__')) > >> for k in dic: setattr(mod, k, dic[k]) > >> return mod > > > > Wow. This looks like an almighty tool. We can have modules, interfaces, > > classes and properties all the like with this. > > > > Guess a PEP would be nice. > > Very nice indeed. I'd be more supportive if it was defined as a new statement > such as "create" with the syntax: > > create TYPE NAME(ARGS): > BLOCK > > The result would be roughly equivalent to: > > kwds = {} > exec BLOCK in kwds > NAME = TYPE(NAME, ARGS, kwds) And is equivalent to the class/metaclass abuse... #suport code def BlockMetaclassFactory(constructor): class BlockMetaclass(type): def __new__(cls, name, bases, dct): return constructor(name, bases, dct) return BlockMetaClass #non-syntax syntax class NAME(ARGS): __metaclass__ = BlockMetaclassFactory(TYPE) BLOCK Or even... def BlockClassFactory(constructor): class BlockClass: __metaclass__ = BlockMetaclassFactory(constructor) return BlockClass class NAME(BlockClassFactory(TYPE), ARGS): BLOCK To be used with properties, one could use a wrapper and class definition... def _Property(names, bases, dct): return property(**dct) Property = BlockClassFactory(_Property) class foo(object): class x(Property): ... With minor work, it would be easy to define a subclassable Property which could handle some basic styles: write once, default value, etc. I am unconvinced that a block syntax is necessary or desireable for this case. With the proper support classes, you can get modules, classes, metaclasses, properties, the previous 'given:' syntax, etc. - Josiah From jcarlson at uci.edu Mon Oct 24 04:50:47 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 23 Oct 2005 19:50:47 -0700 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <1130107429.11268.40.camel@geddy.wooz.org> References: <1130107429.11268.40.camel@geddy.wooz.org> Message-ID: <20051023194708.38D7.JCARLSON@uci.edu> Barry Warsaw wrote: > I've had this PEP laying around for quite a few months. It was inspired > by some code we'd written which wanted to be able to get immutable > versions of arbitrary objects. I've finally finished the PEP, uploaded > a sample patch (albeit a bit incomplete), and I'm posting it here to see > if there is any interest. > > http://www.python.org/peps/pep-0351.html class xlist(list): def __freeze__(self): return tuple(self) Shouldn't that be: class xlist(list): def __freeze__(self): return tuple(map(freeze, self)) "Should dicts and sets automatically freeze their mutable keys?" Dictionaries don't have mutable keys, but it is of my opinion that a container which is frozen should have its contents frozen as well. - Josiah From nyamatongwe at gmail.com Mon Oct 24 05:41:50 2005 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Mon, 24 Oct 2005 13:41:50 +1000 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> Message-ID: <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> Guido van Rossum: > Folks, please focus on what Python 3000 should do. > > I'm thinking about making all character strings Unicode (possibly with > different internal representations a la NSString in Apple's Objective > C) and introduce a separate mutable bytes array data type. But I could > use some validation or feedback on this idea from actual > practitioners. I'd like to more tightly define Unicode strings for Python 3000. Currently, Unicode strings may be implemented with either 2 byte (UCS-2) or 4 byte (UTF-32) elements. Python should allow strings to contain any Unicode character and should be indexable yielding characters rather than half characters. Therefore Python strings should appear to be UTF-32. There could still be multiple implementations (using UTF-16 or UTF-8) to preserve space but all implementations should appear to be the same apart from speed and memory use. Neil From alan.mcintyre at esrgtech.com Mon Oct 24 07:43:26 2005 From: alan.mcintyre at esrgtech.com (Alan McIntyre) Date: Mon, 24 Oct 2005 01:43:26 -0400 Subject: [Python-Dev] int(string) In-Reply-To: <1f7befae0510221038u4bd75ca5l67dc930ae16d24b7@mail.gmail.com> References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com> <2mfyqu6j55.fsf@starship.python.net> <1f7befae0510221038u4bd75ca5l67dc930ae16d24b7@mail.gmail.com> Message-ID: <435C747E.9060005@esrgtech.com> Tim Peters wrote: >[Adam Olsen] > >>https://sourceforge.net/tracker/index.php?func=detail&aid=1334979&group_id=5470&atid=305470> >> >>That patch removes the division from the loop (and fixes the bugs), >>but gives only a small increase in speed. >> >In any case, I agree it _should_ fix the bugs (although it also needs >new tests to verify that). > I started with Adam's patch and did some additional work on PyOS_strtoul. I ended up with a patch that seems to correctly evaluate the tests that Tim listed in bug #1334662, includes new tests (in test_builtin), passes (almost) all of "make test," and it seems to be somewhat faster (~20%) for a "spoj.sphere.pl"-like test on a ~8MB input file. All the ugly details are here (along with my ugly code): http://sourceforge.net/tracker/index.php?func=detail&aid=1335972&group_id=5470&atid=305470 When running "make test" I get some errors in test_array and test_compile that did not occur in the build from CVS. Given the inputs to long() have '.' characters in them, I assume that these tests really should be failing as implemented, but I haven't dug into them to see what's going on: ====================================================================== ERROR: test_repr (__main__.FloatTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "Lib/test/test_array.py", line 187, in test_repr self.assertEqual(a, eval(repr(a), {"array": array.array})) ValueError: invalid literal for long(): 10000000000.0 ====================================================================== ERROR: test_repr (__main__.DoubleTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "Lib/test/test_array.py", line 187, in test_repr self.assertEqual(a, eval(repr(a), {"array": array.array})) ValueError: invalid literal for long(): 10000000000.0 ---------------------------------------------------------------------- test test_compile crashed -- exceptions.ValueError: invalid literal for long(): 90000000000000. From martin at v.loewis.de Mon Oct 24 08:28:14 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 24 Oct 2005 08:28:14 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> Message-ID: <435C7EFE.6060504@v.loewis.de> Neil Hodgson wrote: > I'd like to more tightly define Unicode strings for Python 3000. > Currently, Unicode strings may be implemented with either 2 byte > (UCS-2) or 4 byte (UTF-32) elements. Python should allow strings to > contain any Unicode character and should be indexable yielding > characters rather than half characters. Therefore Python strings > should appear to be UTF-32. There could still be multiple > implementations (using UTF-16 or UTF-8) to preserve space but all > implementations should appear to be the same apart from speed and > memory use. That's very tricky. If you have multiple implementations, you make usage at the C API difficult. If you make it either UTF-8 or UTF-32, you make PythonWin difficult. If you make it UTF-16, you make indexing difficult. Regards, Martin From martin at v.loewis.de Mon Oct 24 08:30:52 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 24 Oct 2005 08:30:52 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <5.1.1.6.0.20051023221433.02ab8200@mail.telecommunity.com> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <5.1.1.6.0.20051023221433.02ab8200@mail.telecommunity.com> Message-ID: <435C7F9C.1050602@v.loewis.de> Phillip J. Eby wrote: > I'm tempted to say it would be even better if there was a command line > option that could be used to force all binary opens to result in bytes, and > require all text opens to specify an encoding. For Python 3000? -1. There shouldn't be command line switches that have that much importance. For Python 2.x? Well, we are not supposed to discuss this. Regards, Martin From nyamatongwe at gmail.com Mon Oct 24 09:24:23 2005 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Mon, 24 Oct 2005 17:24:23 +1000 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <435C7EFE.6060504@v.loewis.de> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> <435C7EFE.6060504@v.loewis.de> Message-ID: <50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com> Martin v. L?wis: > That's very tricky. If you have multiple implementations, you make > usage at the C API difficult. If you make it either UTF-8 or UTF-32, > you make PythonWin difficult. If you make it UTF-16, you make indexing > difficult. For Windows, the code will get a little uglier, needing to perform an allocation/encoding and deallocation more often then at present but I don't think there will be a speed degradation as Windows is currently performing a conversion from 8 bit to UTF-16 inside many system calls. To minimize the cost of allocation, Python could copy Windows in keeping a small number of commonly sized preallocated buffers handy. For indexing UTF-16, a flag could be set to show if the string is all in the base plane and if not, an index could be constructed when and if needed. It'd be good to get some feel for what proportion of string operations performed require indexing. Many, such as startswith, split, and concatenation don't require indexing. The proportion of operations that use indexing to scan strings would also be interesting as adding a (currentIndex, currentOffset) cursor to string objects would be another approach. Neil From michele.simionato at gmail.com Mon Oct 24 09:30:02 2005 From: michele.simionato at gmail.com (Michele Simionato) Date: Mon, 24 Oct 2005 07:30:02 +0000 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <435B5D6E.80101@gmail.com> References: <1129601328.9405.13.camel@geddy.wooz.org> <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com> <5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com> <17238.40158.735826.504410@montanaro.dyndns.org> <4edc17eb0510200035u370b57f9ub1d66b4e99d1be62@mail.gmail.com> <435B5D6E.80101@gmail.com> Message-ID: <4edc17eb0510240030x74523a8fl1a9389b5054c061b@mail.gmail.com> On 10/23/05, Nick Coghlan wrote: > Very nice indeed. I'd be more supportive if it was defined as a new statement > such as "create" with the syntax: > > create TYPE NAME(ARGS): > BLOCK I like it, but it would require a new keyword. Alternatively, one could abuse 'def': def TYPE NAME(ARGS): BLOCK but then people would likely be confused as Skip was, earlier in this thread, so I guess 'def' is a not an option. IMHO a new keyword could be justified for such a powerful feature, but only Guido's opinion counts on this matters ;) Anyway I expected people to criticize the proposal as too powerful and dangerously close to Lisp macros. Michele Simionato From fredrik at pythonware.com Mon Oct 24 09:41:39 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 24 Oct 2005 09:41:39 +0200 Subject: [Python-Dev] int(string) References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com> <2mfyqu6j55.fsf@starship.python.net> <1f7befae0510221038u4bd75ca5l67dc930ae16d24b7@mail.gmail.com> <435C747E.9060005@esrgtech.com> Message-ID: Alan McIntyre wrote: > When running "make test" I get some errors in test_array and > test_compile that did not occur in the build from CVS. Given the inputs > to long() have '.' characters in them, I assume that these tests really > should be failing as implemented, but I haven't dug into them to see > what's going on: > > ====================================================================== > ERROR: test_repr (__main__.FloatTest) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "Lib/test/test_array.py", line 187, in test_repr > self.assertEqual(a, eval(repr(a), {"array": array.array})) > ValueError: invalid literal for long(): 10000000000.0 > > ====================================================================== > ERROR: test_repr (__main__.DoubleTest) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "Lib/test/test_array.py", line 187, in test_repr > self.assertEqual(a, eval(repr(a), {"array": array.array})) > ValueError: invalid literal for long(): 10000000000.0 I don't have the latest cvs, but in my copy of test_array, the input to those two eval calls are array('f', [-42.0, 0.0, 42.0, 100000.0, -10000000000.0, -42.0, 0.0, 42.0, 100000.0, -10000000000.0]) and array('d', [-42.0, 0.0, 42.0, 100000.0, -10000000000.0, -42.0, 0.0, 42.0, 100000.0, -10000000000.0]) respectively. if either of those gives "invalid literal for long", something's seriously broken. does a plain a = -10000000000.0 still work on your machine? From jcarlson at uci.edu Mon Oct 24 10:19:23 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 24 Oct 2005 01:19:23 -0700 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <4edc17eb0510240030x74523a8fl1a9389b5054c061b@mail.gmail.com> References: <435B5D6E.80101@gmail.com> <4edc17eb0510240030x74523a8fl1a9389b5054c061b@mail.gmail.com> Message-ID: <20051024011400.38DA.JCARLSON@uci.edu> Michele Simionato wrote: > > On 10/23/05, Nick Coghlan wrote: > > Very nice indeed. I'd be more supportive if it was defined as a new statement > > such as "create" with the syntax: > > > > create TYPE NAME(ARGS): > > BLOCK > > I like it, but it would require a new keyword. Alternatively, one > could abuse 'def': > > def TYPE NAME(ARGS): > BLOCK > > but then people would likely be confused as Skip was, earlier in this thread, > so I guess 'def' is a not an option. > > IMHO a new keyword could be justified for such a powerful feature, > but only Guido's opinion counts on this matters ;) > > Anyway I expected people to criticize the proposal as too powerful and > dangerously close to Lisp macros. I would criticise it for being dangerously close to worthless. With the minor support code that I (and others) have offered, no new syntax is necessary. You can get the same semantics with... class NAME(_(TYPE), ARGS): BLOCK And a suitably defined _. Remember, not every X line function should be made a builtin or syntax. - Josiah From michele.simionato at gmail.com Mon Oct 24 10:33:18 2005 From: michele.simionato at gmail.com (Michele Simionato) Date: Mon, 24 Oct 2005 08:33:18 +0000 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <20051024011400.38DA.JCARLSON@uci.edu> References: <435B5D6E.80101@gmail.com> <4edc17eb0510240030x74523a8fl1a9389b5054c061b@mail.gmail.com> <20051024011400.38DA.JCARLSON@uci.edu> Message-ID: <4edc17eb0510240133v26cc056em47dee26b901460b9@mail.gmail.com> On 10/24/05, Josiah Carlson wrote: > I would criticise it for being dangerously close to worthless. With the > minor support code that I (and others) have offered, no new syntax is > necessary. > > You can get the same semantics with... > > class NAME(_(TYPE), ARGS): > BLOCK > > And a suitably defined _. Remember, not every X line function should be > made a builtin or syntax. > > - Josiah Could you re-read my original message, please? Sugar is *everything* in this case. If the functionality is to be implemented via a __metaclass__ hook, then it should be considered a hack that nobody in his right mind should use. OTOH, if there is a specific syntax for it, then it means this the usage has the benediction of the BDFL. This would be a HUGE change. For instance, I would never abuse metaclasses for that, whereas I would freely use a 'create' statement. Michele Simionato From mal at egenix.com Mon Oct 24 10:40:28 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 24 Oct 2005 10:40:28 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> Message-ID: <435C9DFC.8020501@egenix.com> Neil Hodgson wrote: > Guido van Rossum: > > >>Folks, please focus on what Python 3000 should do. >> >>I'm thinking about making all character strings Unicode (possibly with >>different internal representations a la NSString in Apple's Objective >>C) and introduce a separate mutable bytes array data type. But I could >>use some validation or feedback on this idea from actual >>practitioners. > > > I'd like to more tightly define Unicode strings for Python 3000. > Currently, Unicode strings may be implemented with either 2 byte > (UCS-2) or 4 byte (UTF-32) elements. Python should allow strings to > contain any Unicode character and should be indexable yielding > characters rather than half characters. Therefore Python strings > should appear to be UTF-32. There could still be multiple > implementations (using UTF-16 or UTF-8) to preserve space but all > implementations should appear to be the same apart from speed and > memory use. There seems to be a general misunderstanding here: even if you have UCS4 storage, it is still possible to slice a Unicode string in a way which makes rendering it correctly. Unicode has the concept of combining code points, e.g. you can store an "?" (e with a accent) as "e" + "'". Now if you slice off the accent, you'll break the character that you encoded using combining code points. Note that combining code points are rather common in encodings of Asian scripts, so this is not an artificial example. Some time ago I proposed a new module called unicodeindex to help with indexing. It would solve most of the indexing issues you run into when dealing with Unicode. I've attached it to this email for reference. More on the used terms: http://www.egenix.com/files/python/EuroPython2002-Python-and-Unicode.pdf http://www.egenix.com/files/python/LSM2005-Developing-Unicode-aware-applications-in-Python.pdf -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 24 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pep-unicodeindex.txt Url: http://mail.python.org/pipermail/python-dev/attachments/20051024/dacea951/pep-unicodeindex.txt From walter at livinglogic.de Mon Oct 24 11:00:42 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Mon, 24 Oct 2005 11:00:42 +0200 Subject: [Python-Dev] New codecs checked in In-Reply-To: <435B95C0.9060005@v.loewis.de> References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de> Message-ID: <435CA2BA.7050900@livinglogic.de> Martin v. L?wis wrote: > M.-A. Lemburg wrote: > >>I've checked in a whole bunch of newly generated codecs >>which now make use of the faster charmap decoding variant added >>by Walter a short while ago. >> >>Please let me know if you find any problems. > > I think we should work on eliminating the decoding_map variables. > There are some codecs which rely on them being present in other codecs > (e.g. koi8_u.py is based on koi8_r.py); however, this could be updated > to use, say > > decoding_table = codecs.update_decoding_map(koi8_r.decoding_table, { > 0x00a4: 0x0454, # CYRILLIC SMALL LETTER UKRAINIAN IE > 0x00a6: 0x0456, # CYRILLIC SMALL LETTER > BYELORUSSIAN-UKRAINIAN I > 0x00a7: 0x0457, # CYRILLIC SMALL LETTER YI (UKRAINIAN) > 0x00ad: 0x0491, # CYRILLIC SMALL LETTER UKRAINIAN GHE > WITH UPTURN > 0x00b4: 0x0404, # CYRILLIC CAPITAL LETTER UKRAINIAN IE > 0x00b6: 0x0406, # CYRILLIC CAPITAL LETTER > BYELORUSSIAN-UKRAINIAN I > 0x00b7: 0x0407, # CYRILLIC CAPITAL LETTER YI (UKRAINIAN) > 0x00bd: 0x0490, # CYRILLIC CAPITAL LETTER UKRAINIAN GHE > WITH UPTURN > }) > > With all these cross-references gone, the decoding_maps could also go. Why should koi_u.py be defined in terms of koi8_r.py anyway? Why not put a complete decoding_table into koi8_u.py? I'd like to suggest a small cosmetic change: gencodec.py should output byte values with two hexdigits instead of four. This makes it easier to see what is a byte values and what is a codepoint. And it would make grepping for stuff simpler. I.e. change: decoding_map.update({ 0x0080: 0x0402, # CYRILLIC CAPITAL LETTER DJE to decoding_map.update({ 0x80: 0x0402, # CYRILLIC CAPITAL LETTER DJE and decoding_table = ( u'\x00' # 0x0000 -> NULL to decoding_table = ( u'\x00' # 0x00 -> U+0000 NULL and encoding_map = { 0x0000: 0x0000, # NULL to encoding_map = { 0x0000: 0x00, # NULL From mal at egenix.com Mon Oct 24 11:25:27 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 24 Oct 2005 11:25:27 +0200 Subject: [Python-Dev] New codecs checked in In-Reply-To: <435CA2BA.7050900@livinglogic.de> References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de> <435CA2BA.7050900@livinglogic.de> Message-ID: <435CA887.3020900@egenix.com> Walter D?rwald wrote: > Martin v. L?wis wrote: > >> M.-A. Lemburg wrote: >> >>> I've checked in a whole bunch of newly generated codecs >>> which now make use of the faster charmap decoding variant added >>> by Walter a short while ago. >>> >>> Please let me know if you find any problems. >> >> >> I think we should work on eliminating the decoding_map variables. >> There are some codecs which rely on them being present in other codecs >> (e.g. koi8_u.py is based on koi8_r.py); however, this could be updated >> to use, say >> >> decoding_table = codecs.update_decoding_map(koi8_r.decoding_table, { >> 0x00a4: 0x0454, # CYRILLIC SMALL LETTER UKRAINIAN IE >> 0x00a6: 0x0456, # CYRILLIC SMALL LETTER >> BYELORUSSIAN-UKRAINIAN I >> 0x00a7: 0x0457, # CYRILLIC SMALL LETTER YI (UKRAINIAN) >> 0x00ad: 0x0491, # CYRILLIC SMALL LETTER UKRAINIAN GHE >> WITH UPTURN >> 0x00b4: 0x0404, # CYRILLIC CAPITAL LETTER UKRAINIAN IE >> 0x00b6: 0x0406, # CYRILLIC CAPITAL LETTER >> BYELORUSSIAN-UKRAINIAN I >> 0x00b7: 0x0407, # CYRILLIC CAPITAL LETTER YI (UKRAINIAN) >> 0x00bd: 0x0490, # CYRILLIC CAPITAL LETTER UKRAINIAN GHE >> WITH UPTURN >> }) >> >> With all these cross-references gone, the decoding_maps could also go. I just left them in because I thought they wouldn't do any harm and might be useful in some applications. Removing them where not directly needed by the codec would not be a problem. > Why should koi_u.py be defined in terms of koi8_r.py anyway? Why not put > a complete decoding_table into koi8_u.py? KOI8-U is not available as mapping on ftp.unicode.org and I only recreated codecs from the mapping files available there. > I'd like to suggest a small cosmetic change: gencodec.py should output > byte values with two hexdigits instead of four. This makes it easier to > see what is a byte values and what is a codepoint. And it would make > grepping for stuff simpler. True. I'll rerun the creation with the above changes sometime this week. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 24 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ncoghlan at gmail.com Mon Oct 24 11:35:19 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 24 Oct 2005 19:35:19 +1000 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <20051024011400.38DA.JCARLSON@uci.edu> References: <435B5D6E.80101@gmail.com> <4edc17eb0510240030x74523a8fl1a9389b5054c061b@mail.gmail.com> <20051024011400.38DA.JCARLSON@uci.edu> Message-ID: <435CAAD7.7010909@gmail.com> Josiah Carlson wrote: > You can get the same semantics with... > > class NAME(_(TYPE), ARGS): > BLOCK > > And a suitably defined _. Remember, not every X line function should be > made a builtin or syntax. And this would be an extremely fragile hack that is entirely dependent on the murky rules regarding how Python chooses the metaclass for the newly created class. Ensuring that the metaclass of the class returned by "_" was always the one chosen would be tricky at best and impossible at worst. Even if it *could* be done, I'd never want to see a hack like that in production code I had anything to do with. And while writing it with "__metaclass__" has precisely the correct semantics, that simply isn't as readable as a new block statement would be, nor is it as readable as the current major alternatives (e.g., defining and invoking a factory function). An alternative to a completely new function would be to simply allow the metaclass to be defined up front, rather than inside the body of the class statement: class @TYPE NAME(ARGS): BLOCK For example: class @Property x(): def get(self): return self._x def set(self, value): self._x = value def delete(self, value): del self._x (I put the metaclass after the keyword, because, unlike a function decorator, the metaclass is invoked *before* the class is created, and because you're only allowed one explicit metaclass) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From mal at egenix.com Mon Oct 24 11:43:09 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 24 Oct 2005 11:43:09 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> Message-ID: <435CACAD.9070106@egenix.com> Bengt Richter wrote: > Please bear with me for a few paragraphs ;-) Please note that source code encoding doesn't really have anything to do with the way the interpreter executes the program - it's merely a way to tell the parser how to convert string literals (currently on the Unicode ones) into constant Unicode objects within the program text. It's also a nice way to let other people know what kind of encoding you used to write your comments ;-) Nothing more. Once a module is compiled, there's no distinction between a module using the latin-1 source code encoding or one using the utf-8 encoding. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 24 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ncoghlan at gmail.com Mon Oct 24 11:14:32 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 24 Oct 2005 19:14:32 +1000 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <1130107429.11268.40.camel@geddy.wooz.org> References: <1130107429.11268.40.camel@geddy.wooz.org> Message-ID: <435CA5F8.7010607@gmail.com> Barry Warsaw wrote: > I've had this PEP laying around for quite a few months. It was inspired > by some code we'd written which wanted to be able to get immutable > versions of arbitrary objects. I've finally finished the PEP, uploaded > a sample patch (albeit a bit incomplete), and I'm posting it here to see > if there is any interest. > > http://www.python.org/peps/pep-0351.html I think it's definitely worth considering. It may also reduce the need for "x" and "frozenx" builtin pairs. We already have "set" and "frozenset", and the various "bytes" ideas that have been kicked around have generally considered the need for a "frozenbytes" as well. If freeze was available, then "freeze(x(*args))" might server as a replacement for any builtin "frozen" variants. I think having dicts and sets automatically invoke freeze would be a mistake, because at least one of the following two cases would behave unexpectedly: d = {} l = [] d[l] = "Oops!" d[l] # Raises KeyError if freeze() isn't also invoked in __getitem__ d = {} l = [] d[l] = "Oops!" l.append(1) d[l] # Raises KeyError regardless Oh, and the PEP's xdict example is even more broken than the PEP implies, because two imdicts which compare equal (same contents) may not hash equal (different id's). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From radeex at gmail.com Mon Oct 24 11:54:20 2005 From: radeex at gmail.com (Christopher Armstrong) Date: Mon, 24 Oct 2005 20:54:20 +1100 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <20051023194708.38D7.JCARLSON@uci.edu> References: <1130107429.11268.40.camel@geddy.wooz.org> <20051023194708.38D7.JCARLSON@uci.edu> Message-ID: <60ed19d40510240254h7e077a74hf719abcf6e5a4ad@mail.gmail.com> On 10/24/05, Josiah Carlson wrote: > "Should dicts and sets automatically freeze their mutable keys?" > > Dictionaries don't have mutable keys, Since when? class Foo: def __init__(self): self.x = 1 f = Foo() d = {f: 1} f.x = 2 Maybe you meant something else? I can't think of any way in which "dictionaries don't have mutable keys" is true. The only rule about dictionary keys that I know of is that they need to be hashable and need to be comparable with the equality operator. -- Twisted | Christopher Armstrong: International Man of Twistery Radix | -- http://radix.twistedmatrix.com | Release Manager, Twisted Project \\\V/// | -- http://twistedmatrix.com |o O| | w----v----w-+ From jcarlson at uci.edu Mon Oct 24 12:09:07 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 24 Oct 2005 03:09:07 -0700 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <435CAAD7.7010909@gmail.com> References: <20051024011400.38DA.JCARLSON@uci.edu> <435CAAD7.7010909@gmail.com> Message-ID: <20051024025358.38E5.JCARLSON@uci.edu> Nick Coghlan wrote: > > Josiah Carlson wrote: > > You can get the same semantics with... > > > > class NAME(_(TYPE), ARGS): > > BLOCK > > > > And a suitably defined _. Remember, not every X line function should be > > made a builtin or syntax. > > And this would be an extremely fragile hack that is entirely dependent on the > murky rules regarding how Python chooses the metaclass for the newly created > class. Ensuring that the metaclass of the class returned by "_" was always the > one chosen would be tricky at best and impossible at worst. The rules for which metaclass is used is listed in the metaclass documentation. I personally never claimed it was perfect, and neither is this one... class NAME(_(TYPE, ARGS)): BLOCK But it does solve the problem without needing syntax (and fixes any possible metaclass order choices). > Even if it *could* be done, I'd never want to see a hack like that in > production code I had anything to do with. That's perfectly reasonable. > (I put the metaclass after the keyword, because, unlike a function decorator, > the metaclass is invoked *before* the class is created, and because you're > only allowed one explicit metaclass) Perhaps, but because the metaclass can return anything (in this case, it returns a property), being able to modify the object that is created may be desireable...at which point, we may as well get class decorators for the built-in chaining. - Josiah From jcarlson at uci.edu Mon Oct 24 12:15:17 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 24 Oct 2005 03:15:17 -0700 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <60ed19d40510240254h7e077a74hf719abcf6e5a4ad@mail.gmail.com> References: <20051023194708.38D7.JCARLSON@uci.edu> <60ed19d40510240254h7e077a74hf719abcf6e5a4ad@mail.gmail.com> Message-ID: <20051024030952.38E8.JCARLSON@uci.edu> Christopher Armstrong wrote: > > On 10/24/05, Josiah Carlson wrote: > > "Should dicts and sets automatically freeze their mutable keys?" > > > > Dictionaries don't have mutable keys, > > Since when? > > Maybe you meant something else? I can't think of any way in which > "dictionaries don't have mutable keys" is true. The only rule about > dictionary keys that I know of is that they need to be hashable and > need to be comparable with the equality operator. Good point, I forgot about user-defined classes (I rarely use them as keys myself, it's all too easy to make a mutable whose hash is dependant on mutable contents, as having an object which you can only find if you have the exact object is not quite as useful I generally need). I will, however, stand by, "a container which is frozen should have its contents frozen as well." - Josiah From walter at livinglogic.de Mon Oct 24 12:17:31 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Mon, 24 Oct 2005 12:17:31 +0200 Subject: [Python-Dev] New codecs checked in In-Reply-To: <435CA887.3020900@egenix.com> References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de> <435CA2BA.7050900@livinglogic.de> <435CA887.3020900@egenix.com> Message-ID: <435CB4BB.6070009@livinglogic.de> M.-A. Lemburg wrote: > Walter D?rwald wrote: > >>Martin v. L?wis wrote: >> >>>M.-A. Lemburg wrote: >>> >>>>I've checked in a whole bunch of newly generated codecs >>>>which now make use of the faster charmap decoding variant added >>>>by Walter a short while ago. >>>> >>>>Please let me know if you find any problems. >>> >>>I think we should work on eliminating the decoding_map variables. >>>There are some codecs which rely on them being present in other codecs >>>(e.g. koi8_u.py is based on koi8_r.py); however, this could be updated >>>to use, say >>> >>>decoding_table = codecs.update_decoding_map(koi8_r.decoding_table, { >>> 0x00a4: 0x0454, # CYRILLIC SMALL LETTER UKRAINIAN IE >>> 0x00a6: 0x0456, # CYRILLIC SMALL LETTER >>>BYELORUSSIAN-UKRAINIAN I >>> 0x00a7: 0x0457, # CYRILLIC SMALL LETTER YI (UKRAINIAN) >>> 0x00ad: 0x0491, # CYRILLIC SMALL LETTER UKRAINIAN GHE >>>WITH UPTURN >>> 0x00b4: 0x0404, # CYRILLIC CAPITAL LETTER UKRAINIAN IE >>> 0x00b6: 0x0406, # CYRILLIC CAPITAL LETTER >>>BYELORUSSIAN-UKRAINIAN I >>> 0x00b7: 0x0407, # CYRILLIC CAPITAL LETTER YI (UKRAINIAN) >>> 0x00bd: 0x0490, # CYRILLIC CAPITAL LETTER UKRAINIAN GHE >>>WITH UPTURN >>>}) >>> >>>With all these cross-references gone, the decoding_maps could also go. > > I just left them in because I thought they wouldn't do any harm > and might be useful in some applications. > > Removing them where not directly needed by the codec would not > be a problem. Recreating them is quite simple via dict(enumerate(decoding_table)) so I think we should remove them. >>Why should koi_u.py be defined in terms of koi8_r.py anyway? Why not put >>a complete decoding_table into koi8_u.py? > > KOI8-U is not available as mapping on ftp.unicode.org and > I only recreated codecs from the mapping files available > there. OK, so we'd need something that creates a new decoding table from an old one + changes, i.e. something like: def update_decoding_table(table, new): table = list[table] for (key, value) in new.iteritems(): table[key] = unichr(value) return u"".join(table) >>I'd like to suggest a small cosmetic change: gencodec.py should output >>byte values with two hexdigits instead of four. This makes it easier to >>see what is a byte values and what is a codepoint. And it would make >>grepping for stuff simpler. > > True. > > I'll rerun the creation with the above changes sometime this > week. Great, thanks! Bye, Walter D?rwald From mwh at python.net Mon Oct 24 12:24:34 2005 From: mwh at python.net (Michael Hudson) Date: Mon, 24 Oct 2005 11:24:34 +0100 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <435CAAD7.7010909@gmail.com> (Nick Coghlan's message of "Mon, 24 Oct 2005 19:35:19 +1000") References: <435B5D6E.80101@gmail.com> <4edc17eb0510240030x74523a8fl1a9389b5054c061b@mail.gmail.com> <20051024011400.38DA.JCARLSON@uci.edu> <435CAAD7.7010909@gmail.com> Message-ID: <2m4q776wm5.fsf@starship.python.net> Nick Coghlan writes: > Josiah Carlson wrote: >> You can get the same semantics with... >> >> class NAME(_(TYPE), ARGS): >> BLOCK >> >> And a suitably defined _. Remember, not every X line function should be >> made a builtin or syntax. > > And this would be an extremely fragile hack that is entirely > dependent on the murky rules regarding how Python chooses the > metaclass for the newly created class. Uh, not really. In the presence of base classes it's always "the type of the first base". The reason it might not seem this simple is that most metaclasses end up calling type.__new__ at some point and this function does more complicated things (such as checking for metaclass conflict and deferring to the most specific metaclass). Not sure what the context is here, but I have to butt in when I see people complicating things which aren't actually that complicated... Cheers, mwh -- There's an aura of unholy black magic about CLISP. It works, but I have no idea how it does it. I suspect there's a goat involved somewhere. -- Johann Hibschman, comp.lang.scheme From jcarlson at uci.edu Mon Oct 24 12:54:04 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 24 Oct 2005 03:54:04 -0700 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <435CA5F8.7010607@gmail.com> References: <1130107429.11268.40.camel@geddy.wooz.org> <435CA5F8.7010607@gmail.com> Message-ID: <20051024034957.38EB.JCARLSON@uci.edu> Nick Coghlan wrote: > I think having dicts and sets automatically invoke freeze would be a mistake, > because at least one of the following two cases would behave unexpectedly: I'm pretty sure that the PEP was only aslomg if one would freeze the contents of dicts IF the dict was being frozen. That is, which of the following should be the case: freeze({1:[2,3,4]}) -> {1:[2,3,4]} freeze({1:[2,3,4]}) -> xdict(1=(2,3,4)) - Josiah From jcarlson at uci.edu Mon Oct 24 12:54:55 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 24 Oct 2005 03:54:55 -0700 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <4edc17eb0510240133v26cc056em47dee26b901460b9@mail.gmail.com> References: <20051024011400.38DA.JCARLSON@uci.edu> <4edc17eb0510240133v26cc056em47dee26b901460b9@mail.gmail.com> Message-ID: <20051024020601.38DF.JCARLSON@uci.edu> Michele Simionato wrote: > > On 10/24/05, Josiah Carlson wrote: > > I would criticise it for being dangerously close to worthless. With the > > minor support code that I (and others) have offered, no new syntax is > > necessary. > > > > You can get the same semantics with... > > > > class NAME(_(TYPE), ARGS): > > BLOCK > > > > And a suitably defined _. Remember, not every X line function should be > > made a builtin or syntax. > > > > - Josiah > > Could you re-read my original message, please? Sugar is *everything* > in this case. If the functionality is to be implemented via a __metaclass__ > hook, then it should be considered a hack that nobody in his right mind > should use. OTOH, if there is a specific syntax for it, then it means > this the usage > has the benediction of the BDFL. This would be a HUGE change. > For instance, I would never abuse metaclasses for that, whereas I > would freely use a 'create' statement. Metaclass abuse? Oh, I'm sorry, I thought that the point of metaclasses were to offer a way to make "magic" happen in a somewhat pragmatic manner, you know, through metaprogramming. I would call this particular use a practical application of standard Python semantics. Pardon me while I attempt to re-parse your above statement... "If there is a specific syntax for [passing a temporary namespace to a callable, created by some sort of block mechanism], then [using it for property creation] has the benediction of the BDFL". What I'm trying to say is that it already has a no-syntax syntax. It uses the "magic" of metaclasses, but one can make that "magic" as explicit as necessary. class NAME(PassNamespaceFromClassBlock(fcn=TYPE, args=ARGS)): BLOCK Personally, I've not seen the desire to pass temporary namespaces to functions until recently, so whether or not people will use it for property creation, or any other way that people would find interesting and/or useful, is at least a bit of prediction. Maybe people will prefer to use property('get_foo', 'set_foo', 'del_foo'), who knows? But you know what? Regardless of what people want, they can use metaclasses right now to create properties, where they would have to wait until Python 2.5 comes out before they could use this proposed 'create' statement. - Josiah From ronaldoussoren at mac.com Mon Oct 24 13:13:45 2005 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Mon, 24 Oct 2005 13:13:45 +0200 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <20051024020601.38DF.JCARLSON@uci.edu> References: <20051024011400.38DA.JCARLSON@uci.edu> <4edc17eb0510240133v26cc056em47dee26b901460b9@mail.gmail.com> <20051024020601.38DF.JCARLSON@uci.edu> Message-ID: <85DC3622-F572-4E00-92A0-3781B7AC7EB0@mac.com> On 24-okt-2005, at 12:54, Josiah Carlson wrote: >> > > Metaclass abuse? Oh, I'm sorry, I thought that the point of > metaclasses > were to offer a way to make "magic" happen in a somewhat pragmatic > manner, you know, through metaprogramming. I would call this > particular > use a practical application of standard Python semantics. I'd say using a class statement to define a property is metaclass abuse, as would anything that wouldn't define something class-like. The same is true for other constructs, using an decorator to define something that is not a callable would IMHO also be abuse. That said, I really have an opinion on the 'create' statement proposal yet. It does seem to have a very limited field of use. I'm quite happy with using property as it is, property('get_foo', 'set_foo') would take away most if not all of the remaining problems. Ronald From mal at egenix.com Mon Oct 24 13:23:14 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 24 Oct 2005 13:23:14 +0200 Subject: [Python-Dev] KOI8_U (New codecs checked in) In-Reply-To: <435CB4BB.6070009@livinglogic.de> References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de> <435CA2BA.7050900@livinglogic.de> <435CA887.3020900@egenix.com> <435CB4BB.6070009@livinglogic.de> Message-ID: <435CC422.8070600@egenix.com> Walter D?rwald wrote: >>> Why should koi_u.py be defined in terms of koi8_r.py anyway? Why not put >>> a complete decoding_table into koi8_u.py? >> >> >> KOI8-U is not available as mapping on ftp.unicode.org and >> I only recreated codecs from the mapping files available >> there. > > > OK, so we'd need something that creates a new decoding table from an old > one + changes, i.e. something like: > > def update_decoding_table(table, new): > table = list[table] > for (key, value) in new.iteritems(): > table[key] = unichr(value) > return u"".join(table) Actually, I'd rather have some official mapping files for these. Perhaps we could get someone to upload a mapping file for KOI8_U to the Unicode site ?! The mapping is defined in RFC2319: http://www.faqs.org/rfcs/rfc2319.html I've put Alexander Yeremenko, the coordinator of the KOI8-U group on CC. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 24 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mal at egenix.com Mon Oct 24 13:37:05 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 24 Oct 2005 13:37:05 +0200 Subject: [Python-Dev] KOI8_U (New codecs checked in) In-Reply-To: <435CC422.8070600@egenix.com> References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de> <435CA2BA.7050900@livinglogic.de> <435CA887.3020900@egenix.com> <435CB4BB.6070009@livinglogic.de> <435CC422.8070600@egenix.com> Message-ID: <435CC761.8000006@egenix.com> M.-A. Lemburg wrote: > Walter D?rwald wrote: > >>>>Why should koi_u.py be defined in terms of koi8_r.py anyway? Why not put >>>>a complete decoding_table into koi8_u.py? >>> >>> >>>KOI8-U is not available as mapping on ftp.unicode.org and >>>I only recreated codecs from the mapping files available >>>there. >> >> >>OK, so we'd need something that creates a new decoding table from an old >>one + changes, i.e. something like: >> >>def update_decoding_table(table, new): >> table = list[table] >> for (key, value) in new.iteritems(): >> table[key] = unichr(value) >> return u"".join(table) > > > Actually, I'd rather have some official mapping files > for these. > > Perhaps we could get someone to upload a mapping file > for KOI8_U to the Unicode site ?! > > The mapping is defined in RFC2319: > > http://www.faqs.org/rfcs/rfc2319.html > > I've put Alexander Yeremenko, the coordinator of > the KOI8-U group on CC. Hmm, that email address bounces. I've now put Maxim on CC: Maxim Dzumanenko Here's a mapping file for KOI9-U - please check whether it's correct. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 24 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: KOI8-U.TXT Url: http://mail.python.org/pipermail/python-dev/attachments/20051024/362a6093/KOI8-U-0001.asc From michele.simionato at gmail.com Mon Oct 24 13:41:18 2005 From: michele.simionato at gmail.com (Michele Simionato) Date: Mon, 24 Oct 2005 11:41:18 +0000 Subject: [Python-Dev] Definining properties - a use case for class decorators? In-Reply-To: <85DC3622-F572-4E00-92A0-3781B7AC7EB0@mac.com> References: <20051024011400.38DA.JCARLSON@uci.edu> <4edc17eb0510240133v26cc056em47dee26b901460b9@mail.gmail.com> <20051024020601.38DF.JCARLSON@uci.edu> <85DC3622-F572-4E00-92A0-3781B7AC7EB0@mac.com> Message-ID: <4edc17eb0510240441n7f02cfbcvdc8bdd171f0b4cf6@mail.gmail.com> On 10/24/05, Ronald Oussoren wrote: > I'd say using a class statement to define a property is metaclass > abuse, as would > anything that wouldn't define something class-like. The same is true > for other > constructs, using an decorator to define something that is not a > callable would IMHO > also be abuse. +1 > That said, I really have an opinion on the 'create' statement > proposal yet. It > does seem to have a very limited field of use. This is definitely non-true. The 'create' statement would have lots of applications. On top of my mind I can think of 'create' applied to: - bunches; - modules; - interfaces; - properties; - usage in framewors, for instance providing sugar for Object-Relational mappers, for making templates (i.e. a create HTMLPage); - building custom minilanguages; - ... This is way I see a 'create' statement is frightening powerful addition to the language. Michele Simionato From mal at egenix.com Mon Oct 24 13:43:52 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 24 Oct 2005 13:43:52 +0200 Subject: [Python-Dev] KOI8_U (New codecs checked in) In-Reply-To: <435CC761.8000006@egenix.com> References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de> <435CA2BA.7050900@livinglogic.de> <435CA887.3020900@egenix.com> <435CB4BB.6070009@livinglogic.de> <435CC422.8070600@egenix.com> <435CC761.8000006@egenix.com> Message-ID: <435CC8F8.1070102@egenix.com> M.-A. Lemburg wrote: > Here's a mapping file for KOI9-U - please check whether > it's correct. I missed one codec point change: 0xB4. Here's the updated version which matches the codec we currently have in Python 2.3 and 2.4. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 24 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: KOI8-U.TXT Url: http://mail.python.org/pipermail/python-dev/attachments/20051024/4b51ccf8/KOI8-U.pot From ncoghlan at gmail.com Mon Oct 24 13:53:00 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 24 Oct 2005 21:53:00 +1000 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: References: <435A4598.3060403@iinet.net.au> <435B597C.6040300@gmail.com> <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com> Message-ID: <435CCB1C.4030108@gmail.com> Guido van Rossum wrote: > Right. That was my point. Nick's worried about undecorated __context__ > because he wants to endow generators with a different default > __context__. I say no to both proposals and the worries cancel each > other out. EIBTI. Works for me. That makes the resolutions for the posted issues: 1. The slot name "__context__" will be used instead of "__with__" 2. The builtin name "context" is currently offlimits due to its ambiguity 3a. generator-iterators do NOT have a native context 3b. Use "contextmanager" as a builtin decorator to get generator-contexts 4. The __context__ slot will NOT be special cased I'll add those into the PEP and reference this thread after Martin is done with the SVN migration. However, those resolutions bring up the following issues: 5 a. What exception is raised when EXPR does not have a __context__ method? b. What about when the returned object is missing __enter__ or __exit__? I suggest raising TypeError in both cases, for symmetry with for loops. The slot check is made in C code, so I don't see any difficulty in raising TypeError instead of AttributeError if the relevant slots aren't filled. 6 a. Should a generic "closing" context manager be provided? b. If yes, should it be a builtin or in a "contexttools" module? I'm not too worried about this one for the moment, and it could easily be left out of the PEP itself. Of the sample managers, it seems the most universally useful, though. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From mal at egenix.com Mon Oct 24 14:09:10 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 24 Oct 2005 14:09:10 +0200 Subject: [Python-Dev] New codecs checked in In-Reply-To: <435CB4BB.6070009@livinglogic.de> References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de> <435CA2BA.7050900@livinglogic.de> <435CA887.3020900@egenix.com> <435CB4BB.6070009@livinglogic.de> Message-ID: <435CCEE6.6020005@egenix.com> Walter D?rwald wrote: >>>I'd like to suggest a small cosmetic change: gencodec.py should output >>>byte values with two hexdigits instead of four. This makes it easier to >>>see what is a byte values and what is a codepoint. And it would make >>>grepping for stuff simpler. >> >>True. >> >>I'll rerun the creation with the above changes sometime this >>week. > > > Great, thanks! Done. I had to create three custom mapping files for cp1140, koi8-u and tis-620. If you want more non-standard charmap codecs converted, please send me the mapping files in the Unicode standard format for these files. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 24 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From paolo_veronelli at libero.it Mon Oct 24 17:09:53 2005 From: paolo_veronelli at libero.it (Paolino) Date: Mon, 24 Oct 2005 17:09:53 +0200 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <1130107429.11268.40.camel@geddy.wooz.org> References: <1130107429.11268.40.camel@geddy.wooz.org> Message-ID: <435CF941.6070104@libero.it> I'm not sure I understood completely the idea but deriving freeze function from hash gives hash a wider importance. Is __hash__=id inside a class enough to use a set (sets.Set before 2.5) derived class instance as a key to a mapping? Sure I missed the point. Regards Paolino From guido at python.org Mon Oct 24 16:47:47 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 24 Oct 2005 07:47:47 -0700 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: <435CCB1C.4030108@gmail.com> References: <435A4598.3060403@iinet.net.au> <435B597C.6040300@gmail.com> <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com> <435CCB1C.4030108@gmail.com> Message-ID: On 10/24/05, Nick Coghlan wrote: > That makes the resolutions for the posted issues: > > 1. The slot name "__context__" will be used instead of "__with__" > 2. The builtin name "context" is currently offlimits due to its ambiguity > 3a. generator-iterators do NOT have a native context > 3b. Use "contextmanager" as a builtin decorator to get generator-contexts > 4. The __context__ slot will NOT be special cased +1 > I'll add those into the PEP and reference this thread after Martin is done > with the SVN migration. > > However, those resolutions bring up the following issues: > > 5 a. What exception is raised when EXPR does not have a __context__ method? > b. What about when the returned object is missing __enter__ or __exit__? > I suggest raising TypeError in both cases, for symmetry with for loops. > The slot check is made in C code, so I don't see any difficulty in raising > TypeError instead of AttributeError if the relevant slots aren't filled. Why are you so keen on TypeError? I find AttributeError totally appropriate. I don't see symmetry with for-loops as a valuable property here. AttributeError and TypeError are often interchangeable anyway. > 6 a. Should a generic "closing" context manager be provided? No. Let's provide the minimal mechanisms FIRST. > b. If yes, should it be a builtin or in a "contexttools" module? > I'm not too worried about this one for the moment, and it could easily be > left out of the PEP itself. Of the sample managers, it seems the most > universally useful, though. Let's leave some examples just be examples. I think I'm leaning towards adding __context__ to locks (all types defined in tread or threading, including condition variables), files, and decimal.Context, and leave it at that. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gary at modernsongs.com Mon Oct 24 17:04:52 2005 From: gary at modernsongs.com (Gary Poster) Date: Mon, 24 Oct 2005 11:04:52 -0400 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <1130107429.11268.40.camel@geddy.wooz.org> References: <1130107429.11268.40.camel@geddy.wooz.org> Message-ID: <7F3FCEC2-CD0D-4EAC-9C2B-77ECB4A0B73B@modernsongs.com> On Oct 23, 2005, at 6:43 PM, Barry Warsaw wrote: > I've had this PEP laying around for quite a few months. It was > inspired > by some code we'd written which wanted to be able to get immutable > versions of arbitrary objects. I've finally finished the PEP, > uploaded > a sample patch (albeit a bit incomplete), and I'm posting it here > to see > if there is any interest. > > http://www.python.org/peps/pep-0351.html I like this. I'd like it better if it integrated with the adapter PEP, so that the freezing mechanism for a given type could be pluggable, and could be provided even if the original object did not contemplate it. I don't know where the adapter PEP stands: skimming through the (most recent?) thread in January didn't give me a clear idea. As another poster mentioned, in-place freezing is also of interest to me (and why I read the PEP Initially), but as also as mentioned that's probably unrelated to your PEP. Gary From Alan.McIntyre at esrgtech.com Mon Oct 24 17:27:08 2005 From: Alan.McIntyre at esrgtech.com (Alan McIntyre) Date: Mon, 24 Oct 2005 11:27:08 -0400 Subject: [Python-Dev] int(string) In-Reply-To: References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com> <2mfyqu6j55.fsf@starship.python.net> <1f7befae0510221038u4bd75ca5l67dc930ae16d24b7@mail.gmail.com> <435C747E.9060005@esrgtech.com> Message-ID: <435CFD4C.1090200@esrgtech.com> Fredrik Lundh wrote: >does a plain > > a = -10000000000.0 > >still work on your machine? > D'oh - I seriously broke something, then, because it didn't. funny_falcon commented on the patch in SF and suggested a change that took care of that. I've uploaded the corrected version of the patch, which now passes all the tests. Alan From raymond.hettinger at verizon.net Mon Oct 24 17:49:35 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Mon, 24 Oct 2005 11:49:35 -0400 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <1130107429.11268.40.camel@geddy.wooz.org> Message-ID: <003201c5d8b2$88c64280$e135c797@oemcomputer> [Barry Warsaw] > I've had this PEP laying around for quite a few months. It was inspired > by some code we'd written which wanted to be able to get immutable > versions of arbitrary objects. * FWIW, the _as_immutable() protocol was dropped from sets.py for a reason. User reports indicated that it was never helpful in practice. It added complexity and confusion without producing offsetting benefits. * AFAICT, there are no use cases for freezing arbitrary objects when the object types are restricted to just lists and sets but not dicts, arrays, or other containers. Even if the range of supported types were expanded, what applications could use this? Most apps cannot support generic substitution of lists and sets -- they have too few methods in common -- they are almost never interchangeable. * I'm concerned that generic freezing leads to poor design and hard-to-find bugs. One class of bugs results from conflating ordered and unordered collections as lookup keys. It is difficult to assess program correctness when the ordered/unordered distinction has been abstracted away. A second class of errors can arise when the original object mutates and gets out-of-sync with its frozen counterpart. * For a rare app needing mutable lookup keys, a simple recipe would suffice: freeze_pairs = [(list, tuple), (set, frozenset)] def freeze(obj): try: hash(obj) except TypeError: for sourcetype, desttype in freeze_pairs: if isinstance(obj, sourcetype): return desttype(obj) raise else: return obj Unlike the PEP, the recipe works with older pythons and is trivially easy to extend to include other containers. * The name "freeze" is problematic because it suggests an in-place change. Instead, the proposed mechanism creates a new object. In contrast, explicit conversions like tuple(l) or frozenset(s) are obvious about their running time, space consumed, and new object identity. Overall, I'm -1 on the PEP. Like a bad C macro, the proposed abstraction hides too much. We lose critical distinctions of ordered vs unordered, mutable vs immutable, new objects vs in-place change, etc. Without compelling use cases, the mechanism smells like a hyper-generalization. Raymond From janssen at parc.com Mon Oct 24 19:06:27 2005 From: janssen at parc.com (Bill Janssen) Date: Mon, 24 Oct 2005 10:06:27 PDT Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: Your message of "Sun, 23 Oct 2005 19:23:40 PDT." <5.1.1.6.0.20051023221433.02ab8200@mail.telecommunity.com> Message-ID: <05Oct24.100629pdt."58617"@synergy1.parc.xerox.com> > >I'm thinking about making all character strings Unicode (possibly with > >different internal representations a la NSString in Apple's Objective > >C) and introduce a separate mutable bytes array data type. But I could > >use some validation or feedback on this idea from actual > >practitioners. +1 from me, too. > I'm tempted to say it would be even better if there was a command line > option that could be used to force all binary opens to result in bytes, and > require all text opens to specify an encoding. I like this idea, too. Presumably plain "open(FILENAME, MODE)" would then result in a binary open (no encoding specified), which I've wanted for a long time (and which makes sense). But it is a change. Bill From janssen at parc.com Mon Oct 24 19:07:40 2005 From: janssen at parc.com (Bill Janssen) Date: Mon, 24 Oct 2005 10:07:40 PDT Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: Your message of "Sun, 23 Oct 2005 20:41:50 PDT." <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> Message-ID: <05Oct24.100744pdt."58617"@synergy1.parc.xerox.com> > Python should allow strings to > contain any Unicode character and should be indexable yielding > characters rather than half characters. Therefore Python strings > should appear to be UTF-32. +1. Bill From phil at riverbankcomputing.co.uk Mon Oct 24 19:18:41 2005 From: phil at riverbankcomputing.co.uk (Phil Thompson) Date: Mon, 24 Oct 2005 18:18:41 +0100 Subject: [Python-Dev] Inconsistent Use of Buffer Interface in stringobject.c Message-ID: <200510241818.41145.phil@riverbankcomputing.co.uk> I'm implementing a string-like object in an extension module and trying to make it as interoperable with the standard string object as possible. To do this I'm implementing the relevant slots and the buffer interface. For most things this is fine, but there are a small number of methods in stringobject.c that don't use the buffer interface - and I don't understand why. Specifically... string_contains() doesn't which means that... MyString("foo") in "foobar" ...doesn't work. s.join(sequence) only allows sequence to contain string or unicode objects. s.strip([chars]) only allows chars to be a string or unicode object. Same for lstrip() and rstrip(). s.ljust(width[, fillchar]) only allows fillchar to be a string object (not even a unicode object). Same for rjust() and center(). Other methods happily allow types that support the buffer interface as well as string and unicode objects. I'm happy to submit a patch - I just wanted to make sure that this behaviour wasn't intentional for some reason. Thanks, Phil From guido at python.org Mon Oct 24 20:06:43 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 24 Oct 2005 11:06:43 -0700 Subject: [Python-Dev] Inconsistent Use of Buffer Interface in stringobject.c In-Reply-To: <200510241818.41145.phil@riverbankcomputing.co.uk> References: <200510241818.41145.phil@riverbankcomputing.co.uk> Message-ID: On 10/24/05, Phil Thompson wrote: > I'm implementing a string-like object in an extension module and trying to > make it as interoperable with the standard string object as possible. To do > this I'm implementing the relevant slots and the buffer interface. For most > things this is fine, but there are a small number of methods in > stringobject.c that don't use the buffer interface - and I don't understand > why. > > Specifically... > > string_contains() doesn't which means that... > > MyString("foo") in "foobar" > > ...doesn't work. > > s.join(sequence) only allows sequence to contain string or unicode objects. > > s.strip([chars]) only allows chars to be a string or unicode object. Same for > lstrip() and rstrip(). > > s.ljust(width[, fillchar]) only allows fillchar to be a string object (not > even a unicode object). Same for rjust() and center(). > > Other methods happily allow types that support the buffer interface as well as > string and unicode objects. > > I'm happy to submit a patch - I just wanted to make sure that this behaviour > wasn't intentional for some reason. A concern I'd have with fixing this is that Unicode objects also support the buffer API. In any situation where either str or unicode is accepted I'd be reluctant to guess whether a buffer object was meant to be str-like or Unicode-like. I think this covers all the cases you mention here. We need to support this better in Python 3000; but I'm not sure you can do much better in Python 2.x; subclassing from str is unlikely to work for you because then too many places are going to assume the internal representation is also the same as for str. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Mon Oct 24 20:19:05 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 24 Oct 2005 20:19:05 +0200 Subject: [Python-Dev] Inconsistent Use of Buffer Interface instringobject.c References: <200510241818.41145.phil@riverbankcomputing.co.uk> Message-ID: Guido van Rossum wrote: > A concern I'd have with fixing this is that Unicode objects also > support the buffer API. In any situation where either str or unicode > is accepted I'd be reluctant to guess whether a buffer object was > meant to be str-like or Unicode-like. I think this covers all the > cases you mention here. iirc, SRE solves that by comparing the length of the sequence with the number of bytes in the buffer. if length == bytes, it's an 8-bit string; if length*sizeof(Py_Unicode) == bytes, it's a Unicode string. From mal at egenix.com Mon Oct 24 20:32:22 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 24 Oct 2005 20:32:22 +0200 Subject: [Python-Dev] Inconsistent Use of Buffer Interface in stringobject.c In-Reply-To: References: <200510241818.41145.phil@riverbankcomputing.co.uk> Message-ID: <435D28B6.9010806@egenix.com> Guido van Rossum wrote: > On 10/24/05, Phil Thompson wrote: > >>I'm implementing a string-like object in an extension module and trying to >>make it as interoperable with the standard string object as possible. To do >>this I'm implementing the relevant slots and the buffer interface. For most >>things this is fine, but there are a small number of methods in >>stringobject.c that don't use the buffer interface - and I don't understand >>why. >> >>Specifically... >> >>string_contains() doesn't which means that... >> >> MyString("foo") in "foobar" >> >>...doesn't work. >> >>s.join(sequence) only allows sequence to contain string or unicode objects. >> >>s.strip([chars]) only allows chars to be a string or unicode object. Same for >>lstrip() and rstrip(). >> >>s.ljust(width[, fillchar]) only allows fillchar to be a string object (not >>even a unicode object). Same for rjust() and center(). >> >>Other methods happily allow types that support the buffer interface as well as >>string and unicode objects. >> >>I'm happy to submit a patch - I just wanted to make sure that this behaviour >>wasn't intentional for some reason. > > > A concern I'd have with fixing this is that Unicode objects also > support the buffer API. In any situation where either str or unicode > is accepted I'd be reluctant to guess whether a buffer object was > meant to be str-like or Unicode-like. I think this covers all the > cases you mention here. This situation is a little better than that: the buffer interface has a slot called getcharbuffer which is what the string methods use in case they find that a string argument is not of type str or unicode. A few don't, but I guess we could fix this. str.split(), .[lr]strip() all support the getcharbuffer interface. str.join() currently doesn't. The Unicode object also leaves out a few cases, among those the ones you mentioned. If it's better for inter-op, I guess we should make an effort and let all of them support the getcharbuffer interface. > We need to support this better in Python 3000; but I'm not sure you > can do much better in Python 2.x; subclassing from str is unlikely to > work for you because then too many places are going to assume the > internal representation is also the same as for str. As first step, I'd suggest to implement the gatcharbuffer slot. That will already go a long way. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 24 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From guido at python.org Mon Oct 24 20:39:21 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 24 Oct 2005 11:39:21 -0700 Subject: [Python-Dev] Inconsistent Use of Buffer Interface in stringobject.c In-Reply-To: <435D28B6.9010806@egenix.com> References: <200510241818.41145.phil@riverbankcomputing.co.uk> <435D28B6.9010806@egenix.com> Message-ID: On 10/24/05, M.-A. Lemburg wrote: > Guido van Rossum wrote: > > A concern I'd have with fixing this is that Unicode objects also > > support the buffer API. In any situation where either str or unicode > > is accepted I'd be reluctant to guess whether a buffer object was > > meant to be str-like or Unicode-like. I think this covers all the > > cases you mention here. > > This situation is a little better than that: the buffer > interface has a slot called getcharbuffer which is what > the string methods use in case they find that a string > argument is not of type str or unicode. I stand corrected! > As first step, I'd suggest to implement the gatcharbuffer > slot. That will already go a long way. Phil, if anything still doesn't work after doing what Marc-Andre says, those would be good candidates for fixes! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From phil at riverbankcomputing.co.uk Mon Oct 24 20:56:26 2005 From: phil at riverbankcomputing.co.uk (Phil Thompson) Date: Mon, 24 Oct 2005 19:56:26 +0100 Subject: [Python-Dev] Inconsistent Use of Buffer Interface in stringobject.c In-Reply-To: References: <200510241818.41145.phil@riverbankcomputing.co.uk> <435D28B6.9010806@egenix.com> Message-ID: <200510241956.26257.phil@riverbankcomputing.co.uk> On Monday 24 October 2005 7:39 pm, Guido van Rossum wrote: > On 10/24/05, M.-A. Lemburg wrote: > > Guido van Rossum wrote: > > > A concern I'd have with fixing this is that Unicode objects also > > > support the buffer API. In any situation where either str or unicode > > > is accepted I'd be reluctant to guess whether a buffer object was > > > meant to be str-like or Unicode-like. I think this covers all the > > > cases you mention here. > > > > This situation is a little better than that: the buffer > > interface has a slot called getcharbuffer which is what > > the string methods use in case they find that a string > > argument is not of type str or unicode. > > I stand corrected! > > > As first step, I'd suggest to implement the gatcharbuffer > > slot. That will already go a long way. > > Phil, if anything still doesn't work after doing what Marc-Andre says, > those would be good candidates for fixes! I have implemented getcharbuffer - I was highlighting those methods where the getcharbuffer implementation was ignored. I'll put a patch together. Phil From martin at v.loewis.de Mon Oct 24 22:44:38 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 24 Oct 2005 22:44:38 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> <435C7EFE.6060504@v.loewis.de> <50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com> Message-ID: <435D47B6.4010703@v.loewis.de> Neil Hodgson wrote: > For Windows, the code will get a little uglier, needing to perform > an allocation/encoding and deallocation more often then at present but > I don't think there will be a speed degradation as Windows is > currently performing a conversion from 8 bit to UTF-16 inside many > system calls. [...] > > For indexing UTF-16, a flag could be set to show if the string is > all in the base plane and if not, an index could be constructed when > and if needed. There are many design alternatives: one option would be to support *three* internal representations in a single type, generating the others from the one operation existing as needed. The default, initial representation might be UTF-8, with UCS-4 only being generated when indexing occurs, and UCS-2 only being generated when the API requires it. On concatenation, always concatenate just one represenation: either one that is already present in both operands, else UTF-8. > It'd be good to get some feel for what proportion of > string operations performed require indexing. Many, such as > startswith, split, and concatenation don't require indexing. The > proportion of operations that use indexing to scan strings would also > be interesting as adding a (currentIndex, currentOffset) cursor to > string objects would be another approach. Indeed. My guess is that indexing is more common than you think, especially when iterating over the string. Of course, iteration could also operate on UTF-8, if you introduced string iterator objects. Regards, Martin From martin at v.loewis.de Mon Oct 24 22:49:05 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 24 Oct 2005 22:49:05 +0200 Subject: [Python-Dev] New codecs checked in In-Reply-To: <435CA2BA.7050900@livinglogic.de> References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de> <435CA2BA.7050900@livinglogic.de> Message-ID: <435D48C1.9080003@v.loewis.de> Walter D?rwald wrote: > Why should koi_u.py be defined in terms of koi8_r.py anyway? Why not put > a complete decoding_table into koi8_u.py? Not sure. Unfortunately, the tables being used as source are not part of the Python source, so nobody except MAL can faithfully regenerate them. If they were part of the Python source, explicitly adding one for KOI8-U would certainly be feasible. > I.e. change: > > decoding_map.update({ > 0x0080: 0x0402, # CYRILLIC CAPITAL LETTER DJE Hmm. I was suggesting to remove decoding_map completely, in which case neither the current form nor your suggested cosmetic change would survive. > to > > decoding_table = ( > u'\x00' # 0x00 -> U+0000 NULL Using U+XXXX in comments to denote the codepoints is a good idea, anyway. Regards, Martin From martin at v.loewis.de Mon Oct 24 22:55:07 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 24 Oct 2005 22:55:07 +0200 Subject: [Python-Dev] New codecs checked in In-Reply-To: <435CA887.3020900@egenix.com> References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de> <435CA2BA.7050900@livinglogic.de> <435CA887.3020900@egenix.com> Message-ID: <435D4A2B.1090505@v.loewis.de> M.-A. Lemburg wrote: > I just left them in because I thought they wouldn't do any harm > and might be useful in some applications. > > Removing them where not directly needed by the codec would not > be a problem. I think memory usage caused is measurable (I estimated 4KiB per dictionary). More importantly, people apparently currently change the dictionaries we provide and expect the codecs to automatically pick up the modified mappings. It would be better if the breakage is explicit (i.e. they get an AttributeError on the variable) instead of implicit (their changes to the mapping simply have no effect anymore). > KOI8-U is not available as mapping on ftp.unicode.org and > I only recreated codecs from the mapping files available > there. I think we should come up with mapping tables for the additional codecs as well, and maintain them in the CVS. This also applies to things like rot13. > I'll rerun the creation with the above changes sometime this > week. I hope I can finish my encoding routine shortly, which again results in changes to the codecs (replacing the encoding dictionaries with other lookup tables). Regards, Martin From martin at v.loewis.de Mon Oct 24 22:59:00 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 24 Oct 2005 22:59:00 +0200 Subject: [Python-Dev] New codecs checked in In-Reply-To: <435CCEE6.6020005@egenix.com> References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de> <435CA2BA.7050900@livinglogic.de> <435CA887.3020900@egenix.com> <435CB4BB.6070009@livinglogic.de> <435CCEE6.6020005@egenix.com> Message-ID: <435D4B14.7060008@v.loewis.de> M.-A. Lemburg wrote: > I had to create three custom mapping files for cp1140, koi8-u > and tis-620. Can you please publish the files you have used somewhere? They best go into the Python CVS. Regards, Martin From martin at v.loewis.de Mon Oct 24 23:06:38 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 24 Oct 2005 23:06:38 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <435C9DFC.8020501@egenix.com> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> <435C9DFC.8020501@egenix.com> Message-ID: <435D4CDE.8020907@v.loewis.de> M.-A. Lemburg wrote: > There seems to be a general misunderstanding here: even if you > have UCS4 storage, it is still possible to slice a Unicode > string in a way which makes rendering it correctly. [impossible?] > Unicode has the concept of combining code points, e.g. you can > store an "?" (e with a accent) as "e" + "'". Now if you slice > off the accent, you'll break the character that you encoded > using combining code points. While this is all true, I agree with Neil that it should do whatever it does consistently across implementations, i.e. len("\U00010000") should always give the same result, and I think this result should always be 1. How to best implement this efficiently is an entirely different question, as is the question whether you can render arbitrary substrings in a meaningful way. Regards, Martin From solipsis at pitrou.net Mon Oct 24 23:22:23 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 24 Oct 2005 23:22:23 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <435D47B6.4010703@v.loewis.de> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> <435C7EFE.6060504@v.loewis.de> <50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com> <435D47B6.4010703@v.loewis.de> Message-ID: <1130188943.8137.13.camel@fsol> > There are many design alternatives: one option would be to support > *three* internal representations in a single type, generating the > others from the one operation existing as needed. The default, initial > representation might be UTF-8, with UCS-4 only being generated when > indexing occurs, and UCS-2 only being generated when the API requires > it. On concatenation, always concatenate just one represenation: either > one that is already present in both operands, else UTF-8. Wouldn't it be simpler to use: - one-byte representation if every character <= 0xFF - two-byte representation if every character <= 0xFFFF - four-byte representation otherwise Then combining several strings means using the larger representation as a result (*). In practice, most use cases will not involve the four-byte representation. (*) a heuristic can be invented so that, when producing a smaller string (by stripping/slicing/etc.), it will "sometimes" check whether a narrower representation is possible. For example : store the length of the string when the last check occurred, and do a new check when the length falls below the half that value. Regards Antoine. From guido at python.org Mon Oct 24 23:31:18 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 24 Oct 2005 14:31:18 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <435D47B6.4010703@v.loewis.de> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> <435C7EFE.6060504@v.loewis.de> <50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com> <435D47B6.4010703@v.loewis.de> Message-ID: On 10/24/05, "Martin v. L?wis" wrote: > Indeed. My guess is that indexing is more common than you think, > especially when iterating over the string. Of course, iteration > could also operate on UTF-8, if you introduced string iterator > objects. Python's slice-and-dice model pretty much ensures that indexing is common. Almost everything is ultimately represented as indices: regex search results have the index in the API, find()/index() return indices, many operations take a start and/or end index. As long as that's the case, indexing better be fast. Changing the APIs would be much work, although perhaps not impossible of Python 3000. For example, Raymond Hettinger's partition() API doesn't refer to indices at all, and can replace many uses of find() or index(). Still, the mere existence of __getitem__ and __getslice__ on strings makes it necessary to implement them efficiently. How realistic would it be to drop them? What should replace them? Some kind of abstract pointers-into-strings perhaps, but that seems much more complex. The trick seems to be to support both simple programs manipulating short strings (where indexing is probably the easiest API to understand, and the additional copying is unlikely to cause performance problems) , as well as programs manipulating very large buffers containing text and doing sophisticated string processing on them. Perhaps we could provide a different kind of API to support the latter, perhaps based on a mutable character buffer data type without direct indexing? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Tue Oct 25 00:21:06 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 25 Oct 2005 00:21:06 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> <435C7EFE.6060504@v.loewis.de> <50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com> <435D47B6.4010703@v.loewis.de> Message-ID: <435D5E52.8080003@v.loewis.de> Guido van Rossum wrote: > Changing the APIs would be much work, although perhaps not impossible > of Python 3000. For example, Raymond Hettinger's partition() API > doesn't refer to indices at all, and can replace many uses of find() > or index(). I think Neil's proposal is not to make them go away, but to implement them less efficiently. For example, if the internal representation is UTF-8, indexing requires linear time, as opposed to constant time. If the internal representation is UTF-16, and you have a flag to indicate whether there are any surrogates on the string, indexing is constant if the flag is false, else linear. > Perhaps we could provide a different kind of API to support the > latter, perhaps based on a mutable character buffer data type without > direct indexing? There are different design goals conflicting here: - some think: "all my data is ASCII, so I want to only use one byte per character". - others think: "all my data goes to the Windows API, so I want to use 2 byte per character". - yet others think: "I want all of Unicode, with proper, efficient indexing, so I want four bytes per char". It's not so much a matter of API as a matter of internal representation. The API doesn't have to change (except for the very low-level C API that directly exposes Py_UNICODE*, perhaps). Regards, Martin From guido at python.org Tue Oct 25 00:47:22 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 24 Oct 2005 15:47:22 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <435D5E52.8080003@v.loewis.de> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> <435C7EFE.6060504@v.loewis.de> <50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com> <435D47B6.4010703@v.loewis.de> <435D5E52.8080003@v.loewis.de> Message-ID: On 10/24/05, "Martin v. L?wis" wrote: > Guido van Rossum wrote: > > Changing the APIs would be much work, although perhaps not impossible > > of Python 3000. For example, Raymond Hettinger's partition() API > > doesn't refer to indices at all, and can replace many uses of find() > > or index(). > > I think Neil's proposal is not to make them go away, but to implement > them less efficiently. For example, if the internal representation > is UTF-8, indexing requires linear time, as opposed to constant time. > If the internal representation is UTF-16, and you have a flag to > indicate whether there are any surrogates on the string, indexing > is constant if the flag is false, else linear. I understand all that. My point is that it's a bad idea to offer an indexing operation that isn't O(1). > > Perhaps we could provide a different kind of API to support the > > latter, perhaps based on a mutable character buffer data type without > > direct indexing? > > There are different design goals conflicting here: > - some think: "all my data is ASCII, so I want to only use one > byte per character". > - others think: "all my data goes to the Windows API, so I want > to use 2 byte per character". > - yet others think: "I want all of Unicode, with proper, efficient > indexing, so I want four bytes per char". I doubt the last one though. Probably they really don't want efficient indexing, they want to perform higher-level operations that currently are only possible using efficient indexing or slicing. With the right API. perhaps they could work just as efficiently with an internal representation of UTF-8. > It's not so much a matter of API as a matter of internal > representation. The API doesn't have to change (except for the > very low-level C API that directly exposes Py_UNICODE*, perhaps). I think the API should reflect the representation *to some extend*, namely it shouldn't claim to have operations that are typically thought of as O(1) that can only be implemented as O(n). An internal representation of UTF-8 might make everyone happy except heavy Windows users; but it requires changes to the API so people won't be writing Python 2.x-style string slinging code. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Tue Oct 25 00:59:27 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 25 Oct 2005 00:59:27 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <1130188943.8137.13.camel@fsol> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> <435C7EFE.6060504@v.loewis.de> <50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com> <435D47B6.4010703@v.loewis.de> <1130188943.8137.13.camel@fsol> Message-ID: <435D674F.3060209@v.loewis.de> Antoine Pitrou wrote: >>There are many design alternatives: > > Wouldn't it be simpler to use: > - one-byte representation if every character <= 0xFF > - two-byte representation if every character <= 0xFFFF > - four-byte representation otherwise As I said: there are many alternatives. This one has the disadvantage of requiring a copy every time you pass the string to a Win32 function (which expects UTF-16). Whether or not this is a significant disadvantage, I don't know. In any case, a multi-representations implementation has the disadvantage of making the C API more difficult to use, in particular for writing codecs. On encoding, it is difficult to fetch the individual characters which you need for the lookup table; on decoding, it is difficult to know in advance what representation to use (unless you know there is an upper bound on the decoded character ordinals). Regards, Martin From nyamatongwe at gmail.com Tue Oct 25 01:13:51 2005 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Tue, 25 Oct 2005 09:13:51 +1000 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <435C9DFC.8020501@egenix.com> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> <435C9DFC.8020501@egenix.com> Message-ID: <50862ebd0510241613w2b5da91cqbcaf4f6157ae338e@mail.gmail.com> M.-A. Lemburg: > Unicode has the concept of combining code points, e.g. you can > store an "?" (e with a accent) as "e" + "'". Now if you slice > off the accent, you'll break the character that you encoded > using combining code points. > ... > next_(u, index) -> integer > > Returns the Unicode object index for the start of the next > found after u[index] or -1 in case no next element > of this type exists. Should entity breakage be further discouraged by returning a slice here rather than an object index? Something like: i = first_grapheme(u) x = 0 while x < width and u[i] != "\n": x, _ = draw(u[i], (x, y)) i = next_grapheme(u, i) Neil From janssen at parc.com Tue Oct 25 01:50:58 2005 From: janssen at parc.com (Bill Janssen) Date: Mon, 24 Oct 2005 16:50:58 PDT Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: Your message of "Mon, 24 Oct 2005 15:47:22 PDT." Message-ID: <05Oct24.165105pdt."58617"@synergy1.parc.xerox.com> > > - yet others think: "I want all of Unicode, with proper, efficient > > indexing, so I want four bytes per char". > > I doubt the last one though. Probably they really don't want efficient > indexing, they want to perform higher-level operations that currently > are only possible using efficient indexing or slicing. With the right > API. perhaps they could work just as efficiently with an internal > representation of UTF-8. I just got mail this morning from a researcher who wants exactly what Martin described, and wondered why the default MacPython 2.4.2 didn't provide it by default. :-) Bill From guido at python.org Tue Oct 25 02:04:35 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 24 Oct 2005 17:04:35 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <3481071664344267578@unknownmsgid> References: <3481071664344267578@unknownmsgid> Message-ID: On 10/24/05, Bill Janssen wrote: > > > - yet others think: "I want all of Unicode, with proper, efficient > > > indexing, so I want four bytes per char". > > > > I doubt the last one though. Probably they really don't want efficient > > indexing, they want to perform higher-level operations that currently > > are only possible using efficient indexing or slicing. With the right > > API. perhaps they could work just as efficiently with an internal > > representation of UTF-8. > > I just got mail this morning from a researcher who wants exactly what > Martin described, and wondered why the default MacPython 2.4.2 didn't > provide it by default. :-) Oh, I don't doubt that they want it. But often they don't *need* it, and the higher-level goal they are trying to accomplish can be dealt with better in a different way. (Sort of my response to people asking for static typing in Python as well. :-) Did they tell you what they were trying to do that MacPython 2.4.2 wouldn't let them, beyond "represent a large Unicode string as an array of 4-byte integers"? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Tue Oct 25 02:39:58 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 25 Oct 2005 13:39:58 +1300 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> <435C7EFE.6060504@v.loewis.de> <50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com> <435D47B6.4010703@v.loewis.de> <435D5E52.8080003@v.loewis.de> Message-ID: <435D7EDE.7040307@canterbury.ac.nz> Guido van Rossum wrote: > I think the API should reflect the representation *to some extend*, > namely it shouldn't claim to have operations that are typically > thought of as O(1) that can only be implemented as O(n). Maybe a compromise could be reached by using a btree of chunks or something, so indexing is O(log n). Not as good as O(1) but a lot better than O(n). -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Oct 25 02:40:02 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 25 Oct 2005 13:40:02 +1300 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> <435C7EFE.6060504@v.loewis.de> <50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com> <435D47B6.4010703@v.loewis.de> Message-ID: <435D7EE2.6080907@canterbury.ac.nz> Guido van Rossum wrote: > Python's slice-and-dice model pretty much ensures that indexing is > common. Almost everything is ultimately represented as indices: regex > search results have the index in the API, find()/index() return > indices, many operations take a start and/or end index. Maybe the idea of string views should be reconsidered in light of this. It's been criticised on the grounds that its use could keep large strings alive longer than needed, but if operations that currently return indices instead returned string views, this wouldn't be any more of a concern than it is now, especially if there is an easy way to explicitly materialise the view as an independent string when wanted. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From janssen at parc.com Tue Oct 25 04:39:42 2005 From: janssen at parc.com (Bill Janssen) Date: Mon, 24 Oct 2005 19:39:42 PDT Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: Your message of "Mon, 24 Oct 2005 17:04:35 PDT." Message-ID: <05Oct24.193946pdt."58617"@synergy1.parc.xerox.com> Guido writes: > Oh, I don't doubt that they want it. But often they don't *need* it, > and the higher-level goal they are trying to accomplish can be dealt > with better in a different way. (Sort of my response to people asking > for static typing in Python as well. :-) I suppose that's true. But what if they're not smart enough to figure out that better, different, way? I doubt you intend Python to be sort of the Rubik's cube of programming... And no, he didn't say why he wanted the ability to "represent a Unicode string as an array of 4-byte integers". Though I know he's doing something with the Deseret Alphabet, translating some early work on American Indian culture that was transcribed in that character set. Bill From simon at arrowtheory.com Tue Oct 25 05:36:26 2005 From: simon at arrowtheory.com (Simon Burton) Date: Tue, 25 Oct 2005 13:36:26 +1000 Subject: [Python-Dev] AST branch is in? In-Reply-To: References: <200510211202.12015.anthony@interlink.com.au> Message-ID: <20051025133626.286990fd.simon@arrowtheory.com> On Fri, 21 Oct 2005 18:32:22 +0000 (UTC) nas at arctrix.com (Neil Schemenauer) wrote: > > Does it just allow us to do new and interesting manipulations of > > the code during compilation? > > Well, that's a pretty big deal, IMHO. For example, adding > pychecker-like functionality should be straight forward now. I also > hope some of the namespace optimizations get explored (e.g. PEP > 267). Is there a python interface ? Simon. -- Simon Burton, B.Sc. Licensed PO Box 8066 ANU Canberra 2601 Australia Ph. 61 02 6249 6940 http://arrowtheory.com From mal at egenix.com Tue Oct 25 10:38:14 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 25 Oct 2005 10:38:14 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <50862ebd0510241613w2b5da91cqbcaf4f6157ae338e@mail.gmail.com> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> <435C9DFC.8020501@egenix.com> <50862ebd0510241613w2b5da91cqbcaf4f6157ae338e@mail.gmail.com> Message-ID: <435DEEF6.5020603@egenix.com> Neil Hodgson wrote: > M.-A. Lemburg: > > >>Unicode has the concept of combining code points, e.g. you can >>store an "?" (e with a accent) as "e" + "'". Now if you slice >>off the accent, you'll break the character that you encoded >>using combining code points. >>... >> next_(u, index) -> integer >> >> Returns the Unicode object index for the start of the next >> found after u[index] or -1 in case no next element >> of this type exists. > > > Should entity breakage be further discouraged by returning a slice > here rather than an object index? You mean a slice that slices out the next ? > Something like: > > i = first_grapheme(u) > x = 0 > while x < width and u[i] != "\n": > x, _ = draw(u[i], (x, y)) > i = next_grapheme(u, i) This sounds a lot like you'd want iterators for the various index types. Should be possible to implement on top of the proposed APIs, e.g. itergraphemes(u), itercodepoints(u), etc. Note that what most people refer to as "character" is a grapheme in Unicode speak. Given that interpretation, "breaking" Unicode "characters" is something you won't ever work around with by using larger code units such as UCS4 compatible ones. Furthermore, you should also note that surrogates (two code units encoding one code point) are part of Unicode life. While you don't need them when storing Unicode in UCS4 code units, they can still be part of the Unicode data and the programmer has to be aware of these. I personally, don't think that slicing Unicode is such a big issue. If you know what you are doing, things tend not to break - which is true for pretty much everything you do in programming ;-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 25 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From nas at arctrix.com Tue Oct 25 10:53:07 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 25 Oct 2005 08:53:07 +0000 (UTC) Subject: [Python-Dev] AST branch is in? References: <200510211202.12015.anthony@interlink.com.au> <20051025133626.286990fd.simon@arrowtheory.com> Message-ID: Simon Burton wrote: > Is there a python interface ? Not yet. Neil From mal at egenix.com Tue Oct 25 11:09:52 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 25 Oct 2005 11:09:52 +0200 Subject: [Python-Dev] New codecs checked in In-Reply-To: <435D4B14.7060008@v.loewis.de> References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de> <435CA2BA.7050900@livinglogic.de> <435CA887.3020900@egenix.com> <435CB4BB.6070009@livinglogic.de> <435CCEE6.6020005@egenix.com> <435D4B14.7060008@v.loewis.de> Message-ID: <435DF660.1010800@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: > > >>I had to create three custom mapping files for cp1140, koi8-u >>and tis-620. > > > Can you please publish the files you have used somewhere? They > best go into the Python CVS. Sure; I'll check in the whole build machinery I'm using for this. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 25 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mal at egenix.com Tue Oct 25 11:17:58 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 25 Oct 2005 11:17:58 +0200 Subject: [Python-Dev] New codecs checked in In-Reply-To: <435D4A2B.1090505@v.loewis.de> References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de> <435CA2BA.7050900@livinglogic.de> <435CA887.3020900@egenix.com> <435D4A2B.1090505@v.loewis.de> Message-ID: <435DF846.5050602@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: > >>I just left them in because I thought they wouldn't do any harm >>and might be useful in some applications. >> >>Removing them where not directly needed by the codec would not >>be a problem. > > > I think memory usage caused is measurable (I estimated 4KiB per > dictionary). More importantly, people apparently currently change > the dictionaries we provide and expect the codecs to automatically > pick up the modified mappings. It would be better if the breakage > is explicit (i.e. they get an AttributeError on the variable) instead > of implicit (their changes to the mapping simply have no effect > anymore). Agreed. I've already checked in the changes, BTW. >>KOI8-U is not available as mapping on ftp.unicode.org and >>I only recreated codecs from the mapping files available >>there. > > > I think we should come up with mapping tables for the additional > codecs as well, and maintain them in the CVS. This also applies > to things like rot13. Agreed. >>I'll rerun the creation with the above changes sometime this >>week. > > > I hope I can finish my encoding routine shortly, which again > results in changes to the codecs (replacing the encoding dictionaries > with other lookup tables). Having seen the decode tables written as long Unicode string, I think that this may indeed also be a good solution for encoding - the major improvement here is that the parser and compiler will do the work of creating the table. At module load time, the .pyc file will only contain a long string which is very fast to create and load (unlike dictionaries which are set up dynamically at load time). In general, it's better to do all the work up-front when creating the codecs, rather than having run-time code repeat these tasks over and over again. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 25 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ncoghlan at gmail.com Tue Oct 25 11:26:58 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 25 Oct 2005 19:26:58 +1000 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <20051024034957.38EB.JCARLSON@uci.edu> References: <1130107429.11268.40.camel@geddy.wooz.org> <435CA5F8.7010607@gmail.com> <20051024034957.38EB.JCARLSON@uci.edu> Message-ID: <435DFA62.4090802@gmail.com> Josiah Carlson wrote: > Nick Coghlan wrote: >> I think having dicts and sets automatically invoke freeze would be a mistake, >> because at least one of the following two cases would behave unexpectedly: > > I'm pretty sure that the PEP was only aslomg if one would freeze the > contents of dicts IF the dict was being frozen. > > That is, which of the following should be the case: > freeze({1:[2,3,4]}) -> {1:[2,3,4]} > freeze({1:[2,3,4]}) -> xdict(1=(2,3,4)) I believe the choices you intended are: freeze({1:[2,3,4]}) -> imdict(1=[2,3,4]) freeze({1:[2,3,4]}) -> imdict(1=(2,3,4)) Regardless, that question makes a lot more sense (and looking at the PEP again, I realised I simply read it wrong the first time). For containers where equality depends on the contents of the container (i.e., all the builtin ones), I don't see how it is possible to implement a sensible hash function without freezing the contents as well - otherwise your immutable isn't particularly immutable. Consider what would happen if list "__freeze__" simply returned a tuple version of itself - you have a __freeze__ method which returns a potentially unhashable object! Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From amk at amk.ca Tue Oct 25 11:58:49 2005 From: amk at amk.ca (A.M. Kuchling) Date: Tue, 25 Oct 2005 05:58:49 -0400 Subject: [Python-Dev] AST branch is in? In-Reply-To: <20051025133626.286990fd.simon@arrowtheory.com> References: <200510211202.12015.anthony@interlink.com.au> <20051025133626.286990fd.simon@arrowtheory.com> Message-ID: <20051025095849.GA4930@rogue.amk.ca> On Tue, Oct 25, 2005 at 01:36:26PM +1000, Simon Burton wrote: > Is there a python interface ? Not yet, as far as I know. FYI, all: please see the following weblog entry for a description of the AST branch: http://www.amk.ca/diary/2005/10/the_ast_branch_lands_1 If I got anything wrong, please offer corrections in the comments for that post. --amk From amk at amk.ca Tue Oct 25 12:13:20 2005 From: amk at amk.ca (A.M. Kuchling) Date: Tue, 25 Oct 2005 06:13:20 -0400 Subject: [Python-Dev] Reminder: PyCon 2006 submissions due in a week Message-ID: <20051025101320.GC4930@rogue.amk.ca> The submission deadline for PyCon 2006 is now a week away. PyCon 2006 will be in Dallas, Texas, February 24-26 2006. For 2006, I'd like to see more tutorial-style talks on the program. This means that your talk doesn't have to be about something entirely new; you can show how to use a particular language feature, standard library module, examine some aspect of a Python implementation, or compare the available libraries in an application domain. For example, the most popular talk at 2005 was Michelle Levesque's PyWeboff, which compare various web development tools. The next most popular (ignoring a few keynotes and the lightning talks) were Alex Martelli's talks on iterators & generators, and on OOP. Partly that's because it's Alex, of course, but I think attendees want help in deciding which tools are good/helpful/safe to use. If you need an idea, http://wiki.python.org/moin/PyCon2005/Feedback lists some topics that 2005's attendees were interested in. CFP: http://www.python.org/pycon/2006/cfp Proposal submission site: http://submit.python.org/ --amk From mal at egenix.com Tue Oct 25 12:18:50 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 25 Oct 2005 12:18:50 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <5.0.2.1.1.20051024133833.02b81cb0@mail.oz.net> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <5.0.2.1.1.20051024133833.02b81cb0@mail.oz.net> Message-ID: <435E068A.70409@egenix.com> Bengt Richter wrote: > At 11:43 2005-10-24 +0200, M.-A. Lemburg wrote: > >>Bengt Richter wrote: >> >>>Please bear with me for a few paragraphs ;-) >> >>Please note that source code encoding doesn't really have >>anything to do with the way the interpreter executes the >>program - it's merely a way to tell the parser how to >>convert string literals (currently on the Unicode ones) >>into constant Unicode objects within the program text. >>It's also a nice way to let other people know what kind of >>encoding you used to write your comments ;-) >> >>Nothing more. > > I think somehow I didn't make things clear, sorry ;-) > As I tried to show in the example of module_a.cs vs module_b.cs, > the source encoding currently results in two different str-type > strings representing the source _character_ sequence, which is the > _same_ in both cases. I don't follow you here. The source code encoding is only applied to Unicode literals (you are using string literals in your example). String literals are passed through as-is. Whether or not you editor will use the source code encoding marker is really up to your editor and not within the scope of Python. If you open the two module files in Emacs, you'll see identical renderings of the string literals. With other editors, you may have to explicitly tell the editor which encoding to assume. Dito for shell printouts. > To make it more clear, try the following little > program (untested except on NT4 with > Python 2.4b1 (#56, Nov 3 2004, 01:47:27) > [GCC 3.2.3 (mingw special 20030504-1)] on win32 ;-): > > ----< t_srcenc.py >-------------------------------- > import os > def test(): > open('module_a.py','wb').write( > "# -*- coding: latin-1 -*-" + os.linesep + > "cs = '\xfcber-cool'" + os.linesep) > open('module_b.py','wb').write( > "# -*- coding: utf-8 -*-" + os.linesep + > "cs = '\xc3\xbcber-cool'" + os.linesep) > # show that we have two modules differing only in encoding: > print ''.join(line.decode('latin-1') for line in open('module_a.py')) > print ''.join(line.decode('utf-8') for line in open('module_b.py')) > # see how results are affected: > import module_a, module_b > print module_a.cs + ' =?= ' + module_b.cs > print module_a.cs.decode('latin-1') + ' =?= ' + module_b.cs.decode('utf-8') > > if __name__ == '__main__': > test() > --------------------------------------------------- > The result copied from NT4 console to clipboard and pasted into eudora: > __________________________________________________________ > > [17:39] C:\pywk\python-dev>py24 t_srcenc.py > # -*- coding: latin-1 -*- > cs = '?ber-cool' > > # -*- coding: utf-8 -*- > cs = '?ber-cool' > > nber-cool =?= ++ber-cool > ?ber-cool =?= ?ber-cool > __________________________________________________________ > (I'd say NT did the best it could, rendering the the copied cp437 > superscript n as the 'n' above, and the '++' coming from the > cp437 box characters corresponding to the '\xc3\xbc'. Not sure > how it will show on your screen, but try the program to see ;-) > >>Once a module is compiled, there's no distinction between >>a module using the latin-1 source code encoding or one using >>the utf-8 encoding. > > ISTM module_a.cs and module_b.cs can readily be distinguished after > compilation, whereas the sources displayed according to their declared > encodings as above (or as e.g. different editors using different native > encoding might) cannot (other than the encoding cookie itself) ;-) > Perhaps you meant something else? What your editor displays to you is not within the scope of Python, e.g. if you open the files in Emacs you'll see something different than in Notepad. I guess that's the price you have to pay for being able to write programs that can include Unicode literals using the complete range of possible Unicode characters without having to revert to escapes. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 25 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From fredrik at pythonware.com Tue Oct 25 12:26:28 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 25 Oct 2005 12:26:28 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net><5.0.2.1.1.20051022120117.02b58da0@mail.oz.net><5.0.2.1.1.20051024133833.02b81cb0@mail.oz.net> <435E068A.70409@egenix.com> Message-ID: M.-A. Lemburg wrote: > I don't follow you here. The source code encoding > is only applied to Unicode literals (you are using string > literals in your example). String literals are passed > through as-is. however, for Python 3000, it would be nice if the source-code encoding applied to the *entire* file (XML-style), rather than just unicode string literals and (hope- fully) comments and docstrings. From mal at egenix.com Tue Oct 25 13:31:50 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 25 Oct 2005 13:31:50 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net><5.0.2.1.1.20051022120117.02b58da0@mail.oz.net><5.0.2.1.1.20051024133833.02b81cb0@mail.oz.net> <435E068A.70409@egenix.com> Message-ID: <435E17A6.2000404@egenix.com> Fredrik Lundh wrote: > M.-A. Lemburg wrote: > > >>I don't follow you here. The source code encoding >>is only applied to Unicode literals (you are using string >>literals in your example). String literals are passed >>through as-is. > > > however, for Python 3000, it would be nice if the source-code encoding applied > to the *entire* file (XML-style), rather than just unicode string literals and (hope- > fully) comments and docstrings. Actually, the encoding is applied to the complete source file: the file is transcoded into UTF-8 and then parsed by the Python parser. Unicode literals are then decoded from the UTF-8 into Unicode. String literals are transcoded back into the source code encoding, thus making the (rather long due to technical constraints) round-trip source code encoding -> Unicode -> UTF-8 -> Unicode -> source code encoding. Python 3k should have a fully Unicode based parser to reduce this additional transcoding overhead. Since Py3k will only have Unicode literals, the problems with string literals will go away all by themselves :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 25 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mal at egenix.com Tue Oct 25 14:11:28 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 25 Oct 2005 14:11:28 +0200 Subject: [Python-Dev] New codecs checked in In-Reply-To: <435DF660.1010800@egenix.com> References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de> <435CA2BA.7050900@livinglogic.de> <435CA887.3020900@egenix.com> <435CB4BB.6070009@livinglogic.de> <435CCEE6.6020005@egenix.com> <435D4B14.7060008@v.loewis.de> <435DF660.1010800@egenix.com> Message-ID: <435E20F0.5040905@egenix.com> M.-A. Lemburg wrote: > Martin v. L?wis wrote: > >>M.-A. Lemburg wrote: >> >> >> >>>I had to create three custom mapping files for cp1140, koi8-u >>>and tis-620. >> >> >>Can you please publish the files you have used somewhere? They >>best go into the Python CVS. > > > Sure; I'll check in the whole build machinery I'm using for this. Done. In order to rebuild the codecs, cd Tools/unicode; make then check the codecs in the created build/ subdir (e.g. using comparecodecs.py) and copy them over to the Lib/encodings/ directory. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 25 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ncoghlan at gmail.com Tue Oct 25 16:27:49 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 26 Oct 2005 00:27:49 +1000 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: References: <435A4598.3060403@iinet.net.au> <435B597C.6040300@gmail.com> <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com> <435CCB1C.4030108@gmail.com> Message-ID: <435E40E5.7070703@gmail.com> Almost there - this is the only issue I have left on my list :) Guido van Rossum wrote: > On 10/24/05, Nick Coghlan wrote: >> However, those resolutions bring up the following issues: >> >> 5 a. What exception is raised when EXPR does not have a __context__ method? >> b. What about when the returned object is missing __enter__ or __exit__? >> I suggest raising TypeError in both cases, for symmetry with for loops. >> The slot check is made in C code, so I don't see any difficulty in raising >> TypeError instead of AttributeError if the relevant slots aren't filled. > > Why are you so keen on TypeError? I find AttributeError totally > appropriate. I don't see symmetry with for-loops as a valuable > property here. AttributeError and TypeError are often interchangeable > anyway. The reason I'm keen on TypeError is because 'abstract.c' uses it consistently when it fails to find a method to support a requested protocol. None of the abstract object methods currently raise AttributeError, and this property is fairly visible at the Python level because the abstract API's are used to implement many of the bytecodes and various builtin functions. Both for loops and the iter function, for example, get their current exception behaviour from PyObject_GetIter and PyIter_Next. Having had a look at mwh's patch, however, I've realised that going that way would only be possible if there were dedicated bytecodes for GET_CONTEXT, ENTER_CONTEXT and EXIT_CONTEXT (similar to the dedicated GET_ITER and FOR_ITER). Leaving the exception as AttributeError means that level of bytecode hacking isn't necessary (mwh's patch just emits a fairly normal try/finally statement, although it still modifies the bytecode to include LOAD_EXIT_ARGS). So, the inconsistency with other syntactic protocols still bothers me, but I can live with AttributeError if you don't want to add three new bytecodes just to support PEP 343. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From dave at boost-consulting.com Tue Oct 25 17:21:56 2005 From: dave at boost-consulting.com (David Abrahams) Date: Tue, 25 Oct 2005 11:21:56 -0400 Subject: [Python-Dev] MinGW and libpython24.a Message-ID: Is the instruction at http://www.python.org/dev/doc/devel/inst/tweak-flags.html#SECTION000622000000000000000 still relevant? I am not 100% certain I didn't make one myself, but it looks to me as though my Windows Python 2.4.1 distro came with a libpython24.a. I am asking here because it seems only the person who prepares the installer would know. If this is true, in which version was it introduced? Thanks, -- Dave Abrahams Boost Consulting www.boost-consulting.com From guido at python.org Tue Oct 25 18:14:29 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 25 Oct 2005 09:14:29 -0700 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: <435E40E5.7070703@gmail.com> References: <435A4598.3060403@iinet.net.au> <435B597C.6040300@gmail.com> <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com> <435CCB1C.4030108@gmail.com> <435E40E5.7070703@gmail.com> Message-ID: On 10/25/05, Nick Coghlan wrote: > Almost there - this is the only issue I have left on my list :) [,,,] > > Why are you so keen on TypeError? I find AttributeError totally > > appropriate. I don't see symmetry with for-loops as a valuable > > property here. AttributeError and TypeError are often interchangeable > > anyway. > > The reason I'm keen on TypeError is because 'abstract.c' uses it consistently > when it fails to find a method to support a requested protocol. Hm. abstract.c well predates the new type system. Slots and methods weren't really unified back then, so TypeError made obvious sense at the time. But with the new unified types/classes, those TypeErrors are really just delayed (or precomputed? depends on your POV) AttributeErrors. > None of the abstract object methods currently raise AttributeError, and this > property is fairly visible at the Python level because the abstract API's are > used to implement many of the bytecodes and various builtin functions. Both > for loops and the iter function, for example, get their current exception > behaviour from PyObject_GetIter and PyIter_Next. > > Having had a look at mwh's patch, however, I've realised that going that way > would only be possible if there were dedicated bytecodes for GET_CONTEXT, > ENTER_CONTEXT and EXIT_CONTEXT (similar to the dedicated GET_ITER and FOR_ITER). > > Leaving the exception as AttributeError means that level of bytecode hacking > isn't necessary (mwh's patch just emits a fairly normal try/finally statement, > although it still modifies the bytecode to include LOAD_EXIT_ARGS). Let's definitely not introduce new bytecodes just so we can raise a different exception. > So, the inconsistency with other syntactic protocols still bothers me, but I > can live with AttributeError if you don't want to add three new bytecodes just > to support PEP 343. I think the consistency you are seeking is a mirage. The TypeErrors stem from the pre-computation of the slot population, not from some requirements to raise TypeError for failing to implement some required built-in protocol. I wouldn't hold it against other implementations of Python if they raised AttributeError in more situations. It is true though that AttributeError is somewhat special. There are lots of places (perhaps too many?) where an operation is defined using something like "if the object has attribute __foo__, use it, otherwise use some other approach". Some operations explicitly check for AttributeError in their attribute check, and let a different exception bubble up the stack. Presumably this is done so that a bug in somebody's __getattr__ implementation doesn't get masked by the "otherwise use some other approach" branch. But this is relatively rare; most calls to PyObject_GetAttr just clear the error if they have a different approach available. In any case, I don't see any of this as supporting the position that TypeError is somehow more appropriate. An AttributeError complaining about a missing __enter__, __exit__ or __context__ method sounds just fine. (Oh, and please don't go checking for the existence of __exit__ before calling __enter__. That kind of bug is found with even the most cursory testing.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Tue Oct 25 20:12:13 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 25 Oct 2005 20:12:13 +0200 Subject: [Python-Dev] New codecs checked in In-Reply-To: <435E20F0.5040905@egenix.com> References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de> <435CA2BA.7050900@livinglogic.de> <435CA887.3020900@egenix.com> <435CB4BB.6070009@livinglogic.de> <435CCEE6.6020005@egenix.com> <435D4B14.7060008@v.loewis.de> <435DF660.1010800@egenix.com> <435E20F0.5040905@egenix.com> Message-ID: <435E757D.4030408@v.loewis.de> M.-A. Lemburg wrote: > Done. > > In order to rebuild the codecs, cd Tools/unicode; make > then check the codecs in the created build/ subdir (e.g. > using comparecodecs.py) and copy them over to the > Lib/encodings/ directory. Thanks! Martin From martin at v.loewis.de Tue Oct 25 20:16:45 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 25 Oct 2005 20:16:45 +0200 Subject: [Python-Dev] MinGW and libpython24.a In-Reply-To: References: Message-ID: <435E768D.2000401@v.loewis.de> David Abrahams wrote: > Is the instruction at > http://www.python.org/dev/doc/devel/inst/tweak-flags.html#SECTION000622000000000000000 > still relevant? I am not 100% certain I didn't make one myself, but > it looks to me as though my Windows Python 2.4.1 distro came with a > libpython24.a. I am asking here because it seems only the person who > prepares the installer would know. That impression might be incorrect: I can tell you when I started including libpython24.a, but I have no clue whether the instructions you refer to are correct - I don't use the file myself at all. > If this is true, in which version was it introduced? It was introduced in 1.20/1.16.2.4 of Tools/msi/msi.py in response to patch #1088716; this in turn was first used to release r241c1. Regards, Martin From martin at v.loewis.de Tue Oct 25 20:18:24 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 25 Oct 2005 20:18:24 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <05Oct24.165105pdt."58617"@synergy1.parc.xerox.com> References: <05Oct24.165105pdt."58617"@synergy1.parc.xerox.com> Message-ID: <435E76F0.70001@v.loewis.de> Bill Janssen wrote: > I just got mail this morning from a researcher who wants exactly what > Martin described, and wondered why the default MacPython 2.4.2 didn't > provide it by default. :-) If all he wants is to represent Deseret, he can do so in a 16-bit Unicode type, too: Python supports UTF-16. Regards, Martin From martin at v.loewis.de Tue Oct 25 20:21:08 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 25 Oct 2005 20:21:08 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net><5.0.2.1.1.20051022120117.02b58da0@mail.oz.net><5.0.2.1.1.20051024133833.02b81cb0@mail.oz.net> <435E068A.70409@egenix.com> Message-ID: <435E7794.2030505@v.loewis.de> Fredrik Lundh wrote: > however, for Python 3000, it would be nice if the source-code encoding applied > to the *entire* file (XML-style), rather than just unicode string literals and (hope- > fully) comments and docstrings. As MAL explains, the encoding currently does apply to the entire file. However, because of the Python syntax, you are restricted to ASCII in many places, such as keywords, number literals, and (unfortunately) identifiers. Lifting the restriction on identifiers is on my agenda. Regards, Martin From jcarlson at uci.edu Tue Oct 25 21:04:33 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 25 Oct 2005 12:04:33 -0700 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <435DFA62.4090802@gmail.com> References: <20051024034957.38EB.JCARLSON@uci.edu> <435DFA62.4090802@gmail.com> Message-ID: <20051025120225.3924.JCARLSON@uci.edu> Nick Coghlan wrote: > > Josiah Carlson wrote: > > Nick Coghlan wrote: > >> I think having dicts and sets automatically invoke freeze would be a mistake, > >> because at least one of the following two cases would behave unexpectedly: > > > > I'm pretty sure that the PEP was only aslomg if one would freeze the > > contents of dicts IF the dict was being frozen. > > > > That is, which of the following should be the case: > > freeze({1:[2,3,4]}) -> {1:[2,3,4]} > > freeze({1:[2,3,4]}) -> xdict(1=(2,3,4)) > > I believe the choices you intended are: > freeze({1:[2,3,4]}) -> imdict(1=[2,3,4]) > freeze({1:[2,3,4]}) -> imdict(1=(2,3,4)) > > Regardless, that question makes a lot more sense (and looking at the PEP > again, I realised I simply read it wrong the first time). > > For containers where equality depends on the contents of the container (i.e., > all the builtin ones), I don't see how it is possible to implement a sensible > hash function without freezing the contents as well - otherwise your immutable > isn't particularly immutable. > > Consider what would happen if list "__freeze__" simply returned a tuple > version of itself - you have a __freeze__ method which returns a potentially > unhashable object! I agree completely, hence my original statement on 10/23: "it is of my opinion that a container which is frozen should have its contents frozen as well." - Josiah From dave at boost-consulting.com Tue Oct 25 21:06:29 2005 From: dave at boost-consulting.com (David Abrahams) Date: Tue, 25 Oct 2005 15:06:29 -0400 Subject: [Python-Dev] MinGW and libpython24.a In-Reply-To: <435E768D.2000401@v.loewis.de> (Martin v. =?iso-8859-1?Q?L=F6?= =?iso-8859-1?Q?wis's?= message of "Tue, 25 Oct 2005 20:16:45 +0200") References: <435E768D.2000401@v.loewis.de> Message-ID: "Martin v. L?wis" writes: > David Abrahams wrote: >> Is the instruction at >> http://www.python.org/dev/doc/devel/inst/tweak-flags.html#SECTION000622000000000000000 >> still relevant? I am not 100% certain I didn't make one myself, but >> it looks to me as though my Windows Python 2.4.1 distro came with a >> libpython24.a. I am asking here because it seems only the person who >> prepares the installer would know. > > That impression might be incorrect: I can tell you when I started > including libpython24.a, but I have no clue whether the instructions > you refer to are correct - I don't use the file myself at all. > >> If this is true, in which version was it introduced? > > It was introduced in 1.20/1.16.2.4 of Tools/msi/msi.py in response to > patch #1088716; this in turn was first used to release r241c1. Thanks! -- Dave Abrahams Boost Consulting www.boost-consulting.com From jcarlson at uci.edu Tue Oct 25 21:17:10 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 25 Oct 2005 12:17:10 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <435E7794.2030505@v.loewis.de> References: <435E7794.2030505@v.loewis.de> Message-ID: <20051025120919.3927.JCARLSON@uci.edu> "Martin v. L?wis" wrote: > > Fredrik Lundh wrote: > > however, for Python 3000, it would be nice if the source-code encoding applied > > to the *entire* file (XML-style), rather than just unicode string literals and (hope- > > fully) comments and docstrings. > > As MAL explains, the encoding currently does apply to the entire file. > However, because of the Python syntax, you are restricted to ASCII > in many places, such as keywords, number literals, and (unfortunately) > identifiers. Lifting the restriction on identifiers is on my agenda. It seems that removing this restriction may cause serious issues, at least in the case when using cyrillic characters in names. See recent security issues in regards to web addresses in web browsers for the confusion (and/or name errors) that could result in their use. While I agree in principle that people should be able to use the entirety of one's own natural language in writing software in programming languages, I think that it is an ugly can of worms that perhaps shouldn't be opened. - Josiah From eric.nieuwland at xs4all.nl Tue Oct 25 22:05:18 2005 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Tue, 25 Oct 2005 22:05:18 +0200 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: References: <435A4598.3060403@iinet.net.au> <435B597C.6040300@gmail.com> <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com> <435CCB1C.4030108@gmail.com> <435E40E5.7070703@gmail.com> Message-ID: <103348c72cae040229895842cb3c0cdc@xs4all.nl> Guido van Rossum wrote: > It is true though that AttributeError is somewhat special. There are > lots of places (perhaps too many?) where an operation is defined using > something like "if the object has attribute __foo__, use it, otherwise > use some other approach". Some operations explicitly check for > AttributeError in their attribute check, and let a different exception > bubble up the stack. Presumably this is done so that a bug in > somebody's __getattr__ implementation doesn't get masked by the > "otherwise use some other approach" branch. But this is relatively > rare; most calls to PyObject_GetAttr just clear the error if they have > a different approach available. In any case, I don't see any of this > as supporting the position that TypeError is somehow more appropriate. > An AttributeError complaining about a missing __enter__, __exit__ or > __context__ method sounds just fine. (Oh, and please don't go checking > for the existence of __exit__ before calling __enter__. That kind of > bug is found with even the most cursory testing.) Hmmm... Would it be reasonable to introduce a ProtocolError exception? --eric From guido at python.org Tue Oct 25 22:11:23 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 25 Oct 2005 13:11:23 -0700 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: <103348c72cae040229895842cb3c0cdc@xs4all.nl> References: <435A4598.3060403@iinet.net.au> <435B597C.6040300@gmail.com> <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com> <435CCB1C.4030108@gmail.com> <435E40E5.7070703@gmail.com> <103348c72cae040229895842cb3c0cdc@xs4all.nl> Message-ID: On 10/25/05, Eric Nieuwland wrote: > Hmmm... Would it be reasonable to introduce a ProtocolError exception? And which perceived problem would that solve? The problem of Nick & Guido disagreeing in public? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From eric.nieuwland at xs4all.nl Tue Oct 25 22:22:33 2005 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Tue, 25 Oct 2005 22:22:33 +0200 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: References: <435A4598.3060403@iinet.net.au> <435B597C.6040300@gmail.com> <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com> <435CCB1C.4030108@gmail.com> <435E40E5.7070703@gmail.com> <103348c72cae040229895842cb3c0cdc@xs4all.nl> Message-ID: <0c1f3f308e9c605ef4687689c860913e@xs4all.nl> Guido van Rossum wrote: > On 10/25/05, Eric Nieuwland wrote: >> Hmmm... Would it be reasonable to introduce a ProtocolError exception? > > And which perceived problem would that solve? The problem of Nick & > Guido disagreeing in public? ;-) No, that will go on in other fields, I guess. It was meant to be a bit more informative about what is wrong. ProtocolError: lacks __enter__ or __exit__ --eric From guido at python.org Tue Oct 25 22:35:14 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 25 Oct 2005 13:35:14 -0700 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: <0c1f3f308e9c605ef4687689c860913e@xs4all.nl> References: <435A4598.3060403@iinet.net.au> <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com> <435CCB1C.4030108@gmail.com> <435E40E5.7070703@gmail.com> <103348c72cae040229895842cb3c0cdc@xs4all.nl> <0c1f3f308e9c605ef4687689c860913e@xs4all.nl> Message-ID: [Eric "are all your pets called Eric?" Nieuwland] > >> Hmmm... Would it be reasonable to introduce a ProtocolError exception? [Guido] > > And which perceived problem would that solve? [Eric] > It was meant to be a bit more informative about what is wrong. > > ProtocolError: lacks __enter__ or __exit__ That's exactly what I'm trying to avoid. :) I find "AttributeError: __exit__" just as informative. In either case, if you know what __exit__ means, you'll know what you did wrong. And if you don't know what it means, you'll have to look it up anyway. And searching for ProtocolError doesn't do you any good -- you'll have to learn about what __exit__ is and where it is required. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From p.f.moore at gmail.com Tue Oct 25 22:40:30 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 25 Oct 2005 21:40:30 +0100 Subject: [Python-Dev] PEP 343 - multiple context managers in one statement Message-ID: <79990c6b0510251340s6adb7fbcpac5247886a171c3f@mail.gmail.com> I have a deep suspicion that this has been done to death already, but my searching ability isn't up to finding the reference. So I'll simply ask the question, and not offer a long discussion: Has the option of letting the with statement admit multiple context managers been considered (and presumably rejected)? I'm thinking of with expr1, expr2, expr3: # whatever In some ways, this doesn't even need an extension to the PEP - giving tuples suitable __enter__ and __exit__ methods would do it. Or, I suppose a user-defined manager which combined a list of others: class combining: def __init__(*mgrs): self.mgrs = mgrs def __with__(self): return self def __enter__(self): return tuple(mgr.__enter__() for mgr in self.mgrs) def __exit__(self, type, value, tb): # first in, last out for mgr in reversed(self.mgrs): mgr.__exit__(type, value, tb) Would that be worth using as an example in the PEP? Sorry - it got a bit long anyway... Paul. PS The signature of __with__ in example 4 in the PEP is wrong - it has an incorrect "lock" parameter. From fwierzbicki at gmail.com Tue Oct 25 22:48:05 2005 From: fwierzbicki at gmail.com (Frank Wierzbicki) Date: Tue, 25 Oct 2005 16:48:05 -0400 Subject: [Python-Dev] AST branch is in? In-Reply-To: References: <200510211202.12015.anthony@interlink.com.au> Message-ID: <4dab5f760510251348u7b90f7b4if71624f702a54ea2@mail.gmail.com> On 10/20/05, Neal Norwitz wrote: > > The Grammar is (was at one point at least) shared between Jython and > would allow more tools to be able to share infrastructure. The idea > is to eventually be able to have [JP]ython output the same AST to > tools. Hello Python-dev, My name is Frank Wierzbicki and I'm working on the Jython project. Does anyone on this list know more about the history of this Grammar sharing between the two projects? I've heard about some Grammar sharing between Jython and Python, and I've noticed that (most of) the jython code in /org/python/parser/ast is commented "Autogenerated AST node". I would definitely like to look at (eventually) coordinating with this effort. I've cross-posted to the Jython-dev list in case someone there has some insight. Thanks, Frank -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20051025/71dbda4a/attachment.html From guido at python.org Tue Oct 25 22:57:02 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 25 Oct 2005 13:57:02 -0700 Subject: [Python-Dev] AST branch is in? In-Reply-To: <4dab5f760510251348u7b90f7b4if71624f702a54ea2@mail.gmail.com> References: <200510211202.12015.anthony@interlink.com.au> <4dab5f760510251348u7b90f7b4if71624f702a54ea2@mail.gmail.com> Message-ID: On 10/25/05, Frank Wierzbicki wrote: > My name is Frank Wierzbicki and I'm working on the Jython project. Does > anyone on this list know more about the history of this Grammar sharing > between the two projects? I've heard about some Grammar sharing between > Jython and Python, and I've noticed that (most of) the jython code in > /org/python/parser/ast is commented "Autogenerated AST node". I would > definitely like to look at (eventually) coordinating with this effort. > > I've cross-posted to the Jython-dev list in case someone there has some > insight. Your best bet is to track down Jim Hugunin and see if he remembers. He's jimhug at microsoft.com or jim at hugunin.net. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From janssen at parc.com Tue Oct 25 23:02:26 2005 From: janssen at parc.com (Bill Janssen) Date: Tue, 25 Oct 2005 14:02:26 PDT Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: Your message of "Tue, 25 Oct 2005 11:18:24 PDT." <435E76F0.70001@v.loewis.de> Message-ID: <05Oct25.140234pdt."58617"@synergy1.parc.xerox.com> I think he was more interested in the invariant Martin proposed, that len("\U00010000") should always be the same and should always be 1. Bill From guido at python.org Tue Oct 25 23:04:09 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 25 Oct 2005 14:04:09 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <-6904385769530255077@unknownmsgid> References: <435E76F0.70001@v.loewis.de> <-6904385769530255077@unknownmsgid> Message-ID: On 10/25/05, Bill Janssen wrote: > I think he was more interested in the invariant Martin proposed, that > > len("\U00010000") > > should always be the same and should always be 1. Yes but why? What does this invariant do for him? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pedronis at strakt.com Tue Oct 25 23:08:43 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Tue, 25 Oct 2005 23:08:43 +0200 Subject: [Python-Dev] [Jython-dev] Re: AST branch is in? In-Reply-To: <4dab5f760510251348u7b90f7b4if71624f702a54ea2@mail.gmail.com> References: <200510211202.12015.anthony@interlink.com.au> <4dab5f760510251348u7b90f7b4if71624f702a54ea2@mail.gmail.com> Message-ID: <435E9EDB.90308@strakt.com> Frank Wierzbicki wrote: > On 10/20/05, *Neal Norwitz* > wrote: > > The Grammar is (was at one point at least) shared between Jython and > would allow more tools to be able to share infrastructure. The idea > is to eventually be able to have [JP]ython output the same AST to > tools. > > > Hello Python-dev, > > My name is Frank Wierzbicki and I'm working on the Jython project. Does > anyone on this list know more about the history of this Grammar sharing > between the two projects? I've heard about some Grammar sharing between > Jython and Python, and I've noticed that (most of) the jython code in > /org/python/parser/ast is commented "Autogenerated AST node". I would > definitely like to look at (eventually) coordinating with this effort. > > I've cross-posted to the Jython-dev list in case someone there has some > insight. as far as I understand now Python trunk contains some generated AST representation C code created through the asdl_c.py script from an updated Python.asdl, these files live in http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Parser/ a parallel asdl_java.py existed in Python CVS sandbox (where the AST effort started) and was updated the last time the Jython own AST classes were generated with at the time version of Python.asdl (this was done by me if I remember correctly at some point in Jython 2.2 evolution, I think when the PyDev guys wanted a more up-to-date Jython parser to reuse): http://cvs.sourceforge.net/viewcvs.py/*checkout*/python/python/nondist/sandbox/ast/asdl_java.py?content-type=text%2Fplain&rev=1.7 basically the new Python.asdl needs to be used, the asdl_java.py maybe updated and our compiler changed as necessary. regards. From martin at v.loewis.de Tue Oct 25 23:05:28 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 25 Oct 2005 23:05:28 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <20051025120919.3927.JCARLSON@uci.edu> References: <435E7794.2030505@v.loewis.de> <20051025120919.3927.JCARLSON@uci.edu> Message-ID: <435E9E18.7090502@v.loewis.de> Josiah Carlson wrote: > It seems that removing this restriction may cause serious issues, at > least in the case when using cyrillic characters in names. See recent > security issues in regards to web addresses in web browsers for the > confusion (and/or name errors) that could result in their use. That impression is deceiving. We are talking about source code here; people type in identifiers explicitly rather than receiving them through linking, and they scope identifiers (by module or object). If somebody manages to get look-alike identifiers into your Python libraries, you have bigger problems than these look-alikes: anybody capable of doing so could just as well replace the real thing in the first place. As always in computer security: define your threat model before reasoning about the risks. Regards, Martin From pedronis at strakt.com Tue Oct 25 23:11:32 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Tue, 25 Oct 2005 23:11:32 +0200 Subject: [Python-Dev] AST branch is in? In-Reply-To: References: <200510211202.12015.anthony@interlink.com.au> <4dab5f760510251348u7b90f7b4if71624f702a54ea2@mail.gmail.com> Message-ID: <435E9F84.2010205@strakt.com> Guido van Rossum wrote: > On 10/25/05, Frank Wierzbicki wrote: > >> My name is Frank Wierzbicki and I'm working on the Jython project. Does >>anyone on this list know more about the history of this Grammar sharing >>between the two projects? I've heard about some Grammar sharing between >>Jython and Python, and I've noticed that (most of) the jython code in >>/org/python/parser/ast is commented "Autogenerated AST node". I would >>definitely like to look at (eventually) coordinating with this effort. >> >> I've cross-posted to the Jython-dev list in case someone there has some >>insight. > > > Your best bet is to track down Jim Hugunin and see if he remembers. > He's jimhug at microsoft.com or jim at hugunin.net. > no. this is all after Jim, its indeed a derived effort from the CPython own AST effort, just that we started using it quite a while ago. This is all after Jim was not involved with Jython anymore, Finn Bock started this. From guido at python.org Tue Oct 25 23:13:38 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 25 Oct 2005 14:13:38 -0700 Subject: [Python-Dev] AST branch is in? In-Reply-To: <435E9F84.2010205@strakt.com> References: <200510211202.12015.anthony@interlink.com.au> <4dab5f760510251348u7b90f7b4if71624f702a54ea2@mail.gmail.com> <435E9F84.2010205@strakt.com> Message-ID: On 10/25/05, Samuele Pedroni wrote: > > Your best bet is to track down Jim Hugunin and see if he remembers. > > He's jimhug at microsoft.com or jim at hugunin.net. > no. this is all after Jim, its indeed a derived effort from the CPython > own AST effort, just that we started using it quite a while ago. > This is all after Jim was not involved with Jython anymore, Finn Bock > started this. Oops! Sorry for the misinformation. Shows how much I know. :( -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Tue Oct 25 23:21:43 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 25 Oct 2005 23:21:43 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: References: <435E76F0.70001@v.loewis.de> <-6904385769530255077@unknownmsgid> Message-ID: <435EA1E7.9020006@v.loewis.de> Guido van Rossum wrote: > Yes but why? What does this invariant do for him? I don't know about this person, but there are a few things that don't work properly in UTF-16 mode: - the Unicode character database fails to lookup things. u"\U0001D670".isupper() gives false, but should give true (since it denotes MATHEMATICAL MONOSPACE CAPITAL A). It gives true in UCS-4 mode - As a result, normalization on these doesn't work, either. It should normalize to "LATIN CAPITAL LETTER A" under NFKC, but doesn't. - regular expressions only have limited support. In particular, adding non-BMP characters to character classes is not possible. [\U0001D670] will match any character that is either \uD835 or \uDE70, whereas it only matches MATHEMATICAL MONOSPACE CAPITAL A in UCS-4 mode. There might be more limitations, but those are the ones that come to mind easily. While I could imagine fixing the first two with some effort, the third one is really tricky (unless you would accept a "wide" representation of a character class even if the Unicode representation is only narrow). Regards, Martin From jcarlson at uci.edu Tue Oct 25 23:40:06 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 25 Oct 2005 14:40:06 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <435E9E18.7090502@v.loewis.de> References: <20051025120919.3927.JCARLSON@uci.edu> <435E9E18.7090502@v.loewis.de> Message-ID: <20051025142245.3930.JCARLSON@uci.edu> "Martin v. L?wis" wrote: > > Josiah Carlson wrote: > > It seems that removing this restriction may cause serious issues, at > > least in the case when using cyrillic characters in names. See recent > > security issues in regards to web addresses in web browsers for the > > confusion (and/or name errors) that could result in their use. > > That impression is deceiving. We are talking about source code here; > people type in identifiers explicitly rather than receiving them > through linking, and they scope identifiers (by module or object). > > If somebody manages to get look-alike identifiers into your Python > libraries, you have bigger problems than these look-alikes: anybody > capable of doing so could just as well replace the real thing in > the first place. > > As always in computer security: define your threat model before > reasoning about the risks. I should have been more explicit. I did not mean to imply that I was concerned about the security implications of inserting arbitrary identifiers in Python (I was mentioning the web browser case for an example of how such characters have been confusing previously), I am concerned about confusion involved with using: Greek Capital: Alpha, Beta, Epsilon, Zeta, Eta, Iota, Kappa, Mu, Nu, Omicron, Rho, and Tau. Cyrillic Capital: Dze, Je, A, Ve, Ie, Em, En, O, Er, Es, Te, Ha, ... And how users could say, "name error? But I typed in window.draw(PEN) as I was told to, and it didn't work!" Identically drawn glyphs are a problem, and pretending that they aren't a problem, doesn't make it so. Right now, all possible name glyphs are visually distinct, which would not be the case if any unicode character could be used as a name (except for numerals). Speaking of which, would we then be offering support for arabic/indic numeric literals, and/or support it in int()/float()? Ideally I would like to say yes, but I could see the confusion if such were allowed. - Josiah From phil at riverbankcomputing.co.uk Tue Oct 25 23:45:02 2005 From: phil at riverbankcomputing.co.uk (Phil Thompson) Date: Tue, 25 Oct 2005 22:45:02 +0100 Subject: [Python-Dev] Inconsistent Use of Buffer Interface in stringobject.c In-Reply-To: References: <200510241818.41145.phil@riverbankcomputing.co.uk> <435D28B6.9010806@egenix.com> Message-ID: <200510252245.02871.phil@riverbankcomputing.co.uk> On Monday 24 October 2005 7:39 pm, Guido van Rossum wrote: > On 10/24/05, M.-A. Lemburg wrote: > > Guido van Rossum wrote: > > > A concern I'd have with fixing this is that Unicode objects also > > > support the buffer API. In any situation where either str or unicode > > > is accepted I'd be reluctant to guess whether a buffer object was > > > meant to be str-like or Unicode-like. I think this covers all the > > > cases you mention here. > > > > This situation is a little better than that: the buffer > > interface has a slot called getcharbuffer which is what > > the string methods use in case they find that a string > > argument is not of type str or unicode. > > I stand corrected! > > > As first step, I'd suggest to implement the gatcharbuffer > > slot. That will already go a long way. > > Phil, if anything still doesn't work after doing what Marc-Andre says, > those would be good candidates for fixes! The patch is now on SF, #1337876. Phil From mal at egenix.com Tue Oct 25 23:47:23 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 25 Oct 2005 23:47:23 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <20051025120919.3927.JCARLSON@uci.edu> References: <435E7794.2030505@v.loewis.de> <20051025120919.3927.JCARLSON@uci.edu> Message-ID: <435EA7EB.90100@egenix.com> Josiah Carlson wrote: > "Martin v. L?wis" wrote: > >>Fredrik Lundh wrote: >> >>>however, for Python 3000, it would be nice if the source-code encoding applied >>>to the *entire* file (XML-style), rather than just unicode string literals and (hope- >>>fully) comments and docstrings. >> >>As MAL explains, the encoding currently does apply to the entire file. >>However, because of the Python syntax, you are restricted to ASCII >>in many places, such as keywords, number literals, and (unfortunately) >>identifiers. Lifting the restriction on identifiers is on my agenda. > > > It seems that removing this restriction may cause serious issues, at > least in the case when using cyrillic characters in names. See recent > security issues in regards to web addresses in web browsers for the > confusion (and/or name errors) that could result in their use. > > While I agree in principle that people should be able to use the > entirety of one's own natural language in writing software in > programming languages, I think that it is an ugly can of worms that > perhaps shouldn't be opened. I agree with Josiah. A few years ago we had a discussion about this on python-dev and agreed to stick with ASCII identifiers for Python. I still think that's the right way to go. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 25 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From guido at python.org Tue Oct 25 23:47:42 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 25 Oct 2005 14:47:42 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <20051025142245.3930.JCARLSON@uci.edu> References: <20051025120919.3927.JCARLSON@uci.edu> <435E9E18.7090502@v.loewis.de> <20051025142245.3930.JCARLSON@uci.edu> Message-ID: On 10/25/05, Josiah Carlson wrote: > Identically drawn glyphs are a problem, and pretending that they aren't > a problem, doesn't make it so. Right now, all possible name glyphs are > visually distinct, which would not be the case if any unicode character > could be used as a name (except for numerals). Speaking of which, would > we then be offering support for arabic/indic numeric literals, and/or > support it in int()/float()? Ideally I would like to say yes, but I > could see the confusion if such were allowed. This problem isn't new. There are plenty of fonts where 1 and l are hard to distinguish, or l and I for that matter, or O and 0. Yes, we need better tools to diagnose this. No, we shouldn't let this stop us from adding such a feature if it is otherwise a good feature. I'm not so sure about this for other reasons -- it hampers code sharing, and as soon as you add right-to-left character sets to the mix (or top-to-bottom, for that matter), displaying source code is going to be near impossible for most tools (since the keywords and standard library module names will still be in the Latin alphabet). This actually seems a killer even for allowing Unicode in comments, which I'd otherwise favor. What do Unicode-aware apps generally do with right-to-left characters? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Tue Oct 25 23:55:27 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 25 Oct 2005 23:55:27 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <20051025142245.3930.JCARLSON@uci.edu> References: <20051025120919.3927.JCARLSON@uci.edu> <435E9E18.7090502@v.loewis.de> <20051025142245.3930.JCARLSON@uci.edu> Message-ID: <435EA9CF.6060305@v.loewis.de> Josiah Carlson wrote: > And how users could say, "name error? But I typed in window.draw(PEN) as > I was told to, and it didn't work!" Ah, so the "serious issues" you are talking about are not security issues, but usability issues. I don't think extending the range of acceptable characters will cause any additional confusion. Users are already getting "surprising" NameErrors/AttributeErrors in the following cases: - they just misspell the identifier, and then, when the error message is printed, fail to recognize the difference, as they read over the typo just like they read over it when mistyping it in the first place. - they run into confusions with different things having the same names in different contexts. For example, they wonder why they get TypeError for passing the wrong number of arguments to a function, when the call matches exactly what the source code in front of them tells them - only that they were calling a different function which just happened to have the same name. In the light of these common mistakes, your example with an identifier named PEN, where the "P" might be a cyrillic letter or the E a greek one is just made up: For window.draw, people will readily understand that they are supposed to use Latin letters. More generally, they will know what script to use just from looking at the identifier. > Identically drawn glyphs are a problem, and pretending that they aren't > a problem, doesn't make it so. Right now, all possible name glyphs are > visually distinct Not at all: Just compare Fool and Foo1 (and perhaps FooI) In the font in which I'm typing this, these are slightly different - but there are fonts in which the difference is really difficult to recognize. > Speaking of which, would > we then be offering support for arabic/indic numeric literals, and/or > support it in int()/float()? No. None of the Arabic users have ever requested such a feature, so it would be stupid to provide it. We provide extended identifiers not for the fun of it, but because users are requesting them. Regards, Martin From ncoghlan at gmail.com Tue Oct 25 23:55:15 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 26 Oct 2005 07:55:15 +1000 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: References: <435A4598.3060403@iinet.net.au> <435B597C.6040300@gmail.com> <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com> <435CCB1C.4030108@gmail.com> <435E40E5.7070703@gmail.com> Message-ID: <435EA9C3.9030600@gmail.com> Guido van Rossum wrote: > On 10/25/05, Nick Coghlan wrote: >> Almost there - this is the only issue I have left on my list :) > [,,,] >>> Why are you so keen on TypeError? I find AttributeError totally >>> appropriate. I don't see symmetry with for-loops as a valuable >>> property here. AttributeError and TypeError are often interchangeable >>> anyway. >> The reason I'm keen on TypeError is because 'abstract.c' uses it consistently >> when it fails to find a method to support a requested protocol. > > Hm. abstract.c well predates the new type system. Slots and methods > weren't really unified back then, so TypeError made obvious sense at > the time. Ah, I hadn't considered that, because I never made significant use of any Python versions before 2.2. Maybe there's a design principle in there somewhere: Failed duck-typing -> AttributeError (or TypeError for complex checks) Failed instance or subtype check -> TypeError Most of the functions in abstract.c handle complex protocols, so a simple attribute error wouldn't convey the necessary meaning. The context protocol, on the other hand, is fairly simple, and an AttributeError tells you everything you really need to know. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From martin at v.loewis.de Tue Oct 25 23:56:06 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 25 Oct 2005 23:56:06 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <435EA7EB.90100@egenix.com> References: <435E7794.2030505@v.loewis.de> <20051025120919.3927.JCARLSON@uci.edu> <435EA7EB.90100@egenix.com> Message-ID: <435EA9F6.3030702@v.loewis.de> M.-A. Lemburg wrote: > A few years ago we had a discussion about this on python-dev > and agreed to stick with ASCII identifiers for Python. I still > think that's the right way to go. I don't think there ever was such an agreement. Regards, Martin From guido at python.org Wed Oct 26 00:05:25 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 25 Oct 2005 15:05:25 -0700 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: <435EA9C3.9030600@gmail.com> References: <435A4598.3060403@iinet.net.au> <435B597C.6040300@gmail.com> <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com> <435CCB1C.4030108@gmail.com> <435E40E5.7070703@gmail.com> <435EA9C3.9030600@gmail.com> Message-ID: On 10/25/05, Nick Coghlan wrote: > Maybe there's a design principle in there somewhere: > > Failed duck-typing -> AttributeError (or TypeError for complex checks) > Failed instance or subtype check -> TypeError Doesn't convince me. If there are principles at work here (and not just coincidences), they are (a) don't lightly replace an exception by another, and (b) don't raise AttributeError; the getattr operation raise it for you. (a) says that we should let the AttributeError bubble up in the case of the with-statement; (b) explains why you see TypeError when a slot isn't filled. > Most of the functions in abstract.c handle complex protocols, so a simple > attribute error wouldn't convey the necessary meaning. The context protocol, > on the other hand, is fairly simple, and an AttributeError tells you > everything you really need to know. That's what I've been saying all the time. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Wed Oct 26 00:11:36 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 26 Oct 2005 00:11:36 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: References: <20051025120919.3927.JCARLSON@uci.edu> <435E9E18.7090502@v.loewis.de> <20051025142245.3930.JCARLSON@uci.edu> Message-ID: <435EAD98.6000401@v.loewis.de> Guido van Rossum wrote: > This actually seems a killer even for allowing Unicode in comments, > which I'd otherwise favor. What do Unicode-aware apps generally do > with right-to-left characters? The Unicode standard has an elaborate definition of what should happen. There are many rules to it, but essentially, there is the notion of a "primary" direction, which then is toggled based on the directionality of each character (unicodedata.bidirectional). There are also formatting characters which toggle the direction. This aspect of rendering is often not implemented, though. Web browsers do it correctly, see http://he.wikipedia.org/wiki/Python where all text should come out right-adjusted, yet the Latin fragments are still left to right (such as "Guido van Rossum") Integrating it into this text looks like this: ?????? (Python). GUI frameworks sometimes do it correctly, sometimes don't; most notably, Tk has no good support for RTL text. Regards, Martin From ncoghlan at gmail.com Wed Oct 26 00:20:50 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 26 Oct 2005 08:20:50 +1000 Subject: [Python-Dev] PEP 343 - multiple context managers in one statement In-Reply-To: <79990c6b0510251340s6adb7fbcpac5247886a171c3f@mail.gmail.com> References: <79990c6b0510251340s6adb7fbcpac5247886a171c3f@mail.gmail.com> Message-ID: <435EAFC2.6020305@gmail.com> Paul Moore wrote: > I have a deep suspicion that this has been done to death already, but > my searching ability isn't up to finding the reference. So I'll simply > ask the question, and not offer a long discussion: > > Has the option of letting the with statement admit multiple context > managers been considered (and presumably rejected)? > > I'm thinking of > > with expr1, expr2, expr3: > # whatever Not rejected - deliberately left as a future option (this is the reason why the RHS of an as clause has to be parenthesised if you want tuple unpacking). > In some ways, this doesn't even need an extension to the PEP - giving > tuples suitable __enter__ and __exit__ methods would do it. Or, I > suppose a user-defined manager which combined a list of others: > > class combining: > def __init__(*mgrs): > self.mgrs = mgrs > def __with__(self): > return self > def __enter__(self): > return tuple(mgr.__enter__() for mgr in self.mgrs) > def __exit__(self, type, value, tb): > # first in, last out > for mgr in reversed(self.mgrs): > mgr.__exit__(type, value, tb) > > Would that be worth using as an example in the PEP? The issue with that implementation is that the semantics are wrong - it doesn't actually mirror *nested* with statements. If one of the later __enter__ methods, or one of the first-executed __exit__ methods throws an exception, there are a lot of __exit__ methods that get skipped. Getting it right is more complicated (and this probably still has mistakes): class nested(object): def __init__(*mgrs): self.mgrs = mgrs self.entered = None def __context__(self): return self def __enter__(self): self.entered = deque() vars = [] try: for mgr in self.mgrs: var = mgr.__enter__() self.entered.push_front(mgr) vars.append(var) except: self.__exit__(*sys.exc_info()) raise return vars def __exit__(self, *exc_info): # first in, last out # Behave like nested with statements ex = exc_info for mgr in self.entered: try: mgr.__exit__(*ex) except: ex = sys.exc_info() if ex is not exc_info: raise ex[0], ex[1], ex[2] > PS The signature of __with__ in example 4 in the PEP is wrong - it has > an incorrect "lock" parameter. Thanks - I'll fix that when I incorporate the resolutions of the open issues (which will be post the SVN migration). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From jcarlson at uci.edu Wed Oct 26 01:59:51 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 25 Oct 2005 16:59:51 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <435EA9CF.6060305@v.loewis.de> References: <20051025142245.3930.JCARLSON@uci.edu> <435EA9CF.6060305@v.loewis.de> Message-ID: <20051025164015.3942.JCARLSON@uci.edu> "Martin v. L?wis" wrote: > > Josiah Carlson wrote: > > And how users could say, "name error? But I typed in window.draw(PEN) as > > I was told to, and it didn't work!" > > Ah, so the "serious issues" you are talking about are not security > issues, but usability issues. Indeed, it was a misunderstanding, as the email stated: I did not mean to imply that I was concerned about the security implications of inserting arbitrary identifiers in Python (I was mentioning the web browser case for an example of how such characters have been confusing previously), I am concerned about confusion involved with using: [glyphs which are identical] > I don't think extending the range of acceptable characters will > cause any additional confusion. Users are already getting "surprising" > NameErrors/AttributeErrors in the following cases: > - they just misspell the identifier, and then, when the error message > is printed, fail to recognize the difference, as they read over the > typo just like they read over it when mistyping it in the first place. In this case it's not just a misreading, the characters look identical! When is an 'E' not an 'E'? When it is an Epsilon or Ie. Saying what characters will or will not be used as identifiers, when those characters are keys on a keyboard of a specific type, is pretty presumptuous. > - they run into confusions with different things having the same names > in different contexts. For example, they wonder why they get TypeError > for passing the wrong number of arguments to a function, when the > call matches exactly what the source code in front of them tells > them - only that they were calling a different function which just > happened to have the same name. Right, and users should be reading the documentation for the functions and methods they are calling. > In the light of these common mistakes, your example with an identifier > named PEN, where the "P" might be a cyrillic letter or the E a greek one > is just made up: For window.draw, people will readily understand that > they are supposed to use Latin letters. More generally, they will know > what script to use just from looking at the identifier. Sure, that example was made up, but there are words which have been stolen from various languages by english, and you are discounting the case of single-letter temporary variables. Saying what will and won't happen over the course of using unicode identifiers is quite the prediction. > > Identically drawn glyphs are a problem, and pretending that they aren't > > a problem, doesn't make it so. Right now, all possible name glyphs are > > visually distinct > > Not at all: Just compare Fool and Foo1 (and perhaps FooI) > > In the font in which I'm typing this, these are slightly different - but > there are fonts in which the difference is really difficult to > recognize. Indeed, they are similar, but_ different_ in my font as well. The trick is that the glyphs are not different in the case of certain greek or cyrillic letters. They don't just /look/ similar they /are identical/. - Josiah From guido at python.org Wed Oct 26 02:18:37 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 25 Oct 2005 17:18:37 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <20051025164015.3942.JCARLSON@uci.edu> References: <20051025142245.3930.JCARLSON@uci.edu> <435EA9CF.6060305@v.loewis.de> <20051025164015.3942.JCARLSON@uci.edu> Message-ID: On 10/25/05, Josiah Carlson wrote: > Indeed, they are similar, but_ different_ in my font as well. The trick > is that the glyphs are not different in the case of certain greek or > cyrillic letters. They don't just /look/ similar they /are identical/. Well, in the font I'm using to read this email, I and l are /identical/. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jcarlson at uci.edu Wed Oct 26 02:35:37 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 25 Oct 2005 17:35:37 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: References: <20051025164015.3942.JCARLSON@uci.edu> Message-ID: <20051025173215.3951.JCARLSON@uci.edu> Guido van Rossum wrote: > > On 10/25/05, Josiah Carlson wrote: > > Indeed, they are similar, but_ different_ in my font as well. The trick > > is that the glyphs are not different in the case of certain greek or > > cyrillic letters. They don't just /look/ similar they /are identical/. > > Well, in the font I'm using to read this email, I and l are /identical/. In all fonts I've seen, E/Epsilon/Ie are /always identical/. - Josiah From nyamatongwe at gmail.com Wed Oct 26 02:49:53 2005 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Wed, 26 Oct 2005 10:49:53 +1000 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <435EAD98.6000401@v.loewis.de> References: <20051025120919.3927.JCARLSON@uci.edu> <435E9E18.7090502@v.loewis.de> <20051025142245.3930.JCARLSON@uci.edu> <435EAD98.6000401@v.loewis.de> Message-ID: <50862ebd0510251749y2b8137cy672145f04994e854@mail.gmail.com> Martin v. L?wis: > This aspect of rendering is often not implemented, though. Web browsers > do it correctly, see > ... > GUI frameworks sometimes do it correctly, sometimes don't; most > notably, Tk has no good support for RTL text. Scintilla does a rough job with this. RTL text is displayed correctly as the underlying platform libraries (Windows or GTK+/Pango) handle this aspect when called to draw text. However editing is not performed correctly with the caret not being placed correctly within RTL text and other visual glitches. There is interest in the area and even a funding proposal this week. Neil From greg.ewing at canterbury.ac.nz Wed Oct 26 02:51:40 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 26 Oct 2005 13:51:40 +1300 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <435EA9CF.6060305@v.loewis.de> References: <20051025120919.3927.JCARLSON@uci.edu> <435E9E18.7090502@v.loewis.de> <20051025142245.3930.JCARLSON@uci.edu> <435EA9CF.6060305@v.loewis.de> Message-ID: <435ED31C.3010800@canterbury.ac.nz> Martin v. L?wis wrote: > For window.draw, people will readily understand that > they are supposed to use Latin letters. More generally, they will know > what script to use just from looking at the identifier. Would it help if an identifier were required to be made up of letters from the same alphabet, e.g. all Latin or all Greek or all Cyrillic, but not a mixture. Then you'd get an immediate error if you accidentally slipped in a letter from the wrong alphabet. Greg From anthony at interlink.com.au Wed Oct 26 05:25:19 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Wed, 26 Oct 2005 13:25:19 +1000 Subject: [Python-Dev] make testall hanging on HEAD? Message-ID: <200510261325.19363.anthony@interlink.com.au> At the moment, I see make testall hanging in test_timeout. In addition, test_curses is leaving the tty in a hosed state: test_crypt test_csv test_curses test_datetime test_dbm test_decimal test_decorators test_deque test_descr This is on Ubuntu Breezy, [GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu9)] on linux2 Anyone else see this? -- Anthony Baxter It's never too late to have a happy childhood. From jepler at unpythonic.net Wed Oct 26 06:47:13 2005 From: jepler at unpythonic.net (jepler@unpythonic.net) Date: Tue, 25 Oct 2005 23:47:13 -0500 Subject: [Python-Dev] make testall hanging on HEAD? In-Reply-To: <200510261325.19363.anthony@interlink.com.au> References: <200510261325.19363.anthony@interlink.com.au> Message-ID: <20051026044713.GA1460@unpythonic.net> ditto on the "curses" problem, but test_timeout completed just fine, at least the first time around. fedora core 4, x86_64 [GCC 4.0.1 20050727 (Red Hat 4.0.1-5)] on linux2 Jeff -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20051025/36f54361/attachment-0001.pgp From nyamatongwe at gmail.com Wed Oct 26 07:49:39 2005 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Wed, 26 Oct 2005 15:49:39 +1000 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <435DEEF6.5020603@egenix.com> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> <435C9DFC.8020501@egenix.com> <50862ebd0510241613w2b5da91cqbcaf4f6157ae338e@mail.gmail.com> <435DEEF6.5020603@egenix.com> Message-ID: <50862ebd0510252249r24a73350l2c692a0e4c45d2fc@mail.gmail.com> M.-A. Lemburg: > You mean a slice that slices out the next ? Yes. > This sounds a lot like you'd want iterators for the various > index types. Should be possible to implement on top of the > proposed APIs, e.g. itergraphemes(u), itercodepoints(u), etc. Iterators may be helpful, but can also be too restrictive when the processing is not completely iterative, such as peeking ahead or looking behind to wrap at a word boundary in the display example. There should be It was more that there may leave less scope for error if there was a move away from indexes to slices. The PEP provides ways to specify what you want to examine or modify but it looks to me like returning indexes will see code repetition or additional variables with an increase in fragility. > Note that what most people refer to as "character" is a > grapheme in Unicode speak. A grapheme-oriented string type may be worthwhile although you'd probably have to choose a particular normalisation form to ease processing. > Given that interpretation, > "breaking" Unicode "characters" is something you won't > ever work around with by using larger code units such > as UCS4 compatible ones. I still think we can reduce the scope for errors. > Furthermore, you should also note that surrogates (two > code units encoding one code point) are part of Unicode > life. While you don't need them when storing Unicode > in UCS4 code units, they can still be part of the > Unicode data and the programmer has to be aware of > these. Many programmers can and will ignore surrogates. One day that may bite them but we can't close off text processing to those who have no idea of what surrogates are, or directional marks, or that sorting is locale dependent, or have no understanding of the difference between NFC and NFKD normalization forms. Neil From martin at v.loewis.de Wed Oct 26 08:09:33 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 26 Oct 2005 08:09:33 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <20051025164015.3942.JCARLSON@uci.edu> References: <20051025142245.3930.JCARLSON@uci.edu> <435EA9CF.6060305@v.loewis.de> <20051025164015.3942.JCARLSON@uci.edu> Message-ID: <435F1D9D.8060001@v.loewis.de> Josiah Carlson wrote: > In this case it's not just a misreading, the characters look identical! > When is an 'E' not an 'E'? When it is an Epsilon or Ie. Saying what > characters will or will not be used as identifiers, when those > characters are keys on a keyboard of a specific type, is pretty > presumptuous. Why is that rude and disrespectful? I'm certainly respecting developers who want to use their scripts for identifiers, or else I would not have suggested that they could do so. However, from the experience with my own language, and the three or so foreign languages I know, I can tell you that people would normally don't mix identifiers of different scripts. > Sure, that example was made up, but there are words which have been > stolen from various languages by english, and you are discounting the > case of single-letter temporary variables. Saying what will and won't > happen over the course of using unicode identifiers is quite the > prediction. Sure, people can make mistakes. They get an error, and then will need to find the cause of the problem. Sometimes, this will be easy, and sometimes, it will not. > Indeed, they are similar, but_ different_ in my font as well. The trick > is that the glyphs are not different in the case of certain greek or > cyrillic letters. They don't just /look/ similar they /are identical/. This string: "E?" is the LATIN CAPITAL LETTER E, followed by the GREEK CAPITAL LETTER EPSILON. In the font my email composer uses, the E is slightly larger than the Epsilon - so there /is/ a visual difference. But even if there isn't: if this was a frequent problem, the name error could include an alternative representation (say, with Unicode ordinals for non-ASCII characters) which would give an easy visual clue. I still doubt that this is a frequent problem, and I don't see any better grounds for claiming that it is than for claiming that it is not. Regards, Martin From eric.nieuwland at xs4all.nl Wed Oct 26 08:17:01 2005 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Wed, 26 Oct 2005 08:17:01 +0200 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: References: <435A4598.3060403@iinet.net.au> <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com> <435CCB1C.4030108@gmail.com> <435E40E5.7070703@gmail.com> <103348c72cae040229895842cb3c0cdc@xs4all.nl> <0c1f3f308e9c605ef4687689c860913e@xs4all.nl> Message-ID: Guido van Rossum wrote: > [Eric "are all your pets called Eric?" Nieuwland] >>>> Hmmm... Would it be reasonable to introduce a ProtocolError >>>> exception? > > [Guido] >>> And which perceived problem would that solve? > > [Eric] >> It was meant to be a bit more informative about what is wrong. >> >> ProtocolError: lacks __enter__ or __exit__ > > That's exactly what I'm trying to avoid. :) > > I find "AttributeError: __exit__" just as informative. In either case, > if you know what __exit__ means, you'll know what you did wrong. And > if you don't know what it means, you'll have to look it up anyway. And > searching for ProtocolError doesn't do you any good -- you'll have to > learn about what __exit__ is and where it is required. I see. Then why don't we unify *Error into Error? Just read the message and know what it means. And we could then drop the burden of exception classes and only use the message. A sense of deja-vu comes over me somehow ;-) From martin at v.loewis.de Wed Oct 26 08:22:41 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 26 Oct 2005 08:22:41 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <435ED31C.3010800@canterbury.ac.nz> References: <20051025120919.3927.JCARLSON@uci.edu> <435E9E18.7090502@v.loewis.de> <20051025142245.3930.JCARLSON@uci.edu> <435EA9CF.6060305@v.loewis.de> <435ED31C.3010800@canterbury.ac.nz> Message-ID: <435F20B1.8080803@v.loewis.de> Greg Ewing wrote: > Would it help if an identifier were required to be > made up of letters from the same alphabet, e.g. all > Latin or all Greek or all Cyrillic, but not a mixture. > Then you'd get an immediate error if you accidentally > slipped in a letter from the wrong alphabet. Not in the literal sense: you certainly want to allow "latin" digits in, say, a cyrillic identifier.See http://www.unicode.org/reports/tr31/ for what the Unicode consortium recommends to do. In addition to the strict specification, they envision usage guidelines. This seems Pythonic: just because you could potentially shoot yourself in the foot doesn't mean it should be banned from the language. IOW, whether it would help largely depends on whether the problem is real in the first place. Just because you *can* come up with look-alike identifiers doesn't mean that people will use them, or that they will mistake the scripts (except for deliberately doing so, of course). Regards, Martin From stephen at xemacs.org Wed Oct 26 08:40:55 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 26 Oct 2005 15:40:55 +0900 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <20051025164015.3942.JCARLSON@uci.edu> (Josiah Carlson's message of "Tue, 25 Oct 2005 16:59:51 -0700") References: <20051025142245.3930.JCARLSON@uci.edu> <435EA9CF.6060305@v.loewis.de> <20051025164015.3942.JCARLSON@uci.edu> Message-ID: <871x28vkzs.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Josiah" == Josiah Carlson writes: Josiah> Indeed, they are similar, but_ different_ in my font as Josiah> well. The trick is that the glyphs are not different in Josiah> the case of certain greek or cyrillic letters. They don't Josiah> just /look/ similar they /are identical/. But these problems are going to arise in _any_ multilingual context; it's not at all specific to identifiers. It's just that computers lexing identifiers are kinda picky about those things compared to humans. I think you can reasonably classify it as a new breed of typo, and develop UIs to deal with it in that way. To handle cases where glyphs are (nearly) identical, UIs that visually flag "foreign" characters, at least in contexts where cross-block punning is unacceptable, will be developed, and users will learn to pay attention to those flags. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From walter at livinglogic.de Wed Oct 26 09:31:47 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed, 26 Oct 2005 09:31:47 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <20051025142245.3930.JCARLSON@uci.edu> References: <20051025120919.3927.JCARLSON@uci.edu> <435E9E18.7090502@v.loewis.de> <20051025142245.3930.JCARLSON@uci.edu> Message-ID: Am 25.10.2005 um 23:40 schrieb Josiah Carlson: > [...] > Identically drawn glyphs are a problem, and pretending that they > aren't > a problem, doesn't make it so. Right now, all possible name glyphs > are > visually distinct, which would not be the case if any unicode > character > could be used as a name (except for numerals). Speaking of which, > would > we then be offering support for arabic/indic numeric literals, and/or > support it in int()/float()? It's already supported in int() and float() >>> int(u"\u136c\u2082") 42 >>> float(u"\u0664\u09e8") 42.0 But not as literals: # -*- coding: unicode-escape -*- print \u136c\u2082 This gives (on the Mac): File "encoding.py", line 3 print ?? ^ SyntaxError: invalid syntax > [...] Bye, Walter D?rwald From mal at egenix.com Wed Oct 26 11:50:01 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 26 Oct 2005 11:50:01 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <435EA9F6.3030702@v.loewis.de> References: <435E7794.2030505@v.loewis.de> <20051025120919.3927.JCARLSON@uci.edu> <435EA7EB.90100@egenix.com> <435EA9F6.3030702@v.loewis.de> Message-ID: <435F5149.4060804@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: > >>A few years ago we had a discussion about this on python-dev >>and agreed to stick with ASCII identifiers for Python. I still >>think that's the right way to go. > > I don't think there ever was such an agreement. You even argued against having non-ASCII identifiers: http://mail.python.org/pipermail/python-list/2002-May/102936.html and I agree with you on most of the points you make in that posting: * Unicode identifiers are going to introduce massive code breakage - just think of all the tools people use to manipulate Python code today; I'm quite sure that most of it will fail in one way or another if you present it Unicode literals such as in "z?hler += 1". * People don't seem very interested in using Unicode identifiers, e.g. http://mail.python.org/pipermail/i18n-sig/2001-February/000828.html most of the few who did comment, said they'd rather have ASCII identifiers, e.g. http://mail.python.org/pipermail/python-list/2002-May/104050.html Do you really think that it will help with code readability if programmers are allowed to use native scripts for their identifiers ? I think this goes beyond just visual aspects of being able to distinguish graphemes: If you are told to debug a program written by say a Japanese programmer using Japanese identifiers you are going to have a really hard time. Integrating such code into other applications will be even harder, since you'd be forced to use his Japanese class names in your application. This doesn't only introduce problems with being able to enter the Japanese identifiers, it will also cause your application to suddenly contain identifiers in Japanese even though that's not your native script. I think source code encodings provide an ideal way to have comments written in native scripts - and people use that a lot. However, keeping the program code itself in plain ASCII makes it far more readable and reusable across locales. Something that's important in this globalized world. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 26 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From paolo.veronelli at gmail.com Mon Oct 24 17:30:36 2005 From: paolo.veronelli at gmail.com (Paolino) Date: Mon, 24 Oct 2005 17:30:36 +0200 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <435CF941.6070104@libero.it> References: <1130107429.11268.40.camel@geddy.wooz.org> <435CF941.6070104@libero.it> Message-ID: <435CFE1C.9050809@gmail.com> Paolino wrote: > Is __hash__=id inside a class enough to use a set (sets.Set before 2.5) > derived class instance as a key to a mapping? It is __hash__=lambda self:id(self) that is terribly slow ,it needs a faster way to state that to let them be useful as key to mapping as all set operations will pipe into the mechanism .In my application that function is eating time like hell, and will keep on doing it even with the PEP proposed .OT probably. Regards Paolino From bokr at oz.net Tue Oct 25 02:56:40 2005 From: bokr at oz.net (Bengt Richter) Date: Mon, 24 Oct 2005 17:56:40 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <435CACAD.9070106@egenix.com> References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net> Message-ID: <5.0.2.1.1.20051024133833.02b81cb0@mail.oz.net> At 11:43 2005-10-24 +0200, M.-A. Lemburg wrote: >Bengt Richter wrote: >> Please bear with me for a few paragraphs ;-) > >Please note that source code encoding doesn't really have >anything to do with the way the interpreter executes the >program - it's merely a way to tell the parser how to >convert string literals (currently on the Unicode ones) >into constant Unicode objects within the program text. >It's also a nice way to let other people know what kind of >encoding you used to write your comments ;-) > >Nothing more. I think somehow I didn't make things clear, sorry ;-) As I tried to show in the example of module_a.cs vs module_b.cs, the source encoding currently results in two different str-type strings representing the source _character_ sequence, which is the _same_ in both cases. To make it more clear, try the following little program (untested except on NT4 with Python 2.4b1 (#56, Nov 3 2004, 01:47:27) [GCC 3.2.3 (mingw special 20030504-1)] on win32 ;-): ----< t_srcenc.py >-------------------------------- import os def test(): open('module_a.py','wb').write( "# -*- coding: latin-1 -*-" + os.linesep + "cs = '\xfcber-cool'" + os.linesep) open('module_b.py','wb').write( "# -*- coding: utf-8 -*-" + os.linesep + "cs = '\xc3\xbcber-cool'" + os.linesep) # show that we have two modules differing only in encoding: print ''.join(line.decode('latin-1') for line in open('module_a.py')) print ''.join(line.decode('utf-8') for line in open('module_b.py')) # see how results are affected: import module_a, module_b print module_a.cs + ' =?= ' + module_b.cs print module_a.cs.decode('latin-1') + ' =?= ' + module_b.cs.decode('utf-8') if __name__ == '__main__': test() --------------------------------------------------- The result copied from NT4 console to clipboard and pasted into eudora: __________________________________________________________ [17:39] C:\pywk\python-dev>py24 t_srcenc.py # -*- coding: latin-1 -*- cs = '?ber-cool' # -*- coding: utf-8 -*- cs = '?ber-cool' nber-cool =?= ++ber-cool ?ber-cool =?= ?ber-cool __________________________________________________________ (I'd say NT did the best it could, rendering the the copied cp437 superscript n as the 'n' above, and the '++' coming from the cp437 box characters corresponding to the '\xc3\xbc'. Not sure how it will show on your screen, but try the program to see ;-) >Once a module is compiled, there's no distinction between >a module using the latin-1 source code encoding or one using >the utf-8 encoding. ISTM module_a.cs and module_b.cs can readily be distinguished after compilation, whereas the sources displayed according to their declared encodings as above (or as e.g. different editors using different native encoding might) cannot (other than the encoding cookie itself) ;-) Perhaps you meant something else? >Thanks, You're welcome. Regards, Bengt Richter From lucky1010_studies at yahoo.co.in Tue Oct 25 15:09:10 2005 From: lucky1010_studies at yahoo.co.in (Lucky Wankhede) Date: Tue, 25 Oct 2005 14:09:10 +0100 (BST) Subject: [Python-Dev] "? operator in python" Message-ID: <20051025130910.65658.qmail@web8602.mail.in.yahoo.com> Dear sir, I m a student of Computer Science Dept. University Of Pune.(M.S.) (India). Sir , I have found that the python is about to have feature of "? " operator same as in C languge. Sir , Not Only I but the our whole Dept. is waitng for it. Kindly provide me with the information that in version of python we will be able to find that feature and when it is about to realse. Considering your best of sympathetic consideration. Hoping for early response. Thank You. Mr. Lucky R. Wankhede M.C,A, Ist, __________________________________________________________ Yahoo! India Matrimony: Find your partner now. Go to http://yahoo.shaadi.com From lucky1010_studies at yahoo.co.in Tue Oct 25 15:12:26 2005 From: lucky1010_studies at yahoo.co.in (Lucky Wankhede) Date: Tue, 25 Oct 2005 14:12:26 +0100 (BST) Subject: [Python-Dev] "? operator in python" Message-ID: <20051025131226.42537.qmail@web8603.mail.in.yahoo.com> Dear sir, I m a student of Computer Science Dept. University Of Pune.(M.S.) (India). We are learning python as a course for our semester. Found its not only use full but heart touching laguage. Sir , I have found that the python is going to have new feature, of "? " operator, same as in C languge. Kindly provide me with the information that in version of python we will be able to find that feature and when it is about to realse. Considering your best of sympathetic consideration. Hoping for early response. Thank You. Mr. Lucky R. Wankhede M.C,A, Ist, Dept. Of Comp. Sciende, University of Pune, India. __________________________________________________________ Yahoo! India Matrimony: Find your partner now. Go to http://yahoo.shaadi.com From p.f.moore at gmail.com Wed Oct 26 14:59:34 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 26 Oct 2005 13:59:34 +0100 Subject: [Python-Dev] PEP 343 - multiple context managers in one statement In-Reply-To: <435EAFC2.6020305@gmail.com> References: <79990c6b0510251340s6adb7fbcpac5247886a171c3f@mail.gmail.com> <435EAFC2.6020305@gmail.com> Message-ID: <79990c6b0510260559k1aaf7b40o3598001a7d79fcaa@mail.gmail.com> On 10/25/05, Nick Coghlan wrote: > Paul Moore wrote: [...] > > Has the option of letting the with statement admit multiple context > > managers been considered (and presumably rejected)? [...] > Not rejected - deliberately left as a future option (this is the reason why > the RHS of an as clause has to be parenthesised if you want tuple unpacking). Thanks. I now see that note in the PEP - apologies for missing it in the first instance. [...] > The issue with that implementation is that the semantics are wrong - it > doesn't actually mirror *nested* with statements. If one of the later > __enter__ methods, or one of the first-executed __exit__ methods throws an > exception, there are a lot of __exit__ methods that get skipped. > > Getting it right is more complicated (and this probably still has mistakes): Bah. You're right, of course (about it being more complicated - I can't see any mistakes :-)) I'd argue that precisely because a naive approach gets it wrong, having your version as an example in the PEP (and possibly the documentation, much like the itertools module has a recipes section) is that much more useful. Anyway, thanks for the help. Paul. From mcherm at mcherm.com Wed Oct 26 17:32:46 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Wed, 26 Oct 2005 08:32:46 -0700 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues Message-ID: <20051026083246.6tg0gu2gafms8gkc@login.werra.lunarpages.com> Guido writes: > I find "AttributeError: __exit__" just as informative. Eric Nieuwland responds: > I see. Then why don't we unify *Error into Error? > Just read the message and know what it means. > And we could then drop the burden of exception classes and only use the > message. > A sense of deja-vu comes over me somehow ;-) The answer (and there _IS_ an answer) is that using different exception types allows the user some flexibility in CATCHING the exceptions. The discussion you have been following obscures that point somewhat because there's little meaningful difference between TypeError and AttributeError (at least in well-written code that doesn't have unnecessary typechecks in it). If there were a significant difference between TypeError and AttributeError then Nick and Guido would have immediately chosen the appropriate error type based on functionality rather than style, and there wouldn't have been any need for discussion. Oh yeah, and you can also put extra info into an exception object besides just the error message. (We don't do that as often as we should... it's a powerful technique.) -- Michael Chermside From jimjjewett at gmail.com Wed Oct 26 18:16:14 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 26 Oct 2005 12:16:14 -0400 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). Message-ID: Greg Ewing asked: > Would it help if an identifier were required to be > made up of letters from the same alphabet, e.g. all > Latin or all Greek or all Cyrillic, but not a mixture. Probably, yes, though there could still be problems mixing within a program. FWIW, the Opera web browser is already using a similar solution. Domain names are limited to Latin-1 *unless* the top-level registrar has a policy to prevent spoofing. -jJ From guido at python.org Wed Oct 26 18:21:55 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 26 Oct 2005 09:21:55 -0700 Subject: [Python-Dev] "? operator in python" In-Reply-To: <20051025131226.42537.qmail@web8603.mail.in.yahoo.com> References: <20051025131226.42537.qmail@web8603.mail.in.yahoo.com> Message-ID: Dear Lucky, You are correct. Python 2.5 will have a conditional operator. The syntax will be different than C; it will look like this: (EXPR1 if TEST else EXPR2) (which is the equivalent of TEST?EXPR1:EXPR2 in C). For more information, see PEP 308 (http://www.python.org/peps/pep-0308.html). Python 2.5 will be released some time next year; we hope to have alphas available in the 2nd quarter. Thatr's about as firm as we can currently be about the release date. Enjoy, --Guido van Rossum On 10/25/05, Lucky Wankhede wrote: > > > Dear sir, > > I m a student of Computer Science Dept. > University Of Pune.(M.S.) (India). We are learning > python as a course for our semester. Found its not > only use full but heart touching laguage. > > Sir , I have found that the python is going > to have new feature, of "? " operator, same as in C > languge. > > > Kindly provide me with the information that > in version of python we will be able to find that > feature and when it is about to realse. > Considering your best of sympathetic > consideration. Hoping for early response. > > > > Thank You. > > > Mr. Lucky R. Wankhede > M.C,A, Ist, > Dept. Of Comp. > Sciende, > University of Pune, > India. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Wed Oct 26 19:02:22 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 26 Oct 2005 19:02:22 +0200 Subject: [Python-Dev] CVS is read-only Message-ID: <435FB69E.8090501@v.loewis.de> I just switched the repository to read-only mode, and removed the test subversion installation. I'll let you know when the conversion is complete. Regards, Martin From martin at v.loewis.de Wed Oct 26 19:32:52 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 26 Oct 2005 19:32:52 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <435F5149.4060804@egenix.com> References: <435E7794.2030505@v.loewis.de> <20051025120919.3927.JCARLSON@uci.edu> <435EA7EB.90100@egenix.com> <435EA9F6.3030702@v.loewis.de> <435F5149.4060804@egenix.com> Message-ID: <435FBDC4.8030300@v.loewis.de> M.-A. Lemburg wrote: > You even argued against having non-ASCII identifiers: > > http://mail.python.org/pipermail/python-list/2002-May/102936.html I see :-) It seems I have changed my mind since then (which apparently predates PEP 263). One issue I apparently was worried about was the plan to use native-encoding byte strings for the identifiers; this I didn't like at all. > * Unicode identifiers are going to introduce massive > code breakage - just think of all the tools people use > to manipulate Python code today; I'm quite sure that > most of it will fail in one way or another if you present > it Unicode literals such as in "z?hler += 1". True. Today, I think I would be willing to accept the code breakage: these tools had quite some time to update themselves to PEP 263 (even though not all of them have done so yet); also, usage of the feature would only spread gradually. A failure to support the feature in the Python proper would be treated as a bug by us; how tool providers deal with the feature would be their choice. > * People don't seem very interested in using Unicode > identifiers, e.g. > > http://mail.python.org/pipermail/i18n-sig/2001-February/000828.html True. However, I also suspect that lack of tool support contributes to that. For the specific case of Java, there is no notion of source encoding, which makes Unicode identifiers really tedious to use. If it were really easy to use, I assume people would actually use it - atleast in some of the contexts, like teaching, where Python is also widely used. > Do you really think that it will help with code readability > if programmers are allowed to use native scripts for their > identifiers ? Yes, I do - for some groups of users. Of course, code sharing would be more difficult, and there certainly should be a policy to use only ASCII in the standard library. But within local groups, users would find understanding code easier if they knew what the identifiers actually meant. > If you are told to debug a program > written by say a Japanese programmer using Japanese identifiers > you are going to have a really hard time. Integrating such > code into other applications will be even harder, since you'd > be forced to use his Japanese class names in your application. Certainly, yes. There is a trade-off: you can make it easier for some people to read and write code if they can use their native script; at the same time, it would be harder for others to read and modify it. It's a policy decision whether you use English identifiers or not - it shouldn't be a technical decision (as it currently is). > I think source code encodings provide an ideal way to > have comments written in native scripts - and people > use that a lot. However, keeping the program code itself > in plain ASCII makes it far more readable and reusable > across locales. Something that's important in this > globalized world. Certainly. However, some programs don't need to live in a globalized world - e.g. if they are homework in a school. Within a locale, using native scripts would make the program more readable. Regards, Martin From jcarlson at uci.edu Wed Oct 26 20:33:14 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 26 Oct 2005 11:33:14 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <435FBDC4.8030300@v.loewis.de> References: <435F5149.4060804@egenix.com> <435FBDC4.8030300@v.loewis.de> Message-ID: <20051026105934.3977.JCARLSON@uci.edu> "Martin v. L?wis" wrote: > > M.-A. Lemburg wrote: > > You even argued against having non-ASCII identifiers: > > > > http://mail.python.org/pipermail/python-list/2002-May/102936.html > > > > Do you really think that it will help with code readability > > if programmers are allowed to use native scripts for their > > identifiers ? > > Yes, I do - for some groups of users. Of course, code sharing > would be more difficult, and there certainly should be a policy > to use only ASCII in the standard library. But within local > groups, users would find understanding code easier if they > knew what the identifiers actually meant. According to wikipedia (http://en.wikipedia.org/wiki/Latin_alphabet), various languages have adopted a transliteration of their language and/or former alphabets into latin. They don't purport to know all of the reasons why, and I'm not going to speculate. Whether or not more languages start using the latin alphabet is a good question. Basing judgement on history and likely globalization, it is only a matter of time before basically all languages have a transcription into the latin alphabet that is taught to all (unless China takes over the world). - Josiah From jcarlson at uci.edu Wed Oct 26 20:40:14 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 26 Oct 2005 11:40:14 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <435F1D9D.8060001@v.loewis.de> References: <20051025164015.3942.JCARLSON@uci.edu> <435F1D9D.8060001@v.loewis.de> Message-ID: <20051026002535.395E.JCARLSON@uci.edu> "Martin v. L?wis" wrote: > > Josiah Carlson wrote: > > In this case it's not just a misreading, the characters look identical! > > When is an 'E' not an 'E'? When it is an Epsilon or Ie. Saying what > > characters will or will not be used as identifiers, when those > > characters are keys on a keyboard of a specific type, is pretty > > presumptuous. > > Why is that rude and disrespectful? I'm certainly respecting developers > who want to use their scripts for identifiers, or else I would not have > suggested that they could do so. I never said rude, I said presumptuous. "Going beyond what is right or proper; excessively forward." (according to dictionary.com, the OED has a similar definition). I was trying to say that in stating that users wouldn't be using keys on their keyboard in their natual language when also using english characters, that you were assuming a bit about their usage patterns that you perhaps shouldn't. I certainly could also be presumptuous in stating that users may very well mix certain languages, but it seems to be more likely given keywords and the standard library using the latin alphabet. > > Indeed, they are similar, but_ different_ in my font as well. The trick > > is that the glyphs are not different in the case of certain greek or > > cyrillic letters. They don't just /look/ similar they /are identical/. > > This string: "E??" is the LATIN CAPITAL LETTER E, followed by the GREEK > CAPITAL LETTER EPSILON. In the font my email composer uses, the E is > slightly larger than the Epsilon - so there /is/ a visual difference. My email client doesn't handle unicode, but a quick check by swapping fonts in a word processor provides that at least on my platform, all three are the same glyph (same size, shape, ...) for all fixed-width fonts. If a platform distinguishes all three, then one should consider one's platform lucky. Not all platforms and/or preferred fonts of users are. > But even if there isn't: if this was a frequent problem, the name > error could include an alternative representation (say, with Unicode > ordinals for non-ASCII characters) which would give an easy visual > clue. It would offer a great cue, but I'm not sure if it is possible. I think that it sounds like an ugly discussion of stdout/err encodings and exception handling machinery that I don't want to be a part of. > I still doubt that this is a frequent problem, and I don't see any > better grounds for claiming that it is than for claiming that it > is not. Whether or not it is frequent will depend on the prevalence of desire to use those characters. While I don't think that such uses will be as common as using 'klass' when passing a class, I do think that it will result in more than a few sf bug reports. I also share Marc-Andre Lemburg's concerns about the understandability of code written in Kanji, Hebrew, Arabic, etc., at least for those who have not memorized the entirety of those alphabets. - Josiah From martin at v.loewis.de Wed Oct 26 21:39:31 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 26 Oct 2005 21:39:31 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <20051026105934.3977.JCARLSON@uci.edu> References: <435F5149.4060804@egenix.com> <435FBDC4.8030300@v.loewis.de> <20051026105934.3977.JCARLSON@uci.edu> Message-ID: <435FDB73.5080703@v.loewis.de> Josiah Carlson wrote: > According to wikipedia (http://en.wikipedia.org/wiki/Latin_alphabet), > various languages have adopted a transliteration of their language > and/or former alphabets into latin. They don't purport to know all of > the reasons why, and I'm not going to speculate. > > Whether or not more languages start using the latin alphabet is a good > question. Basing judgement on history and likely globalization, it is > only a matter of time before basically all languages have a > transcription into the latin alphabet that is taught to all (unless > China takes over the world). That is a very U.S. centric view. I don't share it, but I think it is pointless to argue against it. Regards, Martin From ejones at uwaterloo.ca Thu Oct 27 02:02:54 2005 From: ejones at uwaterloo.ca (Evan Jones) Date: Wed, 26 Oct 2005 20:02:54 -0400 Subject: [Python-Dev] Parser and Runtime: Divorced! Message-ID: <03b7f74aebe5c6249a8bb00ac17d1952@uwaterloo.ca> After a few hours of tedious and frustrating hacking I've managed to separate the Python abstract syntax tree parser from the rest of Python itself. This could be useful for people who may wish to build Python tools without Python, or tools in C/C++. In the process of doing this, I came across a comment mentioning that it would be desirable to separate the parser. Is there any interest in doing this? I now have a vague idea about how to do this. Of course, there is no point in making changes like this unless there is some tangible benefit. I will make my ugly hack available once I have polished it a little bit more. It involved hacking header files to provide a "re-implementation" of the pieces of Python that the parser needs (PyObject, PyString, and PyInt). It likely is a bit buggy, and it doesn't support all the types (notably, it is missing support for Unicode, Longs, Floats, and Complex), but it works well enough to get the AST from a few simple strings, which is what I wanted. Evan Jones -- Evan Jones http://evanjones.ca/ From greg.ewing at canterbury.ac.nz Thu Oct 27 04:14:04 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 27 Oct 2005 15:14:04 +1300 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <435F5149.4060804@egenix.com> References: <435E7794.2030505@v.loewis.de> <20051025120919.3927.JCARLSON@uci.edu> <435EA7EB.90100@egenix.com> <435EA9F6.3030702@v.loewis.de> <435F5149.4060804@egenix.com> Message-ID: <436037EC.7050308@canterbury.ac.nz> M.-A. Lemburg wrote: > If you are told to debug a program > written by say a Japanese programmer using Japanese identifiers > you are going to have a really hard time. Or you could look upon it as an opportunity to broaden your mental horizons by learning some Japanese. :-) -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Oct 27 04:14:13 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 27 Oct 2005 15:14:13 +1300 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <435F20B1.8080803@v.loewis.de> References: <20051025120919.3927.JCARLSON@uci.edu> <435E9E18.7090502@v.loewis.de> <20051025142245.3930.JCARLSON@uci.edu> <435EA9CF.6060305@v.loewis.de> <435ED31C.3010800@canterbury.ac.nz> <435F20B1.8080803@v.loewis.de> Message-ID: <436037F5.8050501@canterbury.ac.nz> Martin v. L?wis wrote: > Not in the literal sense: you certainly want to allow > "latin" digits in, say, a cyrillic identifier. Yes, by "alphabet" I really only meant the letters, although you might want to apply the same idea to clusters of digits within an identifier, depending on how potentially confusable the various sets of digits are -- I'm not familiar enough with alternative digit sets to know whether that would be a problem. > Just because > you *can* come up with look-alike identifiers doesn't > mean that people will use them, or that they will mistake > the scripts (except for deliberately doing so, of > course). I still think this is a much worse potential problem than that of "l" vs "1", etc. It's reasonable to adopt the practice of never using "l" as a single letter identifier, for example. But it would be unreasonable to ban the use of "E" as an identifier on the grounds that someone somewhere might confuse it with a capital epsilon. An alternative would be to identify such confusable letters in the various alphabets and define them to be equivalent. And beyond the issue of alphabets there's also the question of whether accented characters should be considered distinct. I can see quite a few holy flame wars erupting over that... -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From jcarlson at uci.edu Thu Oct 27 08:23:39 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 26 Oct 2005 23:23:39 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <435FDB73.5080703@v.loewis.de> References: <20051026105934.3977.JCARLSON@uci.edu> <435FDB73.5080703@v.loewis.de> Message-ID: <20051026232149.399A.JCARLSON@uci.edu> "Martin v. L?wis" wrote: > Josiah Carlson wrote: > > According to wikipedia (http://en.wikipedia.org/wiki/Latin_alphabet), > > various languages have adopted a transliteration of their language > > and/or former alphabets into latin. They don't purport to know all of > > the reasons why, and I'm not going to speculate. > > > > Whether or not more languages start using the latin alphabet is a good > > question. Basing judgement on history and likely globalization, it is > > only a matter of time before basically all languages have a > > transcription into the latin alphabet that is taught to all (unless > > China takes over the world). > > That is a very U.S. centric view. I don't share it, but I think it is > pointless to argue against it. I should have included a ;). Whether or not in the future all languages use the latin alphabet should have little to do with whether Python chooses to support non-latin identifiers in the forthcoming 2.5 or later releases. - Josiah From mal at egenix.com Thu Oct 27 11:09:04 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 27 Oct 2005 11:09:04 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <435FBDC4.8030300@v.loewis.de> References: <435E7794.2030505@v.loewis.de> <20051025120919.3927.JCARLSON@uci.edu> <435EA7EB.90100@egenix.com> <435EA9F6.3030702@v.loewis.de> <435F5149.4060804@egenix.com> <435FBDC4.8030300@v.loewis.de> Message-ID: <43609930.5030907@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: > >>You even argued against having non-ASCII identifiers: >> >>http://mail.python.org/pipermail/python-list/2002-May/102936.html > > > I see :-) It seems I have changed my mind since then (which > apparently predates PEP 263). > > One issue I apparently was worried about was the plan to use > native-encoding byte strings for the identifiers; this I didn't > like at all. > > >>* Unicode identifiers are going to introduce massive >>code breakage - just think of all the tools people use >>to manipulate Python code today; I'm quite sure that >>most of it will fail in one way or another if you present >>it Unicode literals such as in "z?hler += 1". > > > True. Today, I think I would be willing to accept the > code breakage: these tools had quite some time to update > themselves to PEP 263 (even though not all of them have > done so yet); also, usage of the feature would only spread > gradually. A failure to support the feature in the Python > proper would be treated as a bug by us; how tool providers > deal with the feature would be their choice. I was thinking of introspection and debugging tools. These would then see Unicode objects in the namespace dictionaries and this will likely break a lot of code - much for the same reason you see code breakage now if you let Unicode object enter the Python standard lib without warning :-) >>* People don't seem very interested in using Unicode >>identifiers, e.g. >> >> http://mail.python.org/pipermail/i18n-sig/2001-February/000828.html > > > True. However, I also suspect that lack of tool support > contributes to that. For the specific case of Java, > there is no notion of source encoding, which makes Unicode > identifiers really tedious to use. > > If it were really easy to use, I assume people would actually > use it - atleast in some of the contexts, like teaching, > where Python is also widely used. Well, that has two sides: Of course, you'll always find some people that will like a certain feature. The question is what effects does it have on the rest of us. Python has always put some constraints on programmers to raise code readability, e.g. white space awareness. Giving them Unicode identifiers sounds like a step backwards in this context. Note that I'm not talking about comments, string literal contents, etc. - only the programming logic, ie. keywords and identifiers. >>Do you really think that it will help with code readability >>if programmers are allowed to use native scripts for their >>identifiers ? > > > Yes, I do - for some groups of users. Of course, code sharing > would be more difficult, and there certainly should be a policy > to use only ASCII in the standard library. But within local > groups, users would find understanding code easier if they > knew what the identifiers actually meant. Hmm, but why do you think they wouldn't understand the meaning of ASCII versions of the identifiers ? Note that using ASCII doesn't necessarily mean that you have to use English as basis for the naming schemes of identifiers. >>If you are told to debug a program >>written by say a Japanese programmer using Japanese identifiers >>you are going to have a really hard time. Integrating such >>code into other applications will be even harder, since you'd >>be forced to use his Japanese class names in your application. > > > Certainly, yes. There is a trade-off: you can make it easier > for some people to read and write code if they can use their > native script; at the same time, it would be harder for others > to read and modify it. > > It's a policy decision whether you use English identifiers or > not - it shouldn't be a technical decision (as it currently > is). See above: ASCII != English. Most scripts have a transliteration into ASCII - simply because that's the global standard for scripts. >>I think source code encodings provide an ideal way to >>have comments written in native scripts - and people >>use that a lot. However, keeping the program code itself >>in plain ASCII makes it far more readable and reusable >>across locales. Something that's important in this >>globalized world. > > > Certainly. However, some programs don't need to live in > a globalized world - e.g. if they are homework in a school. > Within a locale, using native scripts would make the program > more readable. True. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 27 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mal at egenix.com Thu Oct 27 11:25:22 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 27 Oct 2005 11:25:22 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <436037EC.7050308@canterbury.ac.nz> References: <435E7794.2030505@v.loewis.de> <20051025120919.3927.JCARLSON@uci.edu> <435EA7EB.90100@egenix.com> <435EA9F6.3030702@v.loewis.de> <435F5149.4060804@egenix.com> <436037EC.7050308@canterbury.ac.nz> Message-ID: <43609D02.8080209@egenix.com> Greg Ewing wrote: > M.-A. Lemburg wrote: > > >>If you are told to debug a program >>written by say a Japanese programmer using Japanese identifiers >>you are going to have a really hard time. > > > Or you could look upon it as an opportunity to > broaden your mental horizons by learning some > Japanese. :-) I just took Japanese as exmaple for a language and script that I don't know anything about. I would actually love to learn some Japanese, but simply don't have the time for learning it. Anyway, I could just as well have chosen Tibetian, Thai or Limbu scripts (which all look very nice, BTW): http://www.unicode.org/charts/ Perhaps this is not as bad after all - I just don't think that it will help code readability in the long run. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 27 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From martin at v.loewis.de Thu Oct 27 12:35:13 2005 From: martin at v.loewis.de (martin@v.loewis.de) Date: Thu, 27 Oct 2005 12:35:13 +0200 Subject: [Python-Dev] Conversion to Subversion is complete Message-ID: <1130409313.4360ad6139518@www.domainfactory-webmail.de> The Python source code repository is now converted to subversion; please feel free to start checking out new sandboxes. For a few days, this installation probably still needs to be considered in testing. If there are no serious problems found by next Monday, I would consider conversion of the data complete. The CVS repository will be kept available read-only for a while longer, so you can easily forward any patches you may have. Most of you are probably interested in checking out one of these folders: svn+ssh://pythondev at svn.python.org/python/trunk svn+ssh://pythondev at svn.python.org/python/branches/release24-maint svn+ssh://pythondev at svn.python.org/peps The anonymous read-only equivalents of these are http://svn.python.org/projects/python/trunk http://svn.python.org/projects/python/branches/release24-maint http://svn.python.org/projects/peps As mentioned before, in addition to "plain" http/WebDAV, viewcvs is available at http://svn.python.org/view/ There are some more things left to be done, such as updating the developer documentation. I'll start working on that soon, but contributions are welcome. Regards, Martin From skip at pobox.com Thu Oct 27 12:49:35 2005 From: skip at pobox.com (skip@pobox.com) Date: Thu, 27 Oct 2005 05:49:35 -0500 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <1130409313.4360ad6139518@www.domainfactory-webmail.de> References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> Message-ID: <17248.45247.676631.388117@montanaro.dyndns.org> >>>>> "martin" == martin writes: martin> The Python source code repository is now converted to martin> subversion; please feel free to start checking out new martin> sandboxes. Excellent... Thanks for all the effort. Skip From jeremy at alum.mit.edu Thu Oct 27 14:23:45 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Thu, 27 Oct 2005 08:23:45 -0400 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <2mbr1g6loh.fsf@starship.python.net> References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net> Message-ID: Can anyone point an old CVS/Perforce-Luddite at instructions for how to use the new SVN repository? Jeremy On 10/23/05, Michael Hudson wrote: > "Martin v. L?wis" writes: > > > I'd like to start the subversion switchover this coming Wednesday, > > with a total commit freeze at 16:00 GMT. > > Yay! Thanks again for doing this. > > Cheers, > mwh > > -- > [Perl] combines all the worst aspects of C and Lisp: a billion > different sublanguages in one monolithic executable. It combines > the power of C with the readability of PostScript. -- Jamie Zawinski > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > From dave at boost-consulting.com Thu Oct 27 14:26:36 2005 From: dave at boost-consulting.com (David Abrahams) Date: Thu, 27 Oct 2005 08:26:36 -0400 Subject: [Python-Dev] [Docs] MinGW and libpython24.a References: <435E768D.2000401@v.loewis.de> Message-ID: David Abrahams writes: > "Martin v. L?wis" writes: > >> David Abrahams wrote: >>> Is the instruction at >>> http://www.python.org/dev/doc/devel/inst/tweak-flags.html#SECTION000622000000000000000 >>> still relevant? I am not 100% certain I didn't make one myself, but >>> it looks to me as though my Windows Python 2.4.1 distro came with a >>> libpython24.a. I am asking here because it seems only the person who >>> prepares the installer would know. >> >> That impression might be incorrect: I can tell you when I started >> including libpython24.a, but I have no clue whether the instructions >> you refer to are correct - I don't use the file myself at all. >> >>> If this is true, in which version was it introduced? >> >> It was introduced in 1.20/1.16.2.4 of Tools/msi/msi.py in response to >> patch #1088716; this in turn was first used to release r241c1. > > Thanks! As it turns out, MinGW also implemented, in version 3.0.0 (with binutils-2.13.90-20030111-1), features which make the creation of libpython24.a unnecessary. So whoever maintains this doc might want to note that you only need that step if you are using a version of Python prior to 2.4.1 with a MinGW prior to 3.0.0 (with binutils-2.13.90-20030111-1). Regards -- Dave Abrahams Boost Consulting www.boost-consulting.com From walter at livinglogic.de Thu Oct 27 14:41:07 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Thu, 27 Oct 2005 14:41:07 +0200 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <1130409313.4360ad6139518@www.domainfactory-webmail.de> References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> Message-ID: <4360CAE3.4090408@livinglogic.de> martin at v.loewis.de wrote: > The Python source code repository is now converted to subversion; > [...] Thanks for doing this. BTW, will there be daily tarballs, like the one available from: http://cvs.perl.org/snapshots/python/python/python-latest.tar.gz Bye, Walter D?rwald From eric.nieuwland at xs4all.nl Thu Oct 27 14:45:54 2005 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Thu, 27 Oct 2005 14:45:54 +0200 Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues In-Reply-To: <20051026083246.6tg0gu2gafms8gkc@login.werra.lunarpages.com> References: <20051026083246.6tg0gu2gafms8gkc@login.werra.lunarpages.com> Message-ID: <0a6f9d0d448835d658af6e4be85bd954@xs4all.nl> Michael Chermside wrote: > Guido writes: >> I find "AttributeError: __exit__" just as informative. > > Eric Nieuwland responds: >> I see. Then why don't we unify *Error into Error? >> Just read the message and know what it means. >> And we could then drop the burden of exception classes and only use >> the >> message. >> A sense of deja-vu comes over me somehow ;-) > > The answer (and there _IS_ an answer) is that using different exception > types allows the user some flexibility in CATCHING the exceptions. The > discussion you have been following obscures that point somewhat because > there's little meaningful difference between TypeError and > AttributeError (at least in well-written code that doesn't have > unnecessary typechecks in it). Yep. I too would like to have 'SOME flexibility in catching the exceptions' meaning I'd like to be able to catch TypeErrors and AttributeErrors while not catching what I call ProtocolErrors. The simple reason is that in most of my apps TypeErrors and AttributeErrors will depend on the runtime situation, while ProtocolErrors will mostly be static. So I'll debug for ProtocolErrors and I'll handle runtime stuff. > If there were a significant difference between TypeError and > AttributeError then Nick and Guido would have immediately chosen the > appropriate error type based on functionality rather than style, and > there wouldn't have been any need for discussion. I got that already. To me it means one of them may be a candidate for removal/redefinition. > Oh yeah, and you can also put extra info into an exception object > besides just the error message. (We don't do that as often as we > should... it's a powerful technique.) Perhaps that needs for propaganda then. I won't dare to suggest syntactic sugar ;-) --eric From skip at pobox.com Thu Oct 27 14:54:59 2005 From: skip at pobox.com (skip@pobox.com) Date: Thu, 27 Oct 2005 07:54:59 -0500 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net> Message-ID: <17248.52771.225830.484931@montanaro.dyndns.org> Jeremy> Can anyone point an old CVS/Perforce-Luddite at instructions for Jeremy> how to use the new SVN repository? Jeremy, I'd never used Subversion until Barry grabbed the python.org web maintainers by our collective ears and dragged us to the table with the kool aid. As it turns out, the svn flavored kool aid tastes about the same as the cvs flavor (svn {commit,up,diff} == cvs {commit,up,diff}, though there are some slight aftertastes you have to get used to (e.g., revision numbers are for the entire branch, not just a single file). That said, the best place to start is probably the Subversion book, available in both online and dead tree versions: http://svnbook.red-bean.com/ Appendix A of that book is "Subversion for CVS Users". Probably worth a quick skim and a browser bookmark. Though there's no svn/cvs cheatsheet there, you may also find isolated tidbits in the Subversion FAQ: http://subversion.tigris.org/faq.html Just grep around for "cvs". Skip From wl at flexis.de Thu Oct 27 15:15:54 2005 From: wl at flexis.de (Wolfgang Langner) Date: Thu, 27 Oct 2005 15:15:54 +0200 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <17248.45247.676631.388117@montanaro.dyndns.org> References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> <17248.45247.676631.388117@montanaro.dyndns.org> Message-ID: Hello, skip at pobox.com wrote: > martin> The Python source code repository is now converted to > martin> subversion; please feel free to start checking out new > martin> sandboxes. > > Excellent... Thanks for all the effort. Good work. I checked the http and viewcvs access and all worked. But why is an old subversion used ? (Powered by Subversion version 1.1.4) bye by Wolfgang From mwh at python.net Thu Oct 27 15:57:19 2005 From: mwh at python.net (Michael Hudson) Date: Thu, 27 Oct 2005 14:57:19 +0100 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <1130409313.4360ad6139518@www.domainfactory-webmail.de> (martin@v.loewis.de's message of "Thu, 27 Oct 2005 12:35:13 +0200") References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> Message-ID: <2m64rj5agw.fsf@starship.python.net> martin at v.loewis.de writes: > The Python source code repository is now converted to subversion; > please feel free to start checking out new sandboxes. For a few > days, this installation probably still needs to be considered in > testing. If there are no serious problems found by next Monday, > I would consider conversion of the data complete. The CVS repository > will be kept available read-only for a while longer, so you can > easily forward any patches you may have. Woo! Do checkins to svn.python.org go to the python-checkins list already? Cheers, mwh -- How do I keep people from reading my Perl code? Oh wait. Ha ha! -- from Twisted.Quotes From jim at zope.com Thu Oct 27 16:03:08 2005 From: jim at zope.com (Jim Fulton) Date: Thu, 27 Oct 2005 10:03:08 -0400 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net> Message-ID: <4360DE1C.3010602@zope.com> Jeremy Hylton wrote: > Can anyone point an old CVS/Perforce-Luddite at instructions for how > to use the new SVN repository? And can you remind us where to send our public keys? :) Jim -- Jim Fulton mailto:jim at zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From guido at python.org Thu Oct 27 17:32:04 2005 From: guido at python.org (Guido van Rossum) Date: Thu, 27 Oct 2005 08:32:04 -0700 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <1130409313.4360ad6139518@www.domainfactory-webmail.de> References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> Message-ID: On 10/27/05, martin at v.loewis.de wrote: > The Python source code repository is now converted to subversion; > please feel free to start checking out new sandboxes. Woo hoo! Thanks for all the hard work and good thinking, Martin. > Most of you are probably interested in checking out one of these > folders: > > svn+ssh://pythondev at svn.python.org/python/trunk > svn+ssh://pythondev at svn.python.org/python/branches/release24-maint > svn+ssh://pythondev at svn.python.org/peps This doesn't work for me. I'm sure the problem is on my end, but my svn skills are too rusty to figure it out. I get this: $ svn checkout svn+ssh://pythondev at svn.python.org/peps Permission denied (publickey,keyboard-interactive). svn: Connection closed unexpectedly $svn --version svn, version 1.2.0 (r14790) compiled Jun 13 2005, 18:51:32 Copyright (C) 2000-2005 CollabNet. Subversion is open source software, see http://subversion.tigris.org/ This product includes software developed by CollabNet (http://www.Collab.Net/). The following repository access (RA) modules are available: * ra_dav : Module for accessing a repository via WebDAV (DeltaV) protocol. - handles 'http' scheme - handles 'https' scheme * ra_svn : Module for accessing a repository using the svn network protocol. - handles 'svn' scheme * ra_local : Module for accessing a repository on local disk. - handles 'file' scheme $ I can ssh to svn.python.org just fine, with no password (it says it's dinsdale). I can checkout the read-only versions just fine. I can work with the pydotorg svn repository just fine (checked something in last week). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Thu Oct 27 18:07:16 2005 From: skip at pobox.com (skip@pobox.com) Date: Thu, 27 Oct 2005 11:07:16 -0500 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <4360DE1C.3010602@zope.com> References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net> <4360DE1C.3010602@zope.com> Message-ID: <17248.64308.936680.578655@montanaro.dyndns.org> Jim> And can you remind us where to send our public keys? :) Jim, Send your keys to pydotorg at python.org. Unless you specify otherwise, your login will probably be "jim.fulton". Skip From martin at v.loewis.de Thu Oct 27 19:18:01 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 27 Oct 2005 19:18:01 +0200 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <4360CAE3.4090408@livinglogic.de> References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> <4360CAE3.4090408@livinglogic.de> Message-ID: <43610BC9.1040508@v.loewis.de> Walter D?rwald wrote: > Thanks for doing this. > > BTW, will there be daily tarballs, like the one available from: > http://cvs.perl.org/snapshots/python/python/python-latest.tar.gz Will be, yes (I'm saddened that you refer to this location, and not http://www.dcl.hpi.uni-potsdam.de/home/loewis/python.tgz :-) I'm planning to provide them at http://svn.python.org/snapshots. Regards, Martin From martin at v.loewis.de Thu Oct 27 19:19:50 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 27 Oct 2005 19:19:50 +0200 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <17248.52771.225830.484931@montanaro.dyndns.org> References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net> <17248.52771.225830.484931@montanaro.dyndns.org> Message-ID: <43610C36.2030500@v.loewis.de> skip at pobox.com wrote: > Though there's no svn/cvs cheatsheet there, you may also find isolated > tidbits in the Subversion FAQ: > > http://subversion.tigris.org/faq.html > > Just grep around for "cvs". In addition, you might want to read http://www.python.org/dev/svn.html Regards, Martin From martin at v.loewis.de Thu Oct 27 19:20:53 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 27 Oct 2005 19:20:53 +0200 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <4360DE1C.3010602@zope.com> References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net> <4360DE1C.3010602@zope.com> Message-ID: <43610C75.1020908@v.loewis.de> Jim Fulton wrote: >> Can anyone point an old CVS/Perforce-Luddite at instructions for how >> to use the new SVN repository? > > > And can you remind us where to send our public keys? :) pydotorg at python.org should work; you will get a confirmation when they are installed. Regards, Martin From ndbecker2 at gmail.com Thu Oct 27 19:27:53 2005 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 27 Oct 2005 13:27:53 -0400 Subject: [Python-Dev] Help with inotify Message-ID: I'm trying to make a module to support inotify (linux). I put together a module using boost::python. Problem is, inotify uses a file descriptor. If I call python os.fdopen on it, I get an error: Python 2.4.1 (#1, May 16 2005, 15:15:14) [GCC 4.0.0 20050512 (Red Hat 4.0.0-5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from inotify import * >>> import os >>> i=inotify() >>> i.fileno() 4 >>> os.fdopen (i.fileno()) Traceback (most recent call last): File "", line 1, in ? IOError: [Errno 21] Is a directory Any ideas? I'd rather not have to trace through python if I could avoid it (I don't even have source installed here). From kbk at shore.net Thu Oct 27 19:35:30 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Thu, 27 Oct 2005 13:35:30 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200510271735.j9RHZUHG005080@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 360 open (+16) / 2956 closed ( +1) / 3316 total (+17) Bugs : 893 open (+10) / 5353 closed (+12) / 6246 total (+22) RFE : 199 open ( -2) / 189 closed ( +2) / 388 total ( +0) New / Reopened Patches ______________________ Patch for (Doc) #1255218 (2005-10-17) http://python.org/sf/1328526 opened by Peter van Kampen Patch for (Doc) #1261659 (2005-10-17) http://python.org/sf/1328566 opened by Peter van Kampen pclose raises spurious exception on win32 (2005-10-17) http://python.org/sf/1328851 opened by Guido van Rossum datetime/xmlrpclib.DateTime comparison (2005-10-18) http://python.org/sf/1330538 opened by Skip Montanaro tarfile.py: fix for 1330039 (2005-10-19) CLOSED http://python.org/sf/1331635 opened by Lars Gust?bel Allow use of non-latin1 chars in interactive shell (2005-10-21) http://python.org/sf/1333679 opened by Noam Raphael Fix for int(string, base) wrong answers (2005-10-22) http://python.org/sf/1334979 opened by Adam Olsen Patch to implement PEP 351 (2005-10-23) http://python.org/sf/1335812 opened by Barry A. Warsaw Fix for int(string, base) wrong answers (take 2) (2005-10-24) http://python.org/sf/1335972 opened by Alan McIntyre remove 4 ints from PyFrameObject (2005-10-24) http://python.org/sf/1337051 opened by Neal Norwitz Elemental Security contribution - parsexml.py (2005-10-25) http://python.org/sf/1337648 opened by Guido van Rossum Elemental Security contribution - pgen2 package (2005-10-25) http://python.org/sf/1337696 opened by Guido van Rossum fileinput patch for bug #1336582 (2005-10-25) http://python.org/sf/1337756 opened by A. Murat EREN Inconsistent use of buffer interface in string and unicode (2005-10-25) http://python.org/sf/1337876 opened by Phil Thompson tarfile.py: fix for bug #1336623 (2005-10-26) http://python.org/sf/1338314 opened by Lars Gust?bel cross compile and mingw support (2005-10-27) http://python.org/sf/1339673 opened by Jan Nieuwenhuizen Patches Closed ______________ tarfile.py: fix for 1330039 (2005-10-19) http://python.org/sf/1331635 closed by nnorwitz New / Reopened Bugs ___________________ HTTPResponse instance has no attribute 'fileno' (2005-10-16) http://python.org/sf/1327971 opened by Kevin Dwyer __getslice__ taking priority over __getitem__ (2005-10-17) http://python.org/sf/1328278 opened by Josh Marshall os-process.html (2005-10-17) CLOSED http://python.org/sf/1328915 opened by Noah Spurrier Empty Generator doesn't evaluate as False (2005-10-17) CLOSED http://python.org/sf/1328959 opened by Christian H?ltje tarfile.add() produces hard links instead of normal files (2005-10-18) CLOSED http://python.org/sf/1330039 opened by Martin Pitt utf 7 codec broken (2005-10-19) CLOSED http://python.org/sf/1331062 opened by Ralf Schmitt string_subscript doesn't check for failed PyMem_Malloc (2005-10-19) CLOSED http://python.org/sf/1331563 opened by Adam Olsen Incorrect use of -L/usr/lib/termcap (2005-10-19) http://python.org/sf/1332732 opened by Robert M. Zigweid Inaccurate footnote 1 in Lib ref, sect 2.3.6.4 (2005-10-20) CLOSED http://python.org/sf/1332780 opened by Andy BSD DB test failures for BSD DB 3.2 (2005-10-19) http://python.org/sf/1332852 opened by Neal Norwitz Fatal Python error: Interpreter not initialized (2005-10-20) http://python.org/sf/1332869 opened by Andrew Mitchell BSD DB test failures for BSD DB 4.1 (2005-10-19) http://python.org/sf/1332873 opened by Neal Norwitz Bugs of the new AST compiler (2005-10-21) http://python.org/sf/1333982 opened by Armin Rigo int(string, base) wrong answers (2005-10-22) http://python.org/sf/1334662 opened by Tim Peters Python 2.4.2 doesn't build with "--without-threads" (2005-10-22) http://python.org/sf/1335054 opened by Gunter Ohrner fileinput device or resource busy error (2005-10-24) http://python.org/sf/1336582 opened by A. Murat EREN tarfile can't extract some tar archives.. (2005-10-24) http://python.org/sf/1336623 opened by A. Murat EREN Python.h should include system headers properly [POSIX] (2005-10-25) http://python.org/sf/1337400 opened by Dimitri Papadopoulos IDLE, F5 ? wrong external file content on error. (2005-10-26) http://python.org/sf/1337987 opened by MvGulik doctest mishandles exceptions raised within generators (2005-10-26) http://python.org/sf/1337990 opened by Tim Wegener Memory keeping (2005-10-26) http://python.org/sf/1338264 opened by sin CVS webbrowser.py (1.40) bugs (2005-10-26) http://python.org/sf/1338995 opened by Greg Couch shelve.Shelf.__del__ throws exceptions (2005-10-26) http://python.org/sf/1339007 opened by Geoffrey T. Dairiki Threading misbehavior with lambdas (2005-10-27) CLOSED http://python.org/sf/1339045 opened by Maciek Fijalkowski Bugs Closed ___________ wrong TypeError traceback in generator expressions (2005-10-14) http://python.org/sf/1327110 closed by mwh os-process.html (2005-10-17) http://python.org/sf/1328915 closed by nnorwitz Empty Generator doesn't evaluate as False (2005-10-17) http://python.org/sf/1328959 closed by rhettinger tarfile.add() produces hard links instead of normal files (2005-10-18) http://python.org/sf/1330039 closed by nnorwitz utf 7 codec broken (2005-10-19) http://python.org/sf/1331062 closed by lemburg string_subscript doesn't check for failed PyMem_Malloc (2005-10-19) http://python.org/sf/1331563 closed by nnorwitz Inaccurate footnote 1 in Lib ref, sect 2.3.6.4 (2005-10-20) http://python.org/sf/1332780 closed by birkenfeld Argument genexp corner case (2005-03-21) http://python.org/sf/1167751 closed by nnorwitz Encodings iso8859_1 and latin_1 are redundant (2005-08-12) http://python.org/sf/1257525 closed by lemburg ISO8859-9 broken (2005-10-11) http://python.org/sf/1324237 closed by lemburg mac_roman codec missing "apple" codepoint (2005-10-04) http://python.org/sf/1313051 closed by lemburg line numbers off by 1 in dis (2005-07-28) http://python.org/sf/1246473 closed by nascheme Threading misbehavior with lambdas (2005-10-27) http://python.org/sf/1339045 deleted by fijal RFE Closed __________ python scratchpad (IDLE) (2005-10-14) http://python.org/sf/1326830 closed by kbk datetime.replace could take a dict (2005-09-20) http://python.org/sf/1296581 closed by birkenfeld From martin at v.loewis.de Thu Oct 27 19:38:32 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 27 Oct 2005 19:38:32 +0200 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> Message-ID: <43611098.3000401@v.loewis.de> Guido van Rossum wrote: > Woo hoo! Thanks for all the hard work and good thinking, Martin. My pleasure! >>svn+ssh://pythondev at svn.python.org/python/trunk >>svn+ssh://pythondev at svn.python.org/python/branches/release24-maint >>svn+ssh://pythondev at svn.python.org/peps > > > This doesn't work for me. I'm sure the problem is on my end, but my > svn skills are too rusty to figure it out. It's actually not: you missed the pythondev@ part. To access the repository, your SSH key must be added to pythondev's authorized_keys file; it previously wasn't. I have now added your key .comcast.net to the file; I did not add guido at eric, as SSH1 is not supported. Please try again. The list of committers is (now) at http://www.python.org/dev/committers Anybody not on the list who wishes to (and had access to the CVS) please send your key; if you have access to dinsdale, just let us know and we copy your key. Regards, Martin From fdrake at acm.org Thu Oct 27 19:40:04 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 27 Oct 2005 13:40:04 -0400 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <17248.64308.936680.578655@montanaro.dyndns.org> References: <435BC27C.1010503@v.loewis.de> <4360DE1C.3010602@zope.com> <17248.64308.936680.578655@montanaro.dyndns.org> Message-ID: <200510271340.04668.fdrake@acm.org> On Thursday 27 October 2005 12:07, skip at pobox.com wrote: > Send your keys to pydotorg at python.org. Unless you specify otherwise, your > login will probably be "jim.fulton". Mail to pydotorg doesn't allow posting from non-members; I watch for notifications for owner on that list and try to approve as quickly as possible, but it's a manual process just to get the mail through. We should probably have a dedicated address for this, or tell people to send them to webmaster. -Fred -- Fred L. Drake, Jr. From martin at v.loewis.de Thu Oct 27 19:56:34 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 27 Oct 2005 19:56:34 +0200 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> <17248.45247.676631.388117@montanaro.dyndns.org> Message-ID: <436114D2.4090803@v.loewis.de> Wolfgang Langner wrote: > But why is an old subversion used ? > (Powered by Subversion version 1.1.4) That's the one Debian provides. We don't build our own, but use Debian packages for everything. Also, subversion 1.1 is not old: it was released on Oct 4, 2004; 1.1.4 is less than a year old. Regards, Martin From martin at v.loewis.de Thu Oct 27 19:57:27 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 27 Oct 2005 19:57:27 +0200 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <2m64rj5agw.fsf@starship.python.net> References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> <2m64rj5agw.fsf@starship.python.net> Message-ID: <43611507.8090606@v.loewis.de> Michael Hudson wrote: > Do checkins to svn.python.org go to the python-checkins list already? They do indeed - you should have received one commit message by now (me testing whether committing works, on PEP 347). Regards, Martin From martin at v.loewis.de Thu Oct 27 19:59:06 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 27 Oct 2005 19:59:06 +0200 Subject: [Python-Dev] [Docs] MinGW and libpython24.a In-Reply-To: References: <435E768D.2000401@v.loewis.de> Message-ID: <4361156A.7090101@v.loewis.de> David Abrahams wrote: > As it turns out, MinGW also implemented, in version 3.0.0 (with > binutils-2.13.90-20030111-1), features which make the creation of > libpython24.a unnecessary. So whoever maintains this doc might want > to note that you only need that step if you are using a version of > Python prior to 2.4.1 with a MinGW prior to 3.0.0 (with > binutils-2.13.90-20030111-1). Can you please provide a patch to the documentation? None of the regular documentation maintainers would know what exactly to write; this is all user-contributed. Regards, Martin From martin at v.loewis.de Thu Oct 27 20:01:40 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 27 Oct 2005 20:01:40 +0200 Subject: [Python-Dev] Help with inotify In-Reply-To: References: Message-ID: <43611604.4080404@v.loewis.de> Neal Becker wrote: > Any ideas? I'd rather not have to trace through python if I could avoid it > (I don't even have source installed here). Use strace, then. Find out what precise system call gives you this error. If this is not enough clue, post the relevant fragment of the trace output. Usage would be strace -o muell python test_notify.py (look into the file muell afterwards) Regards, Martin From martin at v.loewis.de Thu Oct 27 20:06:43 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 27 Oct 2005 20:06:43 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <436037F5.8050501@canterbury.ac.nz> References: <20051025120919.3927.JCARLSON@uci.edu> <435E9E18.7090502@v.loewis.de> <20051025142245.3930.JCARLSON@uci.edu> <435EA9CF.6060305@v.loewis.de> <435ED31C.3010800@canterbury.ac.nz> <435F20B1.8080803@v.loewis.de> <436037F5.8050501@canterbury.ac.nz> Message-ID: <43611733.3060606@v.loewis.de> Greg Ewing wrote: > I still think this is a much worse potential problem > than that of "l" vs "1", etc. It's reasonable to > adopt the practice of never using "l" as a single > letter identifier, for example. But it would be > unreasonable to ban the use of "E" as an identifier > on the grounds that someone somewhere might confuse > it with a capital epsilon. As a style guide, people should use single-letter identifiers only for local variables. If they follow the guideline, it should be easy to tell whether such an identifier is Latin or Greek (if everything else in the function is Latin, the E likely is as well). > An alternative would be to identify such confusable > letters in the various alphabets and define them > to be equivalent. pylint could check for such things (although I very much doubt it would have any hits in the next 10 years). > And beyond the issue of alphabets there's also the > question of whether accented characters should be > considered distinct. I can see quite a few holy > flame wars erupting over that... For that, there is the Unicode TR that precisely defines how this should be done. People should then have their wars with the Unicode consortium. Regards, Martin From martin at v.loewis.de Thu Oct 27 20:16:19 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 27 Oct 2005 20:16:19 +0200 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <200510271340.04668.fdrake@acm.org> References: <435BC27C.1010503@v.loewis.de> <4360DE1C.3010602@zope.com> <17248.64308.936680.578655@montanaro.dyndns.org> <200510271340.04668.fdrake@acm.org> Message-ID: <43611973.9030100@v.loewis.de> Fred L. Drake, Jr. wrote: > Mail to pydotorg doesn't allow posting from non-members; I watch for > notifications for owner on that list and try to approve as quickly as > possible, but it's a manual process just to get the mail through. Ah, didn't know this. > We should probably have a dedicated address for this, or tell people to send > them to webmaster. I think I would request a separate address; I don't think I want to get all webmaster email. That address should probably include webmaster, though. Regards, Martin From ndbecker2 at gmail.com Thu Oct 27 20:17:03 2005 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 27 Oct 2005 14:17:03 -0400 Subject: [Python-Dev] Help with inotify References: <43611604.4080404@v.loewis.de> Message-ID: "Martin v. L?wis" wrote: > Neal Becker wrote: >> Any ideas? I'd rather not have to trace through python if I could avoid >> it (I don't even have source installed here). > > Use strace, then. Find out what precise system call gives you this > error. If this is not enough clue, post the relevant fragment of the > trace output. Usage would be > > strace -o muell python test_notify.py > (look into the file muell afterwards) > Yes, tried that- learned nothing. I suspect what's happening is that python's fdopen is using some stat call to determine whether the file descriptor refers to a directory, and is getting an answer that the inotify fd does. Don't know what to do about it. Can I build a python file object in "C" from the fd? Here's strace. The write of '4' is where my code writes the value of fileno() to stdout, which is '4', which is correct - notice that open("test-inotify.py") returned '3': ... open("test-inotify.py", O_RDONLY) = 3 write(2, " File \"test-inotify.py\", line 6"..., 39) = 39 fstat(3, {st_mode=S_IFREG|0664, st_size=87, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaaadc13000 read(3, "from inotify import *\nimport os\n"..., 4096) = 87 write(2, " ", 4) = 4 write(2, "os.fdopen (i.fileno())\n", 23) = 23 close(3) = 0 munmap(0x2aaaadc13000, 4096) = 0 write(2, "IOError", 7) = 7 write(2, ": ", 2) = 2 write(2, "[Errno 21] Is a directory", 25) = 25 From fdrake at acm.org Thu Oct 27 20:23:55 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 27 Oct 2005 14:23:55 -0400 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <43611973.9030100@v.loewis.de> References: <435BC27C.1010503@v.loewis.de> <200510271340.04668.fdrake@acm.org> <43611973.9030100@v.loewis.de> Message-ID: <200510271423.55919.fdrake@acm.org> On Thursday 27 October 2005 14:16, Martin v. L?wis wrote: > I think I would request a separate address; I don't think I want to get > all webmaster email. I like the idea of a separate address as well. > That address should probably include webmaster, though. Are you suggesting that the key-deposit address be routed to the webmaster crew? Most of the webmasters don't have the access needed to deposit keys. -Fred -- Fred L. Drake, Jr. From dave at boost-consulting.com Thu Oct 27 20:37:23 2005 From: dave at boost-consulting.com (David Abrahams) Date: Thu, 27 Oct 2005 14:37:23 -0400 Subject: [Python-Dev] [Docs] MinGW and libpython24.a In-Reply-To: <4361156A.7090101@v.loewis.de> (Martin v. =?iso-8859-1?Q?L=F6?= =?iso-8859-1?Q?wis's?= message of "Thu, 27 Oct 2005 19:59:06 +0200") References: <435E768D.2000401@v.loewis.de> <4361156A.7090101@v.loewis.de> Message-ID: "Martin v. L?wis" writes: > David Abrahams wrote: >> As it turns out, MinGW also implemented, in version 3.0.0 (with >> binutils-2.13.90-20030111-1), features which make the creation of >> libpython24.a unnecessary. So whoever maintains this doc might want >> to note that you only need that step if you are using a version of >> Python prior to 2.4.1 with a MinGW prior to 3.0.0 (with >> binutils-2.13.90-20030111-1). > > Can you please provide a patch to the documentation? None of the > regular documentation maintainers would know what exactly to write; > this is all user-contributed. This isn't rocket science. Or maybe it is; if adding These instructions only apply if you're using a version of Python prior to 2.4.1 with a MinGW prior to 3.0.0 (with binutils-2.13.90-20030111-1) is not acceptable then no patch I could submit would be acceptable, because I don't know how to do better either. -- Dave Abrahams Boost Consulting www.boost-consulting.com From martin at v.loewis.de Thu Oct 27 20:59:13 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 27 Oct 2005 20:59:13 +0200 Subject: [Python-Dev] [Docs] MinGW and libpython24.a In-Reply-To: References: <435E768D.2000401@v.loewis.de> <4361156A.7090101@v.loewis.de> Message-ID: <43612381.3070300@v.loewis.de> David Abrahams wrote: > This isn't rocket science. Or maybe it is; if adding > > These instructions only apply if you're using a version of Python > prior to 2.4.1 with a MinGW prior to 3.0.0 (with > binutils-2.13.90-20030111-1) > > is not acceptable then no patch I could submit would be acceptable, > because I don't know how to do better either. Thanks, committed as revision 41338: http://svn.python.org/projects/python/trunk/Doc/inst/inst.tex I wasn't sure whether to place this text at the beginning or the end (i.e. whether all instructions of this section are incorrect or only part of it); I put it at the beginning. Regards, Martin From walter at livinglogic.de Thu Oct 27 21:02:21 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Thu, 27 Oct 2005 21:02:21 +0200 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <43610BC9.1040508@v.loewis.de> References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> <4360CAE3.4090408@livinglogic.de> <43610BC9.1040508@v.loewis.de> Message-ID: <606B4D81-32DB-4125-B449-C023A1A61014@livinglogic.de> Am 27.10.2005 um 19:18 schrieb Martin v. L?wis: > Walter D?rwald wrote: > >> Thanks for doing this. >> BTW, will there be daily tarballs, like the one available from: >> http://cvs.perl.org/snapshots/python/python/python-latest.tar.gz >> > > Will be, yes (I'm saddened that you refer to this location, and not > http://www.dcl.hpi.uni-potsdam.de/home/loewis/python.tgz :-) I didn't know that, although I probably should, the links are on the official page at http://www.python.org/dev/. ;) BTW, http://www.dcl.hpi.uni-potsdam.de/home/loewis/python.tgz is just 45 bytes. > I'm planning to provide them at http://svn.python.org/snapshots. Great! BTW, ViewCVS seems to be missing the stylesheet. http:// svn.python.org/view/*docroot*/styles.css gives an exception complaining about "No such file or directory: '/etc/viewcvs/doc/ styles.css'" Bye, Walter D?rwald From martin at v.loewis.de Thu Oct 27 21:05:36 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 27 Oct 2005 21:05:36 +0200 Subject: [Python-Dev] Help with inotify In-Reply-To: References: <43611604.4080404@v.loewis.de> Message-ID: <43612500.4040403@v.loewis.de> Neal Becker wrote: > Yes, tried that- learned nothing. Please go back further in the trace file. There must be a return value of -1 (EISDIR) somewhere in the file, try to locate that. > Here's strace. The write of '4' is where my code writes the value of > fileno() to stdout, which is '4', which is correct - notice that > open("test-inotify.py") returned '3': The fragment you quote only refers to the part where it tries to format the traceback. The value '4' is never written, instead, it writes 4 spaces (the second argument is the bytes, the third is the number of bytes). Regards, Martin From ndbecker2 at gmail.com Thu Oct 27 21:17:29 2005 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 27 Oct 2005 15:17:29 -0400 Subject: [Python-Dev] Help with inotify References: <43611604.4080404@v.loewis.de> <43612500.4040403@v.loewis.de> Message-ID: "Martin v. L?wis" wrote: > Neal Becker wrote: >> Yes, tried that- learned nothing. > > Please go back further in the trace file. There must be a return > value of -1 (EISDIR) somewhere in the file, try to locate that. > >> Here's strace. The write of '4' is where my code writes the value of >> fileno() to stdout, which is '4', which is correct - notice that >> open("test-inotify.py") returned '3': > > The fragment you quote only refers to the part where it tries to > format the traceback. The value '4' is never written, instead, > it writes 4 spaces (the second argument is the bytes, the third > is the number of bytes). > This 1st line is the syscall for inotify: SYS_253(0, 0x7fffff88f0f0, 0x2aaaadda3f00, 0x2aaaaab4611b, 0x7) = 4 close(3) = 0 futex(0x502530, FUTEX_WAKE, 1) = 0 futex(0x502530, FUTEX_WAKE, 1) = 0 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaaadc12000 write(1, "4\n", 2) = 2 fcntl(4, F_GETFL) = 0 (flags O_RDONLY) fstat(4, {st_mode=S_IFDIR|0600, st_size=0, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaaadc13000 lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) fstat(4, {st_mode=S_IFDIR|0600, st_size=0, ...}) = 0 close(4) = 0 munmap(0x2aaaadc13000, 4096) = 0 write(2, "Traceback (most recent call last"..., 35) = 35 open("test-inotify.py", O_RDONLY) = 3 write(2, " File \"test-inotify.py\", line 6"..., 39) = 39 ... From skip at pobox.com Thu Oct 27 23:01:43 2005 From: skip at pobox.com (skip@pobox.com) Date: Thu, 27 Oct 2005 16:01:43 -0500 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <200510271423.55919.fdrake@acm.org> References: <435BC27C.1010503@v.loewis.de> <200510271340.04668.fdrake@acm.org> <43611973.9030100@v.loewis.de> <200510271423.55919.fdrake@acm.org> Message-ID: <17249.16439.870339.133847@montanaro.dyndns.org> Fred> Are you suggesting that the key-deposit address be routed to the Fred> webmaster crew? Most of the webmasters don't have the access Fred> needed to deposit keys. In fact, many of us on the pydotorg list don't have ssh access either. I suspect the number of useful recipients is no more than five (Martin, Barry, Anthony, Sean, maybe one or two others). Skip From martin at v.loewis.de Thu Oct 27 23:02:40 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 27 Oct 2005 23:02:40 +0200 Subject: [Python-Dev] Help with inotify In-Reply-To: References: <43611604.4080404@v.loewis.de> <43612500.4040403@v.loewis.de> Message-ID: <43614070.2030007@v.loewis.de> Neal Becker wrote: > SYS_253(0, 0x7fffff88f0f0, 0x2aaaadda3f00, 0x2aaaaab4611b, 0x7) = 4 > close(3) = 0 > futex(0x502530, FUTEX_WAKE, 1) = 0 > futex(0x502530, FUTEX_WAKE, 1) = 0 > fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0 > mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = > 0x2aaaadc12000 > write(1, "4\n", 2) = 2 > fcntl(4, F_GETFL) = 0 (flags O_RDONLY) > fstat(4, {st_mode=S_IFDIR|0600, st_size=0, ...}) = 0 > mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = > 0x2aaaadc13000 > lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) > fstat(4, {st_mode=S_IFDIR|0600, st_size=0, ...}) = 0 > close(4) = 0 > munmap(0x2aaaadc13000, 4096) = 0 > write(2, "Traceback (most recent call last"..., 35) = 35 I see. Python is making up the EISDIR, looking at the stat result. In Objects/fileobject.c:dircheck generates the EISDIR error, which apparently comes from posix_fdopen, PyFile_FromFile, fill_file_fields. Python simply does not support file objects which stat(2) as directories. Regards, Martin From martin at v.loewis.de Fri Oct 28 00:12:30 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 28 Oct 2005 00:12:30 +0200 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <606B4D81-32DB-4125-B449-C023A1A61014@livinglogic.de> References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> <4360CAE3.4090408@livinglogic.de> <43610BC9.1040508@v.loewis.de> <606B4D81-32DB-4125-B449-C023A1A61014@livinglogic.de> Message-ID: <436150CE.8000305@v.loewis.de> Walter D?rwald wrote: > BTW, ViewCVS seems to be missing the stylesheet. http:// > svn.python.org/view/*docroot*/styles.css gives an exception > complaining about "No such file or directory: '/etc/viewcvs/doc/ > styles.css'" Thanks, fixed. I already wondered why I was supposed to create a /viewcvs Alias in the apache configuration... Regards, Martin From ndbecker2 at gmail.com Fri Oct 28 01:32:23 2005 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 27 Oct 2005 19:32:23 -0400 Subject: [Python-Dev] Help with inotify References: <43611604.4080404@v.loewis.de> <43612500.4040403@v.loewis.de> <43614070.2030007@v.loewis.de> Message-ID: "Martin v. L?wis" wrote: > I see. Python is making up the EISDIR, looking at the stat result. > In Objects/fileobject.c:dircheck generates the EISDIR error, which > apparently comes from posix_fdopen, PyFile_FromFile, > fill_file_fields. > > Python simply does not support file objects which stat(2) as directories. > OK, does python have a C API that would allow me to create a python file object from my C (C++) code? Then instead of using python's fdopen I could just do it myself. From bcannon at gmail.com Fri Oct 28 01:46:58 2005 From: bcannon at gmail.com (Brett Cannon) Date: Thu, 27 Oct 2005 16:46:58 -0700 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <1130409313.4360ad6139518@www.domainfactory-webmail.de> References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> Message-ID: On 10/27/05, martin at v.loewis.de wrote: [SNIP] > Most of you are probably interested in checking out one of these > folders: > > svn+ssh://pythondev at svn.python.org/python/trunk > svn+ssh://pythondev at svn.python.org/python/branches/release24-maint > svn+ssh://pythondev at svn.python.org/peps > Why the entire 'peps' directory and not just the trunk like with 'python'? It looks like no tags or branches have ever been created for the PEPs and thus are not really needed. I am also curious as to what you would have me check out for the sandbox; whole directory or just the trunk? -Brett From bob at redivi.com Fri Oct 28 01:49:02 2005 From: bob at redivi.com (Bob Ippolito) Date: Thu, 27 Oct 2005 16:49:02 -0700 Subject: [Python-Dev] Help with inotify In-Reply-To: References: <43611604.4080404@v.loewis.de> <43612500.4040403@v.loewis.de> <43614070.2030007@v.loewis.de> Message-ID: <70FDA54D-3D84-4502-BE83-061BF4DBC101@redivi.com> On Oct 27, 2005, at 4:32 PM, Neal Becker wrote: > "Martin v. L?wis" wrote: > >> I see. Python is making up the EISDIR, looking at the stat result. >> In Objects/fileobject.c:dircheck generates the EISDIR error, which >> apparently comes from posix_fdopen, PyFile_FromFile, >> fill_file_fields. >> >> Python simply does not support file objects which stat(2) as >> directories. >> >> > > OK, does python have a C API that would allow me to create a python > file > object from my C (C++) code? Then instead of using python's fdopen > I could > just do it myself. Why do you need a file object for something that is not a file anyway? select.select doesn't require file objects for example, just objects that have a fileno() method. -bob From ndbecker2 at gmail.com Fri Oct 28 01:58:05 2005 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 27 Oct 2005 19:58:05 -0400 Subject: [Python-Dev] Help with inotify References: <43611604.4080404@v.loewis.de> <43612500.4040403@v.loewis.de> <43614070.2030007@v.loewis.de> <70FDA54D-3D84-4502-BE83-061BF4DBC101@redivi.com> Message-ID: Bob Ippolito wrote: > > On Oct 27, 2005, at 4:32 PM, Neal Becker wrote: > >> "Martin v. L?wis" wrote: >> >>> I see. Python is making up the EISDIR, looking at the stat result. >>> In Objects/fileobject.c:dircheck generates the EISDIR error, which >>> apparently comes from posix_fdopen, PyFile_FromFile, >>> fill_file_fields. >>> >>> Python simply does not support file objects which stat(2) as >>> directories. >>> >>> >> >> OK, does python have a C API that would allow me to create a python >> file >> object from my C (C++) code? Then instead of using python's fdopen >> I could >> just do it myself. > > Why do you need a file object for something that is not a file > anyway? select.select doesn't require file objects for example, just > objects that have a fileno() method. > Yes, that's a good point - the reason is I didn't want to restrict the interface to only work with select. Maybe I should rethink the interface. From bob at redivi.com Fri Oct 28 02:07:35 2005 From: bob at redivi.com (Bob Ippolito) Date: Thu, 27 Oct 2005 17:07:35 -0700 Subject: [Python-Dev] Help with inotify In-Reply-To: References: <43611604.4080404@v.loewis.de> <43612500.4040403@v.loewis.de> <43614070.2030007@v.loewis.de> <70FDA54D-3D84-4502-BE83-061BF4DBC101@redivi.com> Message-ID: <2B6C4ED6-4E9E-4154-AFEC-982E7CDEC182@redivi.com> On Oct 27, 2005, at 4:58 PM, Neal Becker wrote: > Bob Ippolito wrote: > > >> >> On Oct 27, 2005, at 4:32 PM, Neal Becker wrote: >> >> >>> "Martin v. L?wis" wrote: >>> >>> >>>> I see. Python is making up the EISDIR, looking at the stat result. >>>> In Objects/fileobject.c:dircheck generates the EISDIR error, which >>>> apparently comes from posix_fdopen, PyFile_FromFile, >>>> fill_file_fields. >>>> >>>> Python simply does not support file objects which stat(2) as >>>> directories. >>>> >>>> >>>> >>> >>> OK, does python have a C API that would allow me to create a python >>> file >>> object from my C (C++) code? Then instead of using python's fdopen >>> I could >>> just do it myself. >>> >> >> Why do you need a file object for something that is not a file >> anyway? select.select doesn't require file objects for example, just >> objects that have a fileno() method. >> >> > Yes, that's a good point - the reason is I didn't want to restrict the > interface to only work with select. Maybe I should rethink the > interface. Well what would the interface do if you had a file object? Are you supposed to be able to read/write/seek/tell/etc.? I don't understand why you're trying to do what you're doing. select.select was just an example, select.poll's register/unregister takes any object with a fileno also. Note that socket isn't a file and it has a fileno also. Since what you have isn't a file, chances are returning a file object is a bug not a feature. -bob From nyamatongwe at gmail.com Fri Oct 28 02:21:16 2005 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Fri, 28 Oct 2005 10:21:16 +1000 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <20051026105934.3977.JCARLSON@uci.edu> References: <435F5149.4060804@egenix.com> <435FBDC4.8030300@v.loewis.de> <20051026105934.3977.JCARLSON@uci.edu> Message-ID: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> Josiah Carlson: > According to wikipedia (http://en.wikipedia.org/wiki/Latin_alphabet), > various languages have adopted a transliteration of their language > and/or former alphabets into latin. They don't purport to know all of > the reasons why, and I'm not going to speculate. I used to work on software written by Japanese and English speakers at Fujitsu with most developers being Japanese. The rules were that comments could be in Japanese but identifiers were only allowed to contain ASCII characters. Most variable names were poorly chosen with s, p, q, fla (boolean=flag) and flafla being popular. When I asked some Japanese coders why they didn't use Japanese words expressed in ASCII (Romaji), their response was that it was a really weird idea. This is anecdotal but it appears to me that transliterations are not commonly used apart from learning languages and some minimal help for foreigners such as including transliterated names on railway station name boards. Neil From ndbecker2 at gmail.com Fri Oct 28 02:33:12 2005 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 27 Oct 2005 20:33:12 -0400 Subject: [Python-Dev] Help with inotify References: <43611604.4080404@v.loewis.de> <43612500.4040403@v.loewis.de> <43614070.2030007@v.loewis.de> <70FDA54D-3D84-4502-BE83-061BF4DBC101@redivi.com> <2B6C4ED6-4E9E-4154-AFEC-982E7CDEC182@redivi.com> Message-ID: Bob Ippolito wrote: > > On Oct 27, 2005, at 4:58 PM, Neal Becker wrote: > >> Bob Ippolito wrote: >> >> >>> >>> On Oct 27, 2005, at 4:32 PM, Neal Becker wrote: >>> >>> >>>> "Martin v. L?wis" wrote: >>>> >>>> >>>>> I see. Python is making up the EISDIR, looking at the stat result. >>>>> In Objects/fileobject.c:dircheck generates the EISDIR error, which >>>>> apparently comes from posix_fdopen, PyFile_FromFile, >>>>> fill_file_fields. >>>>> >>>>> Python simply does not support file objects which stat(2) as >>>>> directories. >>>>> >>>>> >>>>> >>>> >>>> OK, does python have a C API that would allow me to create a python >>>> file >>>> object from my C (C++) code? Then instead of using python's fdopen >>>> I could >>>> just do it myself. >>>> >>> >>> Why do you need a file object for something that is not a file >>> anyway? select.select doesn't require file objects for example, just >>> objects that have a fileno() method. >>> >>> >> Yes, that's a good point - the reason is I didn't want to restrict the >> interface to only work with select. Maybe I should rethink the >> interface. > > Well what would the interface do if you had a file object? Are you > supposed to be able to read/write/seek/tell/etc.? I don't understand > why you're trying to do what you're doing. select.select was just an > example, select.poll's register/unregister takes any object with a > fileno also. > Yes, you are supposed to be able to read and get information. However, I have implemented fileno for it, so you can use select.select on it if you just want to wait for something to happen - which is probably all that's really needed. I also implemented select as a method of my inotify object, in case you prefer that. Here's an excerpt from documentation/filesystems/inotify.txt: ----------------- Events are provided in the form of an inotify_event structure that is read(2) from a given inotify instance. The filename is of dynamic length and follows the struct. It is of size len. The filename is padded with null bytes to ensure proper alignment. This padding is reflected in len. You can slurp multiple events by passing a large buffer, for example size_t len = read (fd, buf, BUF_LEN); Where "buf" is a pointer to an array of "inotify_event" structures at least BUF_LEN bytes in size. The above example will return as many events as are available and fit in BUF_LEN. Each inotify instance fd is also select()- and poll()-able. ----------------- From bcannon at gmail.com Fri Oct 28 02:42:42 2005 From: bcannon at gmail.com (Brett Cannon) Date: Thu, 27 Oct 2005 17:42:42 -0700 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <1130409313.4360ad6139518@www.domainfactory-webmail.de> References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> Message-ID: I have started a svn section in the dev FAQ (http://www.python.org/dev/devfaq.html) pertaining to checking out a project from the repository and other stuff discussed so far. If something is not clear or people feel a step is missing, let me know. I will remove the CVS section once Martin has tossed the CVS repository on SF. -Brett On 10/27/05, martin at v.loewis.de wrote: > The Python source code repository is now converted to subversion; > please feel free to start checking out new sandboxes. For a few > days, this installation probably still needs to be considered in > testing. If there are no serious problems found by next Monday, > I would consider conversion of the data complete. The CVS repository > will be kept available read-only for a while longer, so you can > easily forward any patches you may have. > > Most of you are probably interested in checking out one of these > folders: > > svn+ssh://pythondev at svn.python.org/python/trunk > svn+ssh://pythondev at svn.python.org/python/branches/release24-maint > svn+ssh://pythondev at svn.python.org/peps > > The anonymous read-only equivalents of these are > > http://svn.python.org/projects/python/trunk > http://svn.python.org/projects/python/branches/release24-maint > http://svn.python.org/projects/peps > > As mentioned before, in addition to "plain" http/WebDAV, > viewcvs is available at > > http://svn.python.org/view/ > > There are some more things left to be done, such as updating > the developer documentation. I'll start working on that soon, > but contributions are welcome. > > Regards, > Martin > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From tim.peters at gmail.com Fri Oct 28 03:27:18 2005 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 27 Oct 2005 21:27:18 -0400 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> Message-ID: <1f7befae0510271827t5bef2009l5af731679e38acfd@mail.gmail.com> [Brett Cannon] > I have started a svn section in the dev FAQ > (http://www.python.org/dev/devfaq.html) pertaining to checking out a > project from the repository and other stuff discussed so far. If > something is not clear or people feel a step is missing, let me know. Thanks, Brett! I'm just starting this trek, in slow motion, and that was a real help From skip at pobox.com Fri Oct 28 04:53:26 2005 From: skip at pobox.com (skip@pobox.com) Date: Thu, 27 Oct 2005 21:53:26 -0500 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> Message-ID: <17249.37542.578327.785926@montanaro.dyndns.org> Brett> I have started a svn section in the dev FAQ Brett> (http://www.python.org/dev/devfaq.html) pertaining to checking Brett> out a project from the repository and other stuff discussed so Brett> far. If something is not clear or people feel a step is missing, Brett> let me know. We're starting to look at how much information we can push over to the Wiki. Any pages where multiple people might contribute, especially if they are not the typical website maintainers, seems to me like good Wiki candidates to me. That goes double for anything FAQ-ish. Skip From bcannon at gmail.com Fri Oct 28 05:03:54 2005 From: bcannon at gmail.com (Brett Cannon) Date: Thu, 27 Oct 2005 20:03:54 -0700 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <17249.37542.578327.785926@montanaro.dyndns.org> References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> <17249.37542.578327.785926@montanaro.dyndns.org> Message-ID: On 10/27/05, skip at pobox.com wrote: > > Brett> I have started a svn section in the dev FAQ > Brett> (http://www.python.org/dev/devfaq.html) pertaining to checking > Brett> out a project from the repository and other stuff discussed so > Brett> far. If something is not clear or people feel a step is missing, > Brett> let me know. > > We're starting to look at how much information we can push over to the Wiki. > Any pages where multiple people might contribute, especially if they are not > the typical website maintainers, seems to me like good Wiki candidates to > me. That goes double for anything FAQ-ish. > I guess, but I just don't like wikis personally so I have no inclination to make the conversion. If someone wants to make the conversion over to the wiki and keep it up that's fine, but I have no problem keeping the dev FAQ updated like I have for CVS in the past. -Brett From bcannon at gmail.com Fri Oct 28 05:15:38 2005 From: bcannon at gmail.com (Brett Cannon) Date: Thu, 27 Oct 2005 20:15:38 -0700 Subject: [Python-Dev] PEP 352: Required Superclass for Exceptions Message-ID: Well, I am at it again, but this time Guido is a co-conspirator. We wrote a PEP that introduces BaseException and moves KeyboardInterrupt and SystemExit. Even if you followed the discussion for PEP 348 you should read the PEP since I am sure there will be something that someone doesn't like, such as the transition plan or how I didn't use British English throughout. =) Anyway, as soon as the cron job posts the PEP to the web site (already checked into the new svn repository) have a read and start expounding about how wonderful it is and that there is no qualms with it whatsoever. =) -Brett From fdrake at acm.org Fri Oct 28 05:53:26 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 27 Oct 2005 23:53:26 -0400 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> <17249.37542.578327.785926@montanaro.dyndns.org> Message-ID: <200510272353.27064.fdrake@acm.org> On Thursday 27 October 2005 23:03, Brett Cannon wrote: > I guess, but I just don't like wikis personally so I have no > inclination to make the conversion. If someone wants to make the > conversion over to the wiki and keep it up that's fine, but I have no > problem keeping the dev FAQ updated like I have for CVS in the past. And I'm sure we all appreciate your efforts! I certainly do. Regarding using the wiki... I have mixed feelings. Wikis are really, really good for some things. Anything that's "how-to" based on technology (how to use SVN, CVS, etc.) seems like a reasonable candidate, because we get the advantages of peer review. For things that describe policy, I don't think that's so great. For policy (how to use SVN for Python development, because we have certain rules), I think we want to maintain strict editorial control. -Fred -- Fred L. Drake, Jr. From martin at v.loewis.de Fri Oct 28 07:15:36 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Fri, 28 Oct 2005 07:15:36 +0200 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> Message-ID: <4361B3F8.4000304@v.loewis.de> Brett Cannon wrote: > Why the entire 'peps' directory and not just the trunk like with > 'python'? It looks like no tags or branches have ever been created > for the PEPs and thus are not really needed. Right. > I am also curious as to what you would have me check out for the > sandbox; whole directory or just the trunk? You would usually only check out the trunk (unless you want to work on a branch, of course). Regards, Martin From stephen at xemacs.org Fri Oct 28 09:44:42 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 28 Oct 2005 16:44:42 +0900 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> (Neil Hodgson's message of "Fri, 28 Oct 2005 10:21:16 +1000") References: <435F5149.4060804@egenix.com> <435FBDC4.8030300@v.loewis.de> <20051026105934.3977.JCARLSON@uci.edu> <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> Message-ID: <87irvikrv9.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Neil" == Neil Hodgson writes: Neil> Most variable names were poorly chosen with s, p, q, fla Neil> (boolean=flag) and flafla being popular. When I asked some Neil> Japanese coders why they didn't use Japanese words expressed Neil> in ASCII (Romaji), their response was that it was a really Neil> weird idea. That may be due to the fact that two-ideograph words will often have a dozen homonyms, and sometimes several dozen. I sometimes use kanji in not-for-general-distribution Emacs LISP code when 2 kanji will give as expressive an identifier as 10 or 15 ASCII characters. Neil> This is anecdotal but it appears to me that transliterations Neil> are not commonly used apart from learning languages In everyday usage, they're used a lot for identifier-like purposes like corporate logos. The only large corpuses of Japanese-oriented Japanese-authored code I'm familiar with are the input methods Wnn, Canna, and SKK, and these invariably use transliterated Japanese grammatical terms for parser components[1], although there are perfectly good equivalents in English, at least (I think they may actually be standardized by the Ministry of Education). There's also an Emacs library, edict.el, that uses _mixed_ ASCII-hiragana-kanji identifiers. (ISTR that was done just to prove a point---the person who wrote it was an American, I believe---definitely not Japanese.) Footnotes: [1] Japanese does not require word delimiters, so input methods must have grammatical knowledge to choose among large numbers of homonyms. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From theller at python.net Fri Oct 28 09:53:31 2005 From: theller at python.net (Thomas Heller) Date: Fri, 28 Oct 2005 09:53:31 +0200 Subject: [Python-Dev] Conversion to Subversion is complete References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> Message-ID: <8xwe6ps4.fsf@python.net> martin at v.loewis.de writes: > The Python source code repository is now converted to subversion; > please feel free to start checking out new sandboxes. For a few > days, this installation probably still needs to be considered in > testing. If there are no serious problems found by next Monday, > I would consider conversion of the data complete. The CVS repository > will be kept available read-only for a while longer, so you can > easily forward any patches you may have. > > Most of you are probably interested in checking out one of these > folders: > > svn+ssh://pythondev at svn.python.org/python/trunk > svn+ssh://pythondev at svn.python.org/python/branches/release24-maint > svn+ssh://pythondev at svn.python.org/peps > Works out of the box for me, thanks, Martin (but we have debugged this before). Can anyone recommend an XEmacs svn plugin to use - I've tried psvn.el from http://www.xsteve.at/prg/emacs/psvn.el which seems to work? Thomas From mwh at python.net Fri Oct 28 10:34:30 2005 From: mwh at python.net (Michael Hudson) Date: Fri, 28 Oct 2005 09:34:30 +0100 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <8xwe6ps4.fsf@python.net> (Thomas Heller's message of "Fri, 28 Oct 2005 09:53:31 +0200") References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> <8xwe6ps4.fsf@python.net> Message-ID: <2mbr1a3uqx.fsf@starship.python.net> Thomas Heller writes: > Can anyone recommend an XEmacs svn plugin to use - I've tried psvn.el > from http://www.xsteve.at/prg/emacs/psvn.el which seems to work? I've heard http://www.xsteve.at/prg/emacs/psvn.el works :) I also have vc-svn.el installed (I think it's from the subversion source, but it might be part of newer emacs distributions). Cheers, mwh -- INEFFICIENT CAPITALIST YOUR OPULENT TOILET WILL BE YOUR UNDOING -- from Twisted.Quotes From theller at python.net Fri Oct 28 11:54:08 2005 From: theller at python.net (Thomas Heller) Date: Fri, 28 Oct 2005 11:54:08 +0200 Subject: [Python-Dev] Conversion to Subversion is complete References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> <8xwe6ps4.fsf@python.net> <2mbr1a3uqx.fsf@starship.python.net> Message-ID: <3bmm6k73.fsf@python.net> Michael Hudson writes: > Thomas Heller writes: > >> Can anyone recommend an XEmacs svn plugin to use - I've tried psvn.el >> from http://www.xsteve.at/prg/emacs/psvn.el which seems to work? > > I've heard http://www.xsteve.at/prg/emacs/psvn.el works :) > > I also have vc-svn.el installed (I think it's from the subversion > source, but it might be part of newer emacs distributions). I've heard that vc-svn.el does NOT work with Xemacs (note the X), but haven't tried it myself. Thomas From orent at hishome.net Fri Oct 28 12:20:38 2005 From: orent at hishome.net (Oren Tirosh) Date: Fri, 28 Oct 2005 12:20:38 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> References: <435F5149.4060804@egenix.com> <435FBDC4.8030300@v.loewis.de> <20051026105934.3977.JCARLSON@uci.edu> <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> Message-ID: <7168d65a0510280320j75a51f1btd71a109cd5d604f5@mail.gmail.com> On 10/28/05, Neil Hodgson wrote: > I used to work on software written by Japanese and English speakers > at Fujitsu with most developers being Japanese. The rules were that > comments could be in Japanese but identifiers were only allowed to > contain ASCII characters. Most variable names were poorly chosen with > s, p, q, fla (boolean=flag) and flafla being popular. When I asked > some Japanese coders why they didn't use Japanese words expressed in > ASCII (Romaji), their response was that it was a really weird idea. > > This is anecdotal but it appears to me that transliterations are > not commonly used apart from learning languages and some minimal help > for foreigners such as including transliterated names on railway > station name boards. Israeli programmers generally use English identifiers but transliterations are common for local business terminology: types of financial instruments, tax and insurance terminology, employee benefit plans etc. Yes, it looks weird, but it would be rather pointless to try to translate them. Even native English speakers would find it difficult to recognize the translations because they are used to using them as loan words. Only transliteration (or possibly the use of non-ASCII identifiers) would make sense in this situation and I do not think it is unique to Israel. BTW, I heard about a Cobol shop that had an explicit policy of using only transliterated identifiers. This resulted in a much smaller chance of hitting one of Cobol's numerous reserved words. Thankfully, this is not an issue in Python... Oren From ncoghlan at gmail.com Fri Oct 28 13:12:42 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 28 Oct 2005 21:12:42 +1000 Subject: [Python-Dev] PEP 352: Required Superclass for Exceptions In-Reply-To: References: Message-ID: <436207AA.20506@gmail.com> Brett Cannon wrote: > Anyway, as soon as the cron job posts the PEP to the web site (already > checked into the new svn repository) have a read and start expounding > about how wonderful it is and that there is no qualms with it > whatsoever. =) You mean aside from the implementation of __getitem__ being broken in BaseException*? ;) Aside from that, I actually do have one real problem and one observation. The problem: The value of ex.args The PEP as written significantly changes the semantics of ex.args - instead of being an empty tuple when no arguments are provided, it is instead a singleton tuple containing the empty string. A backwards compatible definition of BaseException.__init__ would be: def __init__(self, *args): self.args = args self.message = '' if not args else args[0] The observation: The value of ex.message Under PEP 352 the concept of allowing "return x" to be used in a generator to mean "raise StopIteration(x)" would actually align quite well. A bare "return", however, would need to be changed to translate to "raise StopIteration(None)" rather than its current "raise StopIteration" in order to get the correct value (None) into ex.message. Cheers, Nick. * (self.args[0] is self.message) due to the way __init__ is written, but __getitem__ assumes self.message isn't in self.args) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Fri Oct 28 13:28:54 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 28 Oct 2005 21:28:54 +1000 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <200510271423.55919.fdrake@acm.org> References: <435BC27C.1010503@v.loewis.de> <200510271340.04668.fdrake@acm.org> <43611973.9030100@v.loewis.de> <200510271423.55919.fdrake@acm.org> Message-ID: <43620B76.8060308@gmail.com> Fred L. Drake, Jr. wrote: > On Thursday 27 October 2005 14:16, Martin v. L?wis wrote: > > I think I would request a separate address; I don't think I want to get > > all webmaster email. > > I like the idea of a separate address as well. Perhaps the radically named svnaccess at python.org? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From barry at python.org Fri Oct 28 14:11:19 2005 From: barry at python.org (Barry Warsaw) Date: Fri, 28 Oct 2005 08:11:19 -0400 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <8xwe6ps4.fsf@python.net> References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> <8xwe6ps4.fsf@python.net> Message-ID: <1130501479.5145.43.camel@geddy.wooz.org> On Fri, 2005-10-28 at 03:53, Thomas Heller wrote: > Can anyone recommend an XEmacs svn plugin to use - I've tried psvn.el > from http://www.xsteve.at/prg/emacs/psvn.el which seems to work? Yep, that's the one I use, albeit a few revs back from what's up there now. It's had some performance problems in the past, but is generally pretty good these days. I've had issues with it bogging down XEmacs /after/ running stat on a very large tree. It seems (seemed?) as though it was still hogging cpu even after the actual back-end svn command was finished. My only other nit is that I wish I could svn stat a few directories at a time. Say I know that only directory A and B are out of date. At the command line I can say "svn stat A B" or "svn commit A B". Can't really do that in psvn.el, but I can understand why that's problematic. I also would love it to be hooked into vc-mode too, for modeline updates and commits of single files. I can understand why those things aren't there yet though. All in all psvn.el works very well, although it's not (for me) a complete replacement for the command line. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20051028/c8ea519b/attachment.pgp From guido at python.org Fri Oct 28 17:22:09 2005 From: guido at python.org (Guido van Rossum) Date: Fri, 28 Oct 2005 08:22:09 -0700 Subject: [Python-Dev] PEP 352: Required Superclass for Exceptions In-Reply-To: <436207AA.20506@gmail.com> References: <436207AA.20506@gmail.com> Message-ID: On 10/28/05, Nick Coghlan wrote: > Brett Cannon wrote: > > Anyway, as soon as the cron job posts the PEP to the web site (already > > checked into the new svn repository) have a read and start expounding > > about how wonderful it is and that there is no qualms with it > > whatsoever. =) > > You mean aside from the implementation of __getitem__ being broken in > BaseException*? ;) Are you clairvoyant?! The cronjob wass broken due to the SVN transition and the file wasn't on the site yet. (Now fixed BTW.) Oh, and here's the URL just in case: http://www.python.org/peps/pep-0352.html > Aside from that, I actually do have one real problem and one observation. > > The problem: The value of ex.args > > The PEP as written significantly changes the semantics of ex.args - instead > of being an empty tuple when no arguments are provided, it is instead a > singleton tuple containing the empty string. > > A backwards compatible definition of BaseException.__init__ would be: > > def __init__(self, *args): > self.args = args > self.message = '' if not args else args[0] But does anyone care? As long as args exists and is a tuple, does it matter that it doesn't match the argument list when the latter was empty? IMO the protocol mostly says that ex.args exists and is a tuple -- the values in there can't be relied upon in pre-2.5-Python. Exceptions that have specific information should store it in a different place, not in ex.args. > The observation: The value of ex.message > > Under PEP 352 the concept of allowing "return x" to be used in a generator > to mean "raise StopIteration(x)" would actually align quite well. A bare > "return", however, would need to be changed to translate to "raise > StopIteration(None)" rather than its current "raise StopIteration" in order to > get the correct value (None) into ex.message. Since ex.message is new, how can you say that it should have the value None? IMO the whole idea is that ex.message should always be a string going forward (although I'm not going to add a typecheck to enforce this). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bcannon at gmail.com Fri Oct 28 21:12:57 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 28 Oct 2005 12:12:57 -0700 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <200510272353.27064.fdrake@acm.org> References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> <17249.37542.578327.785926@montanaro.dyndns.org> <200510272353.27064.fdrake@acm.org> Message-ID: On 10/27/05, Fred L. Drake, Jr. wrote: > On Thursday 27 October 2005 23:03, Brett Cannon wrote: > > I guess, but I just don't like wikis personally so I have no > > inclination to make the conversion. If someone wants to make the > > conversion over to the wiki and keep it up that's fine, but I have no > > problem keeping the dev FAQ updated like I have for CVS in the past. > > And I'm sure we all appreciate your efforts! I certainly do. > > Regarding using the wiki... I have mixed feelings. Wikis are really, really > good for some things. Anything that's "how-to" based on technology (how to > use SVN, CVS, etc.) seems like a reasonable candidate, because we get the > advantages of peer review. > > For things that describe policy, I don't think that's so great. For policy > (how to use SVN for Python development, because we have certain rules), I > think we want to maintain strict editorial control. > I like that explanation more than mine. =) So I am just going to keep the FAQ up then. If there is anything at http://www.python.org/dev/svn.html people feel should be moved over to the FAQ that has not occurred yet, let me know. Please have personal experience, though, with what you want added so as to make sure the information is relevant (e.g., Tim suffering through getting an SSH 2 key for Windows and what is exactly needed, complete with screenshots =) . -Brett From bcannon at gmail.com Fri Oct 28 21:29:35 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 28 Oct 2005 12:29:35 -0700 Subject: [Python-Dev] PEP 352: Required Superclass for Exceptions In-Reply-To: References: <436207AA.20506@gmail.com> Message-ID: On 10/28/05, Guido van Rossum wrote: > On 10/28/05, Nick Coghlan wrote: > > Brett Cannon wrote: > > > Anyway, as soon as the cron job posts the PEP to the web site (already > > > checked into the new svn repository) have a read and start expounding > > > about how wonderful it is and that there is no qualms with it > > > whatsoever. =) > > > > You mean aside from the implementation of __getitem__ being broken in > > BaseException*? ;) > > Are you clairvoyant?! The cronjob wass broken due to the SVN > transition and the file wasn't on the site yet. (Now fixed BTW.) Oh, > and here's the URL just in case: > http://www.python.org/peps/pep-0352.html > Nick got the python-checkins email and then read the PEP from the repository (or at least that is what I assume since that is how Neal managed to catch the PEP literally in under 5 minutes after checkin). > > Aside from that, I actually do have one real problem and one observation. > > > > The problem: The value of ex.args > > > > The PEP as written significantly changes the semantics of ex.args - instead > > of being an empty tuple when no arguments are provided, it is instead a > > singleton tuple containing the empty string. > > > > A backwards compatible definition of BaseException.__init__ would be: > > > > def __init__(self, *args): > > self.args = args > > self.message = '' if not args else args[0] > > But does anyone care? As long as args exists and is a tuple, does it > matter that it doesn't match the argument list when the latter was > empty? IMO the protocol mostly says that ex.args exists and is a tuple > -- the values in there can't be relied upon in pre-2.5-Python. > Exceptions that have specific information should store it in a > different place, not in ex.args. > Looking at http://docs.python.org/lib/module-exceptions.html , it looks like Guido is right. All it ever says is that it is a tuple and that any passed-in arguments go into 'args'; nothing about its default value if no arguments are passed in. But I personally have no qualms changing it if people want it, so -0 from me on making it more backwards-compatible. > > The observation: The value of ex.message > > > > Under PEP 352 the concept of allowing "return x" to be used in a generator > > to mean "raise StopIteration(x)" would actually align quite well. A bare > > "return", however, would need to be changed to translate to "raise > > StopIteration(None)" rather than its current "raise StopIteration" in order to > > get the correct value (None) into ex.message. > > Since ex.message is new, how can you say that it should have the value > None? IMO the whole idea is that ex.message should always be a string > going forward (although I'm not going to add a typecheck to enforce > this). > My feeling exactly on 'message'. -Brett From raymond.hettinger at verizon.net Fri Oct 28 22:16:00 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 28 Oct 2005 16:16:00 -0400 Subject: [Python-Dev] PEP 352 Transition Plan In-Reply-To: <20051028193558.77BDF1E407C@bag.python.org> Message-ID: <007c01c5dbfc$6a46e600$b62dc797@oemcomputer> I don't follow why the PEP deprecates catching a category of exceptions in a different release than it deprecates raising them. Why would a release allow catching something that cannot be raised? I must be missing something here. Raymond From guido at python.org Fri Oct 28 22:29:20 2005 From: guido at python.org (Guido van Rossum) Date: Fri, 28 Oct 2005 13:29:20 -0700 Subject: [Python-Dev] PEP 352 Transition Plan In-Reply-To: <007c01c5dbfc$6a46e600$b62dc797@oemcomputer> References: <20051028193558.77BDF1E407C@bag.python.org> <007c01c5dbfc$6a46e600$b62dc797@oemcomputer> Message-ID: On 10/28/05, Raymond Hettinger wrote: > I don't follow why the PEP deprecates catching a category of exceptions > in a different release than it deprecates raising them. Why would a > release allow catching something that cannot be raised? I must be > missing something here. So conforming code can catch exceptions raised by not-yet conforming code. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rhamph at gmail.com Fri Oct 28 22:32:40 2005 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 28 Oct 2005 14:32:40 -0600 Subject: [Python-Dev] PEP 352 Transition Plan In-Reply-To: <007c01c5dbfc$6a46e600$b62dc797@oemcomputer> References: <20051028193558.77BDF1E407C@bag.python.org> <007c01c5dbfc$6a46e600$b62dc797@oemcomputer> Message-ID: On 10/28/05, Raymond Hettinger wrote: > I don't follow why the PEP deprecates catching a category of exceptions > in a different release than it deprecates raising them. Why would a > release allow catching something that cannot be raised? I must be > missing something here. Presumably because they CAN still be raised; attempting to do so provokes a warning, not an error. It also facilitates upgrading from old versions of Python. You can work to eliminate cases where the exceptions are raised while still handling them if they do get raised. -- Adam Olsen, aka Rhamphoryncus From raymond.hettinger at verizon.net Fri Oct 28 22:44:53 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 28 Oct 2005 16:44:53 -0400 Subject: [Python-Dev] PEP 352 Transition Plan In-Reply-To: Message-ID: <007d01c5dc00$738da2e0$b62dc797@oemcomputer> > > Why would a > > release allow catching something that cannot be raised? I must be > > missing something here. > > So conforming code can catch exceptions raised by not-yet conforming code. That makes sense. What was the rationale for pushing the deprecation of __getitem__ and args back to Py2.8? Is the there a disadvantage for doing it earlier? On the flip side, is there any reason it has to be done at all prior to Py3.0? That change seems orthogonal to the rest of the proposal and has its own pluses and minuses (simplification on the plus-side and code-breakage on the minus-side). FWIW, the args tuple does have a legitimate use case as one solution to the problem of exception chaining (keeping the old info intact, but adding new info as an extra field): try: raise TypeError('inner detail') except TypeError, e: args = e.args + ('outer context',) raise TypeError(*args) Raymond From martin at v.loewis.de Sat Oct 29 00:07:23 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sat, 29 Oct 2005 00:07:23 +0200 Subject: [Python-Dev] Help with inotify In-Reply-To: References: <43611604.4080404@v.loewis.de> <43612500.4040403@v.loewis.de> <43614070.2030007@v.loewis.de> Message-ID: <4362A11B.8090305@v.loewis.de> Neal Becker wrote: > OK, does python have a C API that would allow me to create a python file > object from my C (C++) code? Then instead of using python's fdopen I could > just do it myself. I don't know - you will have to read the python source to find out (this is actually not a pythondev question anymore). Regards, Martin From martin at v.loewis.de Sat Oct 29 00:14:53 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 29 Oct 2005 00:14:53 +0200 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> Message-ID: <4362A2DD.5090704@v.loewis.de> Brett Cannon wrote: > I have started a svn section in the dev FAQ > (http://www.python.org/dev/devfaq.html) pertaining to checking out a > project from the repository and other stuff discussed so far. If > something is not clear or people feel a step is missing, let me know. One think that should be carried over from svn.ht is how to setup Putty on Windows. The issue is that subversion will look for a ssh binary in its path, and if there is none, it fails. Saying [tunnels] ssh="c:/program files/putty/plink.exe" -T in subversion's config file does the trick (see svn.html). If you use a different SSH client, you need to adjust the configuration accordingly. FYI, -T specifies to not allocate a terminal. plink has the nice feature of giving GUI feedback if there is no terminal for interactive feedback (such as whether the remote key is trusted). This makes it useful for TortoiseSVN. Regards, Martin From martin at v.loewis.de Sat Oct 29 00:17:48 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 29 Oct 2005 00:17:48 +0200 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <4361B3F8.4000304@v.loewis.de> References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> <4361B3F8.4000304@v.loewis.de> Message-ID: <4362A38C.9050400@v.loewis.de> Martin v. L?wis wrote: >>I am also curious as to what you would have me check out for the >>sandbox; whole directory or just the trunk? > > > You would usually only check out the trunk (unless you want to work > on a branch, of course). Actually, you would probably check out a sandbox subdirectory, such as http://svn.python.org/projects/sandbox/trunk/decimal/ (say). We don't have a policy for making tags or branches for single directories only; I would suggest that either "tags/decimal-1.0" or "tags/decimal/1.0" would be acceptable (depending on how frequently anticipate to make takes, perhaps). Regards, Martin From martin at v.loewis.de Sat Oct 29 00:21:03 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 29 Oct 2005 00:21:03 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> References: <435F5149.4060804@egenix.com> <435FBDC4.8030300@v.loewis.de> <20051026105934.3977.JCARLSON@uci.edu> <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> Message-ID: <4362A44F.9010506@v.loewis.de> Neil Hodgson wrote: > This is anecdotal but it appears to me that transliterations are > not commonly used apart from learning languages and some minimal help > for foreigners such as including transliterated names on railway > station name boards. That would be my guess also. Transliteration is clearly common for Latin-based languages (French, German, Spanish, say), but I doubt non-Latin scripts are that often transliterated (even if conventions exist). Regards, Martin From bcannon at gmail.com Sat Oct 29 00:52:48 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 28 Oct 2005 15:52:48 -0700 Subject: [Python-Dev] PEP 352 Transition Plan In-Reply-To: <007d01c5dc00$738da2e0$b62dc797@oemcomputer> References: <007d01c5dc00$738da2e0$b62dc797@oemcomputer> Message-ID: On 10/28/05, Raymond Hettinger wrote: > > > Why would a > > > release allow catching something that cannot be raised? I must be > > > missing something here. > > > > So conforming code can catch exceptions raised by not-yet conforming > code. > > That makes sense. > > What was the rationale for pushing the deprecation of __getitem__ and > args back to Py2.8? Is the there a disadvantage for doing it earlier? > On the flip side, is there any reason it has to be done at all prior to > Py3.0? That change seems orthogonal to the rest of the proposal and has > its own pluses and minuses (simplification on the plus-side and > code-breakage on the minus-side). > I thought that there was no exact rush on their removal. And I suspect the later versions of the 2.x branch will be used to help ease transition to Python 3, so I figured pushing it to 2.8 seemed like a good idea. I could even push it all the way to 2.9 if people prefer. > FWIW, the args tuple does have a legitimate use case as one solution to > the problem of exception chaining (keeping the old info intact, but > adding new info as an extra field): > > try: > raise TypeError('inner detail') > except TypeError, e: > args = e.args + ('outer context',) > raise TypeError(*args) > Interesting point, but I think that chaining should have more concrete support ala PEP 344 or some other mechanism. I think most people agree that exception chaining is important enough to have better support than some implied way of a causing exception to be passed along. Perhaps something more along the lines of: try: raise TypeError("inner detail") except TypeError, e: raise TypeError("outer detail", cause=e) where BaseException then has a 'cause' attribute that is set to None by default or some specific object that is passed in as the second argument to the constructor. -Brett From tim.peters at gmail.com Sat Oct 29 03:29:09 2005 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 28 Oct 2005 21:29:09 -0400 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <43610C36.2030500@v.loewis.de> References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net> <17248.52771.225830.484931@montanaro.dyndns.org> <43610C36.2030500@v.loewis.de> Message-ID: <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com> [skip at pobox.com] >> Though there's no svn/cvs cheatsheet there, you may also find isolated >> tidbits in the Subversion FAQ: >> >> http://subversion.tigris.org/faq.html >> >> Just grep around for "cvs". [Martin v. L?wis] > In addition, you might want to read > > http://www.python.org/dev/svn.html Excellent suggestions! I have a few to pass on: 1. CVS uses "update" for all sorts of things. SVN has different commands for several of the use cases CVS's update conflates: - Updating to the current server state. "svn update" does that, and SVN's update isn't useful for anything other than that. - Finding out what's changed in your sandbox. Use "svn status" for that. Bonus: in return for creating zillions of admin files, "svn status" is a local operation (no network access required). Do "svn status -u" to get, in addition, a listing of files that _would_ change if you were to do "svn update". - Merging. Use "svn merge" for that. This includes the case of reverting a checkin, in which case just reverse the revision numbers: svn merge URL -rNEW:OLD where NEW is the revision number of the checkin you want to revert, and OLD is typically NEW-1. Very nice: this reverts _all_ changes made in revision NEW, no matter how many files were involved. 2. Every checkin conceptually creates a new version of the entire repository, uniquely identified by its revision number. This is very powerful, but subtle, and CVS has nothing like it. A glimpse of its power was given just above, talking about the ease of reverting an entire checkin in one easy gulp, 3. You're working on a trunk sandbox and discover it's going to take longer than you hoped. Now you wish you had created a branch. This is actually dead easy: create a new branch of the trunk. "svn switch" your sandbox to that new branch; this leaves your local change alone, which is key. "svn commit" -- you're done! There's now a branch on the server matching your fiddled local state. 4. Making a branch or tag goes very fast under SVN. Because branches and tags are just conventionally-named directories, you can delete them (like any other directory) when you're done with them. These conspire to make simple applications of branches much more pleasant than under CVS. 5. CVS uses text mode for files by default. SVN uses binary mode. The latter is safer, but creates endless low-level snags for x-platform development. I encourage Python developers to include this gibberish in their SVN config file: """ [auto-props] # Setting eol-style to native on all files is a trick: if svn # believes a new file is binary, it won't honor the eol-style # auto-prop. However, svn considers the request to set eol-style # to be an error then, and if adding multiple files with one # svn "add" cmd, svn will stop adding files after the first # such error. A future release of svn will probably consider # this to be a warning instead (and continue adding files). * = svn:eol-style=native *.c = svn:keywords=Id *.h = svn:keywords=Id *.py = svn:keywords=Id """ Then SVN will set the necessary svn:eol-style property to "native" on new text files you commit. I've never yet seen it tag a file inappropriately using this trick, but it's guaranteed to screw up _all_ text files without something like this (unless you have the patience and discipline to manually set eol-style=native on all new text files you add). From ncoghlan at gmail.com Sat Oct 29 03:52:04 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 29 Oct 2005 11:52:04 +1000 Subject: [Python-Dev] PEP 352: Required Superclass for Exceptions In-Reply-To: References: <436207AA.20506@gmail.com> Message-ID: <4362D5C4.4080206@gmail.com> Brett Cannon wrote: > On 10/28/05, Guido van Rossum wrote: > Nick got the python-checkins email and then read the PEP from the > repository (or at least that is what I assume since that is how Neal > managed to catch the PEP literally in under 5 minutes after checkin). Actually, when you first check a PEP in, the diff includes the entire text of the PEP - so I just read the python-checkins email :) >> But does anyone care? As long as args exists and is a tuple, does it >> matter that it doesn't match the argument list when the latter was >> empty? IMO the protocol mostly says that ex.args exists and is a tuple >> -- the values in there can't be relied upon in pre-2.5-Python. >> Exceptions that have specific information should store it in a >> different place, not in ex.args. > > Looking at http://docs.python.org/lib/module-exceptions.html , it > looks like Guido is right. All it ever says is that it is a tuple and > that any passed-in arguments go into 'args'; nothing about its default > value if no arguments are passed in. > > But I personally have no qualms changing it if people want it, so -0 > from me on making it more backwards-compatible. I agree changing the behaviour is highly unlikely to cause any serious problems (mainly because anyone *caring* about the contents of args is rare), the current behaviour is relatively undocumented, and the PEP now proposes deprecating ex.args immediately, so Guido's well within his rights if he wants to change the behaviour. I was merely commenting from the 'its an unnecessary change to existing behaviour' angle, since the backwards compatible version gives the same behaviour of the new ex.message API as the version in the PEP, while leaving the now-deprecated ex.args API behaviour identical to that in Python 2.4. In other words, I'm looking for a *benefit* that comes from the behavioural change, rather than a 'but the current behaviour is undocumented anyway' response. If there's no actual benefit in breaking it, then why break it? :) >>> The observation: The value of ex.message >>> >>> Under PEP 352 the concept of allowing "return x" to be used in a generator >>> to mean "raise StopIteration(x)" would actually align quite well. A bare >>> "return", however, would need to be changed to translate to "raise >>> StopIteration(None)" rather than its current "raise StopIteration" in order to >>> get the correct value (None) into ex.message. >> Since ex.message is new, how can you say that it should have the value >> None? IMO the whole idea is that ex.message should always be a string >> going forward (although I'm not going to add a typecheck to enforce >> this). >> > > My feeling exactly on 'message'. I'm talking about the specific context of the behaviour of 'return' in generators, not on the behaviour of ex.message in general. For normal exceptions, I agree '' is the correct default. For that specific case of allowing a return value from generators, and using it as the message on the raised StopIteration, *then* it makes sense for "return" to translate to "raise StopIteration(None)", so that generators have the same 'default return value' as normal functions. There's a reason I said it was just an observation - it has no effect on PEP 352 itself, only on a *different* syntax extension that hasn't even been officially suggested in a PEP (only mentioned in passing when discussing PEP 342). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Sat Oct 29 04:23:17 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 29 Oct 2005 12:23:17 +1000 Subject: [Python-Dev] PEP 352 Transition Plan In-Reply-To: References: <007d01c5dc00$738da2e0$b62dc797@oemcomputer> Message-ID: <4362DD15.4080606@gmail.com> Brett Cannon wrote: > Interesting point, but I think that chaining should have more concrete > support ala PEP 344 or some other mechanism. I think most people > agree that exception chaining is important enough to have better > support than some implied way of a causing exception to be passed > along. Perhaps something more along the lines of: > > try: > raise TypeError("inner detail") > except TypeError, e: > raise TypeError("outer detail", cause=e) > > where BaseException then has a 'cause' attribute that is set to None > by default or some specific object that is passed in as the second > argument to the constructor. Another point in PEP 352's favour, is that it makes it far more feasible to implement something like PEP 344 by providing "__traceback__" and "__prev_exc__" attributes on BaseException. The 'raise' statement could then take care of setting them appropriately if it was given an instance of BaseException to raise. Actually, that brings up another question - PEP 352 says it will require objects that "inherit from BaseException". Does that mean that either subtypes or instances of BaseException will be acceptable? Or does it just mean instances? If the latter, how will that affect the multi-argument forms of 'raise'? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ishimoto at gembook.org Sat Oct 29 04:29:23 2005 From: ishimoto at gembook.org (Atsuo Ishimoto) Date: Sat, 29 Oct 2005 11:29:23 +0900 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <4362A44F.9010506@v.loewis.de> References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> <4362A44F.9010506@v.loewis.de> Message-ID: <20051029110331.D5AA.ISHIMOTO@gembook.org> Hello from Japan, I googled discussions about non-ASCII identifiers in Japanese, but I found no consensus. Major languages such as Java or VB support non-ASCII identifiers, so projects uses non-ASCII identifiers for their programs are existing. Not all Japanese programmers think this is a good idea. Some people enthusiastically prefer Japanese identifiers, but some feel it reduces readability and hard to type, some worry about tool breakages or encoding problem, etc. It looks that smart people don't like to express their preference to Japanese identifiers, maybe because they think such style is not cool, or they are afraid such confession may reveal lack of their English ability.;) I'm +0.1 for non-ASCII identifiers, although module names should remain ASCII. ASCII identifiers might be encouraged, but as Martin said, it is very useful for some groups of users. On Sat, 29 Oct 2005 00:21:03 +0200 "Martin v. Lvwis" wrote: > Neil Hodgson wrote: > > This is anecdotal but it appears to me that transliterations are > > not commonly used apart from learning languages and some minimal help > > for foreigners such as including transliterated names on railway > > station name boards. > > That would be my guess also. Transliteration is clearly common for > Latin-based languages (French, German, Spanish, say), but I doubt > non-Latin scripts are that often transliterated (even if conventions > exist). > Yes, transliterations are rarely used in daily life in Japan. For programming, I know a lot of projects use transliterated Japanses style, but I guess they are rather minority. -------------------------- Atsuo Ishimoto ishimoto at gembook.org Homepage:http://www.gembook.jp From raymond.hettinger at verizon.net Sat Oct 29 04:55:37 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 28 Oct 2005 22:55:37 -0400 Subject: [Python-Dev] PEP 352 Transition Plan In-Reply-To: <4362DD15.4080606@gmail.com> Message-ID: <009101c5dc34$3de2a1c0$b62dc797@oemcomputer> [Nick Coghlan] > Another point in PEP 352's favour, is that it makes it far more feasible > to > implement something like PEP 344 by providing "__traceback__" and > "__prev_exc__" attributes on BaseException. > > The 'raise' statement could then take care of setting them appropriately > if it > was given an instance of BaseException to raise. IMO, there is no reason to take e.args out of the Py2.x series. Take-aways should be left for Py3.0. The existence of a legitimate use case means that there may be working code in the field that would be broken unnecessarily. Nothing is gained by this breakage. If 344 gets accepted and implemented, that's great. Either way, there is no rationale for chopping this long standing feature before 3.0. IIRC, that was the whole point of 3.0 -- we could take out old stuff that had been replaced by new and better things; otherwise, we would simply deprecate old-style classes and be done with it in Py2.5 or Py2.6. Raymond From bcannon at gmail.com Sat Oct 29 05:16:02 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 28 Oct 2005 20:16:02 -0700 Subject: [Python-Dev] PEP 352: Required Superclass for Exceptions In-Reply-To: <4362D5C4.4080206@gmail.com> References: <436207AA.20506@gmail.com> <4362D5C4.4080206@gmail.com> Message-ID: On 10/28/05, Nick Coghlan wrote: > Brett Cannon wrote: > > On 10/28/05, Guido van Rossum wrote: > > Nick got the python-checkins email and then read the PEP from the > > repository (or at least that is what I assume since that is how Neal > > managed to catch the PEP literally in under 5 minutes after checkin). > > Actually, when you first check a PEP in, the diff includes the entire text of > the PEP - so I just read the python-checkins email :) > > >> But does anyone care? As long as args exists and is a tuple, does it > >> matter that it doesn't match the argument list when the latter was > >> empty? IMO the protocol mostly says that ex.args exists and is a tuple > >> -- the values in there can't be relied upon in pre-2.5-Python. > >> Exceptions that have specific information should store it in a > >> different place, not in ex.args. > > > > Looking at http://docs.python.org/lib/module-exceptions.html , it > > looks like Guido is right. All it ever says is that it is a tuple and > > that any passed-in arguments go into 'args'; nothing about its default > > value if no arguments are passed in. > > > > But I personally have no qualms changing it if people want it, so -0 > > from me on making it more backwards-compatible. > > I agree changing the behaviour is highly unlikely to cause any serious > problems (mainly because anyone *caring* about the contents of args is rare), > the current behaviour is relatively undocumented, and the PEP now proposes > deprecating ex.args immediately, so Guido's well within his rights if he wants > to change the behaviour. > > I was merely commenting from the 'its an unnecessary change to existing > behaviour' angle, since the backwards compatible version gives the same > behaviour of the new ex.message API as the version in the PEP, while leaving > the now-deprecated ex.args API behaviour identical to that in Python 2.4. > > In other words, I'm looking for a *benefit* that comes from the behavioural > change, rather than a 'but the current behaviour is undocumented anyway' > response. If there's no actual benefit in breaking it, then why break it? :) > The benefit for me was that the code kept the 'message' argument and thus, in my mind, made it much more obvious that 'mesage' and 'args' are different. But I think I have a much more reasonable solution that lets me keep the 'message' argument explicit. It also let me use the conditional operator to simplify the code more. So I went ahead and made it the more backwards-compatible. > >>> The observation: The value of ex.message > >>> > >>> Under PEP 352 the concept of allowing "return x" to be used in a generator > >>> to mean "raise StopIteration(x)" would actually align quite well. A bare > >>> "return", however, would need to be changed to translate to "raise > >>> StopIteration(None)" rather than its current "raise StopIteration" in order to > >>> get the correct value (None) into ex.message. > >> Since ex.message is new, how can you say that it should have the value > >> None? IMO the whole idea is that ex.message should always be a string > >> going forward (although I'm not going to add a typecheck to enforce > >> this). > >> > > > > My feeling exactly on 'message'. > > I'm talking about the specific context of the behaviour of 'return' in > generators, not on the behaviour of ex.message in general. For normal > exceptions, I agree '' is the correct default. > > For that specific case of allowing a return value from generators, and using > it as the message on the raised StopIteration, *then* it makes sense for > "return" to translate to "raise StopIteration(None)", so that generators have > the same 'default return value' as normal functions. > > There's a reason I said it was just an observation - it has no effect on PEP > 352 itself, only on a *different* syntax extension that hasn't even been > officially suggested in a PEP (only mentioned in passing when discussing PEP 342). > Ah, OK. So you just want to make sure that at the generator level that the bytecode (or the ceval loop, not sure where the change would need to be made) that the StopIteration be raised with an explicit 'message' argument of None. Which obviously does not directly affect PEP 352, but should be considered as a possible change. That makes sense to me and I have no trouble with that, but that is partially because I don't have to make that change. =) -Brett From guido at python.org Sat Oct 29 05:22:36 2005 From: guido at python.org (Guido van Rossum) Date: Fri, 28 Oct 2005 20:22:36 -0700 Subject: [Python-Dev] PEP 352: Required Superclass for Exceptions In-Reply-To: <4362D5C4.4080206@gmail.com> References: <436207AA.20506@gmail.com> <4362D5C4.4080206@gmail.com> Message-ID: [Trying to cut this short... We have too many threads for this topic. :-( ] On 10/28/05, Nick Coghlan wrote: [on making args b/w compatible] > I agree changing the behaviour is highly unlikely to cause any serious > problems (mainly because anyone *caring* about the contents of args is rare), > the current behaviour is relatively undocumented, and the PEP now proposes > deprecating ex.args immediately, so Guido's well within his rights if he wants > to change the behaviour. I take it back. Since the feature will disappear in Python 3.0 and is maintained only for b/w compatibility, we should keep it as b/w compatible as possible. That means it should default to () and always have as its value exactly the positional arguments that were passed. OTOH, I want message to default to "", not to None (even though it will be set to None if you explicitly pass None as the first argument). So the constructor could be like this (until Python 3000): def __init__(self, *args): self.args = args if args: self.message = args[0] else: self.message = "" I think Nick proposed this before as well, so let's just do this. > I'm talking about the specific context of the behaviour of 'return' in > generators, not on the behaviour of ex.message in general. For normal > exceptions, I agree '' is the correct default. > > For that specific case of allowing a return value from generators, and using > it as the message on the raised StopIteration, *then* it makes sense for > "return" to translate to "raise StopIteration(None)", so that generators have > the same 'default return value' as normal functions. I don't like that (not-even-proposed) feature anyway. I see no use for it; it only gets proposed by people who are irked by the requirement that generators can contain 'return' but not 'return value'. I think that irkedness is unwarranted; 'return' is useful to cause an early exit, but generators don't have a return value so 'return value' is meaningless. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bcannon at gmail.com Sat Oct 29 05:27:27 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 28 Oct 2005 20:27:27 -0700 Subject: [Python-Dev] PEP 352: Required Superclass for Exceptions In-Reply-To: References: <436207AA.20506@gmail.com> <4362D5C4.4080206@gmail.com> Message-ID: On 10/28/05, Guido van Rossum wrote: > [Trying to cut this short... We have too many threads for this topic. :-( ] > > On 10/28/05, Nick Coghlan wrote: > [on making args b/w compatible] > > I agree changing the behaviour is highly unlikely to cause any serious > > problems (mainly because anyone *caring* about the contents of args is rare), > > the current behaviour is relatively undocumented, and the PEP now proposes > > deprecating ex.args immediately, so Guido's well within his rights if he wants > > to change the behaviour. > > I take it back. Since the feature will disappear in Python 3.0 and is > maintained only for b/w compatibility, we should keep it as b/w > compatible as possible. That means it should default to () and always > have as its value exactly the positional arguments that were passed. > > OTOH, I want message to default to "", not to None (even though it > will be set to None if you explicitly pass None as the first > argument). So the constructor could be like this (until Python 3000): > > def __init__(self, *args): > self.args = args > if args: > self.message = args[0] > else: > self.message = "" > > I think Nick proposed this before as well, so let's just do this. Yeah, but Nick used the conditional operator and I used that. All checked in. -Brett From bcannon at gmail.com Sat Oct 29 05:37:29 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 28 Oct 2005 20:37:29 -0700 Subject: [Python-Dev] PEP 352 Transition Plan In-Reply-To: <4362DD15.4080606@gmail.com> References: <007d01c5dc00$738da2e0$b62dc797@oemcomputer> <4362DD15.4080606@gmail.com> Message-ID: On 10/28/05, Nick Coghlan wrote: > Brett Cannon wrote: > > Interesting point, but I think that chaining should have more concrete > > support ala PEP 344 or some other mechanism. I think most people > > agree that exception chaining is important enough to have better > > support than some implied way of a causing exception to be passed > > along. Perhaps something more along the lines of: > > > > try: > > raise TypeError("inner detail") > > except TypeError, e: > > raise TypeError("outer detail", cause=e) > > > > where BaseException then has a 'cause' attribute that is set to None > > by default or some specific object that is passed in as the second > > argument to the constructor. > > Another point in PEP 352's favour, is that it makes it far more feasible to > implement something like PEP 344 by providing "__traceback__" and > "__prev_exc__" attributes on BaseException. > > The 'raise' statement could then take care of setting them appropriately if it > was given an instance of BaseException to raise. > Yep. This is why having a guaranteed API is so handy for exceptions. And actually PEP 3000 says that exceptions are supposed to gain a traceback attribute. But that can be another PEP if PEP 344 doesn't make it. > Actually, that brings up another question - PEP 352 says it will require > objects that "inherit from BaseException". Does that mean that either subtypes > or instances of BaseException will be acceptable? Or does it just mean > instances? If the latter, how will that affect the multi-argument forms of > 'raise'? > I don't see how a multi-argument 'raise' changes the situation any. ``raise BaseException`` and ``raise BaseException()`` must both be supported which means isinstance() or issubtype() will be used (unless Python 3 bans raising a class or something). -Brett -Brett From radeex at gmail.com Sat Oct 29 06:43:25 2005 From: radeex at gmail.com (Christopher Armstrong) Date: Sat, 29 Oct 2005 15:43:25 +1100 Subject: [Python-Dev] PEP 352 Transition Plan In-Reply-To: <4362DD15.4080606@gmail.com> References: <007d01c5dc00$738da2e0$b62dc797@oemcomputer> <4362DD15.4080606@gmail.com> Message-ID: <60ed19d40510282143x466fbdf1x5570c1b05c6cd53c@mail.gmail.com> On 10/29/05, Nick Coghlan wrote: > Another point in PEP 352's favour, is that it makes it far more feasible to > implement something like PEP 344 by providing "__traceback__" and > "__prev_exc__" attributes on BaseException. Not sure if I'm fully in-context here, but watch out for __traceback__ and garbage collection, since the traceback objects refer to all the frames. I expect there's a significant amount of code out there that expects Exception instances to be reasonably persistent. At least Twisted does, with its encapsulation of Exceptions for the purposes of asynchrony -- Failure objects. These Failure objects also refer to tracebacks, but we had to be very careful about deleting them fairly quickly because of GC issues. After deletion they simply contain an inert, basically stringified copy of the traceback. On an only semi-related note, at one point I tried making it possible to have longer-lived Traceback objects that could be reraised, but found it very hard to do, at least with my self-imposed requirement of keeping it in an extension module. http://mail.python.org/pipermail/python-dev/2005-September/056091.html -- Twisted | Christopher Armstrong: International Man of Twistery Radix | -- http://radix.twistedmatrix.com | Release Manager, Twisted Project \\\V/// | -- http://twistedmatrix.com |o O| | w----v----w-+ From ncoghlan at iinet.net.au Sat Oct 29 08:16:19 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Sat, 29 Oct 2005 16:16:19 +1000 Subject: [Python-Dev] PEP 343 updated with outcome of recent discussions Message-ID: <436313B3.2030707@iinet.net.au> Once the cron job works it magic, the updated PEP 343 should be available on the website. As far as I am aware, there aren't any more open issues, so it is once again ready for BDFL pronouncement. I also tinkered with the example naming a bit, and added a new example for the "nested" context manager (it turns out there *were* mistakes in the last version I posted here - I had the deque method name wrong, and I wasn't invoking __context__ correctly on the nested contexts). Cheers, Nick. P.S. My availability will be sketchy for the rest of this weekend, then nonexistent until next weekend, so don't be surprised if I don't respond to messages before then. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From martin at v.loewis.de Sat Oct 29 10:56:58 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 29 Oct 2005 10:56:58 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <20051029110331.D5AA.ISHIMOTO@gembook.org> References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> <4362A44F.9010506@v.loewis.de> <20051029110331.D5AA.ISHIMOTO@gembook.org> Message-ID: <4363395A.3040606@v.loewis.de> Atsuo Ishimoto wrote: > I'm +0.1 for non-ASCII identifiers, although module names should remain > ASCII. ASCII identifiers might be encouraged, but as Martin said, it is > very useful for some groups of users. Thanks for these data. This mostly reflects my experience with German and French users: some people would like to use non-ASCII identifiers if they could, other argue they never would as a matter of principle. Of course, transliteration is more straight-forward. Regards, Martin From gjc at inescporto.pt Sat Oct 29 13:09:10 2005 From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro) Date: Sat, 29 Oct 2005 12:09:10 +0100 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <4363395A.3040606@v.loewis.de> References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> <4362A44F.9010506@v.loewis.de> <20051029110331.D5AA.ISHIMOTO@gembook.org> <4363395A.3040606@v.loewis.de> Message-ID: <1130584150.10206.10.camel@localhost.localdomain> On Sat, 2005-10-29 at 10:56 +0200, "Martin v. L?wis" wrote: > Atsuo Ishimoto wrote: > > I'm +0.1 for non-ASCII identifiers, although module names should remain > > ASCII. ASCII identifiers might be encouraged, but as Martin said, it is > > very useful for some groups of users. > > Thanks for these data. This mostly reflects my experience with German > and French users: some people would like to use non-ASCII identifiers > if they could, other argue they never would as a matter of principle. > Of course, transliteration is more straight-forward. Not sure if anyone has made this point already, but unicode identifiers are also useful for math programs. The ability to directly type the math letters, like alpha, omega, etc., would actually make the code more readable, while still understandable by programmers of all nationalities. For instance, you could write: ?v = x1 - x0 if ?v < ?: return Instead of: delta_v = x1 - x0 if delta_v < epsilon: return But anyone that is supposed to understand the code will be able to read the delta and epsilon symbols. Regards. -- Gustavo J. A. M. Carneiro The universe is always one step beyond logic From solipsis at pitrou.net Sat Oct 29 14:32:22 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 29 Oct 2005 14:32:22 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <4363395A.3040606@v.loewis.de> References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> <4362A44F.9010506@v.loewis.de> <20051029110331.D5AA.ISHIMOTO@gembook.org> <4363395A.3040606@v.loewis.de> Message-ID: <1130589142.5945.11.camel@fsol> > Thanks for these data. This mostly reflects my experience with German > and French users: some people would like to use non-ASCII identifiers > if they could, other argue they never would as a matter of principle. > Of course, transliteration is more straight-forward. FWIW, being French, I don't remember hearing any programmer wish (s)he could use non-ASCII identifiers, in any programming language. But arguably translitteration is very straight-forward (although a bit lossless at times ;-)). I think typeability and reproduceability should be weighted carefully. It's nice to have the real letter delta instead of "delta", but how do I type it again on my non-Greek keyboard if I want to keep consistent naming in the program? ASCII is ethnocentric, but it probably can be typed easily with every device in the world. Also, as a matter of fact, if I type an identifier with an accented letter inside, I would like Python to warn me, because it would be a typing error on my part. Maybe this should be an option at the beginning of any source file (like encoding currently). Or is this overkill? From skink at evhr.net Sat Oct 29 14:50:36 2005 From: skink at evhr.net (Fabien Schwob) Date: Sat, 29 Oct 2005 14:50:36 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <1130589142.5945.11.camel@fsol> References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> <4362A44F.9010506@v.loewis.de> <20051029110331.D5AA.ISHIMOTO@gembook.org> <4363395A.3040606@v.loewis.de> <1130589142.5945.11.camel@fsol> Message-ID: <4363701C.80904@evhr.net> > FWIW, being French, I don't remember hearing any programmer wish (s)he > could use non-ASCII identifiers, in any programming language. But > arguably translitteration is very straight-forward (although a bit > lossless at times ;-)). > > I think typeability and reproduceability should be weighted carefully. > It's nice to have the real letter delta instead of "delta", but how do I > type it again on my non-Greek keyboard if I want to keep consistent > naming in the program? > > ASCII is ethnocentric, but it probably can be typed easily with every > device in the world. > > Also, as a matter of fact, if I type an identifier with an accented > letter inside, I would like Python to warn me, because it would be a > typing error on my part. > > Maybe this should be an option at the beginning of any source file (like > encoding currently). Or is this overkill? I'm also French and I must say that I agree with you. In my case, the most important thing is to be able to manage the _data_ in the good encoding. I'm currently trying to implement a little search engine in python (to improve my skills mainly) and the biggest problem I have to face is how to manage encoding. Some web pages are in French, in German, in English, etc. and it take me a lot of time to handle this problem correctly. I think it's more useful to be able to manipulate simply the _data_ than to have accents in identifiers. -- Derri?re chaque bogue, il y a un d?veloppeur, un homme qui s'est tromp?. (Bon, OK, parfois ils s'y mettent ? plusieurs). From martin at v.loewis.de Sat Oct 29 16:48:32 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 29 Oct 2005 16:48:32 +0200 Subject: [Python-Dev] Divorcing str and unicode (no more implicitconversions). In-Reply-To: <1130589142.5945.11.camel@fsol> References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> <4362A44F.9010506@v.loewis.de> <20051029110331.D5AA.ISHIMOTO@gembook.org> <4363395A.3040606@v.loewis.de> <1130589142.5945.11.camel@fsol> Message-ID: <43638BC0.40108@v.loewis.de> Antoine Pitrou wrote: > FWIW, being French, I don't remember hearing any programmer wish (s)he > could use non-ASCII identifiers, in any programming language. But > arguably translitteration is very straight-forward (although a bit > lossless at times ;-)). My canonical example is Fran?ois Pinard, who keeps requesting it, saying that local people where surprised they couldn't use accented characters in Python. Perhaps that's because he actually is Quebecian :-) Regards, Martin From phd at mail2.phd.pp.ru Sat Oct 29 18:13:27 2005 From: phd at mail2.phd.pp.ru (Oleg Broytmann) Date: Sat, 29 Oct 2005 20:13:27 +0400 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com> References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net> <17248.52771.225830.484931@montanaro.dyndns.org> <43610C36.2030500@v.loewis.de> <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com> Message-ID: <20051029161327.GB7048@phd.pp.ru> Hello! On Fri, Oct 28, 2005 at 09:29:09PM -0400, Tim Peters wrote: > - Finding out what's changed in your sandbox. Use "svn status" svn diff uses locally saved copies of files. This increases speed by trading for the disk space. It also decreases net traffic; that's important for those who have expensive connections. > 4. Making a branch or tag goes very fast under SVN. Fast and cheap in terms of space; Subversion uses a kind of symlinks in its internal filesystem. > make simple applications of branches much more pleasant than under CVS. Much more pleasant. I now use more branches than I did with CVS and have less conflicts. > * = svn:eol-style=native I would very much like to recommend developers to set svn:executable property on executable scripts and unset it on non-executable files; thus all those README and NEWS will be tarred with -rw-r--r-- attributes. :) Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From s.percivall at chello.se Sat Oct 29 18:35:18 2005 From: s.percivall at chello.se (Simon Percivall) Date: Sat, 29 Oct 2005 18:35:18 +0200 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <43611507.8090606@v.loewis.de> References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> <2m64rj5agw.fsf@starship.python.net> <43611507.8090606@v.loewis.de> Message-ID: On 27 okt 2005, at 19.57, Martin v. L?wis wrote: > Michael Hudson wrote: > >> Do checkins to svn.python.org go to the python-checkins list already? >> > > They do indeed - you should have received one commit message by now > (me testing whether committing works, on PEP 347). Could the subject lines of those messages please be changed to something more informative? Having which files were changed in the subject seems better than having only the new rev and the folders the files are in. //Simon From martin at v.loewis.de Sat Oct 29 18:44:50 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 29 Oct 2005 18:44:50 +0200 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> <2m64rj5agw.fsf@starship.python.net> <43611507.8090606@v.loewis.de> Message-ID: <4363A702.1050502@v.loewis.de> Simon Percivall wrote: > Could the subject lines of those messages please be changed to something > more informative? Having which files were changed in the subject seems > better than having only the new rev and the folders the files are in. I'm neither sure whether that should be done, or whether it could be done. What do others think? I personally found those long subject lines listing all the changed files very ugly and unreadable. The other question (whether it could be done) is probably answered as "yes", but I have to research what magic precisely is necessary. Regards, Martin From barry at python.org Sat Oct 29 20:34:48 2005 From: barry at python.org (Barry Warsaw) Date: Sat, 29 Oct 2005 14:34:48 -0400 Subject: [Python-Dev] Conversion to Subversion is complete In-Reply-To: <4363A702.1050502@v.loewis.de> References: <1130409313.4360ad6139518@www.domainfactory-webmail.de> <2m64rj5agw.fsf@starship.python.net> <43611507.8090606@v.loewis.de> <4363A702.1050502@v.loewis.de> Message-ID: <1130610888.11892.6.camel@geddy.wooz.org> On Sat, 2005-10-29 at 12:44, "Martin v. L?wis" wrote: > What do others think? I personally found those long subject lines > listing all the changed files very ugly and unreadable. Me too. At work our subject lines contain something like: Subject: [SVN][reponame] checkin of r12345 - dir/containing/changes Note that we send a different commit message for every directory the change happens in, even though it's all one revision. We like it that way because some people don't care about certain directories and can filter based on that. Inside the body of the email you'll see something like: Author: person Date: when New Revision: r12345 Log: Log message comes next. Definitely best to show up before the diff. diff comes next... FWIW, this format has worked well for us. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20051029/b8d38328/attachment.pgp From fdrake at acm.org Sat Oct 29 21:04:15 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Sat, 29 Oct 2005 15:04:15 -0400 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com> References: <435BC27C.1010503@v.loewis.de> <43610C36.2030500@v.loewis.de> <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com> Message-ID: <200510291504.15504.fdrake@acm.org> On Friday 28 October 2005 21:29, Tim Peters wrote: > - Finding out what's changed in your sandbox. Use "svn status" > for that. Bonus: in return for creating zillions of admin files, > "svn status" > is a local operation (no network access required). Do "svn status -u" > to get, in addition, a listing of files that _would_ change if you were to > do "svn update". It's worth noting that "svn status -u" does require network access, since it has to check with the repository to see what's been updated there. -Fred -- Fred L. Drake, Jr. From fdrake at acm.org Sat Oct 29 21:50:11 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Sat, 29 Oct 2005 15:50:11 -0400 Subject: [Python-Dev] [Python-checkins] commit of r41352 - in python/trunk: . Lib Lib/distutils Lib/distutils/command Lib/encodings In-Reply-To: <20051029194022.470D61E40B4@bag.python.org> References: <20051029194022.470D61E40B4@bag.python.org> Message-ID: <200510291550.12279.fdrake@acm.org> On Saturday 29 October 2005 15:40, martin.v.loewis at python.org wrote: > Author: martin.v.loewis > Date: Sat Oct 29 21:40:21 2005 > New Revision: 41352 > > Modified: > python/trunk/ (props changed) > python/trunk/.cvsignore ... > Add *.pyc to svn:ignore. > Add libpython*.a to .cvsignore and svn:ignore. Shouldn't we simply remove the .cvsignore files? Subversion doesn't use them, so they'll just end up getting out of sync with the svn:ignore properties. -Fred -- Fred L. Drake, Jr. From noamraph at gmail.com Sat Oct 29 22:42:49 2005 From: noamraph at gmail.com (Noam Raphael) Date: Sat, 29 Oct 2005 22:42:49 +0200 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <1130107429.11268.40.camel@geddy.wooz.org> References: <1130107429.11268.40.camel@geddy.wooz.org> Message-ID: Hello, I have thought about freezing for some time, and I think that it is a fundamental need - the need to know, sometimes, that objects aren't going to change. This is mostly the need of containers. dicts need to know that the objects that are used as keys aren't going to change, because if they change, their hash value changes, and you end up with a data structure in an inconsistent state. This is the need of sets too, and of heaps, and binary trees, and so on. I want to give another example: I and my colleges designed something which can be described as an "electronic spreadsheet in Python". We called it a "table". The values in the table are Python objects, and the functions which relate them are written in Python. Then comes the problem: the user has, of course, access to the objects stored in the table. What would happen if he changes them? The answer is that the table would be in an inconsistent state, since something which should be the return value of a function is now something else, and there's no way for the table to know about that. The solution is to have a "freeze" protocol. It may be called "frozen" (like frozen(set([1,2,3]))), so that it will be clear that it does not change the object itself. The definition of a frozen object is that its value can't change - that is, if you compare it with another object, you should get the same result as long as the other object hasn't changed. As a rule, only frozen objects should be hashable. I want to give another, different, use case for freezing objects. I once thought about writing a graph package in Python - I mean a graph with vertices and edges. The most obvious way to store a directed graph is as a mapping (dict) from a node to the set of nodes that it points to. Since I want to be able to find also which nodes point to a specific node, I will store another mapping, from a node to the set of nodes that point to it. Now, I want a method of the graph which will return the set of nodes that a given node points to, for example to let me write "if y in graph.adjacent_nodes(x) then". The question is, what will the adjacent_nodes method return? If it returns the set which is a part of the data structure, there is nothing (even no convention!) that will prevent the user from playing with it. This will corrupt the data structure, since the change won't be recorded in the inverse mapping. adjacent_nodes can return a copy of the set, it's a waste if you only want to check whether an object is a member of the set. I gave this example to say that the "frozen" protocol should (when possible) return an object which doesn't really contain a copy of the data, but rather gives an "image" of the original object. If the original object changes while there are frozen copies of it, the data will be copied, and all the frozen objects will then reference a version of the data that will never change again. This will solve the graph problem nicely - adjacent_nodes would simply return a frozen copy of the set, and a copy operation would happen only in the rare cases when the returned set is being modified. This would also help the container use cases: they may call the frozen() method on objects that should be inserted into the container, and usually the data won't be copied. Some objects can't be created in their final form, but can only be constructed step after step. This means that they must be non-frozen objects. Sometimes they are constructed in order to get into a container. Unless the frozen() method is copy-on-change the way I described, all the data would have to be copied again, just for the commitment that it won't change. I don't mean to frighten, but in principle, this may mean that immutable strings might be introduced, which will allow us to get rid of all the cStringIO workarounds. Immutable strings would be constructed whenever they are needed, at a low performance cost (remember that a frozen copy of a given object has to be constructed only once - once it has been created, the same object can be returned on additional frozen() calls.) Copy-on-change of containers of non-frozen objects requires additional complication: it requires frozen objects to have a way for setting a callback that will be called when the original object was changed. This is because the change makes the container of the original object change, so it must drop its own frozen copy. This needs to happen only once per frozen object, since after a change, all the containers drop their frozen copies. I think this callback is conceptually similar to the weakref callback. Just an example that copy-on-change (at least of containers of frozen objects) is needed: sets. It was decided that you can test whether a non-frozen set is a member of a set. I understand that it is done by "temporarily freezing" the set, and that it caused some threading issues. A copy-on-change mechanism might solve it more elegantly. What do you think? Noam From martin at v.loewis.de Sun Oct 30 00:53:53 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 30 Oct 2005 00:53:53 +0200 Subject: [Python-Dev] [Python-checkins] commit of r41352 - in python/trunk: . Lib Lib/distutils Lib/distutils/command Lib/encodings In-Reply-To: <200510291550.12279.fdrake@acm.org> References: <20051029194022.470D61E40B4@bag.python.org> <200510291550.12279.fdrake@acm.org> Message-ID: <4363FD81.10403@v.loewis.de> Fred L. Drake, Jr. wrote: > Shouldn't we simply remove the .cvsignore files? Subversion doesn't use them, > so they'll just end up getting out of sync with the svn:ignore properties. That might be reasonable. I just noticed that it is convenient to do svn propset svn:ignore -F .cvsignore . Without a file, I wouldn't know how to edit the property, so I would probably do svn propget svn:ignore . > ignores vim ignores svn propset svn:ignore -F ignores . rm ignores Regards, Martin From solipsis at pitrou.net Sun Oct 30 01:25:54 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 30 Oct 2005 01:25:54 +0200 Subject: [Python-Dev] svn:ignore In-Reply-To: <4363FD81.10403@v.loewis.de> References: <20051029194022.470D61E40B4@bag.python.org> <200510291550.12279.fdrake@acm.org> <4363FD81.10403@v.loewis.de> Message-ID: <1130628354.5945.24.camel@fsol> Hi, FWIW, I opened a bug report on Subversion some time ago so that patterns like "*.pyc" and "*.pyo" are ignored by default in Subversion. Feel free to add comments or vote for the bug: http://subversion.tigris.org/issues/show_bug.cgi?id=2415 Regards Antoine. From noamraph at gmail.com Sun Oct 30 01:32:41 2005 From: noamraph at gmail.com (Noam Raphael) Date: Sun, 30 Oct 2005 01:32:41 +0200 Subject: [Python-Dev] [Python-checkins] commit of r41352 - in python/trunk: . Lib Lib/distutils Lib/distutils/command Lib/encodings In-Reply-To: <4363FD81.10403@v.loewis.de> References: <20051029194022.470D61E40B4@bag.python.org> <200510291550.12279.fdrake@acm.org> <4363FD81.10403@v.loewis.de> Message-ID: > That might be reasonable. I just noticed that it is convenient to do > > svn propset svn:ignore -F .cvsignore . > > Without a file, I wouldn't know how to edit the property, so I would > probably do > > svn propget svn:ignore . > ignores > vim ignores > svn propset svn:ignore -F ignores . > rm ignores > Won't "svn propedit svn:ignore ." do the trick? Noam From pinard at iro.umontreal.ca Sun Oct 30 02:16:11 2005 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Sat, 29 Oct 2005 20:16:11 -0400 Subject: [Python-Dev] [Python-checkins] commit of r41352 - in python/trunk: . Lib Lib/distutils Lib/distutils/command Lib/encodings In-Reply-To: <4363FD81.10403@v.loewis.de> References: <20051029194022.470D61E40B4@bag.python.org> <200510291550.12279.fdrake@acm.org> <4363FD81.10403@v.loewis.de> Message-ID: <20051030001611.GA22474@phenix.sram.qc.ca> [Martin von L?wis] >Without a file, I wouldn't know how to edit the property, so I would >probably do >svn propget svn:ignore . > ignores >vim ignores >svn propset svn:ignore -F ignores . >rm ignores You can use `svn propedit' (or `svn pe'). -- Fran?ois Pinard http://pinard.progiciels-bpi.ca From jcarlson at uci.edu Sun Oct 30 02:34:12 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 29 Oct 2005 17:34:12 -0700 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: References: <1130107429.11268.40.camel@geddy.wooz.org> Message-ID: <20051029164637.39E1.JCARLSON@uci.edu> Noam Raphael wrote: > > Hello, > > I have thought about freezing for some time, and I think that it is a > fundamental need - the need to know, sometimes, that objects aren't > going to change. I agree with this point. > This is mostly the need of containers. dicts need to know that the > objects that are used as keys aren't going to change, because if they > change, their hash value changes, and you end up with a data structure > in an inconsistent state. This is the need of sets too, and of heaps, > and binary trees, and so on. You are exactly mirroring the sentiments of the PEP. > I want to give another example: I and my colleges designed something > which can be described as an "electronic spreadsheet in Python". We > called it a "table". The values in the table are Python objects, and > the functions which relate them are written in Python. Then comes the > problem: the user has, of course, access to the objects stored in the > table. What would happen if he changes them? The answer is that the > table would be in an inconsistent state, since something which should > be the return value of a function is now something else, and there's > no way for the table to know about that. I respectfully disagree with this point and the rest of your email. Why? For two use-cases, you offer 'tables of values' and 'graphs', as well as a possible solution to the 'problem'; copy on write. In reading your description of a 'table of values', I can't help but be reminded of the wxPython (and wxWidget) wx.Grid and its semantics. It offers arbitrary tables of values (whose editors and viewers you can change at will), which offers a mechanism by which you can "listen" to changes that occur to the contents of a cell. I can't help but think that if you offered a protocol by which a user can signal that a cell has been changed, perhaps by writing the value to the table itself (table.SetValue(row, col, value)), every read a deepcopy (or a PEP 351 freeze), etc., that both you and the users of your table would be much happier. As for the graph issue, you've got a bigger problem than users just being able to edit edge lists, users can clear the entire dictionary of vertices (outgoing.clear()). It seems to me that a more reasonable method to handle this particular case is to tell your users "don't modify the dictionaries or the edge lists", and/or store your edge lists as tuples instead of lists or dictionaries, and/or use an immutable dictionary (as offered by Barry in the PEP). There's also this little issue of "copy on write" semantics with Python. Anyone who tells you that "copy on write" is easy, is probably hanging out with the same kind of people who say that "threading is easy". Of course both are easy if you limit your uses to some small subset of interesting interactions, but "copy on write" gets far harder when you start thinking of dictionaries, lists, StringIOs, arrays, and all the possible user-defined classes, which may be mutated beyond obj[key] = value and/or obj.attr = value (some have obj.method() which mutates the object). As such, offering a callback mechanism similar to weak references is probably pretty close to impossible with CPython. One of the reasons why I liked the freeze protocol is that it offered a simple mechanism by which Python could easily offer support, for both new and old objects alike. Want an example? Here's the implementation for array freezing: tuple(a). What about lists? tuple(map(freeze, lst)) Freezing may not ultimately be the right solution for everything, but it is a simple solution which handles the majority of cases. Copy on write in Python, on the other hand, is significantly harder to implement, support, and is probably not the right solution for many problems. - Josiah P.S. To reiterate to Barry: map freeze to the contents of containers, otherwise the object can still be modified, and hence is not frozen. From martin at v.loewis.de Sun Oct 30 12:06:15 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 30 Oct 2005 12:06:15 +0100 Subject: [Python-Dev] [Python-checkins] commit of r41352 - in python/trunk: . Lib Lib/distutils Lib/distutils/command Lib/encodings In-Reply-To: References: <20051029194022.470D61E40B4@bag.python.org> <200510291550.12279.fdrake@acm.org> <4363FD81.10403@v.loewis.de> Message-ID: <4364A927.5040209@v.loewis.de> Noam Raphael wrote: > Won't "svn propedit svn:ignore ." do the trick? It certainly would. Thanks for pointing that out. Regards, Martin From skip at pobox.com Sun Oct 30 14:04:22 2005 From: skip at pobox.com (skip@pobox.com) Date: Sun, 30 Oct 2005 07:04:22 -0600 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com> References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net> <17248.52771.225830.484931@montanaro.dyndns.org> <43610C36.2030500@v.loewis.de> <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com> Message-ID: <17252.50390.256221.4882@montanaro.dyndns.org> Tim> Excellent suggestions! I have a few to pass on: ... Tim, Thanks for the tips. As a new svn user myself, I find these helpful. These are precisely the things the Wiki would be good for. They don't prescribe policy. They help people in a general way to migrate from cvs to svn more easily. Anyone with cvs and svn experience, but without the ability to check stuff into the pydotorg repository could contribute. Skip From skip at pobox.com Sun Oct 30 14:29:25 2005 From: skip at pobox.com (skip@pobox.com) Date: Sun, 30 Oct 2005 07:29:25 -0600 Subject: [Python-Dev] [Python-checkins] commit of r41352 - in python/trunk: . Lib Lib/distutils Lib/distutils/command Lib/encodings In-Reply-To: <200510291550.12279.fdrake@acm.org> References: <20051029194022.470D61E40B4@bag.python.org> <200510291550.12279.fdrake@acm.org> Message-ID: <17252.51893.314977.306457@montanaro.dyndns.org> Fred> Shouldn't we simply remove the .cvsignore files? Subversion Fred> doesn't use them, so they'll just end up getting out of sync with Fred> the svn:ignore properties. Is there some equivalent? If so, can we convert the .cvsignore files before deleting them? Skip From skip at pobox.com Sun Oct 30 16:36:43 2005 From: skip at pobox.com (skip@pobox.com) Date: Sun, 30 Oct 2005 09:36:43 -0600 Subject: [Python-Dev] svn checksum error Message-ID: <17252.59531.252751.768301@montanaro.dyndns.org> I tried "svn up" to bring my sandbox up-to-date and got this output: % svn up U Include/unicodeobject.h subversion/libsvn_wc/update_editor.c:1609: (apr_err=155017) svn: Checksum mismatch for 'Objects/.svn/text-base/unicodeobject.c.svn-base'; expected: '8611dc5f592e7cbc6070524a1437db9b', actual: '2d28838f2fec366fc58386728a48568e' What's that telling me? Thx, Skip From skip at pobox.com Sun Oct 30 16:38:45 2005 From: skip at pobox.com (skip@pobox.com) Date: Sun, 30 Oct 2005 09:38:45 -0600 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <17252.50390.256221.4882@montanaro.dyndns.org> References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net> <17248.52771.225830.484931@montanaro.dyndns.org> <43610C36.2030500@v.loewis.de> <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com> <17252.50390.256221.4882@montanaro.dyndns.org> Message-ID: <17252.59653.792906.582288@montanaro.dyndns.org> Tim> Excellent suggestions! I have a few to pass on: skip> These are precisely the things the Wiki would be good for. I went ahead and used Tim's note as the basis for a page on the wiki: http://wiki.python.org/moin/CvsToSvn It's linked from the PythonDevelopers page (a page of previously dubious necessity). Skip From fredrik at pythonware.com Sun Oct 30 17:58:01 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun, 30 Oct 2005 17:58:01 +0100 Subject: [Python-Dev] svn checksum error References: <17252.59531.252751.768301@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote: > I tried "svn up" to bring my sandbox up-to-date and got this output: > > % svn up > U Include/unicodeobject.h > subversion/libsvn_wc/update_editor.c:1609: (apr_err=155017) > svn: Checksum mismatch for 'Objects/.svn/text-base/unicodeobject.c.svn-base'; expected: '8611dc5f592e7cbc6070524a1437db9b', actual: '2d28838f2fec366fc58386728a48568e' > > What's that telling me? "welcome to the wonderful world of subversion error messages" (from what I can tell, the message means that SVN thinks that there might have been some checksum error somewhere, or some other error at a point where subversion thinks it's likely that a checksum was involved; to figure out what's really causing this problem, you probably need a debug build of subversion). deleting the offending directory and doing "svn up" is the easiest way to fix this. From martin at v.loewis.de Sun Oct 30 23:03:54 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 30 Oct 2005 23:03:54 +0100 Subject: [Python-Dev] svn:ignore (Was: [Python-checkins] commit of r41352 - in python/trunk: . Lib Lib/distutils Lib/distutils/command Lib/encodings) In-Reply-To: <17252.51893.314977.306457@montanaro.dyndns.org> References: <20051029194022.470D61E40B4@bag.python.org> <200510291550.12279.fdrake@acm.org> <17252.51893.314977.306457@montanaro.dyndns.org> Message-ID: <4365434A.5030808@v.loewis.de> skip at pobox.com wrote: > Fred> Shouldn't we simply remove the .cvsignore files? Subversion > Fred> doesn't use them, so they'll just end up getting out of sync with > Fred> the svn:ignore properties. > > Is there some equivalent? If so, can we convert the .cvsignore files before > deleting them? cvs2svn has already converted them automatically - to svn:ignore properties; try svn propget svn:ignore Doc (assuming . is the current directory). I have now deleted all .cvsignore files in the trunk in revision 41357 (yay, giving a single number for a multi-file delete operation feels good :-) Regards, Martin From martin at v.loewis.de Sun Oct 30 23:25:28 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 30 Oct 2005 23:25:28 +0100 Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover In-Reply-To: <17252.59653.792906.582288@montanaro.dyndns.org> References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net> <17248.52771.225830.484931@montanaro.dyndns.org> <43610C36.2030500@v.loewis.de> <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com> <17252.50390.256221.4882@montanaro.dyndns.org> <17252.59653.792906.582288@montanaro.dyndns.org> Message-ID: <43654858.9020108@v.loewis.de> skip at pobox.com wrote: > I went ahead and used Tim's note as the basis for a page on the wiki: > > http://wiki.python.org/moin/CvsToSvn > > It's linked from the PythonDevelopers page (a page of previously dubious > necessity). I have pretty much the same reservations against Wikis as Brett does; it seems more productive if people would just use python-dev to ask questions of the "how do I" kind (and probably of the "do I really need to" kind as well). I don't mind somebody collecting this information into whatever place more permanent and accessible than a mailing list archive; I think I would normally add them to the developer FAQ instead of to the Wiki, primarily because I can memorize the location of the FAQ better. Regards, Martin From martin at v.loewis.de Sun Oct 30 23:43:51 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 30 Oct 2005 23:43:51 +0100 Subject: [Python-Dev] svn checksum error In-Reply-To: <17252.59531.252751.768301@montanaro.dyndns.org> References: <17252.59531.252751.768301@montanaro.dyndns.org> Message-ID: <43654CA7.8030200@v.loewis.de> skip at pobox.com wrote: > I tried "svn up" to bring my sandbox up-to-date and got this output: > > % svn up > U Include/unicodeobject.h > subversion/libsvn_wc/update_editor.c:1609: (apr_err=155017) > svn: Checksum mismatch for 'Objects/.svn/text-base/unicodeobject.c.svn-base'; expected: '8611dc5f592e7cbc6070524a1437db9b', actual: '2d28838f2fec366fc58386728a48568e' > > What's that telling me? At the shallow level, the message should be clear: there is an actual checksum for a file and an expected checksum, and they differ. They shouldn't differ. Somewhat deeper, this indicates a bug in Subversion. It's not clear to me whether this is a client or a server bug. In the version on svn.python.org, the error message is on line 2846, so I would suspect it is a client bug. The natural question then is: what operating system, what subversion version are you using? Regards, Martin From ejones at uwaterloo.ca Mon Oct 31 00:19:41 2005 From: ejones at uwaterloo.ca (Evan Jones) Date: Sun, 30 Oct 2005 18:19:41 -0500 Subject: [Python-Dev] Parser and Runtime: Divorced! In-Reply-To: <03b7f74aebe5c6249a8bb00ac17d1952@uwaterloo.ca> References: <03b7f74aebe5c6249a8bb00ac17d1952@uwaterloo.ca> Message-ID: <84c355f24dfa73224073d897c38edd44@uwaterloo.ca> On Oct 26, 2005, at 20:02, Evan Jones wrote: > In the process of doing this, I came across a comment mentioning that > it would be desirable to separate the parser. Is there any interest in > doing this? I now have a vague idea about how to do this. Of course, > there is no point in making changes like this unless there is some > tangible benefit. I am going to assume that since no one was excited about my post, that the answer is: no, there is no interest in seperating the parser from the rest of the Python run time. At any rate, if anyone is looking for a standalone C Python parser library, you can get it at the following URL. It includes a "print the tree" example that displays the AST for a specified file. It only supports a subset of the parse tree (assignment, functions, print, return), but it should be obvious how it could be extended. http://evanjones.ca/software/pyparser.html Evan Jones -- Evan Jones http://evanjones.ca/ From noamraph at gmail.com Mon Oct 31 00:35:05 2005 From: noamraph at gmail.com (Noam Raphael) Date: Mon, 31 Oct 2005 01:35:05 +0200 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <20051029164637.39E1.JCARLSON@uci.edu> References: <1130107429.11268.40.camel@geddy.wooz.org> <20051029164637.39E1.JCARLSON@uci.edu> Message-ID: Hello, It seems that we both agree that freezing is cool (-; . We disagree on whether a copy-on-write behaviour is desired. Your arguments agains copy-on-write are: 1. It's not needed. 2. It's complicated to implement. But first of all, you didn't like my use cases. I want to argue with that. > In reading your description of a 'table of values', I can't help but be > reminded of the wxPython (and wxWidget) wx.Grid and its semantics. It > offers arbitrary tables of values (whose editors and viewers you can > change at will), which offers a mechanism by which you can "listen" to > changes that occur to the contents of a cell. I can't help but think > that if you offered a protocol by which a user can signal that a cell > has been changed, perhaps by writing the value to the table itself > (table.SetValue(row, col, value)), every read a deepcopy (or a PEP 351 > freeze), etc., that both you and the users of your table would be much > happier. Perhaps I didn't make it clear. The difference between wxPython's Grid and my table is that in the table, most values are *computed*. This means that there's no point in changing the values themselves. They are also used frequently as set members (I can describe why, but it's a bit complicated.) I want to say that even if sets weren't used, the objects in the table should have been frozen. The fact the sets (and dicts) only allow immutable objects as members/keys is just for protecting the user. They could have declared, "you shouldn't change anything you insert - as long as you don't, we'll function properly." The only reason why you can't compute hash values of mutable objects is that you don't want your user to make mistakes, and make the data structure inconsistent. > As for the graph issue, you've got a bigger problem than users just > being able to edit edge lists, users can clear the entire dictionary of > vertices (outgoing.clear()). It seems to me that a more reasonable > method to handle this particular case is to tell your users "don't > modify the dictionaries or the edge lists", and/or store your edge lists > as tuples instead of lists or dictionaries, and/or use an immutable > dictionary (as offered by Barry in the PEP). As I wrote before, telling my users "don't modify the edge lists" is just like making lists hashable, and telling all Python users, "dont modify lists that are dictionary keys." There's no way to tell the users that - there's no convention for objects which should not be changed. You can write it in the documentation, but who'll bother looking there? I don't think that your other suggestions will work: the data structure of the graph itself can't be made of immutable objects, because of the fact that the graph is a mutable object - you can change it. It can be made of immutable objects, but this means copying all the data every time the graph changes. Now, about copy-on-write: > There's also this little issue of "copy on write" semantics with Python. > Anyone who tells you that "copy on write" is easy, is probably hanging > out with the same kind of people who say that "threading is easy". Of > course both are easy if you limit your uses to some small subset of > interesting interactions, but "copy on write" gets far harder when you > start thinking of dictionaries, lists, StringIOs, arrays, and all the > possible user-defined classes, which may be mutated beyond obj[key] = > value and/or obj.attr = value (some have obj.method() which mutates the > object). As such, offering a callback mechanism similar to weak > references is probably pretty close to impossible with CPython. Let's limit ourselves to copy-on-write of objects which do not contain nonfrozen objects. Perhaps it's enough - the table, the graph, and strings, are perfect examples of these. Implementation doesn't seem to complicated to me - whenever the object is about to change, and there is a connected frozen copy, you make a shallow copy of the object, point the frozen copy to it, release the reference to the frozen copy, and continue as usual. That's all. I really think that this kind of copy-on-write is "correct". The temporary freezing of sets in order to check if they are members of other sets is a not-very-nice way of implementing it. This kind of copy-on-write would allow, in principle, for Python strings to become mutable, with almost no speed penalty. It would allow my table, and other containers, to automatically freeze the objects that get into it, without having to trust the user on not changing the objects - and remember that there's no way of *telling* him not to change the objects. Now, the computer scientist in me wants to explain (and think about) freezing containers of nonfrozen objects. What I actually want is that as long as an object doesn't change after it's freezed, the cost of freezing would be nothing - that is, O(1). Think about a mutable string object, which is used in the same way as the current, immutable strings. It is constructed once, and then may be used as a key in a dictionary many times. I want to claim that it's a common pattern - create an object, it doesn't matter how, and then use it without changing it. If that is the case, it's obvious that all the frozen() calls would take O(1) each. How can we accomplish this (freezing costs O(1) as long as the object doesn't change) with containers of nonfrozen objects? It seems impossible - no matter what, on the first time the container is freezed, you would have to call frozen() for every object it contains! The answer is that in an amortized analysis, it is still an O(1) operation. The reason is that as long as frozen() takes O(1) (amortized), all those calls to frozen() can be considered a part of the object construction, since they are made only once - on the next call to frozen(), the already-created frozen object would be returned. This analysis is correct as long as the object doesn't change after it's freezed. The problem is that we have to keep the created frozen object as long as the original object stays alive. So we have to know if it has changed. This is where those callbacks get in. As long as what is done with them is correct, there should be no problems. They are used only to disengage the frozen copies from their original objects. The action they should trigger is simply that: def on_contained_object_change(self): self._frozen_copy = None while self._callbacks: self._callbacks.pop()() What's also interesting is that this freezing mechanism can be provided automatically for user-created classes, since those are simply containers of other objects, which behave exactly like dicts, for this matter. It allows everything in Python to be both mutable and hashable, without changing the O() complexity! Wow! Ok, I'm going to sleep now. If you find something wrong with this idea, please tell me. Have a good day, Noam From gustavo at niemeyer.net Mon Oct 31 00:37:57 2005 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Sun, 30 Oct 2005 21:37:57 -0200 Subject: [Python-Dev] StreamHandler eating exceptions Message-ID: <20051030233757.GB8344@localhost.localdomain> The StreamHandler available under the logging package is currently catching all exceptions under the emit() method call. In the Handler.handleError() documentation it's mentioned that it's implemented like that because users do not care about errors in the logging system. I'd like to apply the following patch: Index: Lib/logging/__init__.py =================================================================== --- Lib/logging/__init__.py (revision 41357) +++ Lib/logging/__init__.py (working copy) @@ -738,6 +738,8 @@ except UnicodeError: self.stream.write(fs % msg.encode("UTF-8")) self.flush() + except KeyboardInterrupt: + raise except: self.handleError(record) Anyone against the change? -- Gustavo Niemeyer http://niemeyer.net From solipsis at pitrou.net Mon Oct 31 00:46:07 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 31 Oct 2005 00:46:07 +0100 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: References: <1130107429.11268.40.camel@geddy.wooz.org> <20051029164637.39E1.JCARLSON@uci.edu> Message-ID: <1130715967.6180.77.camel@fsol> > It allows everything in Python to be both mutable and hashable, I don't understand, since it's already the case. Any user-defined object is at the same time mutable and hashable. And if you want the hash value to follow the changes in attribute values, just define an appropriate __hash__ method. Regards Antoine. From skip at pobox.com Mon Oct 31 02:08:22 2005 From: skip at pobox.com (skip@pobox.com) Date: Sun, 30 Oct 2005 19:08:22 -0600 Subject: [Python-Dev] svn checksum error In-Reply-To: <43654CA7.8030200@v.loewis.de> References: <17252.59531.252751.768301@montanaro.dyndns.org> <43654CA7.8030200@v.loewis.de> Message-ID: <17253.28294.538932.570903@montanaro.dyndns.org> Martin> The natural question then is: what operating system, what Martin> subversion version are you using? Sorry, wasn't thinking in terms of svn bugs. I was anticipating some sort of obvious pilot error. I am on Mac OSX 10.3.9, running svn 1.1.3 I built from source back in the May timeframe. Should I upgrade to 1.2.3 as a matter of course? Fredrik> "welcome to the wonderful world of subversion error messages" ... Fredrik> deleting the offending directory and doing "svn up" is the Fredrik> easiest way to fix this. Thanks. I zapped Objects. The next svn up complained about Misc. The next about Lib. After that, the next svn up ran to completion. Skip From solipsis at pitrou.net Mon Oct 31 02:17:50 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 31 Oct 2005 02:17:50 +0100 Subject: [Python-Dev] svn checksum error In-Reply-To: <17253.28294.538932.570903@montanaro.dyndns.org> References: <17252.59531.252751.768301@montanaro.dyndns.org> <43654CA7.8030200@v.loewis.de> <17253.28294.538932.570903@montanaro.dyndns.org> Message-ID: <1130721470.6180.87.camel@fsol> Le dimanche 30 octobre 2005 ? 19:08 -0600, skip at pobox.com a ?crit : > Sorry, wasn't thinking in terms of svn bugs. I was anticipating some sort > of obvious pilot error. I am on Mac OSX 10.3.9, running svn 1.1.3 I built > from source back in the May timeframe. Should I upgrade to 1.2.3 as a > matter of course? IIRC, the version provided by Fink works fine. No need to compile manually. From pinard at iro.umontreal.ca Mon Oct 31 03:25:54 2005 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Sun, 30 Oct 2005 21:25:54 -0500 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <43638BC0.40108@v.loewis.de> References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> <4362A44F.9010506@v.loewis.de> <20051029110331.D5AA.ISHIMOTO@gembook.org> <4363395A.3040606@v.loewis.de> <1130589142.5945.11.camel@fsol> <43638BC0.40108@v.loewis.de> Message-ID: <20051031022554.GA20255@alcyon.progiciels-bpi.ca> [Martin von L?wis] > My canonical example is Fran?ois Pinard, who keeps requesting it, > saying that local people where surprised they couldn't use accented > characters in Python. Perhaps that's because he actually is Quebecian > :-) I presume I should comment a bit on this. People here are not "surprised" they couldn't use accented characters, they are rather saddened, and some hoped that Python would offer that possibility, one of these days. Also given that here, every production program or system has been progressively rewritten in Python, slowly at first, and more aggressively while the confidence was building up, to the point not much of the non-Python things remain by now. So, all our hopes are concentrated into a single language. All development is done in house by French people. All documentation, external or internal, comments, identifier and function names, everything is in French. Some of the developers here have had a long programming life, while they only barely read English. It is surely a constant frustration, for some of us, having to mangle identifiers by ravelling out their necessary diacritics. It does not look good, it does not smell good, and in many cases, mangling identifiers significantly decreases program legibility. Now, I keep reading strange arguments from people opposing that we use national letters in identifiers, disturbed by the fact they would have a hard time reading our code or publishing it. Even worse, some want to protect us (and the world) against ourselves, using made up, irrational arguments, producing false logic out of their own emotions and feelings. They would like us to think, write, and publish in English. Is it some anachronical colonialism? Quite possible. It surely has some success, as you may find some French people that will only swear in English! :-) For one, in my programming life, I surely chose to write a lot of English code, and I still think English is a good vehicle to planetary communication. However, I like it to my choice. I always felt much opened and collaborative with similarly minded people, and for them, happily rewrote my things from French to English in view of sharing, whenever I saw some mutual advantage to it. I resent when people want to force me into English when I have no real reason to do so. Let me choose to use my own language, as nicely as I can, when working in-shop with people sharing this language with me, for programs that will likely never be published outside anyway. Internationalisation is already granted in our overall view of today's programming, as a way for letting people be comfortable with computers, each in his/her own language. This comfort should extend widely to naming main programming objects (functions, classes, variables, modules) as legibly as possible. Here, I mean legible in an ideal way for the team or the local community, and not necessarily legible to the whole planet. It does not always have to be planetary, you know. For keywords, the need is less stringent, as syntactical constructs are part of a language. When English is opaque to a programmer, he/she can easily learn that small set of words making the syntax, understanding their effect, even while not necessarily understanding the real English meaning of those keywords. This is not a real obstacle in practice. It is true that many Python tools are not prepared to handle internationalised identifiers, and it is very unlikely that these tools will get ready before Python opens itself to internationalised identifiers. Let's open Python first, tools will undoubtedly follow. There will be some adaptation period, but after some while, everything will fall in place, things will become smooth again and just natural to everybody, to the point many of us might remember the current times, and wonder what was all that fuss about. :-) -- Fran?ois Pinard http://pinard.progiciels-bpi.ca From rhamph at gmail.com Mon Oct 31 04:21:29 2005 From: rhamph at gmail.com (Adam Olsen) Date: Sun, 30 Oct 2005 20:21:29 -0700 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: <20051031022554.GA20255@alcyon.progiciels-bpi.ca> References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> <4362A44F.9010506@v.loewis.de> <20051029110331.D5AA.ISHIMOTO@gembook.org> <4363395A.3040606@v.loewis.de> <1130589142.5945.11.camel@fsol> <43638BC0.40108@v.loewis.de> <20051031022554.GA20255@alcyon.progiciels-bpi.ca> Message-ID: On 10/30/05, Fran?ois Pinard wrote: > All development is done in house by French people. All documentation, > external or internal, comments, identifier and function names, > everything is in French. Some of the developers here have had a long > programming life, while they only barely read English. It is surely > a constant frustration, for some of us, having to mangle identifiers by > ravelling out their necessary diacritics. It does not look good, it > does not smell good, and in many cases, mangling identifiers > significantly decreases program legibility. Hear, hear! Not all the world uses english, and restricting them to latin characters simply means it's not readable in any language. It doesn't make it any more readable for those of us who only understand english. +1 on internationalized identifiers. -- Adam Olsen, aka Rhamphoryncus From fdrake at acm.org Mon Oct 31 06:16:18 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon, 31 Oct 2005 01:16:18 -0400 Subject: [Python-Dev] StreamHandler eating exceptions In-Reply-To: <20051030233757.GB8344@localhost.localdomain> References: <20051030233757.GB8344@localhost.localdomain> Message-ID: <200510310016.19467.fdrake@acm.org> On Sunday 30 October 2005 18:37, Gustavo Niemeyer wrote: > I'd like to apply the following patch: +1 Might want to include SystemExit as well, though I think that's less likely to be seen in practice. -Fred -- Fred L. Drake, Jr. From jcarlson at uci.edu Mon Oct 31 06:09:17 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 30 Oct 2005 22:09:17 -0700 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: References: <20051029164637.39E1.JCARLSON@uci.edu> Message-ID: <20051030202958.39FD.JCARLSON@uci.edu> Noam Raphael wrote: > > Hello, > > It seems that we both agree that freezing is cool (-; . We disagree on > whether a copy-on-write behaviour is desired. Your arguments agains > copy-on-write are: > 1. It's not needed. > 2. It's complicated to implement. > > But first of all, you didn't like my use cases. I want to argue with that. > > > In reading your description of a 'table of values', I can't help but be > > reminded of the wxPython (and wxWidget) wx.Grid and its semantics. It > > offers arbitrary tables of values (whose editors and viewers you can > > change at will), which offers a mechanism by which you can "listen" to > > changes that occur to the contents of a cell. I can't help but think > > that if you offered a protocol by which a user can signal that a cell > > has been changed, perhaps by writing the value to the table itself > > (table.SetValue(row, col, value)), every read a deepcopy (or a PEP 351 > > freeze), etc., that both you and the users of your table would be much > > happier. > > Perhaps I didn't make it clear. The difference between wxPython's Grid > and my table is that in the table, most values are *computed*. This > means that there's no point in changing the values themselves. They > are also used frequently as set members (I can describe why, but it's > a bit complicated.) Again, user semantics. Tell your users not to modify entries, and/or you can make copies of objects you return. If your users are too daft to read and/or follow directions, then they deserve to have their software not work. Also from the sounds of it, you are storing both source and destination values in the same table...hrm, that sounds quite a bit like a spreadsheet. How does every spreadsheet handle that again? Oh yeah, they only ever store immutables (generally strings which are interpreted). But I suppose since you are (of course) storing mutable objects, you need to work a bit harder...so store mutables, and return immutable copies (which you can cache if you want, and invalidate when your application updates the results...like a wx.Grid update on changed). > > As for the graph issue, you've got a bigger problem than users just > > being able to edit edge lists, users can clear the entire dictionary of > > vertices (outgoing.clear()). It seems to me that a more reasonable > > method to handle this particular case is to tell your users "don't > > modify the dictionaries or the edge lists", and/or store your edge lists > > as tuples instead of lists or dictionaries, and/or use an immutable > > dictionary (as offered by Barry in the PEP). > > As I wrote before, telling my users "don't modify the edge lists" is > just like making lists hashable, and telling all Python users, "dont > modify lists that are dictionary keys." There's no way to tell the > users that - there's no convention for objects which should not be > changed. You can write it in the documentation, but who'll bother > looking there? When someone complains that something doesn't work, I tell them to read the documentation. If your users haven't been told to RTFM often enough to actually make it happen, then you need a RTFM-bat. Want to know how you make one? You start wrapping the objects you return which segfaults the process if they change things. When they start asking, tell them it is documented quite clearly "do not to modify objects returned, or else". Then there's the other option, which I provide below. > I don't think that your other suggestions will work: the data > structure of the graph itself can't be made of immutable objects, > because of the fact that the graph is a mutable object - you can > change it. It can be made of immutable objects, but this means copying > all the data every time the graph changes. So let me get this straight: You've got a graph. You want to be able to change the graph, but you don't want your users to accidentally change the graph. Sounds to me like an API problem, not a freeze()/mutable problem. Want an API? class graph: ... def IterOutgoing(self, node): ... def IterIncoming(self, node): ... def IsAdjacent(self, node1, node2): ... def IterNodes(self): ... def AddEdge(self, f_node, t_node): ... def RemEdge(self, node1, node2): ... def AddNode(self): ... If you are reasonable in your implementation, all of the above operations can be fast, and you will never have to worry about your users accidentally mucking about with your internal data structures: because you aren't exposing them. If you are really paranoid, you can take the next step and implement it in Pyrex or C, so that only a malicous user can muck about with internal structures, at which point you stop supporting them. > Now, about copy-on-write: > > > There's also this little issue of "copy on write" semantics with Python. > > Anyone who tells you that "copy on write" is easy, is probably hanging > > out with the same kind of people who say that "threading is easy". Of > > course both are easy if you limit your uses to some small subset of > > interesting interactions, but "copy on write" gets far harder when you > > start thinking of dictionaries, lists, StringIOs, arrays, and all the > > possible user-defined classes, which may be mutated beyond obj[key] = > > value and/or obj.attr = value (some have obj.method() which mutates the > > object). As such, offering a callback mechanism similar to weak > > references is probably pretty close to impossible with CPython. > > Let's limit ourselves to copy-on-write of objects which do not contain > nonfrozen objects. Perhaps it's enough - the table, the graph, and > strings, are perfect examples of these. Implementation doesn't seem to > complicated to me - whenever the object is about to change, and there > is a connected frozen copy, you make a shallow copy of the object, > point the frozen copy to it, release the reference to the frozen copy, > and continue as usual. That's all. What you have written here is fairly unintelligible, but thankfully you clarify yourself...pity it still doesn't work, I explain below. [snip] > The problem is that we have to keep the created frozen object as long > as the original object stays alive. So we have to know if it has > changed. This is where those callbacks get in. As long as what is done > with them is correct, there should be no problems. They are used only > to disengage the frozen copies from their original objects. The action > they should trigger is simply that: > > def on_contained_object_change(self): > self._frozen_copy = None > while self._callbacks: > self._callbacks.pop()() > > What's also interesting is that this freezing mechanism can be > provided automatically for user-created classes, since those are > simply containers of other objects, which behave exactly like dicts, > for this matter. > > It allows everything in Python to be both mutable and hashable, > without changing the O() complexity! Wow! > > Ok, I'm going to sleep now. If you find something wrong with this > idea, please tell me. Here is where you are wrong. x = [] for i in xrange(k): x.append(range(k)) We now have a simple list of lists, k lists, each of length k. Let's freeze it. y = frozen(x) Ok, now we have a recursively frozen list of lists y, implemented however you want, and you've ammortized this ONE call to creation time. We'll give y to someone who does whatever he wants to it, and we'll continue on. z = frozen(x) Your claim is that due to the cache, the above operation can be done in constant time after you have already frozen x, this is wrong, but we'll get to that. Let us mutate one of the contained lists, and see if this can continue... x[0][0] = 7 Oh hrm. This invalidates x[0]'s cached frozen object, which would suggest that x's cached frozen object was also invalidated, even though Python objects tend to know nothing about the objects which point to them. Well, that's a rub. In order to /validate/ that an object's cached object is valid, you must validate that the contents of your cached frozen object points to the cached frozen objects of your contents. Or in code (for this two-level example)... def frozen(x): if x.frozen_cache: for i,j in zip(x.contents, x.frozen_cache): if i.frozen_cache is not j: x.frozen_cache = None x.frozen_cache = frozen(x) return x.frozen_cache x.frozen_cache = tuple(map(frozen, x.contents)) return x.frozen_cache Ouch, for the list of lists example with a total size O(k**2), you need to spend O(k) time. We'll say that n == k**2, so really, for this particular object of size O(n), you need to spend O(sqrt(n)) time verifying. Not quite constant. But wait...in order to verify that every cached frozen object is valid...we should have been checking the contents of x[i] to verify that they were frozen too! Wow, that would take us O(n) time just to verify. And we would need to do that EVERY TIME WE CALLED frozen(x), REGARDLESS OF WHETHER x WAS MUTATED! Wait a second...isn't that just the same as just recursively calling freeze? Yes. Are we actually saving any time? No. What has this idea resulted in? The incorrect belief that caching frozen objects will reduce the computation necessary in freeze(x), and a proposed new attribute on literally every object which points to an immutable copy of itself, generally doubling the amount of memory used. Like I said before, it's not so easy. In order to actually implement the system, every object in an object heirarchy would need to know about its parent, but such cannot work because... a = range(10) b = [a] c = [a] bf = frozen(b) cf = frozen(c) a[0] = 10 #oops! That last line is the killer. Every object would necessarily need to know about all containers which point to it, and would necessarily need to notify them all that their contents had changed. I personally don't see Python doing that any time soon. Hope you sleep/slept well, - Josiah From martin at v.loewis.de Mon Oct 31 08:55:09 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 31 Oct 2005 08:55:09 +0100 Subject: [Python-Dev] svn checksum error In-Reply-To: <17253.28294.538932.570903@montanaro.dyndns.org> References: <17252.59531.252751.768301@montanaro.dyndns.org> <43654CA7.8030200@v.loewis.de> <17253.28294.538932.570903@montanaro.dyndns.org> Message-ID: <4365CDDD.5060502@v.loewis.de> skip at pobox.com wrote: > Martin> The natural question then is: what operating system, what > Martin> subversion version are you using? > > Sorry, wasn't thinking in terms of svn bugs. I was anticipating some sort > of obvious pilot error. I am on Mac OSX 10.3.9, running svn 1.1.3 I built > from source back in the May timeframe. Should I upgrade to 1.2.3 as a > matter of course? Not sure. The only mentioning of this specific message was about Linux and multi-threading, in svnserve. Apparently, due to some race condition/pthread locking semantics problems, something could get corrupted. It could be a compiler bug as well, of course. I would try to get some "official" binaries; 1.2.x works just fine with svn.python.org as well. Regards, Martin From vinay_sajip at yahoo.co.uk Mon Oct 31 14:59:53 2005 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 31 Oct 2005 13:59:53 +0000 (UTC) Subject: [Python-Dev] StreamHandler eating exceptions References: <20051030233757.GB8344@localhost.localdomain> Message-ID: Gustavo Niemeyer niemeyer.net> writes: > > The StreamHandler available under the logging package is currently > catching all exceptions under the emit() method call. In the > Handler.handleError() documentation it's mentioned that it's > implemented like that because users do not care about errors > in the logging system. > > I'd like to apply the following patch: [patch snipped] > Anyone against the change? > Good idea. I've checked into svn a patch for both logging/__init__.py and logging/handlers.py which raises both KeyboardInterrupt and SystemExit raised during emit(). From guido at python.org Mon Oct 31 16:26:26 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 31 Oct 2005 08:26:26 -0700 Subject: [Python-Dev] StreamHandler eating exceptions In-Reply-To: References: <20051030233757.GB8344@localhost.localdomain> Message-ID: I wonder if, once PEP 352 is accepted, this shouldn't be changed so that there is only one except clause instead of two, and it says "except Exception:". This has roughly the same effect as the proposed (and already applied) patch, but does it in a Python-3000-compatible way. ATM it is less robust because it doesn't catch exceptions that don't derive from Exception -- but in all cases where this particular try/except has saved my butt (yes it has! multiple times! :-), the exception thrown was a standard exception. (Did anybody else notice the synchronicity of this request with the PEP 352 discussion?) On 10/31/05, Vinay Sajip wrote: > Gustavo Niemeyer niemeyer.net> writes: > > > > > The StreamHandler available under the logging package is currently > > catching all exceptions under the emit() method call. In the > > Handler.handleError() documentation it's mentioned that it's > > implemented like that because users do not care about errors > > in the logging system. > > > > I'd like to apply the following patch: > [patch snipped] > > Anyone against the change? > > > > Good idea. I've checked into svn a patch for both logging/__init__.py and > logging/handlers.py which raises both KeyboardInterrupt and SystemExit raised > during emit(). > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steve at holdenweb.com Mon Oct 31 16:51:13 2005 From: steve at holdenweb.com (Steve Holden) Date: Mon, 31 Oct 2005 15:51:13 +0000 Subject: [Python-Dev] Divorcing str and unicode (no more implicit conversions). In-Reply-To: References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> <4362A44F.9010506@v.loewis.de> <20051029110331.D5AA.ISHIMOTO@gembook.org> <4363395A.3040606@v.loewis.de> <1130589142.5945.11.camel@fsol> <43638BC0.40108@v.loewis.de> <20051031022554.GA20255@alcyon.progiciels-bpi.ca> Message-ID: Adam Olsen wrote: > On 10/30/05, Fran?ois Pinard wrote: > >>All development is done in house by French people. All documentation, >>external or internal, comments, identifier and function names, >>everything is in French. Some of the developers here have had a long >>programming life, while they only barely read English. It is surely >>a constant frustration, for some of us, having to mangle identifiers by >>ravelling out their necessary diacritics. It does not look good, it >>does not smell good, and in many cases, mangling identifiers >>significantly decreases program legibility. > > > Hear, hear! Not all the world uses english, and restricting them to > latin characters simply means it's not readable in any language. It > doesn't make it any more readable for those of us who only understand > english. > > +1 on internationalized identifiers. > While I agree with the sentiments expressed, I think we should not underestimate the practical problems that moving away fr Therefore, if such steps are really going to be considered, I would really like to see them introduced in such a way that no breakage occurs for existing users, even the parochial ones who feel they (and their programs) don't need to understand anything but ASCII. If this means starting out with the features conditionally compiled, despite the added cost of the #ifdefs that would thereby be engendered I think that would be a good idea. We can fix their programs by making Unicode the default string type, but it will take much longer to fix their thinking. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From steve at holdenweb.com Mon Oct 31 16:53:25 2005 From: steve at holdenweb.com (Steve Holden) Date: Mon, 31 Oct 2005 15:53:25 +0000 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <20051030202958.39FD.JCARLSON@uci.edu> References: <20051029164637.39E1.JCARLSON@uci.edu> <20051030202958.39FD.JCARLSON@uci.edu> Message-ID: Josiah Carlson wrote: [...] >>Perhaps I didn't make it clear. The difference between wxPython's Grid >>and my table is that in the table, most values are *computed*. This >>means that there's no point in changing the values themselves. They >>are also used frequently as set members (I can describe why, but it's >>a bit complicated.) > > > Again, user semantics. Tell your users not to modify entries, and/or > you can make copies of objects you return. If your users are too daft > to read and/or follow directions, then they deserve to have their > software not work. > That sounds like a "get out of jail free" card for Microsoft and many other software vendors ... regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From orent at hishome.net Mon Oct 31 18:28:26 2005 From: orent at hishome.net (Oren Tirosh) Date: Mon, 31 Oct 2005 19:28:26 +0200 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <1130715967.6180.77.camel@fsol> References: <1130107429.11268.40.camel@geddy.wooz.org> <20051029164637.39E1.JCARLSON@uci.edu> <1130715967.6180.77.camel@fsol> Message-ID: <7168d65a0510310928y178faddav5b0551c4ed8dac60@mail.gmail.com> On 10/31/05, Antoine Pitrou wrote: > > > It allows everything in Python to be both mutable and hashable, > > I don't understand, since it's already the case. Any user-defined object > is at the same time mutable and hashable. By default, user-defined objects are equal iff they are the same object, regardless of their content. This makes mutability a non-issue. If you want to allow different objects be equal you need to implement a consistent equality operator (commutative, etc), a consistent hash function and ensure that any attributes affecting equality or hash value are immutable. If you fail to meet any of these requirements and put such objects in dictionaries or sets it will result in undefined behavior that may change between Python versions and implementations. Oren From martin at v.loewis.de Mon Oct 31 19:21:17 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 31 Oct 2005 19:21:17 +0100 Subject: [Python-Dev] i18n identifiers (was: Divorcing str and unicode (no more implicit conversions). In-Reply-To: References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> <4362A44F.9010506@v.loewis.de> <20051029110331.D5AA.ISHIMOTO@gembook.org> <4363395A.3040606@v.loewis.de> <1130589142.5945.11.camel@fsol> <43638BC0.40108@v.loewis.de> <20051031022554.GA20255@alcyon.progiciels-bpi.ca> Message-ID: <4366609D.4010205@v.loewis.de> Steve Holden wrote: > Therefore, if such steps are really going to be considered, I would > really like to see them introduced in such a way that no breakage occurs > for existing users, even the parochial ones who feel they (and their > programs) don't need to understand anything but ASCII. It is straight-forward to make this feature completely backwards compatible. Syntactically, it is a pure extension: existing code will continue to work unmodified, and will continue to have the same meaning. With the feature, you will be able to write code that previously produced SyntaxErrors. Semantically, the only potential incompatibility is that you might find Unicode strings in __dict__. If purely-ASCII identifiers are going to be represented by byte strings (as they are now), no change in meaning for existing code is anticipated. So it is not necessary to make the feature conditional to preserve compatibility. Regards, Martin From mal at egenix.com Mon Oct 31 19:51:43 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 31 Oct 2005 19:51:43 +0100 Subject: [Python-Dev] i18n identifiers In-Reply-To: <4366609D.4010205@v.loewis.de> References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com> <4362A44F.9010506@v.loewis.de> <20051029110331.D5AA.ISHIMOTO@gembook.org> <4363395A.3040606@v.loewis.de> <1130589142.5945.11.camel@fsol> <43638BC0.40108@v.loewis.de> <20051031022554.GA20255@alcyon.progiciels-bpi.ca> <4366609D.4010205@v.loewis.de> Message-ID: <436667BF.4050903@egenix.com> Martin v. L?wis wrote: > Steve Holden wrote: > >>Therefore, if such steps are really going to be considered, I would >>really like to see them introduced in such a way that no breakage occurs >>for existing users, even the parochial ones who feel they (and their >>programs) don't need to understand anything but ASCII. > > > It is straight-forward to make this feature completely backwards > compatible. Syntactically, it is a pure extension: existing code > will continue to work unmodified, and will continue to have the > same meaning. With the feature, you will be able to write code > that previously produced SyntaxErrors. > > Semantically, the only potential incompatibility is that you > might find Unicode strings in __dict__. If purely-ASCII identifiers > are going to be represented by byte strings (as they are now), > no change in meaning for existing code is anticipated. > > So it is not necessary to make the feature conditional to preserve > compatibility. If people are really all enthusiastic about such a feature, then it should happen in Python3k when the parser is rewritten to work on Unicode natively. Note that if you start with this now, a single module in your application using Unicode identifiers could potentially break the application: simply by the fact that stack frames, tracebacks and module globals would now contain Unicode. Any processing done on the identifiers, like e.g. error formatting would then have to deal with Unicode objects (due to the automatic conversion). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 31 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From blais at furius.ca Mon Oct 31 20:13:05 2005 From: blais at furius.ca (Martin Blais) Date: Mon, 31 Oct 2005 14:13:05 -0500 Subject: [Python-Dev] a different kind of reduce... Message-ID: <8393fff0510311113p63bc194ak88580f84a25b1a1a@mail.gmail.com> Hi I find myself occasionally doing this: ... = dirname(dirname(dirname(p))) I'm always--literally every time-- looking for a more functional form, something that would be like this: # apply dirname() 3 times on its results, initializing with p ... = repapply(dirname, 3, p) There is a way to hack something like that with reduce, but it's not pretty--it involves creating a temporary list and a lambda function: ... = reduce(lambda x, y: dirname(x), [p] + [None] * 3) Just wondering, does anybody know how to do this nicely? Is there an easy form that allows me to do this? cheers, From guido at python.org Mon Oct 31 20:24:02 2005 From: guido at python.org (Guido van Rossum) Date: Mon, 31 Oct 2005 12:24:02 -0700 Subject: [Python-Dev] PEP 352 Transition Plan In-Reply-To: References: <007d01c5dc00$738da2e0$b62dc797@oemcomputer> <4362DD15.4080606@gmail.com> Message-ID: I've made a final pass over PEP 352, mostly fixing the __str__, __unicode__ and __repr__ methods to behave more reasonably. I'm all for accepting it now. Does anybody see any last-minute show-stopping problems with it? As always, http://python.org/peps/pep-0352.html -- --Guido van Rossum (home page: http://www.python.org/~guido/) From noamraph at gmail.com Mon Oct 31 20:27:30 2005 From: noamraph at gmail.com (Noam Raphael) Date: Mon, 31 Oct 2005 21:27:30 +0200 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <20051030202958.39FD.JCARLSON@uci.edu> References: <20051029164637.39E1.JCARLSON@uci.edu> <20051030202958.39FD.JCARLSON@uci.edu> Message-ID: Hello, I have slept quite well, and talked about it with a few people, and I still think I'm right. About the users-changing-my-internal-data issue: > Again, user semantics. Tell your users not to modify entries, and/or > you can make copies of objects you return. If your users are too daft > to read and/or follow directions, then they deserve to have their > software not work. ... > When someone complains that something doesn't work, I tell them to read > the documentation. If your users haven't been told to RTFM often enough > to actually make it happen, then you need a RTFM-bat. Want to know how > you make one? You start wrapping the objects you return which segfaults > the process if they change things. When they start asking, tell them it > is documented quite clearly "do not to modify objects returned, or else". > Then there's the other option, which I provide below. I disagree. I read the manual when I don't know what something does. If I can guess what it does (and this is where conventions are good), I don't read the manual. And let's say I ought to read the complete manual for every method that I use, and that I deserve a death punishment (or a segmentation fault) if I don't. But let's say that I wrote a software, without reading the manual, and it worked. I have gone to work on other things, and suddenly a bug arises. When the poor guy who needs to fix it goes over the code, everything looks absolutely correct. Should he also read the complete manuals of every library that I used, in order to fix that bug? And remember that in this case, the object could have traveled between several places (including functions in other libraries), before it was changed, and the original data structure starts behaving weird. You suggest two ways for solving the problem. The first is by copying my mutable objects to immutable copies: > Also from the sounds of it, you are storing both source and destination > values in the same table...hrm, that sounds quite a bit like a > spreadsheet. How does every spreadsheet handle that again? Oh yeah, > they only ever store immutables (generally strings which are interpreted). > But I suppose since you are (of course) storing mutable objects, you > need to work a bit harder...so store mutables, and return immutable > copies (which you can cache if you want, and invalidate when your > application updates the results...like a wx.Grid update on changed). This is basically ok. It's just that in my solution, for many objects it's not necessary to make a complete copy just to prevent changing the value: Making frozen copies of objects which can't reference nonfrozen objects (sets, for example), costs O(1), thanks to the copy-on-write. Now, about the graph: > So let me get this straight: You've got a graph. You want to be able to > change the graph, but you don't want your users to accidentally change > the graph. Sounds to me like an API problem, not a freeze()/mutable problem. > Want an API? > > class graph: > ... > def IterOutgoing(self, node): > ... > def IterIncoming(self, node): > ... > def IsAdjacent(self, node1, node2): > ... > def IterNodes(self): > ... > def AddEdge(self, f_node, t_node): > ... > def RemEdge(self, node1, node2): > ... > def AddNode(self): > ... > > If you are reasonable in your implementation, all of the above > operations can be fast, and you will never have to worry about your > users accidentally mucking about with your internal data structures: > because you aren't exposing them. If you are really paranoid, you can > take the next step and implement it in Pyrex or C, so that only a > malicous user can muck about with internal structures, at which point > you stop supporting them. This will work. It's simply... well, not very beautiful. I have to invent a lot of names, and my users need to remember them all. If I give them a frozen set, with all the vertices than a vertex points to (which is an absolutely reasonable API), they will know how to deal with it without learning a lot of method names, thanks to the fact that they are already familiar with sets, and that a lot of thought has gone into the set interface. > > Now, about copy-on-write: ... > > What you have written here is fairly unintelligible, but thankfully you > clarify yourself...pity it still doesn't work, I explain below. I'm sorry if I am sometimes hard to understand. English is not my mother tongue, and it degrades as the hour gets later - and sometimes, things are hard to explain. If I don't explain myself, please say so and I'll try again. This is an excellent example - I wrote about callbacks, and went to sleep. Let me try to explain again how it *does* work. The frozen() function, and the __frozen__ protocol, would get another optional argument - an object to be notified when the *nonfrozen* object has changed. It may be called at most once - only on the first change to the object, and only if the object which requested to be notified is still alive. I now introduce a second protocol, which I will call __changed__. Objects would be notified by calling their __changed__ method. You say that every frozen() call takes O(n), because it needs to verify that objects hadn't changed since the last call: > Oh hrm. This invalidates x[0]'s cached frozen object, which would > suggest that x's cached frozen object was also invalidated, even though > Python objects tend to know nothing about the objects which point to > them. Well, that's a rub. In order to /validate/ that an object's > cached object is valid, you must validate that the contents of your > cached frozen object points to the cached frozen objects of your > contents. Or in code (for this two-level example)... > > def frozen(x): > if x.frozen_cache: > for i,j in zip(x.contents, x.frozen_cache): > if i.frozen_cache is not j: > x.frozen_cache = None > x.frozen_cache = frozen(x) > return x.frozen_cache > x.frozen_cache = tuple(map(frozen, x.contents)) > return x.frozen_cache This is not so. When a list creates its frozen copy, it gives itself to all those frozen() calls. This means that it will be notified if one of its members is changed. In that case, it has to do two simple actions: 1. release the reference to its frozen copy, so that subsequent freezes of the list would create a new frozen copy, and: 2. notify about the change any object which froze the list and requested notification. This frees us of any validation code. If we haven't been notified about a change, there was no change, and the frozen copy is valid. In case you ask, the cost of notification is O(1), amortized. The reason is that every frozen() call can cause at most one callback in the future. > Like I said before, it's not so easy. In order to actually implement > the system, every object in an object heirarchy would need to know about > its parent, but such cannot work because... > > a = range(10) > b = [a] > c = [a] > bf = frozen(b) > cf = frozen(c) > a[0] = 10 #oops! > > That last line is the killer. Every object would necessarily need to > know about all containers which point to it, and would necessarily need > to notify them all that their contents had changed. I personally don't > see Python doing that any time soon. > This is not the case. Every object has to know only on the objects which created frozen copies of it, and requested notifications. Actually, the object itself doesn't have to store anything. I thought about it, and you can create a module for handling those change-callbacks. It would store only weak references to objects. It would have two functions: def register_reference(x, y): '''Register the fact that if object x changes, it means that object y changes too.''' def changed(x): '''Notify all objects that change with x that they are changed.''' When an object is changed, this module would call the __changed__ method of all the objects that have a reference to it, and haven't changed since the connection was created. I hope I have clarified my idea. Tell me if you still think I'm wrong. > Hope you sleep/slept well, > - Josiah > Thanks! indeed, a good sleep is something very important. Sleep well too (when the time comes, of course), Noam From aahz at pythoncraft.com Mon Oct 31 20:40:31 2005 From: aahz at pythoncraft.com (Aahz) Date: Mon, 31 Oct 2005 11:40:31 -0800 Subject: [Python-Dev] a different kind of reduce... In-Reply-To: <8393fff0510311113p63bc194ak88580f84a25b1a1a@mail.gmail.com> References: <8393fff0510311113p63bc194ak88580f84a25b1a1a@mail.gmail.com> Message-ID: <20051031194031.GA4397@panix.com> On Mon, Oct 31, 2005, Martin Blais wrote: > > There is a way to hack something like that with reduce, but it's not > pretty--it involves creating a temporary list and a lambda function: > > ... = reduce(lambda x, y: dirname(x), [p] + [None] * 3) > > Just wondering, does anybody know how to do this nicely? Is there an > easy form that allows me to do this? This should go on comp.lang.python. Thanks. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "If you think it's expensive to hire a professional to do the job, wait until you hire an amateur." --Red Adair From jcarlson at uci.edu Mon Oct 31 20:28:20 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 31 Oct 2005 12:28:20 -0700 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: References: <20051030202958.39FD.JCARLSON@uci.edu> Message-ID: <20051031122308.3A0F.JCARLSON@uci.edu> Steve Holden wrote: > > Josiah Carlson wrote: > [...] > >>Perhaps I didn't make it clear. The difference between wxPython's Grid > >>and my table is that in the table, most values are *computed*. This > >>means that there's no point in changing the values themselves. They > >>are also used frequently as set members (I can describe why, but it's > >>a bit complicated.) > > > > Again, user semantics. Tell your users not to modify entries, and/or > > you can make copies of objects you return. If your users are too daft > > to read and/or follow directions, then they deserve to have their > > software not work. > > > That sounds like a "get out of jail free" card for Microsoft and many > other software vendors ... If/when vendors are COMPLETE in their specifications and documentation, they can have that card, but being that even when they specify such behaviors they are woefully incomplete, there is not a card to be found, and I stand by my opinion. - Josiah From jcarlson at uci.edu Mon Oct 31 21:05:16 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 31 Oct 2005 13:05:16 -0700 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: References: <20051030202958.39FD.JCARLSON@uci.edu> Message-ID: <20051031120205.3A0C.JCARLSON@uci.edu> Noam Raphael wrote: > Hello, > > I have slept quite well, and talked about it with a few people, and I > still think I'm right. And I'm going to point out why you are wrong. > About the users-changing-my-internal-data issue: > > > Again, user semantics. Tell your users not to modify entries, and/or > > you can make copies of objects you return. If your users are too daft > > to read and/or follow directions, then they deserve to have their > > software not work. > ... > > When someone complains that something doesn't work, I tell them to read > > the documentation. If your users haven't been told to RTFM often enough > > to actually make it happen, then you need a RTFM-bat. Want to know how > > you make one? You start wrapping the objects you return which segfaults > > the process if they change things. When they start asking, tell them it > > is documented quite clearly "do not to modify objects returned, or else". > > Then there's the other option, which I provide below. > > I disagree. I read the manual when I don't know what something does. > If I can guess what it does (and this is where conventions are good), > I don't read the manual. And let's say I ought to read the complete > manual for every method that I use, and that I deserve a death > punishment (or a segmentation fault) if I don't. But let's say that I > wrote a software, without reading the manual, and it worked. I have > gone to work on other things, and suddenly a bug arises. When the poor > guy who needs to fix it goes over the code, everything looks > absolutely correct. Should he also read the complete manuals of every > library that I used, in order to fix that bug? And remember that in > this case, the object could have traveled between several places > (including functions in other libraries), before it was changed, and > the original data structure starts behaving weird. You can have a printout before it dies: "I'm crashing your program because something attempted to modify a data structure (here's the traceback), and you were told not to." Then again, you can even raise an exception when people try to change the object, as imdict does, as tuples do, etc. > You suggest two ways for solving the problem. The first is by copying > my mutable objects to immutable copies: And by caching those results, then invalidating them when they are updated by your application. This is the same as what you would like to do, except that I do not rely on copy-on-write semantics, which aren't any faster than freeze+cache by your application. [snip graph API example] > This will work. It's simply... well, not very beautiful. I have to > invent a lot of names, and my users need to remember them all. If I > give them a frozen set, with all the vertices than a vertex points to > (which is an absolutely reasonable API), they will know how to deal > with it without learning a lot of method names, thanks to the fact > that they are already familiar with sets, and that a lot of thought > has gone into the set interface. I never claimed it was beautiful, I claimed it would work. And it does. There are 7 methods, which you can reduce if you play the special method game: RemEdge -> __delitem__((node, node)) RemNode -> __delitem__(node) #forgot this method before IterNodes -> __iter__() IterOutgoing,IterIncoming -> IterAdjacent(node) In any case, whether you choose to use freeze, or use a different API, this particular problem is solvable without copy-on-write semantics. > > > > Now, about copy-on-write: > ... > > > > What you have written here is fairly unintelligible, but thankfully you > > clarify yourself...pity it still doesn't work, I explain below. > > I'm sorry if I am sometimes hard to understand. English is not my > mother tongue, and it degrades as the hour gets later - and sometimes, > things are hard to explain. If I don't explain myself, please say so > and I'll try again. This is an excellent example - I wrote about > callbacks, and went to sleep. Let me try to explain again how it > *does* work. Thank you for the clarification (btw, your english is far better than any of the foreign languages I've been "taught" over the years). > This is not so. When a list creates its frozen copy, it gives itself > to all those frozen() calls. This means that it will be notified if > one of its members is changed. In that case, it has to do two simple > actions: 1. release the reference to its frozen copy, so that > subsequent freezes of the list would create a new frozen copy, and: 2. > notify about the change any object which froze the list and requested > notification. > > This frees us of any validation code. If we haven't been notified > about a change, there was no change, and the frozen copy is valid. > > In case you ask, the cost of notification is O(1), amortized. The > reason is that every frozen() call can cause at most one callback in > the future. Even without validation, there are examples that force a high number of calls, which are not O(1), ammortized or otherwise. > > Like I said before, it's not so easy. In order to actually implement > > the system, every object in an object heirarchy would need to know about > > its parent, but such cannot work because... > > > > a = range(10) > > b = [a] > > c = [a] > > bf = frozen(b) > > cf = frozen(c) > > a[0] = 10 #oops! > > > > That last line is the killer. Every object would necessarily need to > > know about all containers which point to it, and would necessarily need > > to notify them all that their contents had changed. I personally don't > > see Python doing that any time soon. > > > This is not the case. Every object has to know only on the objects > which created frozen copies of it, and requested notifications. > Actually, the object itself doesn't have to store anything. I thought > about it, and you can create a module for handling those > change-callbacks. It would store only weak references to objects. It > would have two functions: > > def register_reference(x, y): > '''Register the fact that if object x changes, it means that > object y changes too.''' > > def changed(x): > '''Notify all objects that change with x that they are changed.''' > > When an object is changed, this module would call the __changed__ > method of all the objects that have a reference to it, and haven't > changed since the connection was created. Callbacks work, in that you can notify parents, but they still don't allow O(1) ammortization. a = [[] for i in xrange(k)] b = [list(a) for i in xrange(k)] del a c = freeze(b) b[0][0].append(1) That append call requires that b and every list in b be notified. That takes O(k) time, because you have to notify the k lists which point to b[0][0]. Let's freeze it again! Oh wait, that takes O(k**2) time for that second freeze because you have to recreate the tuple version of b, as well as the tuple version of everything in b. Cycling through modifications and freezing ends up taking time equivalent to if you were to just re-create the entire frozen version to start with every time you freeze. Now, the actual time analysis on repeated freezings and such gets ugly. There are actually O(k) objects, which take up O(k**2) space. When you modify object b[i][j] (which has just been frozen), you get O(k) callbacks, and when you call freeze(b), it actually results in O(k**2) time to re-copy the O(k**2) pointers to the O(k) objects. It should be obvious that this IS NOT AMMORTIZABLE to original object creation time. > I hope I have clarified my idea. Tell me if you still think I'm wrong. You have clarified it, but it is still wrong. I stand by 'it is not easy to get right', and would further claim, "I doubt it is possible to make it fast." Good day, - Josiah From noamraph at gmail.com Mon Oct 31 23:25:53 2005 From: noamraph at gmail.com (Noam Raphael) Date: Tue, 1 Nov 2005 00:25:53 +0200 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: <20051031120205.3A0C.JCARLSON@uci.edu> References: <20051030202958.39FD.JCARLSON@uci.edu> <20051031120205.3A0C.JCARLSON@uci.edu> Message-ID: On 10/31/05, Josiah Carlson wrote: ... > And I'm going to point out why you are wrong. I still don't think so. I think that I need to reclarify what I mean. > > About the users-changing-my-internal-data issue: ... > You can have a printout before it dies: > "I'm crashing your program because something attempted to modify a data > structure (here's the traceback), and you were told not to." > > Then again, you can even raise an exception when people try to change > the object, as imdict does, as tuples do, etc. Both solutions would solve the problem, but would require me to wrap the built-in set with something which doesn't allow changes. This is a lot of work - but it's quite similiar to what my solution would actually do, in a single built-in function. > > > You suggest two ways for solving the problem. The first is by copying > > my mutable objects to immutable copies: > > And by caching those results, then invalidating them when they are > updated by your application. This is the same as what you would like to > do, except that I do not rely on copy-on-write semantics, which aren't > any faster than freeze+cache by your application. This isn't correct - freezing a set won't require a single copy to be performed, as long as the frozen copy isn't saved after the original is changed. Copy+cache always requires one copy. ... > I never claimed it was beautiful, I claimed it would work. And it does. > There are 7 methods, which you can reduce if you play the special method > game: > > RemEdge -> __delitem__((node, node)) > RemNode -> __delitem__(node) #forgot this method before > IterNodes -> __iter__() > IterOutgoing,IterIncoming -> IterAdjacent(node) > I just wanted to say that this game is of course a lot of fun, but it doesn't simplify the interface. > In any case, whether you choose to use freeze, or use a different API, > this particular problem is solvable without copy-on-write semantics. Right. But I think that a significant simplification of the API is a nice bonus for my solution. And about those copy-on-write semantics - it should be proven how complex they are. Remember that we are talking about frozen-copy-on-write, which I think would simplify matters considerably - for example, there are at most two instances sharing the same data, since the frozen copy can be returned again and again. > > > > Now, about copy-on-write: > > ... > Thank you for the clarification (btw, your english is far better than > any of the foreign languages I've been "taught" over the years). Thanks! It seems that if you are forced to use a language from time to time it improves a little... ... > Even without validation, there are examples that force a high number of > calls, which are not O(1), ammortized or otherwise. > [Snap - a very interesting example] > > Now, the actual time analysis on repeated freezings and such gets ugly. > There are actually O(k) objects, which take up O(k**2) space. When you > modify object b[i][j] (which has just been frozen), you get O(k) > callbacks, and when you call freeze(b), it actually results in O(k**2) > time to re-copy the O(k**2) pointers to the O(k) objects. It should be > obvious that this IS NOT AMMORTIZABLE to original object creation time. > That's absolutely right. My ammortized analysis is correct only if you limit yourself to cases in which the original object doesn't change after a frozen() call was made. In that case, it's ok to count the O(k**2) copy with the O(k**2) object creation, because it's made only once. Why it's ok to analyze only that limited case? I am suggesting a change in Python: that every object you would like be mutable, and would support the frozen() protocol. When you evaluate my suggestion, you need to take a program, and measure its performance in the current Python and in a Python which implements my suggestion. This means that the program should work also on the current Python. In that case, my assumption is true - you won't change objects after you have frozen them, simply because these objects (strings which are used as dict keys, for example) can't be changed at all in the current Python implementation! I will write it in another way: I am proposing a change that will make Python objects, including strings, mutable, and gives you other advantages as well. I claim that it won't make existing Python programs run slower in O() terms. It would allow you to do many things that you can't do today; some of them would be fast, like editing a string, and some of them would be less fast - for example, repeatedly changing an object and freezing it. I think that the performance penalty may be rather small - remember that in programs which do not change strings, there would never be a need to copy the string data at all. And since I think that usually most of the dict lookups are for method or function names, there would almost never be a need to constuct a new object on dict lookup, because you search for the same names again and again, and a new object is created only on the first frozen() call. You might even gain performance, because s += x would be faster. ... > You have clarified it, but it is still wrong. I stand by 'it is not > easy to get right', and would further claim, "I doubt it is possible to > make it fast." It would not be very easy to implement, of course, but I hope that it won't be very hard either, since the basic idea is quite simple. Do you still doubt the possibility of making it fast, given my (correct) definition of fast? And if it's possible (which I think it is), it would allow us to get rid of inconvinient immutable objects, and it would let us put everything into a set. Isn't that nice? > > Good day, > - Josiah > The same to you, Noam From p.f.moore at gmail.com Mon Oct 31 23:29:30 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 31 Oct 2005 22:29:30 +0000 Subject: [Python-Dev] a different kind of reduce... In-Reply-To: <8393fff0510311113p63bc194ak88580f84a25b1a1a@mail.gmail.com> References: <8393fff0510311113p63bc194ak88580f84a25b1a1a@mail.gmail.com> Message-ID: <79990c6b0510311429u2ac8f1dcw193f3dd2fd25f3b1@mail.gmail.com> On 10/31/05, Martin Blais wrote: > I'm always--literally every time-- looking for a more functional form, > something that would be like this: > > # apply dirname() 3 times on its results, initializing with p > ... = repapply(dirname, 3, p) [...] > Just wondering, does anybody know how to do this nicely? Is there an > easy form that allows me to do this? FWIW, something like this works: >>> def fpow(f, n): ... def res(*args, **kw): ... nn = n ... while nn > 0: ... args = [f(*args, **kw)] ... kw = {} ... nn -= 1 ... return args[0] ... return res ... >>> fn = r'a\b\c\d\e\f\g' >>> d3 = fpow(os.path.dirname, 3) >>> d3(fn) 'a\\b\\c\\d' You can vary this a bit - the handling of keyword arguments is an obvious place where I've picked a very arbitrary approach - but you get the idea. This *may* be a candidate for addition to the new "functional" module, but I'd be surprised if it got added without proving itself "in the wild" first. More likely, it should go in a local "utilities" module. Paul. From noamraph at gmail.com Mon Oct 31 23:55:34 2005 From: noamraph at gmail.com (Noam Raphael) Date: Tue, 1 Nov 2005 00:55:34 +0200 Subject: [Python-Dev] PEP 351, the freeze protocol In-Reply-To: References: <20051030202958.39FD.JCARLSON@uci.edu> <20051031120205.3A0C.JCARLSON@uci.edu> Message-ID: I thought about something - > > I think that the performance penalty may be rather small - remember > that in programs which do not change strings, there would never be a > need to copy the string data at all. And since I think that usually > most of the dict lookups are for method or function names, there would > almost never be a need to constuct a new object on dict lookup, > because you search for the same names again and again, and a new > object is created only on the first frozen() call. You might even gain > performance, because s += x would be faster. > Name lookups can take virtually the same time they take now - method names can be saved from the beginning as frozen strings, so finding them in a dict will take just another bit test - is the object frozen - before doing exactly what is done now. Remember, the strings we are familiar with are simply frozen strings...