From tim.one@comcast.net Thu May 1 03:13:46 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 30 Apr 2003 22:13:46 -0400 Subject: [Python-Dev] Dictionary tuning In-Reply-To: <001101c30f2a$216954a0$b1b3958d@oemcomputer> Message-ID: [Raymond Hettinger] > ... > I worked on similar approaches last month and found them wanting. > The concept was that a 64byte cache line held 5.3 dict entries and > that probing those was much less expensive than making a random > probe into memory outside of the cache. > > The first thing I learned was that the random probes were necessary > to reduce collisions. Checking the adjacent space is like a single > step of linear chaining, it increases the number of collisions. Yes, I believe that any regularity will. > That would be fine if the cost were offset by decreased memory > access time; however, for small dicts, the whole dict is already > in cache and having more collisions degrades performance > with no compensating gain. > > The next bright idea was to have a separate lookup function for > small dicts and for larger dictionaries. I set the large dict lookup > to search adjacent entries. The good news is that an artificial > test of big dicts showed a substantial improvement (around 25%). > The bad news is that real programs were worse-off than before. You should qualify that to "some real programs", or perhaps "all real programs I've tried". On the other side, I have real programs that access large dicts in random order, so if you tossed those into your mix, a 25% gain on those would more than wipe out the 1-2% losses you saw elsewhere. > A day of investigation showed the cause. The artificial test > accessed keys randomly and showed the anticipated benefit. However, > real programs access some keys more frequently than others > (I believe Zipf's law applies.) Some real programs do, and, for all I know, most real programs. It's not the case that all real programs do. The counterexamples that sprang instantly to my mind are those using dicts to accumulate stats for testing random number generators. Those have predictable access patterns only when the RNG they're testing sucks . > Those keys *and* their collision chains are likely already in the cache. > So, big dicts had the same limitation as small dicts: You always lose > when you accept more collisions in return for exploiting cache locality. Apart from that "always" ain't always so, I really like that as a summary! > The conclusion was clear, the best way to gain performance > was to have fewer collisions in the first place. Hence, I > resumed experiments on sparsification. How many collisions are you seeing? For int-keyed dicts, all experiments I ran said Python's dicts collided less than a perfectly random hash table would collide (the reason for that is explained in dictobject.c, and that int-keyed dicts tend to use a contiguous range of ints as keys). For string-keyed dicts, extensive experiments said collision behavior was indistinguishable from a perfectly random hash table. I never cared enough about other kinds of keys to time 'em, at least not since systematic flaws were fixed in the tuple and float hash functions (e.g., the tuple hash function used to xor the tuple's elements' hash codes, so that all permututions of a given tuple had the same hash code; that's necessary for unordered sets, but tuples are ordered). >> If someone wants to experiment with that in lookdict_string(), >> stick a new >> >> ++i; >> >> before the for loop, and move the existing >> >> i = (i << 2) + i + perturb + 1; >> >> to the bottom of that loop. Likewise for lookdict(). > PyStone gains 1%. > PyBench loses a 1%. > timecell gains 2% (spreadsheet benchmark) > timemat loses 2% (pure python matrix package benchmark) > timepuzzle loses 1% (class based graph traverser) You'll forgive me if I'm skeptical: they're such small differences that, if I saw them, I'd consider them to be a wash -- in the noise. What kind of platform are you running on that has timing behavior repeatable enough to believe 1-2% blips? > P.S. There is one other way to improve cache behavior > but it involves touching code throughout dictobject.c. Heh -- that wouldn't even be considered a minor nuisance to the truly obsessed . > Move the entry values into a separate array from the > key/hash pairs. That way, you get 8 entries per cache line. What use case would that help? You said before that small dicts are all in cache anyway, so it wouldn't help those. The jumps in large dicts are so extreme that it doesn't seem to matter if the cache line size du jour holds 1 slot or 100. To the contrary, at the end of the large dict lookup, it sounds like it would incur an additional cache miss to fetch the value after the key was found (since that value would no longer ever ride along with the key and hashcode). I can think of a different reason for considering this: sets have no use for the value slot, and wouldn't need to allocate space for 'em. > P.P.S. One other idea is to use a different search pattern > for small dictionaries. Store entries in a self-organizing list > with no holes. Dummy fields aren't needed which saves > a test in the linear search loop. When an entry is found, > move it one closer to the head of the list so that the most > common entries get found instantly. I don't see how more than just one can be found instantly; if "instantly" here means "in no more than a few tries", that's usually true of dicts too -- but is still an odd meaning for "instantly" . > Since there are no holes, all eight cells can be used instead of the > current maximum of five. Like the current arrangement, the > whole small dict fits into just two cache lines. Neil Schemenauer suggested that a few years ago, but I don't recall it going anywhere. I don't know how to predict for which apps it would be faster. If people are keen to optimize very small search tables, think about schemes that avoid the expense of hashing too. From tim.one@comcast.net Thu May 1 03:36:22 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 30 Apr 2003 22:36:22 -0400 Subject: [Python-Dev] Dictionary tuning In-Reply-To: <002301c30f55$245394c0$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger] > ... > I'm going to write-up an informational PEP to summarize the > results of research to-date. I'd suggest instead a text file checked into the Object directory, akin to the existing listsort.txt -- it's only of interest to the small fraction of hardcore developers with an optimizing bent. > After the first draft, I'm sure the other experimenters will each have > lessons to share. In addition, I'll attach a benchmarking suite and > dictionary simulator (fully instrumented). That way, future generations > can reproduce the results and pickup where we left-off. They probably won't, though. The kind of people attracted to this kind of micro-level fiddling revel in recreating this kind of stuff themselves. For example, you didn't look hard enough to find the sequence of dict simulators Christian posted last time he got obsessed with this . On the chance that they might, a plain text file-- or a Wiki page! --is easier to update than a PEP over time. The benchmarking suite should also be checked in, and should be very welcome. Perhaps it's time for a "benchmark" subdirectory under Lib/test? It doesn't make much sense even now that pystone and sortperf live directly in the test directory. > I've decided that this new process should have a name, > something pithy, yet magical sounding, so it shall be > dubbed SCIENCE. LOL! But I'm afraid it's not real science unless you first write grant proposals, and pay a Big Name to agree to be named as Principal Investigator. I'll write Uncle Don a letter on your behalf . From tim_one@email.msn.com Thu May 1 04:50:32 2003 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 30 Apr 2003 23:50:32 -0400 Subject: [Python-Dev] New thread death in test_bsddb3 In-Reply-To: Message-ID: [Thomas Heller] > ... > So is the policy now that it is no longer *allowed* to create another > thread state, while in previous versions there wasn't any choice, > because there existed no way to get the existing one? You can still create all the thread states you like; the new check is in PyThreadState_Swap(), not in PyThreadState_New(). There was always a choice, but previously Python provided no *help* in keeping track of whether a thread already had a thread state associated with it. That didn't stop careful apps from providing their own mechanisms to do so. About policy, yes, it appears to be so now, else Mark wouldn't be raising a fatal error . I view it as having always been the policy (from a good-faith reading of the previous code), just a policy that was too expensive for Python to enforce. There are many policies like that, such as not passing goofy arguments to macros, and not letting references leak. Python doesn't currently enforce them because it's currently too expensive to enforce them. Over time that can change. > IMO a fatal error is very harsh, especially there's no problem to > continue execution - excactly what happens in a release build. There may or may not be a problem with continued execution -- if you've associated more than one living thread state with a thread, your app may very well be fatally confused in a way that's very difficult to diagnose without this error. Clearly, I like having fatal errors for dubious things in debug builds. Debug builds are supposed to help you debug. If the fatal error here drives you insane, and you don't want to repair the app code, you're welcome to change #if defined(Py_DEBUG) to #if 0 in your debug build. > Not that I am misunderstood: I very much appreciate the work Mark has > done, and look forward to use it to it's fullest extent. In what way is this error a genuine burden to you? The only time I've seen it trigger is in the Berkeley database wrapper, where it pointed out a fine opportunity to simplify some obscure hand-rolled callback tomfoolery -- and pointed out that the thread in question did in fact already have a thread state. Whether that was correct in all cases is something I don't know -- and don't have to worry about anymore, since the new code reuses the thread state the thread already had. The lack of errors in a debug run now assures me that's in good shape now. From drifty@alum.berkeley.edu Thu May 1 05:38:39 2003 From: drifty@alum.berkeley.edu (Brett Cannon) Date: Wed, 30 Apr 2003 21:38:39 -0700 (PDT) Subject: [Python-Dev] Dictionary tuning In-Reply-To: References: Message-ID: [Tim Peters] > The benchmarking suite should also be checked in, and should be very > welcome. Perhaps it's time for a "benchmark" subdirectory under Lib/test? > It doesn't make much sense even now that pystone and sortperf live directly > in the test directory. > Works for me. Can we perhaps decide whether we want to do this in the near future? I am going to be writing up module docs for the test package and if we are going to end up moving them I would like to be get this written into the docs the first time through. -Brett From martin@v.loewis.de Thu May 1 05:10:24 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 01 May 2003 06:10:24 +0200 Subject: [Python-Dev] Initialization hook for extenders In-Reply-To: <3EB04B03.887CDF7B@llnl.gov> References: <3EB04B03.887CDF7B@llnl.gov> Message-ID: "Patrick J. Miller" writes: > I actually want this to do some MPI initialization to setup a > single user prompt with broadcast which has to run after > Py_Initialize() but before the import of readline. -1. It is easy enough to copy the code of Py_Main, and customize it for special requirements. The next user may want to have a hook to put additional command line options into Py_Main, YAGNI. Regards, Martin From tim_one@email.msn.com Thu May 1 07:13:51 2003 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 1 May 2003 02:13:51 -0400 Subject: [Python-Dev] Re: heaps In-Reply-To: Message-ID: [Raymond Hettinger] >> I'm quite pleased with the version already in CVS. It is a small >> masterpiece of exposition, sophistication, simplicity, and speed. >> A class based interface is not necessary for every algorithm. [David Eppstein] > It has some elegance, but omits basic operations that are necessary for > many heap-based algorithms and are not provided by this interface. I think Raymond was telling you it isn't intended to be "an interface", rather it's quite up-front about being a collection of functions that operate directly on a Python list, implementing a heap in a very straightforward way, and deliberately not trying to hide any of that. IOW, it's a concrete data type, not an abstract one. I asked, and it doesn't feel like apologizing for being what it is . That's not to say Python couldn't benefit from providing an abstract heap API too, and umpteen different implementations specialized to different kinds of heap applications. It is saying that heapq isn't trying to be that, so pointing out that it isn't falls kinda flat. > Specifically, the three algorithms that use heaps in my upper-division > undergraduate algorithms classes are heapsort (for which heapq works > fine, but you would generally want to use L.sort() instead), Dijkstra's > algorithm (and its relatives such as A* and Prim), which needs the > ability to decrease keys, and event-queue-based plane sweep algorithms > (e.g. for finding all crossing pairs in a set of line segments) which > need the ability to delete items from other than the top. Then some of those will want a different implementation of a heap. The algorithms in heapq are still suitable for many heap applications, such as maintaining an N-best list (like retaining only the 10 best-scoring items in a long sequence), and A* on a search tree (when there's only one path to a node, decrease-key isn't needed; A* on a graph is harder). > To see how important the lack of these operations is, I decided to > compare two implementations of Dijkstra's algorithm. I don't think anyone claimed-- or would claim --that a heapq is suitable for all heap purposes. > The priority-dict implementation from > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/119466 takes as > input a graph, coded as nested dicts {vertex: {neighbor: edge length}}. > This is a variation of a graph coding suggested in one of Guido's essays > that, as Raymond suggests, avoids using a separate class based interface. > > Here's a simplification of my dictionary-based Dijkstra implementation: > > def Dijkstra(G,start,end=None): > D = {} # dictionary of final distances > P = {} # dictionary of predecessors > Q = priorityDictionary() # est.dist. of non-final vert. > Q[start] = 0 > for v in Q: > D[v] = Q[v] > for w in G[v]: > vwLength = D[v] + G[v][w] > if w not in D and (w not in Q or vwLength < Q[w]): > Q[w] = vwLength > P[w] = v > return (D,P) > > Here's a translation of the same implementation to heapq (untested > since I'm not running 2.3). Since there is no decrease in heapq, nor > any way to find and remove old keys, A heapq *is* a list, so you could loop over the list to find an old object. I wouldn't recommend that in general , but it's easy, and if the need is rare then the advertised fact that a heapq is a plain list can be very convenient. Deleting an object from "the interior" still isn't supported directly, of course. It's possible to do so efficiently with this implementation of a heap, but since it doesn't support an efficient way to find an old object to begin with, there seemed little point to providing an efficient delete-by-index function. Here's one such: import heapq def delete_obj_at_index(heap, pos): lastelt = heap.pop() if pos >= len(heap): return # The rest is a lightly fiddled variant of heapq._siftup. endpos = len(heap) # Bubble up the smaller child until hitting a leaf. childpos = 2*pos + 1 # leftmost child position while childpos < endpos: # Set childpos to index of smaller child. rightpos = childpos + 1 if rightpos < endpos and heap[rightpos] <= heap[childpos]: childpos = rightpos # Move the smaller child up. heap[pos] = heap[childpos] pos = childpos childpos = 2*pos + 1 # The leaf at pos is empty now. Put lastelt there, and and bubble # it up to its final resting place (by sifting its parents down). heap[pos] = lastelt heapq._siftdown(heap, 0, pos) > I changed the algorithm to add new tuples for each new key, leaving the > old tuples in place until they bubble up to the top of the heap. > > def Dijkstra(G,start,end=None): > D = {} # dictionary of final distances > P = {} # dictionary of predecessors > Q = [(0,None,start)] # heap of (est.dist., pred., vert.) > while Q: > dist,pred,v = heappop(Q) > if v in D: > continue # tuple outdated by decrease-key, ignore > D[v] = dist > P[v] = pred > for w in G[v]: > heappush(Q, (D[v] + G[v][w], v, w)) > return (D,P) > > My analysis of the differences between the two implementations: > > - The heapq version is slightly complicated (the two lines > if...continue) by the need to explicitly ignore tuples with outdated > priorities. This need for inserting low-level data structure > maintenance code into higher-level algorithms is intrinsic to using > heapq, since its data is not structured in a way that can support > efficient decrease key operations. It surprised me that you tried using heapq at all for this algorithm. I was also surprised that you succeeded <0.9 wink>. > - Since the heap version had no way to determine when a new key was > smaller than an old one, the heapq implementation needed two separate > data structures to maintain predecessors (middle elements of tuples for > items in queue, dictionary P for items already removed from queue). In > the dictionary implementation, both types of items stored their > predecessors in P, so there was no need to transfer this information > from one structure to another. > > - The dictionary version is slightly complicated by the need to look up > old heap keys and compare them with the new ones instead of just > blasting new tuples onto the heap. So despite the more-flexible heap > structure of the dictionary implementation, the overall code complexity > of both implementations ends up being about the same. > > - Heapq forced me to build tuples of keys and items, while the > dictionary based heap did not have the same object-creation overhead > (unless it's hidden inside the creation of dictionary entries). Rest easy, it's not. > On the other hand, since I was already building tuples, it was > convenient to also store predecessors in them instead of in some > other structure. > > - The heapq version uses significantly more storage than the dictionary: > proportional to the number of edges instead of the number of vertices. > > - The changes I made to Dijkstra's algorithm in order to use heapq might > not have been obvious to a non-expert; more generally I think this lack > of flexibility would make it more difficult to use heapq for > cookbook-type implementation of textbook algorithms. Depends on the specific algorithms in question, of course. No single heap implementation is the best choice for all algorithms, and heapq would be misleading people if, e.g., it did offer a decrease_key function -- it doesn't support an efficient way to do that, and it doesn't pretend to. > - In Dijkstra's algorithm, it was easy to identify and ignore outdated > heap entries, sidestepping the inability to decrease keys. I'm not > convinced that this would be as easy in other applications of heaps. All that is explaining why this specific implementation of a heap isn't suited to the task at hand. I don't believe that was at issue, though. An implementation of a heap that is suited for this task may well be less suited for other tasks. > - One of the reasons to separate data structures from the algorithms > that use them is that the data structures can be replaced by ones with > equivalent behavior, without changing any of the algorithm code. The > heapq Dijkstra implementation is forced to include code based on the > internal details of heapq (specifically, the line initializing the heap > to be a one element list), making it less flexible for some uses. > The usual reason one might want to replace a data structure is for > efficiency, but there are others: for instance, I teach various > algorithms classes and might want to use an implementation of Dijkstra's > algorithm as a testbed for learning about different priority queue data > structures. I could do that with the dictionary-based implementation > (since it shows nothing of the heap details) but not the heapq one. You can wrap any interface you like around heapq (that's very easy to do in Python), but it won't change that heapq's implementation is poorly suited to this application. priorityDictionary looks like an especially nice API for this specific algorithm, but, e.g., impossible to use directly for maintaining an N-best queue (priorityDictionary doesn't support multiple values with the same priority, right? if we're trying to find the 10,000 poorest people in America, counting only one as dead broke would be too Republican for some peoples' tastes ). OTOH, heapq is easy and efficient for *that* class of heap application. > Overall, while heapq was usable for implementing Dijkstra, I think it > has significant shortcomings that could be avoided by a more > well-thought-out interface that provided a little more functionality and > a little clearer separation between interface and implementation. heapq isn't trying to separate them at all -- quite the contrary! It's much like the bisect module that way. They find very good uses in practice. I should note that I objected to heapq at the start, because there are several important heap implementation techniques, and just one doesn't fit anyone all the time. My objection went away when Guido pointed out how much like bisect it is: since it doesn't pretend one whit to generality or opaqueness, it can't be taken as promising more than it actually does, nor can it interfere with someone (so inclined) defining a general heap API: it's not even a class, just a handful of functions. Useful, too, just as it is. A general heap API would be nice, but it wouldn't have much (possibly nothing) to do with heapq. From eppstein@ics.uci.edu Thu May 1 07:36:17 2003 From: eppstein@ics.uci.edu (David Eppstein) Date: Wed, 30 Apr 2003 23:36:17 -0700 Subject: [Python-Dev] Re: heaps In-Reply-To: References: Message-ID: <5841710.1051745776@[10.0.1.2]> On 5/1/03 2:13 AM -0400 Tim Peters wrote: > It surprised me that you tried using heapq at all for this algorithm. I > was also surprised that you succeeded <0.9 wink>. Wink noted, but it surprised me too, a little. I had thought decrease key was a necessary part of the algorithm, not something that could be finessed like that. > You can wrap any interface you like around heapq (that's very easy to do > in Python), but it won't change that heapq's implementation is poorly > suited to this application. priorityDictionary looks like an especially > nice API for this specific algorithm, but, e.g., impossible to use > directly for maintaining an N-best queue (priorityDictionary doesn't > support multiple values with the same priority, right? if we're trying > to find the 10,000 poorest people in America, counting only one as dead > broke would be too Republican for some peoples' tastes ). OTOH, > heapq is easy and efficient for *that* class of heap application. I agree with your main points (heapq's inability to handle certain priority queue applications doesn't mean it's useless, and its implementation-specific API helps avoid fooling programmers into thinking it's any more than what it is). But I am confused at this example. Surely it's just as easy to store (income,identity) tuples in either data structure. If you mean, you want to find the 10k smallest income values (rather than the people having those incomes), then it may be that a better data structure would be a simple list L in which the value of L[i] is the count of people with income i. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From guido@python.org Thu May 1 14:28:24 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 01 May 2003 09:28:24 -0400 Subject: [Python-Dev] Dictionary tuning In-Reply-To: Your message of "Wed, 30 Apr 2003 21:38:39 PDT." References: Message-ID: <200305011328.h41DSPq05585@odiug.zope.com> [Tim] > > The benchmarking suite should also be checked in, and should be > > very welcome. Perhaps it's time for a "benchmark" subdirectory > > under Lib/test? It doesn't make much sense even now that pystone > > and sortperf live directly in the test directory. > Works for me. Can we perhaps decide whether we want to do this in the > near future? I am going to be writing up module docs for the test package > and if we are going to end up moving them I would like to be get this > written into the docs the first time through. > > -Brett Should the benchmarks directory be part of the distribution, or should it be in the nondist part of the CVS tree? --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh@python.net Thu May 1 15:07:37 2003 From: mwh@python.net (Michael Hudson) Date: Thu, 01 May 2003 15:07:37 +0100 Subject: [Python-Dev] Dictionary tuning In-Reply-To: <200305011328.h41DSPq05585@odiug.zope.com> (Guido van Rossum's message of "Thu, 01 May 2003 09:28:24 -0400") References: <200305011328.h41DSPq05585@odiug.zope.com> Message-ID: <2m65ouahg6.fsf@starship.python.net> Guido van Rossum writes: > [Tim] >> > The benchmarking suite should also be checked in, and should be >> > very welcome. Perhaps it's time for a "benchmark" subdirectory >> > under Lib/test? It doesn't make much sense even now that pystone >> > and sortperf live directly in the test directory. > >> Works for me. Can we perhaps decide whether we want to do this in the >> near future? I am going to be writing up module docs for the test package >> and if we are going to end up moving them I would like to be get this >> written into the docs the first time through. >> >> -Brett > > Should the benchmarks directory be part of the distribution, or should > it be in the nondist part of the CVS tree? I can't think why you'd want it in nondist, unless they depend on huge input files or something. Cheers, M. -- Those who have deviant punctuation desires should take care of their own perverted needs. -- Erik Naggum, comp.lang.lisp From guido@python.org Thu May 1 15:18:50 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 01 May 2003 10:18:50 -0400 Subject: [Python-Dev] Dictionary tuning In-Reply-To: Your message of "Thu, 01 May 2003 15:07:37 BST." <2m65ouahg6.fsf@starship.python.net> References: <200305011328.h41DSPq05585@odiug.zope.com> <2m65ouahg6.fsf@starship.python.net> Message-ID: <200305011418.h41EIoF07682@odiug.zope.com> > > Should the benchmarks directory be part of the distribution, or should > > it be in the nondist part of the CVS tree? > > I can't think why you'd want it in nondist, unless they depend on huge > input files or something. OK. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@pythoncraft.com Thu May 1 15:20:35 2003 From: aahz@pythoncraft.com (Aahz) Date: Thu, 1 May 2003 10:20:35 -0400 Subject: [Python-Dev] Dictionary tuning In-Reply-To: <200305011328.h41DSPq05585@odiug.zope.com> References: <200305011328.h41DSPq05585@odiug.zope.com> Message-ID: <20030501142034.GA28364@panix.com> On Thu, May 01, 2003, Guido van Rossum wrote: > > Should the benchmarks directory be part of the distribution, or should > it be in the nondist part of the CVS tree? Given the constant number of arguments in c.l.py about speed, I'd keep it in the distribution unless/until it gets large. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In many ways, it's a dull language, borrowing solid old concepts from many other languages & styles: boring syntax, unsurprising semantics, few automatic coercions, etc etc. But that's one of the things I like about it." --Tim Peters on Python, 16 Sep 93 From tjreedy@udel.edu Thu May 1 16:27:34 2003 From: tjreedy@udel.edu (Terry Reedy) Date: Thu, 1 May 2003 11:27:34 -0400 Subject: [Python-Dev] Re: Dictionary tuning References: <200305011328.h41DSPq05585@odiug.zope.com> <2m65ouahg6.fsf@starship.python.net> Message-ID: >From my curious user viewpoint ... "Michael Hudson" wrote in message news:2m65ouahg6.fsf@starship.python.net... > Guido van Rossum writes: > > > [Tim] > >> > The benchmarking suite should also be checked in, and should be > >> > very welcome. Perhaps it's time for a "benchmark" subdirectory > >> > under Lib/test? It doesn't make much sense even now that pystone > >> > and sortperf live directly in the test directory. + 1 on a separate subdirectory (there are two other already) to make these easier to find (or ignore). > >> Works for me. Can we perhaps decide whether we want to do this in the > >> near future? I am going to be writing up module docs for the test package > >> and if we are going to end up moving them I would like to be get this > >> written into the docs the first time through. + 1 on doing so by 2.3 final if not before > > Should the benchmarks directory be part of the distribution, or should > > it be in the nondist part of the CVS tree? > > I can't think why you'd want it in nondist, unless they depend on huge > input files or something. + 1 on keeping these with the standard distribution. Sortperf.py is a great example of random + systematic corner case testing extended to something more complicated than binary ops. Besides that, I expect to actually use it, with minor mods, sometime later this year. I am more than happy to give it the 4K bytes it uses. Terry J. Reedy From patmiller@llnl.gov Thu May 1 16:23:07 2003 From: patmiller@llnl.gov (Patrick J. Miller) Date: Thu, 01 May 2003 08:23:07 -0700 Subject: [Python-Dev] Initialization hook for extenders References: <3EB04B03.887CDF7B@llnl.gov> Message-ID: <3EB13BDB.808E749@llnl.gov> "Martin v. L=F6wis" wrote: > -1. It is easy enough to copy the code of Py_Main, and customize it > for special requirements. The next user may want to have a hook to put > additional command line options into Py_Main, YAGNI. It's not easy. Not if you simply want to link against an installed Python. Nor so if you want to build against 2.1 2.2 and 2.3 ... libraries. There are subtle changes that bite you in the ass if you don't physically copy the right source forward. We did copy forward main.c, but found that every time we updated Python, we had to "rehack" main to make sure we had all the options and flags and initialization straight. I think the hook is extremely cheap, very short, looks almost exactly like Py_AtExit() and solves the problem directly. Pat --=20 Patrick Miller | (925) 423-0309 | http://www.llnl.gov/CASC/people/pmiller You can discover more about a person in an hour of play than in a year of discussion. -- Plato, philosopher (427-347 BCE) From glyph@twistedmatrix.com Thu May 1 16:59:41 2003 From: glyph@twistedmatrix.com (Glyph Lefkowitz) Date: Thu, 1 May 2003 10:59:41 -0500 Subject: [Python-Dev] Re: Python-Dev digest, Vol 1 #3221 - 4 msgs In-Reply-To: <200304282102.h3SL2rW18842@odiug.zope.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Monday, April 28, 2003, at 04:02 PM, Guido van Rossum wrote: >> Why is the Python development team introducing bugs into Python and >> then expecting the user community to fix things that used to work? > > I resent your rhetoric, Glyph. Had you read the rest of this thread, > you would have seen that the performance regression only happens for > sending data at maximum speed over the loopback device, and is > negligeable when receiving e.g. data over a LAN. You would also have > seen that I have already suggested two different simple fixes. I apologize. I did not seriously mean this as an indictment of the entire Python development team or process. I would have responded to this effect sooner, but I've been swamped with work. >> I could understand not wanting to put a lot of effort into >> correcting obscure or difficult-to-find performance problems that >> only a few people care about, but the obvious thing to do in this >> case is simply to change the default behavior. > > It can and will be fixed. I just don't have the time to fix it > myself. I noticed your comment about the checkin. Thanks to the dev team for fixing it so promptly. >> I think this should be in the release notes for 2.3. "Python is 10% >> faster, unless you use sockets, in which case it is much, much slower. >> Do the following in order to regain lost performance and retain the >> same semantics:" > > That is total bullshit, Glyph, and you know it. Please pardon the exaggeration. I forget that sarcasm does not come across as well on e-mail as it does on IRC. I appreciate that the performance drop wasn't really that serious. On a more positive note, looking at performance numbers got us thinking about increasing performance in Twisted. Anthony Baxter has been very helpful with profiling information, Itamar's already written some benchmarking tests, and I finished up a logging infrastructure that is more amenable to metrics gathering last night. (It's also less completely awful than the one we had before and should hook up to the new logging.py gracefully.) We already have an always-on multi-platform regression test suite for Twisted (not the snake farm): http://www.twistedmatrix.com/users/warner.twistd/ If we get this reporting some performance numbers as well, it would be pretty easy to turn it into a regression/performance test for Python by tweaking a few variables -- probably, just 'cvs update; make' in the Python directory instead of the Twisted one. Is there interest in seeing these kinds of numbers generated regularly? What kind of numbers would be interesting on the Python side? -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (Darwin) iD8DBQE+sUSIvVGR4uSOE2wRAmJDAJ9dRfcX8zPYUvExUtvpxTpQlg2GhwCfde5B C7bsGc8YSwp5aN1vJ6BSiGU= =/c5y -----END PGP SIGNATURE----- From martin@v.loewis.de Thu May 1 16:46:05 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 01 May 2003 17:46:05 +0200 Subject: [Python-Dev] Initialization hook for extenders In-Reply-To: <3EB13BDB.808E749@llnl.gov> References: <3EB04B03.887CDF7B@llnl.gov> <3EB13BDB.808E749@llnl.gov> Message-ID: <3EB1413D.7080604@v.loewis.de> Patrick J. Miller wrote: > It's not easy. > > Not if you simply want to link against an installed Python. Why not? Just don't call the function Py_Main. > Nor so if you want to build against 2.1 2.2 and 2.3 ... libraries. Again, I can't see a reason why that is. > There are subtle changes that bite you in the ass if you don't > physically copy the right source forward. For example? > We did copy forward main.c, but found that every time we updated > Python, we had to "rehack" main to make sure we had all the options > and flags and initialization straight. That is not necessary. What would be the problem if you just left your function as it was in Python 2.1? > I think the hook is extremely cheap, very short, looks almost exactly > like Py_AtExit() and solves the problem directly. Unfortunately, the problem is one that almost nobody ever has, and supporting that API adds a maintenance burden. It is better if the maintenance burden is on your side than on the Python core. If you think you really need this, write a PEP, ask the community, and wait for BDFL pronouncement. I'm still -1. Regards, Martin From patmiller@llnl.gov Thu May 1 17:15:09 2003 From: patmiller@llnl.gov (Patrick J. Miller) Date: Thu, 01 May 2003 09:15:09 -0700 Subject: [Python-Dev] Initialization hook for extenders References: <3EB04B03.887CDF7B@llnl.gov> <3EB13BDB.808E749@llnl.gov> <3EB1413D.7080604@v.loewis.de> Message-ID: <3EB1480D.EE379EC6@llnl.gov> Martin, Sorry you disagree. I think that the issue is still important and other pieces of the API are already in this direction. For instance, there is no need to have PyImport_AppendInittab because you can hack config.c (which you can get from $prefix/lib/pythonx.x/config/config.c) and in fact many people did exactly that but it made for a messy extension until the API call made it clean and direct. You don't need Py_AtExit() because you can call through to atexit.register() to put the function in. The list goes on... I still think that Py_AtInit() is clean, symmetric with Py_AtExit(), and solves a big problem for extenders who wish to address localization from within C (as opposed to sitecustomize.py). This is a 10 line patch with 0 runtime impact that requires no maintanence to move forward with new versions. If it were more than that, I could better understand your objections. Hope that I can get you to at least vote 0 instead of -1. Cheers, Pat -- Patrick Miller | (925) 423-0309 | http://www.llnl.gov/CASC/people/pmiller You can never solve a problem on the level on which it was created. -- Albert Einstein, physicist, Nobel laureate (1879-1955) From guido@python.org Thu May 1 17:32:31 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 01 May 2003 12:32:31 -0400 Subject: [Python-Dev] Re: Python-Dev digest, Vol 1 #3221 - 4 msgs In-Reply-To: Your message of "Thu, 01 May 2003 10:59:41 CDT." References: Message-ID: <200305011632.h41GWVu08466@odiug.zope.com> Apologies accepted, Glyph. --Guido van Rossum (home page: http://www.python.org/~guido/) From drifty@alum.berkeley.edu Fri May 2 00:22:34 2003 From: drifty@alum.berkeley.edu (Brett Cannon) Date: Thu, 1 May 2003 16:22:34 -0700 (PDT) Subject: [Python-Dev] python-dev Summary for 2003-04-16 through 2003-04-30 Message-ID: Yes, I am actually getting the rough draft out the day after its coverage ends. Perk of being done with grad school apps. =3D) You guys have until Sunday (busy Friday and Saturday) to show me why I should try proof-reading one of these days. =3D) And I did leave the one thread out that Guido asked not to be spread around so I guess this summary is not as "complete" as previous ones. -------------------------------- +++++++++++++++++++++++++++++++++++++++++++++++++++++ python-dev Summary for 2003-04-16 through 2003-04-30 +++++++++++++++++++++++++++++++++++++++++++++++++++++ This is a summary of traffic on the `python-dev mailing list`_ from April 16, 2003 through April 30, 2003. It is intended to inform the wider Python community of on-going developments on the list and to have an archived summary of each thread started on the list. To comment on anything mentioned here, just post to python-list@python.org or `comp.lang.python`_ with a subject line mentioning what you are discussing. All python-dev members are interested in seeing ideas discussed by the community, so don't hesitate to take a stance on something. And if all of this really interests you then get involved and join `python-dev`_! This is the sixteenth summary written by Brett Cannon (writing history the way I see fit =3D). All summaries are archived at http://www.python.org/dev/summary/ . Please note that this summary is written using reStructuredText_ which can be found at http://docutils.sf.net/rst.html . Any unfamiliar punctuation is probably markup for reST_ (otherwise it is probably regular expression syntax or a typo =3D); you can safely ignore it, although I suggest learnin= g reST; its simple and is accepted for `PEP markup`__. Also, because of the wonders of programs that like to reformat text, I cannot guarantee you will be able to run the text version of this summary through Docutils_ as-is unless it is from the original text file. __ http://www.python.org/peps/pep-0012.html =2E. _python-dev: http://www.python.org/dev/ =2E. _python-dev mailing list: http://mail.python.org/mailman/listinfo/python-dev =2E. _comp.lang.python: http://groups.google.com/groups?q=3Dcomp.lang.pytho= n =2E. _Docutils: http://docutils.sf.net/ =2E. _reST: =2E. _reStructuredText: http://docutils.sf.net/rst.html =2E. contents:: =2E. _last summary: http://www.python.org/dev/summary/2003-04-01_2003-04-15.html =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Summary Announcements =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D So no one responded to my question last time about whether anyone cared if I stopped linking to files in the Python CVS online through ViewCVS. So silence equals what ever answer makes my life easier, so I won't link to files anymore. =2E. _ViewCVS: http://viewcvs.sf.net/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D `2.3b1 release`__ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D __ http://mail.python.org/pipermail/python-dev/2003-April/034682.html Splinter threads: - `Masks in getargs.c `__ - `CALL_ATTR patch `__ - `Built-in functions as methods `__ - `Tagging the tree `__ - `RELEASED: Python 2.3b1 `__ Guido announced he wanted to get `Python 2.3b1`_ out the door by Friday, April 25 (which he did). He also said if something urgently needed to get in before then to set the priority on the item to 7. The rules for betas is you can apply bug fixes (it is the point of the releases). New unit tests can also be added as long as the entire regression testing suite passes with them in there; since this is a beta any bugs found should be patched along with adding the tests. This led to some patches to come up that some people would like to see get into b1. One is Thomas Heller and his patch at http://www.python.org/sf/595026 which adds new argument masks for PyArg_ParseTuple(). Thomas' patch adds two new masks ('k' and 'K') and modifies some others so that their range checking (if they kept any) were more reasonable. This is when Jack Jansen chimed in saying that he didn't notice any mask that worked between 2.2 and 2.3 that converts 32 bit values without throwing a fit. Basically the changes to the 'h' mask left all of the Mac modules broken. The change was backed out, though, and the issue was solved. Martin v. L=C3=B6wis wanted to get IDNA (International Domain Name Addressi= ng) in (which he did). UnixWare was (and as of this writing still is) broken. It's being worked on, though, by Tim Rice. The CALL_ATTR patch that Thomas Wouters and I worked on at PyCon came up. We were trying to come up with an opcode to replace the common ``LOAD_ATTR; CALL_FUNCTION`` opcode pair that happens whenever you call a method. The hope was to short-circuit the pushing on to the stack the method object since it gets popped off immediately by CALL_FUNCTION. Initially the patch only worked for classic classes but Thomas has since cleaned it up and added support for new-style classes. To help out Thomas, Guido gave an overview of new-style classes and how descriptors work. Basically a descriptor is what exists in a class' __dict__ and "primarily affects instance attribute lookup". When the attribute lookup finds the descriptor it wants, it calls its __get__ method (tp_descrget slot for C types). The lookup then "binds" this to the instance; this is what separates a bound method from a function since functions are also descriptors. Properties are just descriptors whose __get__ calls whatever you passed for the fget argument. Class attribute lookup also calls __get__ but the instance attribute is made None (or NULL for C code). __set__ is called on a descriptor for instance attribute assignment but not for class attribute assignment. Guido clarified this later somewhat by the example having a descriptor f that when ``f.__get__(obj)`` is called it returns a function g which acts like a curried function (read the Python Cookbook if you don't know what currying_ is). Now when you call ``g(arg1, ...)`` you are basically doing ``f(obj, arg1, ...)``; so this all turns into ``f.__get__(obj)(arg1, =2E..)``. The problem with the CALL_ATTR patch is that there is turning out to be zero benefit from it beyond from having a nicer opcode for a common operation when the code for working with new-style classes in the code. This could be from cache misses because of the increased size of the interpreter loop or just too many branches to possibly take. As of now the patch is still on SF and has not been applied. =2E. _Python 2.3b1: http://www.python.org/2.3/ =2E. _currying: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52549 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D `Super and properties`__ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D __ http://mail.python.org/pipermail/python-dev/2003-April/034338.html This thread was initially covered in the `last summary`_. Guido ended up explaining why super() does not work for properties. super() does not stop on the first hit of finding something when that "something" is a data descriptor; it ignores it and just keeps on looking. Now super() does this so that it doesn't end up looking like something it isn't. Think of the case of __class__; if it returned what object's __class__ returned it would cause super to look like something it isn't. Guido figured people wouldn't want to override data descriptors anyway, so this made sense. But now there is a use case for this, so Guido is changing this for Python 2.3 so that data descriptors are properly hit in the inheritance chain by super(). =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D `Final PEP 311 run`__ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D __ http://mail.python.org/pipermail/python-dev/2003-April/034705.html Mark Hammond's `PEP 311`_ has now been implemented! What Mark has done is implement two functions in C; PyGILState_Ensure() and PyGILState_Restore(). Call the first one to get control of the GIL, without having to know its current state, to allow you to use the Python API safely. The second releases the GIL when you are done making calls out to Python. This is a much simpler interface than what was previously needed when you did not need a very fancy threading interface with Python and just needed to hold the GIL. As always, read the PEP to get the full details. =2E. _PEP 311: http://www.python.org/peps/pep-0311.html =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D `summing a bunch of numbers (or "whatevers")`__ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D __ http://mail.python.org/pipermail/python-dev/2003-April/034767.html Splinter threads: - `stats.py `__ - `''.join() again `__ How would use sum a list of numbers? Traditionally there have been two answers. One is to use the operator module and 'reduce' in the idiomatic ``reduce(operator.add, list_of_numbers)``. The other is to do a simple loop:: running_sum =3D 0 for num in list_of_numbers: running_sum +=3D num Common complaints against the 'reduce' solution are that is just is ugly. People don't like the loop solution because it is long for such a simple operation. And a knock against both is that new users of Python do not necessarily think of either solution initially. So, what to do? Well, Alex Martelli to the rescue. Alex proposed adding a new built-in, 'sum', that would take a list of numbers and return the sum of those numbers. Originally Alex also wanted to special-case the handling of a list of strings so as to prevent having to tell new Python programmers that ``"".join(list_of_strings)`` is the best way to concatenate a bunch of strings and that looping over them is *really* bad (the amount of I/O done in the loop kills performance). But this special-casing was shot down because it seemed rather magical and can still be taught to beginners easily enough ('reduce' tends to require an understanding of functional programming). But the function still got added for numbers. So, as of Python 2.3b1, there is a built-in named 'sum' that has the parameter list "sum(list_of_numbers [, start=3D0]) -> sum of the numbers in list_of_numbers". The 'start' parameter allows you to specify where to start in the list for your running sum. And since this is a function with a very specific use it is the fastest way you can sum a list of numbers. The question of adding a statistics module came up during this discussion. The thought was presented to come up with a good, basic stats module to have in the stdlib. The arguments against this is that there are already several good stats modules out there so why bother with including one with Python? It would cause some overshadowing of any 3rd-party stats modules. Eventually the "nays" had it and the idea was dropped. And for all his work Alex got CVS commit privileges. Python, the gift that keeps on giving you more responsibility. =3D) =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D `When is it okay to cvs remove?`__ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D __ http://mail.python.org/pipermail/python-dev/2003-April/035011.html Related threads: - `Rules of a beta release? `__ Being probably the most inexperienced person with CVS commit privileges on Python, I am continuing with my newbie questions in terms of applying patches to the CVS tree (and since I control the Summary I am going to document the answers I get here so I don't have to write them down somewhere else =3D). This time I asked about when it was appropriate to use ``cvs remove``, specifically if it was reasonable if a file was completely rewritten. The answer was to not bother with it unless you are actually removing the file forever; don't bother if you are just rewriting the file. Also, don't bother with changing the version number when doing a complete rewrite; just make sure to mention in the CVS commit message that it is a rewrite. I also learned that the basic guideline to follow in terms of whether a patch should be put up on SF_ or just committed directly is that if you are unsure about the usefulness or correctness then you should post it on SF. But if you don't think there is anyone who can answer it on SF it will just languish there for eternity. Also learned the rules of a beta release. Basically no changes that would cause someone's code to not work the same way as when the beta was released can be checked in. New tests are okay, though. =2E. _SF: http://www.sf.net/ =3D=3D=3D=3D=3D=3D=3D=3D=3D Quickies =3D=3D=3D=3D=3D=3D=3D=3D=3D `3-way result of PyObject_IsTrue() considered PITA`__ Raymond Hettinger discovered that PyObject_IsTrue() promises that there is never an error from running the function, which is not how the function performs. Raymond fixed the docs to match the code. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034658.html `Python dies upon printing UNICODE using UTF-8`__ Windows NT 4's support of UTF-8 is broken. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034666.html `shellwords`__ Gustavo Niemeyer asked if there was any chance of getting shellwords_ into the stdlib so as to be able to have POSIX command line word parsing. The basic response was that shlex_ should be enhanced to do what Gustavo wanted. He has since written `patch #722686`_ that implements the features he wanted. It was also discovered that distutils.util.split_quoted comes close. If someone wants to document Distutils utilities it would be greatly appreciated. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034670.html =2E. _shellwords: http://www.crazy-compilers.com/py-lib/shellwords.html =2E. _shlex: http://www.python.org/dev/doc/devel/lib/module-shlex.html =2E. _patch #722686: http://www.python.org/sf/722686 `Changes to gettext.py for Python 2.3`__ This thread was originally covered in the `last summary`_. Barry Warsaw and Martin v. L=C3=B6wis discussed the gettext_ and whether there should be a way to coerce strings to other encodings. They ended up agreeing on defaulting on Unicode for storing the strings and having =2Egettext() coerce to an 8-bit string while .ugettext() returns the original Unicode string. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034511.html =2E. _gettext: http://www.python.org/dev/doc/devel/lib/module-gettext.html `Stackless 3.0 alpha 1 at blinding speed`__ Christian Tismer has done it again; he improved Stackless_ and now has managed to have merged the abilities of Stackless 1 with 2 which has led to 3a. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034708.html =2E. _Stackless: http://www.stackless.com `Build errors under RH9`__ Python was not building under Red Hat 9, but Martin v. L=C3=B6wis check= ed in a fix. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034724.html `Wrappers and keywords`__ Matt LeBlanc asked why there wasn't a nice syntax for doing properties staticmethods and classmethods. The answer is that it was felt it was more important to get the ability to use those new descriptors out there instead of letting a syntax debate hold them up. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034715.html `Startup overhead due to codec usage`__ MA Lemburg and Martin v. L=C3=B6wis discussed startup time taken up by seeing what encoding is used by the local filesystem. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034742.html `test_pwd failing`__ Initially covered in the `last summary`_. test_grp was failing for the same reasons test_pwd was failing. It has been fixed. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034626.html `Evil setattr hack`__ Don't mess with an instance's __dict__ directly; we will let you but if you get burned its your own fault. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034633.html `heapq`__ Splinter threads: - `FIFO data structure? `__ - `heaps `__ The idea of turning the heapq_ module into a class came up, and later led to the idea of having a more proper FIFO (First In, First Out) data structure. Both ideas were shot down. The reason for this was that the stdlib does not need to try to grow every single possible data structure in programming. Guido's design philosophy is to have a few very powerful data structures that other ones can be built off of. This is why the bisect_ and heapq modules just work on standard lists instead of defining a new class. Queue_ is an exception, but it is designed to mediate messages between threads instead of being a general implementation of a queue. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034768.html =2E. _heapq: http://www.python.org/dev/doc/devel/lib/module-heapq.html =2E. _bisect: http://www.python.org/dev/doc/devel/lib/module-bisect.html =2E. _Queue: http://www.python.org/dev/doc/devel/lib/module-Queue.html `New re failures on Windows`__ Splinter threads: - `sre vs gcc `__ The re_ module was failing after some changes were made to it. The pain of it all was that it was failing only on certain platforms running gcc_. Initial attempts were to make it "just work", but then it was stressed that it is more important to find the offending C code and figure out why gcc on certain platforms was compiling bad assembly. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034776.html =2E. _re: http://www.python.org/dev/doc/devel/lib/module-re.html =2E. _gcc: http://gcc.gnu.org/ `os.path.walk() lacks 'depth first' option`__ Someone requested that os.path.walk support depth-first walking. The request was deemed not important enough to bother implementing, but Tim Peters did implement a new function named os.walk that is a much improved replacement for os.path.walk. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034792.html `Weekly Python Bug/Patch Summary`__ Skip Montanaro's weekly reminder that there is work to be done! Summary for week 2 can be found `here `__. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034797.html `Hook Extension Module Import?`__ Want to do something that requires a special import hook in C? Then override the __import__ built-in with what you need. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034804.html `Bug/feature/patch policy for optparse.py`__ Greg Ward asked if it would be okay to keep the official version of optparse_ at http://optik.sf.net/ . Guido said sure. The justification for this is that Greg wants Optik to be available to people for use in earlier versions of Python. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034833.html =2E. _optparse: http://www.python.org/dev/doc/devel/lib/module-optparse.htm= l `LynxOS4 dynamic loading with dlopen() and -ldl`__ LynxOS4 does not like dynamic linking. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034878.html `Embedded python on Win2K, import failures`__ I don't like Windows. And no, this has nothing to do with this single email that is a short continuation of one covered in the `last summary`_. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034506.html `New thread death in test_bsddb3`__ After Mark Hammond's new thread code got checked in the bsddb module broke. Mark went in, though, and using the wonders that is the C preprocessor and NEW_PYGILSTATE_API_EXISTS, Mark fixed the code to use the new PyGILState API as covered in `PEP 311`_ when possible and to use the old solution when needed. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034901.html `Magic number needs upgrade`__ Guido noticed that the PYC magic number needed to be incremented to handle Raymond Hettinger's new bytecode optimizations. But then Guido questioned the need of Raymond's changes. Basically Raymond's changes didn't speed anything up but cleaned up the emitted bytecode. Guido didn't like the idea of adding more code without an actual speed improvement. Since neither this code nor any of the other proposed speedup changes (CALL_ATTR and caching attribute lookup results) are panning out, Guido questioned why Raymond's should get in. Guido suggested rewriting the interpreter from scratch since all new changes seem to be breaking some delicate balance that has developed in it. He also thought putting effort into other things like pysco_. Eventually Raymond's changes were backed out. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034905.html =2E. _pysco: http://psyco.sf.net/ `draft PEP: Trace and Profile Support for Threads`__ Jeremy Hylton has a draft PEP on how to add hooks for profile and trace code in threads. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034909.html `Data Descriptors on module objects`__ Never going to happen. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035025.html `Metatype conflict among bases?`__ "The metaclass [of a class] must be a subclass of the metaclass of all the bases" of that class. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034910.html `okay to beef up tests on the maintenance branch?`__ Answer: yes! =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034939.html `Cryptographic stuff for 2.3`__ AM Kuchling wanted to add an implementation of the AES_ encryption algorithm to the stdlib. After a long discussion the idea was shot down because having crypto that strong in the stdlib would cause export issues for Python. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034957.html =2E. _AES: http://csrc.nist.gov/encryption/aes/ `vacation`__ Neal Norwitz is on vacation from April 26 till May 6. He pointed out some nagging errors coming up from the `Snake Farm`_ that could use some working on. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034942.html =2E. _Snake Farm: http://www.lysator.liu.se/xenofarm/python/latest.html `test_getargs2 failures`__ Not anymore. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034944.html `Democracy`__ Guido pointed out a paper on democracy (in the ancient Athenian sense) and the organization of groups at http://www.acm.org/ubiquity/interviews/b_manville_1.html that was interesting. Sparked some discussion on proper comparisons to open source projects and such. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034946.html `Updating PEP 246 for type/class unification, 2.2+, etc.`__ Phillip Eby proposed some changes to `PEP 246`_. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034955.html =2E. _PEP 246: http://www.python.org/peps/pep-0246.html `why is test_socketserver in expected skips?`__ Skip Montanaro noticed that socketserver was listed as an expected test to be skipped on all platforms sans os2emx even though it works on all platforms with networking (basically all of them). So it was removed from the expected skip list. Skip also tweaked test_support.requires to always pass when the caller is __main__. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034973.html `netrc.py`__ Bram Moolenaar, author of the `greatest editor in the world`_ and AAP_, requested a changed to netrc_ that got implemented. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034983.html =2E. _greatest editor in the world: http://www.vim.org/ =2E. _AAP: http://www.a-a-p.org/ =2E. _netrc: http://www.python.org/dev/doc/devel/lib/module-netrc.html `PyRun_* functions`__ They take FILE* arguments and it is going to stay that way. Just make sure the files are opened with the same library as being built against. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034990.html `Python Developers`__ Related threads: - `Getting mouse position interms of canvas unit. `__ - `2.3b1, and object() `__ Posted to the wrong email list. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034969.html `New test failure on Windows`__ re_ was failing but got fixed. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035009.html `More new Windos test failures`__ Just before `Python 2.3b1`_ got pushed out the door, some last-minute test failures cropped up (some of them were my fault). But they got fixed. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035047.html `should sre.Scanner be exposed through re and documented?`__ re.Scanner shall remain undocumented. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035066.html `LynxOS4 port: need pre-ncurses curses!`__ The LynxOS is hoping curses will go away. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035052.html `test_s?re merge`__ test_re and test_sre have been merged and moved over to unittest_ thanks to Skip Montanaro. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035067.html =2E. _unittest: http://www.python.org/dev/doc/devel/lib/module-unittest.htm= l `test_ossaudiodev hanging again`__ Some people are still having issues with ossaudiodev tests hanging. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035056.html `bz2 module fails to compile on Solaris 8`__ The joys of being cross-platform. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035068.html `test_logging hangs on Solaris 8`__ Splinter threads: - `test_logging hangs on OS X `__ - `test_logging hangs on Solaris 8 (and 9) `__ The joys of threading and trying to avoid deadlock. A fix has been checked in that seems to fix this on OS X; don't know about Solaris yet. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035065.html `Python 2.3b1 documentation`__ Fred L. Drake, Jr. posted the documentation for Python 2.3b1. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035064.html `Accepted PEPs?`__ Splinter threads: - `Reminder to PEP authors `__ - `proposed amendments to PEP 1 `__ The status of some PEPs got updated=C2=A0along with some proposed chang= es to `PEP 1`_. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035104.html =2E. _PEP 1: http://www.python.org/peps/pep-0001.html `Problems w/ Python 2.2-maint and Redhat 9`__ Dealing with some issues of Python 2.2-maint and linking against a dbm. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035120.html `Why doesn't the uu module give you the filename?`__ Someone wanted the uu_ module to let you know what the name of the encoded file is. Was told to post a patch. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035129.html =2E. _uu: http://www.python.org/dev/doc/devel/lib/module-uu.html `Antigen found CorruptedCompressedUuencodeFile virus`__ The joys of having to watch out for viruses in emails and get false positives. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035130.html `Python 2.3b1 has 20% slower networking?`__ Splinter threads: - `Python-Dev digest, Vol 1 #3221 - 4 msgs `__ Networking throughput did not have as high of a max when in a loop as before. Has been fixed, though. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035132.html `cvs socketmodule.c and IPV6 disabled`__ Discovered some code that couldn't compile because a test for a specific C function was not specific enough. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035146.html `Introduction :)`__ Someone else with the first name of Brett introduced themselves to the list (Brett Kelly). You can tell us apart because I am taller. =3D) =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035162.html `Dictionary tuning`__ Splinter threads: - `Dictionary tuning upto 100,000 entries `__ Raymond Hettinger did a bunch of attempted tuning of dictionary accesses and came up with one solution that managed to be beneficial for large dictionaries and not detrimental for small ones. He basically just caused dictionary sizes to grow by a factor of 4 instead of 2 so as to lower the number of collisions. The objection that came up was that some dictionaries would be larger than they were previously. It looks like it would be applied, but Raymond's notes on everything will most likely end up as a text file in Python. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035151.html `Thoughts on -O`__ It was suggested to change what the -O and -OO command-line switches did since at this moment they don't do much (Guido has even suggested eliminating -O). But the discussion has been partially put on hold until development for Python 2.4 starts. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035165.html `Initialization hook for extenders`__ It has been suggested to add a Py_AtInit() hook to Python to be symmetric with Py_AtExit(). The debate over this is still going. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035226.html From Raymond Hettinger" NOTES ON OPTIMIZING DICTIONARIES ================================ Principal Use Cases for Dictionaries ------------------------------------ Passing keyword arguments Typically, one read and one write for 1 to 3 elements. Occurs frequently in normal python code. Class method lookup Dictionaries vary in size with 8 to 16 elements being common. Usually written once with many lookups. When base classes are used, there are many failed lookups followed by a lookup in a base class. Instance attribute lookup and Global variables Dictionaries vary in size. 4 to 10 elements are common. Both reads and writes are common. Builtins Frequent reads. Almost never written. Size 126 interned strings (as of Py2.3b1). A few keys are accessed much more frequently than others. Uniquification Dictionaries of any size. Bulk of work is in creation. Repeated writes to a smaller set of keys. Single read of each key. * Removing duplicates from a sequence. dict.fromkeys(seqn).keys() * Counting elements in a sequence. for e in seqn: d[e]=d.get(e,0) + 1 * Accumulating items in a dictionary of lists. for k, v in itemseqn: d.setdefault(k, []).append(v) Membership Testing Dictionaries of any size. Created once and then rarely changes. Single write to each key. Many calls to __contains__() or has_key(). Similar access patterns occur with replacement dictionaries such as with the % formatting operator. Data Layout (assuming a 32-bit box with 64 bytes per cache line) ----------------------------------------------------------- Smalldicts (8 entries) are attached to the dictobject structure and the whole group nearly fills two consecutive cache lines. Larger dicts use the first half of the dictobject structure (one cache line) and a separate, continuous block of entries (at 12 bytes each for a total of 5.333 entries per cache line). Tunable Dictionary Parameters ----------------------------- * PyDict_MINSIZE. Currently set to 8. Must be a power of two. New dicts have to zero-out every cell. Each additional 8 consumes 1.5 cache lines. Increasing improves the sparseness of small dictionaries but costs time to read in the additional cache lines if they are not already in cache. That case is common when keyword arguments are passed. * Maximum dictionary load in PyDict_SetItem. Currently set to 2/3. Increasing this ratio makes dictionaries more dense resulting in more collisions. Decreasing it, improves sparseness at the expense of spreading entries over more cache lines and at the cost of total memory consumed. The load test occurs in highly time sensitive code. Efforts to make the test more complex (for example, varying the load for different sizes) have degraded performance. * Growth rate upon hitting maximum load. Currently set to *2. Raising this to *4 results in half the number of resizes, less effort to resize, better sparseness for some (but not all dict sizes), and potentially double memory consumption depending on the size of the dictionary. Setting to *4 eliminated every other resize step. Tune-ups should be measured across a broad range of applications and use cases. A change to any parameter will help in some situations and hurt in others. The key is to find settings that help the most common cases and do the least damage to the less common cases. Results will vary dramatically depending on the exact number of keys, whether the keys are all strings, whether reads or writes dominate, the exact hash values of the keys (some sets of values have fewer collisions than others). Any one test or benchmark is likely to prove misleading. Results of Cache Locality Experiments ------------------------------------- When a entry is retrieved from memory, 4.333 adjacent entries are also retrieved into a cache line. Since accessing items in cache is *much* cheaper than a cache miss, an enticing idea is to probe the adjacent entries as a first step in collision resolution. Unfortunately, the introduction of any regularity into collision searches results in more collisions than the current random chaining approach. Exploiting cache locality at the expense of additional collisions fails to payoff when the entries are already loaded in cache (the expense is paid with no compensating benefit). This occurs in small dictionaries where the whole dictionary fits into a pair of cache lines. It also occurs frequently in large dictionaries which have a common access pattern where some keys are accessed much more frequently than others. The more popular entries *and* their collision chains tend to remain in cache. To exploit cache locality, change the collision resolution section in lookdict() and lookdict_string(). Set i^=1 at the top of the loop and move the i = (i << 2) + i + perturb + 1 to an unrolled version of the loop. This optimization strategy can be leveraged in several ways: * If the dictionary is kept sparse (through the tunable parameters), then the occurrence of additional collisions is lessened. * If lookdict() and lookdict_string() are specialized for small dicts and for largedicts, then the versions for large_dicts can be given the alternate search without increasing the collision in small dicts which already have the maximum benefit of cache locality. * If the use case for the dictionary is known to have a random key access pattern (as opposed to a more common pattern with a Zipf's law distribution), then there will be more benefit for large dictionaries because any given key is no more likely than another to already be in cache. Optimizing the Search of Small Dictionaries ------------------------------------------- If lookdict() and lookdict_string() are specialized for smaller dictionaries, then a custom search approach can be implemented that exploits the small search space and cache locality. * The simplest example is a linear search of contiguous entries. This is simple to implement, guaranteed to terminate rapidly, and precludes the need to check for dummy entries. * A more advanced example is a self-organizing search so that the most frequently accessed entries get probed first. The organization adapts if the access pattern changes over time. * Also, small dictionaries may be made more dense, perhaps filling all eight cells to take the maximum advantage of two cache lines. Strategy Pattern ---------------- Consider allowing the user to set the tunable parameters or to select a particular search method. Since some dictionary use cases have known sizes and access patterns, the user may be able to provide useful hints. 1) For example, if membership testing or lookup dominates runtime and memory is not at a premium, the user may benefit from setting the maximum load ratio at 5% or 10% instead of the usual 66.7%. This will sharply curtail the number of collisions. 2) Dictionary creation time can be shortened in cases where the ultimate size of the dictionary is known in advance. The dictionary can be pre-sized so that *no* resize operations are required during creation. Not only does this save resizes, but the key insertion will go more quickly because the first half of the keys will be inserted into a more sparse environment than before. The preconditions for this strategy arise whenever a dictionary is created from a key or item sequence of known length. 3) If the key space is large and the access pattern is known to be random, then search strategies exploiting cache locality can be fruitful. The preconditions for this strategy arise in simulations and numerical analysis. 4) If the keys are fixed and the access pattern strongly favors some of the keys, then the entries can be stored consecutively and accessed with a linear search. This exploits knowledge of the data, cache locality, and a simplified search routine. It also eliminates the need to test for dummy entries on each probe. The preconditions for this strategy arise in symbol tables and in the builtin dictionary. From martin@v.loewis.de Fri May 2 07:40:21 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 02 May 2003 08:40:21 +0200 Subject: [Python-Dev] python-dev Summary for 2003-04-16 through 2003-04-30 In-Reply-To: References: Message-ID: Brett Cannon writes: > IDNA (International Domain Name Addressing) Funnily, the "A" is for "in applications" (as opposed to "in the nameserver"/"on the wire"). Explaining the acronym as "internationalized domain names" should be sufficient. Regards, Martin From Anthony Baxter Fri May 2 09:20:17 2003 From: Anthony Baxter (Anthony Baxter) Date: Fri, 02 May 2003 18:20:17 +1000 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200304091441.h39EfnU25347@odiug.zope.com> Message-ID: <200305020820.h428KIU24126@localhost.localdomain> >>> Guido van Rossum wrote > Hey, I just figured it out. The old socket module (Python 2.1 and > before) *did* special-case \d+\.\d+\.\d+\.\d+! This code was somehow > lost when the IPv6 support was added. I propose to put it back in, at > least for IPv4 (AF_INET). Patch anyone? https://sourceforge.net/tracker/index.php?func=detail&aid=731209&group_id=5470&atid=305470 Unfortunately the code still goes through the idna encoding module - this is some overhead that it would be nice to avoid for all-numeric addresses. Anthony From martin@v.loewis.de Fri May 2 09:58:23 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 02 May 2003 10:58:23 +0200 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200305020820.h428KIU24126@localhost.localdomain> References: <200305020820.h428KIU24126@localhost.localdomain> Message-ID: <3EB2332F.70900@v.loewis.de> Anthony Baxter wrote: > https://sourceforge.net/tracker/index.php?func=detail&aid=731209&group_id=5470&atid=305470 > > Unfortunately the code still goes through the idna encoding module - this > is some overhead that it would be nice to avoid for all-numeric addresses. That happens only if the argument is a Unicode string, no? Regards, Martin From Anthony Baxter Fri May 2 10:19:48 2003 From: Anthony Baxter (Anthony Baxter) Date: Fri, 02 May 2003 19:19:48 +1000 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <3EB2332F.70900@v.loewis.de> Message-ID: <200305020919.h429Jmp24632@localhost.localdomain> >>> =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote > > Unfortunately the code still goes through the idna encoding module - this > > is some overhead that it would be nice to avoid for all-numeric addresses. > > That happens only if the argument is a Unicode string, no? Ah. That could be the case - I think I'm loading the address from an XML file in the test case I used... will fix that. Anthony From martin@v.loewis.de Fri May 2 10:55:42 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 02 May 2003 11:55:42 +0200 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200305020919.h429Jmp24632@localhost.localdomain> References: <200305020919.h429Jmp24632@localhost.localdomain> Message-ID: <3EB2409E.8000403@v.loewis.de> Anthony Baxter wrote: > Ah. That could be the case - I think I'm loading the address from an > XML file in the test case I used... will fix that. If you mean "I'll fix the test case to not use XML anymore" - that might be reasonable. If you mean "I'll fix the test case to convert the Unicode arguments to byte strings before passing them to the socket module", I suggest that this should not be needed: the IDNA codec should complete quickly if the Unicode string is ASCII only (perhaps not as fast as converting the string to ASCII beforehand, but not significantly slower). Regards, Martin From Jack.Jansen@cwi.nl Fri May 2 13:45:34 2003 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Fri, 2 May 2003 14:45:34 +0200 Subject: [Python-Dev] Demos and Tools in binary distributions Message-ID: There's a suggestion over on pythonmac-sig that I add the Demos and Tools directories to a binary installer for MacPython for OSX. For MacPython-OS9 I've always included these, as the OS9 installed tree was really the same layout as the source tree. But I don't really know where I should put them for OSX. How is this handled in binary installers for other platforms? I.e. if you install Python on Windows, do you get Demos and Tools? Where? And if you install an RPM or something similar on Linux? -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From thomas@xs4all.net Fri May 2 14:02:14 2003 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 2 May 2003 15:02:14 +0200 Subject: [Python-Dev] Demos and Tools in binary distributions In-Reply-To: References: Message-ID: <20030502130214.GG26254@xs4all.nl> On Fri, May 02, 2003 at 02:45:34PM +0200, Jack Jansen wrote: > How is this handled in binary installers for other platforms? I.e. if > you install Python on Windows, do you get Demos and Tools? Where? And > if you install an RPM or something similar on Linux? The Debian packages include Demo and Tools in /usr/share/doc/python/examples/; this is practically mandated by the Debian policy ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido@python.org Fri May 2 16:01:16 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 02 May 2003 11:01:16 -0400 Subject: [Python-Dev] Demos and Tools in binary distributions In-Reply-To: "Your message of Fri, 02 May 2003 14:45:34 +0200." References: Message-ID: <200305021501.h42F1Ga02666@pcp02138704pcs.reston01.va.comcast.net> > There's a suggestion over on pythonmac-sig that I add the Demos and > Tools directories to a binary installer for MacPython for OSX. For > MacPython-OS9 I've always included these, as the OS9 installed tree > was really the same layout as the source tree. But I don't really > know where I should put them for OSX. > > How is this handled in binary installers for other platforms? I.e. if > you install Python on Windows, do you get Demos and Tools? Where? And > if you install an RPM or something similar on Linux? On Windows, you get a small selection of tools (i18n, idle, pynche, scripts, versioncheck and webchecker) but no demos, alas. --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh@python.net Fri May 2 16:03:52 2003 From: mwh@python.net (Michael Hudson) Date: Fri, 02 May 2003 16:03:52 +0100 Subject: [Python-Dev] Demos and Tools in binary distributions In-Reply-To: (Jack Jansen's message of "Fri, 2 May 2003 14:45:34 +0200") References: Message-ID: <2mof2l8k6f.fsf@starship.python.net> Jack Jansen writes: > There's a suggestion over on pythonmac-sig that I add the Demos and > Tools directories to a binary installer for MacPython for OSX. For > MacPython-OS9 I've always included these, as the OS9 installed tree > was really the same layout as the source tree. But I don't really > know where I should put them for OSX. Surely this is more a question about OSX than Python? I.e. the examples should go where the user expects them. /Developer/Examples/Python? Of course, not everyone who installs Python will have the dev tools... Cheers, M. -- Need to Know is usually an interesting UK digest of things that happened last week or might happen next week. [...] This week, nothing happened, and we don't care. -- NTK Now, 2000-12-29, http://www.ntk.net/ From logistix@cathoderaymission.net Fri May 2 16:12:32 2003 From: logistix@cathoderaymission.net (logistix) Date: Fri, 2 May 2003 10:12:32 -0500 (CDT) Subject: [Python-Dev] Demos and Tools in binary distributions In-Reply-To: <2mof2l8k6f.fsf@starship.python.net> Message-ID: On Fri, 2 May 2003, Michael Hudson wrote: > Jack Jansen writes: > > > There's a suggestion over on pythonmac-sig that I add the Demos and > > Tools directories to a binary installer for MacPython for OSX. For > > MacPython-OS9 I've always included these, as the OS9 installed tree > > was really the same layout as the source tree. But I don't really > > know where I should put them for OSX. > > Surely this is more a question about OSX than Python? I.e. the > examples should go where the user expects them. > /Developer/Examples/Python? Of course, not everyone who installs > Python will have the dev tools... > > Cheers, > M. > Are there currently any make targets for 'tools' and 'demos'? Adding them might be a way to gently influence where they get installed when all the different distros build thier packages. From skip@pobox.com Fri May 2 16:30:53 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 2 May 2003 10:30:53 -0500 Subject: [Python-Dev] updated notes about building bsddb185 module Message-ID: <16050.36653.443229.45811@montanaro.dyndns.org> Folks, An recent thread on c.l.py about the old bsddb module and new bsddb package convinced me to add more verbiage about building the old version. If you have a moment, please take a look at http://www.python.org/2.3/highlights.html and/or README at the top of the source tree. (Search for "bsddb".) I modified them to include a brief note about building the bsddb185 module and making it appear as the default when people "import bsddb". Feedback appreciated. Thanks, Skip From barry@python.org Fri May 2 16:44:39 2003 From: barry@python.org (Barry Warsaw) Date: 02 May 2003 11:44:39 -0400 Subject: [Python-Dev] updated notes about building bsddb185 module In-Reply-To: <16050.36653.443229.45811@montanaro.dyndns.org> References: <16050.36653.443229.45811@montanaro.dyndns.org> Message-ID: <1051890279.29805.0.camel@barry> On Fri, 2003-05-02 at 11:30, Skip Montanaro wrote: > Folks, > > An recent thread on c.l.py about the old bsddb module and new bsddb package > convinced me to add more verbiage about building the old version. If you > have a moment, please take a look at > > http://www.python.org/2.3/highlights.html > > and/or README at the top of the source tree. (Search for "bsddb".) I > modified them to include a brief note about building the bsddb185 module and > making it appear as the default when people "import bsddb". Without actually trying the recipe, the instructions seem reasonable. -Barry From dberlin@dberlin.org Fri May 2 16:58:03 2003 From: dberlin@dberlin.org (Daniel Berlin) Date: Fri, 2 May 2003 11:58:03 -0400 Subject: [Python-Dev] 2.3 broke email date parsing Message-ID: Parsing dates in emails is broken in 2.3 compared to 2.2.2. Changing parsedate_tz back to what it was in 2.2.2 fixes it. I'm not sure who or why this change was made, but it clearly doesn't handle cases it used to: (oldparseaddr is the 2.3 version with the patch at the bottom applied, which reverts it to what it was in 2.2.2) >>> import _parseaddr >>> _parseaddr.parsedate_tz("3 Mar 2001 02:04:50 -0000") >>> import oldparseaddr >>> oldparseaddr.parsedate_tz("3 Mar 2001 02:04:50 -0000") (2001, 3, 3, 2, 4, 50, 0, 0, 0, 0) >>> The problem is obvious from looking at the new code: The old version would only care if it actually found something it needed to delete. The new version assumes there *must* be a comma in the date if there is no dayname, and if there isn't, returns nothing. I wanted to know if this was a mistake, or done on purpose. If it's a mistake, i'll submit a patch to sourceforge to fix it. Index: _parseaddr.py =================================================================== RCS file: /cvsroot/python/python/dist/src/Lib/email/_parseaddr.py,v retrieving revision 1.5 diff -u -3 -p -r1.5 _parseaddr.py --- _parseaddr.py 17 Mar 2003 18:35:42 -0000 1.5 +++ _parseaddr.py 2 May 2003 15:42:30 -0000 @@ -49,14 +49,9 @@ def parsedate_tz(data): data = data.split() # The FWS after the comma after the day-of-week is optional, so search and # adjust for this. - if data[0].endswith(',') or data[0].lower() in _daynames: + if data[0][-1] in (',', '.') or data[0].lower() in _daynames: # There's a dayname here. Skip it del data[0] - else: - i = data[0].rfind(',') - if i < 0: - return None - data[0] = data[0][i+1:] if len(data) == 3: # RFC 850 date, deprecated stuff = data[0].split('-') if len(stuff) == 3: From just@letterror.com Fri May 2 17:20:40 2003 From: just@letterror.com (Just van Rossum) Date: Fri, 2 May 2003 18:20:40 +0200 Subject: [Python-Dev] Demos and Tools in binary distributions In-Reply-To: <2mof2l8k6f.fsf@starship.python.net> Message-ID: Michael Hudson wrote: > Surely this is more a question about OSX than Python? I.e. the > examples should go where the user expects them. > /Developer/Examples/Python? Of course, not everyone who installs > Python will have the dev tools... Actually, I didn't know until recently that 3rd party stuff sometimes gets installed there (eg. the PyObjC doco). I would actually expect it in /Application/MacPython-2.3/..., as that's where the apps get installed. I guess /Developer/... would make sense if the Python apps got installed in /Developer/Applications/, which they don't. Just From theller@python.net Fri May 2 17:35:36 2003 From: theller@python.net (Thomas Heller) Date: 02 May 2003 18:35:36 +0200 Subject: [Python-Dev] New thread death in test_bsddb3 In-Reply-To: References: Message-ID: "Tim Peters" writes: > [Thomas Heller] > > ... > > So is the policy now that it is no longer *allowed* to create another > > thread state, while in previous versions there wasn't any choice, > > because there existed no way to get the existing one? > > You can still create all the thread states you like; the new check is in > PyThreadState_Swap(), not in PyThreadState_New(). So you can create them, but are not allowed to use them? (Should there be a smiley here, or not, I'm not sure) > > There was always a choice, but previously Python provided no *help* in > keeping track of whether a thread already had a thread state associated with > it. That didn't stop careful apps from providing their own mechanisms to do > so. > > About policy, yes, it appears to be so now, else Mark wouldn't be raising a > fatal error . I view it as having always been the policy (from a > good-faith reading of the previous code), just a policy that was too > expensive for Python to enforce. There are many policies like that, such as > not passing goofy arguments to macros, and not letting references leak. > Python doesn't currently enforce them because it's currently too expensive > to enforce them. Over time that can change. I'm confused: what *is* the policy now? And: Has the policy *changed*, or was it simply not checked before? Since I don't know the policy, I can only guess if the fatal error is appropriate or not. If it is, there should be a 'recipe' what to do (even if it is 'use the approach outlined in PEP311'). If it is not, the error should be removed (IMO). > Clearly, I like having fatal errors for dubious things in debug builds. > Debug builds are supposed to help you debug. If the fatal error here drives > you insane, and you don't want to repair the app code, No, not at all. Thanks, Thomas From martin@v.loewis.de Fri May 2 18:01:51 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 02 May 2003 19:01:51 +0200 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <16050.36653.443229.45811@montanaro.dyndns.org> References: <16050.36653.443229.45811@montanaro.dyndns.org> Message-ID: <3EB2A47F.8000706@v.loewis.de> Skip Montanaro wrote: > Feedback appreciated. I think we need to build bsddb185 automatically under certain conditions. I have encouraged a user to submit a patch in that direction. Regards, Martin From skip@pobox.com Fri May 2 18:34:40 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 2 May 2003 12:34:40 -0500 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <3EB2A47F.8000706@v.loewis.de> References: <16050.36653.443229.45811@montanaro.dyndns.org> <3EB2A47F.8000706@v.loewis.de> Message-ID: <16050.44080.588636.503705@montanaro.dyndns.org> Martin> Skip Montanaro wrote: >> Feedback appreciated. Martin> I think we need to build bsddb185 automatically under certain Martin> conditions. I have encouraged a user to submit a patch in that Martin> direction. I suppose that's an alternative, however, it is complicated by a couple issues: * The bsddb185 module would have to be built as bsddb (not a big deal in and of itself). * The current bsddb package directory would have to be renamed or not installed to avoid name clashes. I don't think there's a precedent for the second issue. The make install target installs everything in Lib. I think The decision about whether the package or the module gets installed would be made in setup.py. The coupling between the two increases the complexity of the process. I smell an ugly hack in the offing. Skip From tim.one@comcast.net Fri May 2 18:55:05 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 02 May 2003 13:55:05 -0400 Subject: [Python-Dev] New thread death in test_bsddb3 In-Reply-To: Message-ID: [Thomas Heller] >>> ... >>> So is the policy now that it is no longer *allowed* to create another >>> thread state, while in previous versions there wasn't any choice, >>> because there existed no way to get the existing one? [Tim] >> You can still create all the thread states you like; the new check is >> in PyThreadState_Swap(), not in PyThreadState_New(). [Thomas] > So you can create them, Yes. > but are not allowed to use them? Currently, no more than one at a time per thread. The API doesn't appear to preclude using multiple thread states with a single thread if the right dances are performed. Offhand I don't know why someone would want to, but people want to do a lot of silly things . > (Should there be a smiley here, or not, I'm not sure) No. > ... > I'm confused: what *is* the policy now? > And: Has the policy *changed*, or was it simply not checked before? I already gave you my best guesses about those (no, yes). > Since I don't know the policy, I can only guess if the fatal error is > appropriate or not. Ditto (yes). > If it is, there should be a 'recipe' what to do (even if it is 'use the > approach outlined in PEP311'). Additions to NEWS and the PEP would be fine by me. > If it is not, the error should be removed (IMO). Sure. From tim.one@comcast.net Fri May 2 20:28:41 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 02 May 2003 15:28:41 -0400 Subject: [Python-Dev] Draft of dictnotes.txt [Please Comment] In-Reply-To: <000901c3106b$0d549d20$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger] > NOTES ON OPTIMIZING DICTIONARIES > ================================ > ... Very nice! Please check it in. From tim.one@comcast.net Fri May 2 20:59:40 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 02 May 2003 15:59:40 -0400 Subject: [Python-Dev] python-dev Summary for 2003-04-16 through 2003-04-30 In-Reply-To: Message-ID: [Brett Cannon] > ... > But the function still got added for numbers. So, as of Python 2.3b1, > there is a built-in named 'sum' that has the parameter list > "sum(list_of_numbers [, start=0]) -> sum of the numbers in > list_of_numbers". The 'start' parameter allows you to specify where to > start in the list for your running sum. list_of_numbers is really any iterable producing numbers. All the numbers are added ("start" doesn't affect that), as if via def sum(seq, start=0): result = start for x in seq: result += x return start The best use for start is if you're summing a sequence of number-like arguments that can't be added to the integer 0 (datetime.timedelta is an example). > ... > Python, the gift that keeps on giving you more responsibility. =) Speaking of which, your PSF dues for April are overdue . > ... > `os.path.walk() lacks 'depth first' option`__ > Someone requested that os.path.walk support depth-first walking. This was a terminology confusion: os.path.walk() always did depth-first walking, and so does the new os.walk(). The missing bit was an option to control whether directories are delivered in preorder ("top down") or postorder ("bottom up") during the depth-first walk. > The request was deemed not important enough to bother implementing, A topdown flag is implemented in os.walk(). From Jack.Jansen@oratrix.com Fri May 2 22:52:08 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Fri, 2 May 2003 23:52:08 +0200 Subject: [Python-Dev] Demos and Tools in binary distributions In-Reply-To: Message-ID: <51F6BD90-7CE8-11D7-A7DC-000A27B19B96@oratrix.com> On vrijdag, mei 2, 2003, at 18:20 Europe/Amsterdam, Just van Rossum wrote: > Michael Hudson wrote: > >> Surely this is more a question about OSX than Python? I.e. the >> examples should go where the user expects them. >> /Developer/Examples/Python? Of course, not everyone who installs >> Python will have the dev tools... > > Actually, I didn't know until recently that 3rd party stuff sometimes > gets installed there (eg. the PyObjC doco). I would actually expect it > in /Application/MacPython-2.3/..., as that's where the apps get > installed. I guess /Developer/... would make sense if the Python apps > got installed in /Developer/Applications/, which they don't. I'm also tempted to go with /Applications/MacPython-2.3/Demo and .../Tools. That is what a lot of Mac applications do. It has a slight problems, though: it would look unintuitive to a pure-unix user. But as there isn't a standard location for this on unix anyway: who cares . A slightly more serious problem is that the README's in Tools and Demo aren't really meant for the 100%-novice, and a prominent location at the top of the /Applications/MacPython-2.3 folder will make it almost-100%-certain that these files are going to be among the first they read. I could put Demo and Tools one level deeper (in an Extras folder?) and provide a readme there explaining that these demos and tools are for all Pythons on all platforms, so may not work and/or may not be intellegible int he first place. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From just@letterror.com Fri May 2 23:08:48 2003 From: just@letterror.com (Just van Rossum) Date: Sat, 3 May 2003 00:08:48 +0200 Subject: [Python-Dev] Demos and Tools in binary distributions In-Reply-To: <51F6BD90-7CE8-11D7-A7DC-000A27B19B96@oratrix.com> Message-ID: Jack Jansen wrote: > I could put Demo and Tools one level deeper (in an Extras folder?) > and provide a readme there explaining that these demos and tools are > for all Pythons on all platforms, so may not work and/or may not be > intellegible int he first place. +1 Just From martin@v.loewis.de Sat May 3 00:39:51 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 03 May 2003 01:39:51 +0200 Subject: [Python-Dev] New thread death in test_bsddb3 In-Reply-To: References: Message-ID: <3EB301C7.5000508@v.loewis.de> Tim Peters wrote: > Currently, no more than one at a time per thread. The API doesn't appear to > preclude using multiple thread states with a single thread if the right > dances are performed. Offhand I don't know why someone would want to, but > people want to do a lot of silly things . There are many good reasons; here is one scenario: Application A calls embedded Python. It creates thread state T1 to do so. Python calls library L1, which releases GIL. L1 calls L2. L2 calls back into Python. To do so, it allocates a new thread state, and acquires the GIL. All in one thread. L2 has no idea that A has already allocated a thread state for this thread. With the new API, L2 does not need any longer to create a thread state. However, in older Python releases, this was necessary, so libraries do such things. It is unfortunate that these libraries now break, and I wish the new API would not be enforced so strictly yet. > I already gave you my best guesses about those (no, yes). I think your guess is wrong: In the past, it was often *necessary* to have multiple thread states allocated for a single thread. There was simply no other option. So it can't be that this was not allowed. Regards, Martin From skip@pobox.com Sat May 3 00:49:31 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 2 May 2003 18:49:31 -0500 Subject: [Python-Dev] removing csv directory from nondist/sandbox - how? Message-ID: <16051.1035.821998.148196@montanaro.dyndns.org> Taking a cue from Raymond's sandbox/itertools cleanup, I cvs removed the contents of sandbox/csv just now. How do I get rid of the sandbox/csv directory itself? I see that the itertools directory remains as well, even though I executed "cvs -dP ." from the sandbox directory. Skip From martin@v.loewis.de Sat May 3 00:32:24 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 03 May 2003 01:32:24 +0200 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <16050.44080.588636.503705@montanaro.dyndns.org> References: <16050.36653.443229.45811@montanaro.dyndns.org> <3EB2A47F.8000706@v.loewis.de> <16050.44080.588636.503705@montanaro.dyndns.org> Message-ID: <3EB30008.4010603@v.loewis.de> Skip Montanaro wrote: [building bsddb185] > I suppose that's an alternative, however, it is complicated by a couple > issues: > > * The bsddb185 module would have to be built as bsddb (not a big deal in > and of itself). Why is that? I propose to build the bsddb185 module as bsddb185. It does not support being built as bsddb[module]. > * The current bsddb package directory would have to be renamed or not > installed to avoid name clashes. I suggest no such thing, and I agree that this would not be desirable. Regards, Martin From skip@pobox.com Sat May 3 01:11:53 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 2 May 2003 19:11:53 -0500 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <3EB30008.4010603@v.loewis.de> References: <16050.36653.443229.45811@montanaro.dyndns.org> <3EB2A47F.8000706@v.loewis.de> <16050.44080.588636.503705@montanaro.dyndns.org> <3EB30008.4010603@v.loewis.de> Message-ID: <16051.2377.270099.748537@montanaro.dyndns.org> Skip> I suppose that's an alternative, however, it is complicated by a Skip> couple issues: Skip> Skip> * The bsddb185 module would have to be built as bsddb (not a big Skip> deal in and of itself). Martin> Why is that? I propose to build the bsddb185 module as Martin> bsddb185. It does not support being built as bsddb[module]. Skip> * The current bsddb package directory would have to be renamed or Skip> not installed to avoid name clashes. Martin> I suggest no such thing, and I agree that this would not be Martin> desirable. My apologies, Martin. I guess I misunderstood what you suggested. (I suspect Nick Vargish may have as well.) My interpretation of his complaint is that he doesn't have a functioning bsddb module and wants the old module back. He wants to be able to install Python and have "bsddb" be the module. As currently constituted, I think Modules/bsddbmodule.c can only be built as "bsddb185" because of the symbols in the file. How can Nick build that as "bsddb"? Furthermore, how can you guarantee that the bsddb package directory won't be found before the bsddb module during a module search (short, perhaps of statically linking the module into the interpreter)? Skip From pje@telecommunity.com Sat May 3 01:29:12 2003 From: pje@telecommunity.com (Phillip J. Eby) Date: Fri, 02 May 2003 20:29:12 -0400 Subject: [Python-Dev] removing csv directory from nondist/sandbox - how? In-Reply-To: <16051.1035.821998.148196@montanaro.dyndns.org> Message-ID: <5.1.0.14.0.20030502202821.02563020@mail.telecommunity.com> At 06:49 PM 5/2/03 -0500, Skip Montanaro wrote: >Taking a cue from Raymond's sandbox/itertools cleanup, I cvs removed the >contents of sandbox/csv just now. How do I get rid of the sandbox/csv >directory itself? I see that the itertools directory remains as well, even >though I executed "cvs -dP ." from the sandbox directory. You can't remove directories from a CVS server unless you have direct access to it. And if you remove the directory, its history goes with it. From martin@v.loewis.de Sat May 3 01:25:03 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 03 May 2003 02:25:03 +0200 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <16051.2377.270099.748537@montanaro.dyndns.org> References: <16050.36653.443229.45811@montanaro.dyndns.org> <3EB2A47F.8000706@v.loewis.de> <16050.44080.588636.503705@montanaro.dyndns.org> <3EB30008.4010603@v.loewis.de> <16051.2377.270099.748537@montanaro.dyndns.org> Message-ID: <3EB30C5F.90801@v.loewis.de> Skip Montanaro wrote: > My apologies, Martin. I guess I misunderstood what you suggested. (I > suspect Nick Vargish may have as well.) My interpretation of his complaint > is that he doesn't have a functioning bsddb module and wants the old module > back. That's the larger of his complaints. There is also a subcomplaint: Building the new bsddb185 module is not automatic, so he has to give explicit instructions to his admins. > He wants to be able to install Python and have "bsddb" be the module. He would want it that way. However, he could also accept importing bsddb185 as bsddb. He cannot accept having to edit Modules/Setup, and he cannot accept building Sleepycat [34].x > As currently constituted, I think Modules/bsddbmodule.c can only be built as > "bsddb185" because of the symbols in the file. How can Nick build that as > "bsddb"? He can't. He can build it as bsddb185. However, his complaint is that setup.py doesn't do that for him. > Furthermore, how can you guarantee that the bsddb package > directory won't be found before the bsddb module during a module search > (short, perhaps of statically linking the module into the interpreter)? I don't think the module should be bsddb; I renamed the init function on purpose. All I'm suggesting that it is autmatically built with setup.py. People can accept changing their Python code. They cannot accept having to ask more favours from their sysadmins. Regards, Martin From andymac@bullseye.apana.org.au Fri May 2 23:45:27 2003 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Sat, 3 May 2003 09:45:27 +1100 (edt) Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <16050.44080.588636.503705@montanaro.dyndns.org> Message-ID: On Fri, 2 May 2003, Skip Montanaro wrote: > Martin> Skip Montanaro wrote: > >> Feedback appreciated. > > Martin> I think we need to build bsddb185 automatically under certain > Martin> conditions. I have encouraged a user to submit a patch in that > Martin> direction. > > I suppose that's an alternative, however, it is complicated by a couple > issues: > > * The bsddb185 module would have to be built as bsddb (not a big deal in > and of itself). > > * The current bsddb package directory would have to be renamed or not > installed to avoid name clashes. > > I don't think there's a precedent for the second issue. The make install > target installs everything in Lib. I think The decision about whether the > package or the module gets installed would be made in setup.py. The > coupling between the two increases the complexity of the process. I smell > an ugly hack in the offing. Could you not have the following? - build bsddb if the Sleepycat libraries are found; - build bsddb185 if the DB 1.85 libraries can be found; - where bsddb is imported, try importing bsddb, and if that fails try importing bsddb185 as bsddb (or as * inside the bsddb pkg). -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From barry@python.org Sat May 3 03:02:45 2003 From: barry@python.org (Barry Warsaw) Date: 02 May 2003 22:02:45 -0400 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <3EB30008.4010603@v.loewis.de> References: <16050.36653.443229.45811@montanaro.dyndns.org> <3EB2A47F.8000706@v.loewis.de> <16050.44080.588636.503705@montanaro.dyndns.org> <3EB30008.4010603@v.loewis.de> Message-ID: <1051927365.4302.3.camel@anthem> On Fri, 2003-05-02 at 19:32, "Martin v. L=F6wis" wrote: > [building bsddb185] > > I suppose that's an alternative, however, it is complicated by a coup= le > > issues: > >=20 > > * The bsddb185 module would have to be built as bsddb (not a big = deal in > > and of itself). >=20 > Why is that? I propose to build the bsddb185 module as bsddb185. It doe= s=20 > not support being built as bsddb[module]. >=20 > > * The current bsddb package directory would have to be renamed or= not > > installed to avoid name clashes. >=20 > I suggest no such thing, and I agree that this would not be desirable. I totally agree with Martin. Make bsddb185 explicit and do not masquerade it as bsddb by default. -Barry From barry@python.org Sat May 3 03:04:17 2003 From: barry@python.org (Barry Warsaw) Date: 02 May 2003 22:04:17 -0400 Subject: [Python-Dev] removing csv directory from nondist/sandbox - how? In-Reply-To: <16051.1035.821998.148196@montanaro.dyndns.org> References: <16051.1035.821998.148196@montanaro.dyndns.org> Message-ID: <1051927457.4302.5.camel@anthem> On Fri, 2003-05-02 at 19:49, Skip Montanaro wrote: > Taking a cue from Raymond's sandbox/itertools cleanup, I cvs removed the > contents of sandbox/csv just now. How do I get rid of the sandbox/csv > directory itself? I see that the itertools directory remains as well, even > though I executed "cvs -dP ." from the sandbox directory. Check to make sure you don't have any dot-files left in the directory. -P should definitely zap it if there's nothing in there. You really don't want to remove the directory from the repository (for a number of reasons). -Barry From tim.one@comcast.net Sat May 3 03:49:04 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 02 May 2003 22:49:04 -0400 Subject: [Python-Dev] New thread death in test_bsddb3 In-Reply-To: <3EB301C7.5000508@v.loewis.de> Message-ID: [Martin v. L=F6wis] > There are many good reasons; here is one scenario: > > Application A calls embedded Python. It creates thread state T1 to = do > so. Python calls library L1, which releases GIL. L1 calls L2. L2 ca= lls > back into Python. To do so, it allocates a new thread state, and > acquires the GIL. All in one thread. > > L2 has no idea that A has already allocated a thread state for this > thread. With the new API, L2 does not need any longer to create a t= hread > state. However, in older Python releases, this was necessary, so > libraries do such things. I understand that some people did this (we've bumped into two so far, right?), but don't agree it was necessary: the thrust of Mark's new = code is to make this easy to do in a uniform way, but people could (and did) = build their own layers of TLS-based Python wrappers before this (Mark is on= e of them; a former employer of mine is another). AFAIK, though, these we= re cases where multiple libraries agreed to cooperate. I don't really c= are anymore, since there's a standard way to do this now. > It is unfortunate that these libraries now break, and I wish the ne= w > API would not be enforced so strictly yet. If it were enforced in a release build I'd agree, but it isn't -- a r= elease build enforces nothing new here, and I want to be punched in the groi= n when a debug build spots dubious practice. >> I already gave you my best guesses about those (no, yes). > I think your guess is wrong: In the past, it was often *necessary* = to > have multiple thread states allocated for a single thread. There wa= s > simply no other option. So it can't be that this was not allowed. It's a new world now -- let's get on with it. Fighting for the right= to retain lame code (judged by current stds, whether or not it was lame = before) isn't a cause I'll sign up for, and especially not when it's in an ex= tremely error-prone area of the C API, and certainly not when it's so easy to= repair too. But if you're determined to let slop slide in the debug build, = check in a change to stop the warning -- it's not important enough to me to= keep arguing about it. I don't think you'd be doing anyone a real favor, = and I'll leave it at that. From martin@v.loewis.de Sat May 3 02:52:41 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 03 May 2003 03:52:41 +0200 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: References: Message-ID: <3EB320E9.4020409@v.loewis.de> Andrew MacIntyre wrote: > Could you not have the following? > - build bsddb if the Sleepycat libraries are found; That is happening now. > - build bsddb185 if the DB 1.85 libraries can be found; That is what I'm proposing. Volunteers should step forward. > - where bsddb is imported, try importing bsddb, and if that > fails try importing bsddb185 as bsddb (or as * inside the bsddb pkg). I'm strongly opposed to that. Users of bsddb185 need to make an explicit choice that they want to use that library. Otherwise, we would have to deal with the bug reports resulting from the brokenness of the library forever. Regards, Martin From tim.one@comcast.net Sat May 3 04:16:35 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 02 May 2003 23:16:35 -0400 Subject: [Python-Dev] removing csv directory from nondist/sandbox - how? In-Reply-To: <16051.1035.821998.148196@montanaro.dyndns.org> Message-ID: [Skip Montanaro] > Taking a cue from Raymond's sandbox/itertools cleanup, I cvs removed > the contents of sandbox/csv just now. How do I get rid of the > sandbox/csv directory itself? I see that the itertools directory > remains as well, even though I executed "cvs -dP ." from the sandbox > directory. -P won't remove a directory if there's any file remaining in the directory that wasn't checked in. This includes dot files (as Barry said), .rej files left behind by old rejected patches, temp scripts or output files you may have created, or a build directory created by setup.py. I had to get rid of all of those before CVS deleted my csv directory (normally I just do deltree (rm -rf) on a dead directory, and CVS won't recreate it then, but I did it by hand this time just to verify how -P works). From noah@noah.org Sat May 3 11:36:53 2003 From: noah@noah.org (Noah Spurrier) Date: Sat, 03 May 2003 03:36:53 -0700 Subject: [Python-Dev] posixmodule.c patch to support forkpty (patch against posixmodule.c Revision 2.241.2.1) Message-ID: <3EB39BC5.50702@noah.org> This is a multi-part message in MIME format. --------------020003030503000503080704 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Hi, I have been taking a hard look at Python 2.3b and support for pseudo-ttys seems to be much better. It looks like os.openpty() was updated to provide support for a wider range of pseudo ttys. Unfortunately os.forkpty() was not also updated. I am attaching a patch that allows os.forkpty() to run on the same platforms that os.openpty supports. In other words, os.forkpty() will use os.fork() and os.openpty() for platforms that don't already have forkpty(). Note that since pty module calls os.forkpty this patch will also allow pty.fork() to work properly on more platforms. Most importantly to me, this patch will allow os.forkpty() to work with Solaris. This patch was diffed against posixmodule.c Revision 2.241.2.1 Python 2.3b. This patch moves most of the logic out of the posix_openpty() C function into a function that can be shared by both posix_openpty() and posix_forkpty(). Although the posix_openpty() logic was moved it was unchanged. I think I kept the code neat despite all the messy #if's that always accompany pty code. I am also attaching a test script, test_forkpty.py (based on test_openpty.py), that tests the basic ability to fork and read and write a pty. I am testing it with my Pexpect module which makes heavy use of the pty module. With the patch Pexpect passes all my unit tests on Solaris. Pexpect has been tested on Linux, OpenBSD, Solaris, and Cygwin. I'm looking for an OS X server to test with. Yours, Noah --------------020003030503000503080704 Content-Type: text/plain; name="test_forkpty.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="test_forkpty.py" #!/usr/bin/env python2.3 import os, sys, time verbose = 1 try: if verbose: print "Calling os.forkpty()" pid, fd = os.forkpty() if verbose: print "(pid, fd) = (%d, %d)"%(pid, fd) except AttributeError: raise TestSkipped, "No forkpty() available." if pid == 0: # child print "I am not a robot!" sys.stdout.flush(0) else: time.sleep(1) print "The robot says: ", os.read(fd,100) os.close(fd) --------------020003030503000503080704 Content-Type: text/plain; name="posixmodule.c.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="posixmodule.c.patch" *** posixmodule.c Tue Apr 22 22:39:17 2003 --- new.posixmodule.c Sat May 3 06:11:04 2003 *************** *** 2597,2685 **** #endif /* defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) || defined(HAVE_DEV_PTMX */ #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX) - PyDoc_STRVAR(posix_openpty__doc__, - "openpty() -> (master_fd, slave_fd)\n\n\ - Open a pseudo-terminal, returning open fd's for both master and slave end.\n"); - static PyObject * ! posix_openpty(PyObject *self, PyObject *noargs) { ! int master_fd, slave_fd; #ifndef HAVE_OPENPTY ! char * slave_name; #endif #if defined(HAVE_DEV_PTMX) && !defined(HAVE_OPENPTY) && !defined(HAVE__GETPTY) ! PyOS_sighandler_t sig_saved; #ifdef sun ! extern char *ptsname(); #endif #endif #ifdef HAVE_OPENPTY ! if (openpty(&master_fd, &slave_fd, NULL, NULL, NULL) != 0) ! return posix_error(); #elif defined(HAVE__GETPTY) ! slave_name = _getpty(&master_fd, O_RDWR, 0666, 0); ! if (slave_name == NULL) ! return posix_error(); ! slave_fd = open(slave_name, O_RDWR); ! if (slave_fd < 0) ! return posix_error(); #else ! master_fd = open(DEV_PTY_FILE, O_RDWR | O_NOCTTY); /* open master */ ! if (master_fd < 0) ! return posix_error(); ! sig_saved = signal(SIGCHLD, SIG_DFL); ! /* change permission of slave */ ! if (grantpt(master_fd) < 0) { ! signal(SIGCHLD, sig_saved); ! return posix_error(); ! } ! /* unlock slave */ ! if (unlockpt(master_fd) < 0) { ! signal(SIGCHLD, sig_saved); ! return posix_error(); ! } ! signal(SIGCHLD, sig_saved); ! slave_name = ptsname(master_fd); /* get name of slave */ ! if (slave_name == NULL) ! return posix_error(); ! slave_fd = open(slave_name, O_RDWR | O_NOCTTY); /* open slave */ ! if (slave_fd < 0) ! return posix_error(); #if !defined(__CYGWIN__) && !defined(HAVE_DEV_PTC) ! ioctl(slave_fd, I_PUSH, "ptem"); /* push ptem */ ! ioctl(slave_fd, I_PUSH, "ldterm"); /* push ldterm */ #ifndef __hpux ! ioctl(slave_fd, I_PUSH, "ttcompat"); /* push ttcompat */ #endif /* __hpux */ #endif /* HAVE_CYGWIN */ #endif /* HAVE_OPENPTY */ ! return Py_BuildValue("(ii)", master_fd, slave_fd); } #endif /* defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX) */ ! #ifdef HAVE_FORKPTY PyDoc_STRVAR(posix_forkpty__doc__, "forkpty() -> (pid, master_fd)\n\n\ Fork a new process with a new pseudo-terminal as controlling tty.\n\n\ Like fork(), return 0 as pid to child process, and PID of child to parent.\n\ To both, return fd of newly opened pseudo-terminal.\n"); - static PyObject * posix_forkpty(PyObject *self, PyObject *noargs) { ! int master_fd, pid; ! pid = forkpty(&master_fd, NULL, NULL, NULL); ! if (pid == -1) ! return posix_error(); ! if (pid == 0) ! PyOS_AfterFork(); ! return Py_BuildValue("(ii)", pid, master_fd); } #endif --- 2597,2784 ---- #endif /* defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) || defined(HAVE_DEV_PTMX */ #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX) static PyObject * ! __shared_openpty (int * out_master_fd, int * out_slave_fd) { ! int master_fd, slave_fd; #ifndef HAVE_OPENPTY ! char * slave_name; #endif #if defined(HAVE_DEV_PTMX) && !defined(HAVE_OPENPTY) && !defined(HAVE__GETPTY) ! PyOS_sighandler_t sig_saved; #ifdef sun ! extern char *ptsname(); #endif #endif #ifdef HAVE_OPENPTY ! if (openpty(&master_fd, &slave_fd, NULL, NULL, NULL) != 0) ! return posix_error(); #elif defined(HAVE__GETPTY) ! slave_name = _getpty(&master_fd, O_RDWR, 0666, 0); ! if (slave_name == NULL) ! return posix_error(); ! slave_fd = open(slave_name, O_RDWR); ! if (slave_fd < 0) ! return posix_error(); #else ! master_fd = open(DEV_PTY_FILE, O_RDWR | O_NOCTTY); ! if (master_fd < 0){ ! return posix_error(); ! } ! sig_saved = signal(SIGCHLD, SIG_DFL); ! /* change permission of slave */ ! if (grantpt(master_fd) < 0) { ! signal(SIGCHLD, sig_saved); ! return posix_error(); ! } ! /* unlock slave */ ! if (unlockpt(master_fd) < 0) { ! signal(SIGCHLD, sig_saved); ! return posix_error(); ! } ! signal(SIGCHLD, sig_saved); ! slave_name = ptsname(master_fd); ! if (slave_name == NULL){ ! return posix_error(); ! } ! slave_fd = open(slave_name, O_RDWR | O_NOCTTY); ! if (slave_fd < 0){ ! return posix_error(); ! } #if !defined(__CYGWIN__) && !defined(HAVE_DEV_PTC) ! ioctl(slave_fd, I_PUSH, "ptem"); /* push ptem */ ! ioctl(slave_fd, I_PUSH, "ldterm"); /* push ldterm */ #ifndef __hpux ! ioctl(slave_fd, I_PUSH, "ttcompat"); /* push ttcompat */ #endif /* __hpux */ #endif /* HAVE_CYGWIN */ #endif /* HAVE_OPENPTY */ ! *out_master_fd = master_fd; ! *out_slave_fd = slave_fd; ! return Py_BuildValue("(ii)", master_fd, slave_fd); ! } + PyDoc_STRVAR(posix_openpty__doc__, + "openpty() -> (master_fd, slave_fd)\n\n\ + Open a pseudo-terminal, returning open fd's for both master and slave end.\n"); + static PyObject * + posix_openpty(PyObject *self, PyObject *noargs) + { + int master_fd; + int slave_fd; + + return __shared_openpty (& master_fd, & slave_fd); } #endif /* defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX) */ ! /* Use forkpty if available. For platform that don't have it I try to define it. */ ! #if defined(HAVE_FORKPTY) || (defined(HAVE_FORK) && (defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX))) PyDoc_STRVAR(posix_forkpty__doc__, "forkpty() -> (pid, master_fd)\n\n\ Fork a new process with a new pseudo-terminal as controlling tty.\n\n\ Like fork(), return 0 as pid to child process, and PID of child to parent.\n\ To both, return fd of newly opened pseudo-terminal.\n"); static PyObject * posix_forkpty(PyObject *self, PyObject *noargs) { ! #ifdef HAVE_FORKPTY /* The easy one */ ! int master_fd, pid; ! pid = forkpty(&master_fd, NULL, NULL, NULL); ! #else /* The hard one */ ! int master_fd, pid; ! int slave_fd; ! char * slave_name; ! int fd; ! ! __shared_openpty (& master_fd, & slave_fd); ! if (master_fd < 0 || slave_fd < 0) ! { ! return posix_error(); ! } ! slave_name = ptsname(master_fd); ! pid = fork(); ! switch (pid) { ! case -1: ! return posix_error(); ! case 0: /* Child */ ! ! #ifdef TIOCNOTTY ! /* Explicitly close the old controlling terminal. ! Some platforms require an explicit detach of the current controlling tty ! before we close stdin, stdout, stderr. ! OpenBSD says that this is obsolete, but doesn't hurt. */ ! fd = open("/dev/tty", O_RDWR | O_NOCTTY); ! if (fd >= 0) { ! (void) ioctl(fd, TIOCNOTTY, (char *)0); ! close(fd); ! } ! #endif /* TIOCNOTTY */ ! ! /* The setsid() system call will place the process into its own session ! which has the effect of disassociating it from the controlling terminal. ! This is known to be true for OpenBSD. ! */ ! if (setsid() < 0){ ! return posix_error(); ! } ! ! ! /* Verify that we are disconnected from the controlling tty. */ ! fd = open("/dev/tty", O_RDWR | O_NOCTTY); ! if (fd >= 0) { ! close(fd); ! return posix_error(); ! } ! ! #ifdef TIOCSCTTY ! /* Make the pseudo terminal the controlling terminal for this process ! (the process must not currently have a controlling terminal). ! */ ! if (ioctl(slave_fd, TIOCSCTTY, (char *)0) < 0){ ! return posix_error(); ! } ! #endif /* TIOCSCTTY */ ! ! /* Verify that we can open to the slave pty file. */ ! fd = open(slave_name, O_RDWR); ! if (fd < 0){ ! return posix_error(); ! } ! else ! close(fd); ! ! /* Verify that we now have a controlling tty. */ ! fd = open("/dev/tty", O_WRONLY); ! if (fd < 0){ ! return posix_error(); ! } ! else { ! close(fd); ! } ! ! (void) close(master_fd); ! (void) dup2(slave_fd, 0); ! (void) dup2(slave_fd, 1); ! (void) dup2(slave_fd, 2); ! if (slave_fd > 2) ! (void) close(slave_fd); ! pid = 0; ! break; ! default: ! /* PARENT */ ! (void) close(slave_fd); ! } ! #endif ! ! if (pid == -1) ! return posix_error(); ! if (pid == 0) ! PyOS_AfterFork(); ! return Py_BuildValue("(ii)", pid, master_fd); } #endif *************** *** 6994,7000 **** #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX) {"openpty", posix_openpty, METH_NOARGS, posix_openpty__doc__}, #endif /* HAVE_OPENPTY || HAVE__GETPTY || HAVE_DEV_PTMX */ ! #ifdef HAVE_FORKPTY {"forkpty", posix_forkpty, METH_NOARGS, posix_forkpty__doc__}, #endif /* HAVE_FORKPTY */ #ifdef HAVE_GETEGID --- 7093,7099 ---- #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX) {"openpty", posix_openpty, METH_NOARGS, posix_openpty__doc__}, #endif /* HAVE_OPENPTY || HAVE__GETPTY || HAVE_DEV_PTMX */ ! #if defined(HAVE_FORKPTY) || (defined(HAVE_FORK) && (defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX))) {"forkpty", posix_forkpty, METH_NOARGS, posix_forkpty__doc__}, #endif /* HAVE_FORKPTY */ #ifdef HAVE_GETEGID --------------020003030503000503080704-- From martin@v.loewis.de Sat May 3 13:23:09 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 03 May 2003 14:23:09 +0200 Subject: [Python-Dev] posixmodule.c patch to support forkpty (patch against posixmodule.c Revision 2.241.2.1) In-Reply-To: <3EB39BC5.50702@noah.org> References: <3EB39BC5.50702@noah.org> Message-ID: Noah Spurrier writes: > I am attaching a patch Please see http://www.python.org/dev/devfaq.html#a2 Please don't post patches to python-dev. > This patch was diffed against posixmodule.c Revision 2.241.2.1 Python 2.3b. Please generate patches against the mainline, not against branches. Kind regards, Martin From noah@noah.org Sat May 3 15:10:12 2003 From: noah@noah.org (Noah Spurrier) Date: Sat, 03 May 2003 07:10:12 -0700 Subject: [Python-Dev] posixmodule.c patch to support forkpty (patch against posixmodule.c Revision 2.241.2.1) In-Reply-To: References: <3EB39BC5.50702@noah.org> Message-ID: <3EB3CDC4.4020306@noah.org> Sorry... my first patch :-) Yours, Noah Martin v. L=F6wis wrote: > Noah Spurrier writes: >=20 >>I am attaching a patch=20 >=20 > Please see >=20 > http://www.python.org/dev/devfaq.html#a2 From skip@pobox.com Sat May 3 15:22:48 2003 From: skip@pobox.com (Skip Montanaro) Date: Sat, 3 May 2003 09:22:48 -0500 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <1051927365.4302.3.camel@anthem> References: <16050.36653.443229.45811@montanaro.dyndns.org> <3EB2A47F.8000706@v.loewis.de> <16050.44080.588636.503705@montanaro.dyndns.org> <3EB30008.4010603@v.loewis.de> <1051927365.4302.3.camel@anthem> Message-ID: <16051.53432.301308.205335@montanaro.dyndns.org> Barry> I totally agree with Martin. Make bsddb185 explicit and do not Barry> masquerade it as bsddb by default. Okay, that's fine with me. Skip From skip@pobox.com Sat May 3 15:25:28 2003 From: skip@pobox.com (Skip Montanaro) Date: Sat, 3 May 2003 09:25:28 -0500 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <1051927365.4302.3.camel@anthem> References: <16050.36653.443229.45811@montanaro.dyndns.org> <3EB2A47F.8000706@v.loewis.de> <16050.44080.588636.503705@montanaro.dyndns.org> <3EB30008.4010603@v.loewis.de> <1051927365.4302.3.camel@anthem> Message-ID: <16051.53592.704262.929675@montanaro.dyndns.org> Barry> I totally agree with Martin. Make bsddb185 explicit and do not Barry> masquerade it as bsddb by default. Skip> Okay, that's fine with me. How about http://python.org/sf/727137 then? I think dbhash should consider bsddb185 as a possibility. That would make Nick Vargish's anydbm programs keep running I think. Skip From skip@pobox.com Sat May 3 15:28:52 2003 From: skip@pobox.com (Skip Montanaro) Date: Sat, 3 May 2003 09:28:52 -0500 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <3EB320E9.4020409@v.loewis.de> References: <3EB320E9.4020409@v.loewis.de> Message-ID: <16051.53796.553202.289905@montanaro.dyndns.org> >> - where bsddb is imported, try importing bsddb, and if that >> fails try importing bsddb185 as bsddb (or as * inside the bsddb pkg). Martin> I'm strongly opposed to that. Users of bsddb185 need to make an Martin> explicit choice that they want to use that library. Otherwise, Martin> we would have to deal with the bug reports resulting from the Martin> brokenness of the library forever. Yeah, but there are places in the core library (like anydbm via dbhash) which import bsddb and are generally going to be out of control of end users. I think those places need to consider bsddb185 as a possibility. I already posted a link to a SF patch. Skip From dave@boost-consulting.com Sat May 3 17:45:10 2003 From: dave@boost-consulting.com (David Abrahams) Date: Sat, 03 May 2003 12:45:10 -0400 Subject: [Python-Dev] Timbot? Message-ID: This has probably already been spotted, but in case it hasn't... I just googled for Timbot and found: http://www.cse.ogi.edu/~mpj/timbot/#Programming -- Dave Abrahams Boost Consulting www.boost-consulting.com From gward@python.net Sat May 3 20:21:31 2003 From: gward@python.net (Greg Ward) Date: Sat, 3 May 2003 15:21:31 -0400 Subject: [Python-Dev] optparse docs need proofreading Message-ID: <20030503192131.GA4689@cthulhu.gerg.ca> So you're sitting around, wondering what to do with your weekend, and worrying that the Python 2.3 documentation is not perfect yet. Well, you could proofread the documentation for optparse (currently section 6.20 of the "lib" manual), which was converted wholesale from reStructuredText to LaTeX, and still bears some scars. Both the DVI/PS/PDF output and HTML bear close examination. I'm working on it now, but will undoubtedly miss stuff, so feel free to email any glitches you notice in the latest CVS version to me. Greg -- Greg Ward http://www.gerg.ca/ From tim.one@comcast.net Sun May 4 06:26:09 2003 From: tim.one@comcast.net (Tim Peters) Date: Sun, 04 May 2003 01:26:09 -0400 Subject: [Python-Dev] Re: heaps In-Reply-To: <5841710.1051745776@[10.0.1.2]> Message-ID: [Tim] >> ... >> priorityDictionary looks like an especially nice API for this specific >> algorithm, but, e.g., impossible to use directly for maintaining an N- >> best queue (priorityDictionary doesn't support multiple values with >> the same priority, right? That was wrong: the dict maps items to priorities, and I read it backwards. Sorry! >> if we're trying to find the 10,000 poorest people in America, counting >> only one as dead broke would be too Republican for some peoples' tastes >> ). OTOH, heapq is easy and efficient for *that* class of heap >> application. [David Eppstein] > I agree with your main points (heapq's inability to handle > certain priority queue applications doesn't mean it's useless, and > its implementation-specific API helps avoid fooling programmers into > thinking it's any more than what it is). But I am confused at this > example. Surely it's just as easy to store (income,identity) tuples in > either data structure. As above, I was inside out. "Just as easy" can't be answered without trying to write actual code, though. Given that heapq and priorityDictionary are both min-heaps, to avoid artificial pain let's look for the people with the N highest incomes instead. For an N-best queue using heapq, "the natural" thing is to define people like so: class Person: def __init__(self, income): self.income = income def __cmp__(self, other): return cmp(self.income, other.income) and then the N-best calculation is as follows; it's normal in N-best applications that N is much smaller than the number of items being ranked, and you don't want to consume more than O(N) memory (for example, google wants to show you the best-scoring 25 documents of the 6 million matches it found): """ # N-best queue for people with the N largest incomes. import heapq dummy = Person(-1) # effectively an income of -Inf q = [dummy] * N # it's fine to use the same object N times for person in people: if person > q[0]: heapq.heapreplace(q, person) # The result list isn't sorted. result = [person for person in q if q is not dummy] """ I'm not as experienced with priorityDictionary. For heapq, the natural __cmp__ is the one that compares objects' priorities. For priorityDictionary, we can't use that, because Person instances will be used as dict keys, and then two Persons with the same income couldn't be in the queue at the same time. So Person.__cmp__ will have to change in such a way that distinct Persons never compare equal. I also have to make sure that a Person is hashable. I see there's another subtlety, apparent only from reading the implementation code: in the heap maintained alongside the dict, it's actually (priority, object) tuples that get compared. Since I expect to see Persons with equal income, when two such tuples get compared, they'll tie on the priority, and go on to compare the Persons. So I have to be sure too that comparing two Persons is cheap. Pondering all that for a while, it seems best to make sure Person doesn't define __cmp__ or __hash__ at all. Then instances will get compared by memory address, distinct Persons will never compare equal, comparing Persons is cheap, and hashing by memory address is cheap too: class Person: def __init__(self, income): self.income = income The N-best code is then: """ q = priorityDictionary() for dummy in xrange(N): q[Person(-1)] = -1 # have to ensure these are distinct Persons for person in people: if person.income > q.smallest().income: del q[q.smallest()] q[person] = person.income # The result list is sorted. result = [person for person in q if person.income != -1] """ Perhaps paradoxically, I had to know essentially everything about how priorityDictionary is implemented to write a correct and efficient algorithm here. That was true of heapq too, of course, but there were fewer subtleties to trip over there, and heapq isn't trying to hide its implementation. BTW, there's a good use of heapq for you: you could use it to maintain the under-the-covers heap inside priorityDictionary! It would save much of the code, and possibly speed it too (heapq's implementation of popping usually requires substantially fewer comparisons than priorityDictionary.smallest uses; this is explained briefly in the comments before _siftup, deferring to Knuth for the gory details). > If you mean, you want to find the 10k smallest income values (rather than > the people having those incomes), then it may be that a better data > structure would be a simple list L in which the value of L[i] is > the count of people with income i. Well, leaving pennies out of it, incomes in the USA span 9 digits, so something taking O(N) memory would still be most attractive. From eppstein@ics.uci.edu Sun May 4 06:46:58 2003 From: eppstein@ics.uci.edu (David Eppstein) Date: Sat, 03 May 2003 22:46:58 -0700 Subject: [Python-Dev] Re: heaps In-Reply-To: References: Message-ID: <17342817.1052002018@[10.0.1.2]> On 5/4/03 1:26 AM -0400 Tim Peters wrote: > it's normal in N-best applications that N is much smaller than the number > of items being ranked, and you don't want to consume more than O(N) > memory (for example, google wants to show you the best-scoring 25 > documents of the 6 million matches it found): Ok, I think you're right, for this sort of thing heapq is better. One could extend my priorityDictionary code to limit memory like this but it would be unnecessary work when the extra features it has over heapq are not used for this sort of algorithm. On the other hand, if you really want to find the n best items in a data stream large enough that you care about using only space O(n), it might also be preferable to take constant amortized time per item rather than the O(log n) that heapq would use, and it's not very difficult nor does it require any fancy data structures. Some time back I needed some Java code for this, haven't had an excuse to port it to Python. In case anyone's interested, it's online at . Looking at it now, it seems more complicated than it needs to be, but maybe that's just the effect of writing in Java instead of Python (I've seen an example of a three-page Java implementation of an algorithm in a textbook that could easily be done in a dozen Python lines). -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From eppstein@ics.uci.edu Sun May 4 08:54:21 2003 From: eppstein@ics.uci.edu (David Eppstein) Date: Sun, 04 May 2003 00:54:21 -0700 Subject: [Python-Dev] Re: heaps References: <17342817.1052002018@[10.0.1.2]> Message-ID: In article <17342817.1052002018@[10.0.1.2]>, David Eppstein wrote: > On the other hand, if you really want to find the n best items in a data > stream large enough that you care about using only space O(n), it might > also be preferable to take constant amortized time per item rather than the > O(log n) that heapq would use, and it's not very difficult nor does it > require any fancy data structures. Some time back I needed some Java code > for this, haven't had an excuse to port it to Python. In case anyone's > interested, it's online at > . BTW, the central idea here is to use a random quicksort pivot to shrink the list, when it grows too large. In python, this could be done without randomization as simply as def addToNBest(L,x,N): L.append(x) if len(L) > 2*N: L.sort() del L[N:] It's not constant amortized time due to the sort, but that's probably more than made up for due to the speed of compiled sort versus interpreted randomized pivot. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From skip@mojam.com Sun May 4 13:00:24 2003 From: skip@mojam.com (Skip Montanaro) Date: Sun, 4 May 2003 07:00:24 -0500 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200305041200.h44C0OY12616@manatee.mojam.com> Bug/Patch Summary ----------------- 423 open / 3606 total bugs (+17) 137 open / 2130 total patches (+10) New Bugs -------- mmap's resize method resizes the file in win32 but not unix (2003-04-27) http://python.org/sf/728515 Long file names in osa suites (2003-04-27) http://python.org/sf/728574 ConfigurePython gives depreaction warning (2003-04-27) http://python.org/sf/728608 super bug (2003-04-28) http://python.org/sf/729103 building readline module fails on Irix 6.5 (2003-04-28) http://python.org/sf/729236 What's new in Python2.3b1 HTML generation. (2003-04-28) http://python.org/sf/729297 comparing versions - one a float (2003-04-28) http://python.org/sf/729317 rexec not listed as dead (2003-04-29) http://python.org/sf/729817 MacPython-OS9 eats CPU while waiting for I/O (2003-04-29) http://python.org/sf/729871 metaclasses, __getattr__, and special methods (2003-04-29) http://python.org/sf/729913 socketmodule.c: inet_pton() expects 4-byte packed_addr (2003-04-30) http://python.org/sf/730222 Unexpected Changes in list Iterator (2003-04-30) http://python.org/sf/730296 Not detecting AIX_GENUINE_CPLUSPLUS (2003-04-30) http://python.org/sf/730467 Python 2.3 bsddb docs need update (2003-05-01) http://python.org/sf/730938 HTTPRedirectHandler variable out of scope (2003-05-01) http://python.org/sf/730963 urllib2 raises AttributeError on redirect (2003-05-01) http://python.org/sf/731116 test_tarfile writes in Lib/test directory (2003-05-02) http://python.org/sf/731403 Importing anydbm generates exception if _bsddb unavailable (2003-05-02) http://python.org/sf/731501 Pimp needs to be able to update itself (2003-05-02) http://python.org/sf/731626 OSX installer .pkg file permissions (2003-05-02) http://python.org/sf/731631 Package Manager needs Help menu (2003-05-02) http://python.org/sf/731635 IDE "lookup in documentation" doesn't work in interactive wi (2003-05-02) http://python.org/sf/731643 GIL not released around getaddrinfo() (2003-05-02) http://python.org/sf/731644 An extended definition of "non-overlapping" would save time. (2003-05-04) http://python.org/sf/732120 Clarification of "pos" and "endpos" for match objects. (2003-05-04) http://python.org/sf/732124 New Patches ----------- Fixes for setup.py in Mac/OSX/Docs (2003-04-27) http://python.org/sf/728744 test_timeout updates (2003-04-28) http://python.org/sf/728815 Compiler warning on Solaris 8 (2003-04-28) http://python.org/sf/729305 Dictionary tuning (2003-04-29) http://python.org/sf/729395 Add Py_AtInit() startup hook for extenders (2003-04-30) http://python.org/sf/730473 assert from longobject.c, line 1215 (2003-04-30) http://python.org/sf/730594 RTEMS does not have a popen (2003-04-30) http://python.org/sf/730597 socketmodule inet_ntop built when IPV6 is disabled (2003-04-30) http://python.org/sf/730603 pimp.py has old URL for default database (2003-05-01) http://python.org/sf/731151 redirect fails in urllib2 (2003-05-01) http://python.org/sf/731153 AssertionError when building rpm under RedHat 9.1 (2003-05-02) http://python.org/sf/731328 make threading join() method return a value (2003-05-02) http://python.org/sf/731607 SpawnedGenerator class for threading module (2003-05-02) http://python.org/sf/731701 find correct socklen_t type (2003-05-03) http://python.org/sf/731991 exit status of latex2html "ignored" (2003-05-04) http://python.org/sf/732143 Closed Bugs ----------- "es#" parser marker leaks memory (2002-01-10) http://python.org/sf/501716 math.fabs documentation is misleading (2003-03-22) http://python.org/sf/708205 Lineno calculation sometimes broken (2003-03-24) http://python.org/sf/708901 Put a reference to print in the Library Reference, please. (2003-04-17) http://python.org/sf/723136 imaplib should convert line endings to be rfc2822 complient (2003-04-18) http://python.org/sf/723962 socketmodule doesn't compile on strict POSIX systems (2003-04-20) http://python.org/sf/724588 SRE bug with capturing groups in alternatives in repeats (2003-04-21) http://python.org/sf/725106 valgrind python fails (2003-04-24) http://python.org/sf/727051 tmpnam problems on windows 2.3b, breaks test.test_os (2003-04-26) http://python.org/sf/728097 Closed Patches -------------- fix for bug 501716 (2003-02-11) http://python.org/sf/684981 OpenVMS complementary patches (2003-03-23) http://python.org/sf/708495 unchecked return values - compile.c (2003-03-23) http://python.org/sf/708604 Cause pydoc to show data descriptor __doc__ strings (2003-03-29) http://python.org/sf/711902 timeouts for FTP connect (and other supported ops) (2003-04-03) http://python.org/sf/714592 Modules/addrinfo.h patch (2003-04-22) http://python.org/sf/725942 Remove extra line ending in CGI XML-RPC responses (2003-04-25) http://python.org/sf/727805 From m@moshez.org Sun May 4 19:55:44 2003 From: m@moshez.org (Moshe Zadka) Date: 4 May 2003 18:55:44 -0000 Subject: [Python-Dev] Distutils using apply Message-ID: <20030504185544.6010.qmail@green.zadka.com> Hi! I haven't seen this come up yet -- why is distutils still using apply? It causes warnings to be emitted when building packages with Python 2.3 and -Wall, and is altogether unclean. Is this just a matter of checking in a patch? Or submitting one to SF? Or is there a real desire to be compatible to Python 1.5.2? Thanks, Moshe -- Moshe Zadka -- http://moshez.org/ Buffy: I don't like you hanging out with someone that... short. Riley: Yeah, a lot of young people nowadays are experimenting with shortness. Agile Programming Language -- http://www.python.org/ From goodger@python.org Sun May 4 20:18:04 2003 From: goodger@python.org (David Goodger) Date: Sun, 04 May 2003 15:18:04 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: <20030504185544.6010.qmail@green.zadka.com> References: <20030504185544.6010.qmail@green.zadka.com> Message-ID: <3EB5676C.1000900@python.org> Moshe Zadka wrote: > Or is there a real desire to be compatible to Python 1.5.2? PEP 291 lists distutils as requiring 1.5.2 compatibility. -- David Goodger From Raymond Hettinger" After more dictionary sparseness experiments, I've become convinced that the ideal settings are better left up to the user who is in a better position to know: * anticipated dictionary size * overall application memory issues * characteristic access patterns (stores vs. reads vs. deletions vs. iteration) * when the dictionary is growing, shrinking, or stablized. * whether many deletions have taken place I have two competing proposals to expose dictresize(): 1) d.resize(minsize=0) The first approach allows a user to trigger a resize(). This is handy after deletions have taken place and dictionary contents have become stable. It allows the dictionary to be rebuilt without dummy entries. If the minsize factor is specified, then the dictionary will be built to the specified size or larger if needed to achieve a power of two or to accommodate existing entries. That is handy when building a dictionary whose approximate size is known in advance because it eliminates all of the intermediate resizes during construction. For instance, the builtin dictionary can be pre-sized for the 126 entries and it will build more quickly. It is also useful after dictionary contents have stabilized and the user wants improved lookup time at the expense of additional memory and slower iteration time. For instance, the builtin dictionary can be resized to 500 entries making it so sparse that the lookups will typically hit on the first try. This API requires a little user sophistication because the effects get wiped out during the next automatic resize (when the dict is two-thirds full). 2) d.setsparsity(factor=1) The second approach does not allow dictionaries to be pre-sized, but the effects do not get wiped out by normal dictionary activity. It is handy when a particular dictionary's lookup/insertion time is more important than iteration time or space considerations. For instance, the builtin dictionary can be set to a sparsity factor of four so that lookups are more rapid. Raymond Hettinger From drifty@alum.berkeley.edu Mon May 5 00:27:16 2003 From: drifty@alum.berkeley.edu (Brett Cannon) Date: Sun, 4 May 2003 16:27:16 -0700 (PDT) Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <001301c31290$fcea25e0$125ffea9@oemcomputer> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger] > I have two competing proposals to expose dictresize(): > > 1) d.resize(minsize=0) > > The first approach allows a user to trigger a resize(). This is handy > after deletions have taken place and dictionary contents have become > stable. It allows the dictionary to be rebuilt without dummy entries. The issue I see with this is people going overboard with calls to this. I can easily imagine a new Python programmer calling this after every insertion or deletion into the dictionary. I can even see experienced programmer getting trapped into this by coming up with a size and then coding themselves into a corner by trying to maintain the size. I also see people coding a size that is optimal and then changing their code but forgetting to change the value passed to the method, thus negating the perk of having this option set > 2) d.setsparsity(factor=1) > > The second approach does not allow dictionaries to be pre-sized, > but the effects do not get wiped out by normal dictionary activity. > This is more reasonable. Since it is a factor it will makes sense to beginners who view it as a sliding scale and also allows more experienced programmers to set it to where they know they want the performance. And setting the value will more than likely be good no matter how the code is changed since the use of the dictionary will most likely stay consistent. Do either hinder dictionary performance just by introducing the possible functionality? I am -1 on 'resize' and +0, teetering on +1, for setsparsity. I will kick over to +1 if someone else out there with more experience with newbies can say strongly that they don't see them messing up with this option. -Brett P.S.: Thanks, Raymond, for doing all of this work and documenting it so well. From guido@python.org Mon May 5 01:34:33 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 04 May 2003 20:34:33 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: "Your message of Sun, 04 May 2003 18:59:46 EDT." <001301c31290$fcea25e0$125ffea9@oemcomputer> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> Message-ID: <200305050034.h450YXx23808@pcp02138704pcs.reston01.va.comcast.net> > After more dictionary sparseness experiments, I've become > convinced that the ideal settings are better left up to the user > who is in a better position to know: > > * anticipated dictionary size > * overall application memory issues > * characteristic access patterns (stores vs. reads vs. deletions > vs. iteration) > * when the dictionary is growing, shrinking, or stablized. > * whether many deletions have taken place Hm. Maybe so, but it *is* a feature that there are no user controls over dictionary behavior, based on the observation that for every user who knows enough about the dict implementation to know how to tweak it, there are at least 1000 who don't, and the latter, in their ill-advised quest for more speed, will use the tweakage API to their detriment. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Mon May 5 03:06:26 2003 From: skip@pobox.com (Skip Montanaro) Date: Sun, 4 May 2003 21:06:26 -0500 Subject: [Python-Dev] Distutils using apply In-Reply-To: <3EB5676C.1000900@python.org> References: <20030504185544.6010.qmail@green.zadka.com> <3EB5676C.1000900@python.org> Message-ID: <16053.50978.292901.471132@montanaro.dyndns.org> >> Or is there a real desire to be compatible to Python 1.5.2? David> PEP 291 lists distutils as requiring 1.5.2 compatibility. Then should distutils be suppressing those warnings? Skip From tim.one@comcast.net Mon May 5 03:20:09 2003 From: tim.one@comcast.net (Tim Peters) Date: Sun, 04 May 2003 22:20:09 -0400 Subject: [Python-Dev] Re: heaps In-Reply-To: <17342817.1052002018@[10.0.1.2]> Message-ID: This is a multi-part message in MIME format. --Boundary_(ID_19ec9GZ0WH09Yh6NFIvyew) Content-type: text/plain; charset=iso-8859-1 Content-transfer-encoding: 7BIT [David Eppstein] > Ok, I think you're right, for this sort of thing heapq is better. > One could extend my priorityDictionary code to limit memory like > this but it would be unnecessary work when the extra features it > has over heapq are not used for this sort of algorithm. I don't believe memory usage was an issue here. Take a look at the code again (comments removed): """ q = priorityDictionary() for dummy in xrange(N): q[Person(-1)] = -1 for person in people: if person.income > q.smallest().income: del q[q.smallest()] q[person] = person.income """ q starts with N entries. Each trip around the loop either leaves the q contents alone, or both removes and adds an entry. So the size of the dict is a loop invariant, len(q) == N. In the cases where it does remove an entry, it always removes the smallest entry, and the entry being added is strictly larger than that, so calling q.smallest() at the start of the next loop trip finds the just-deleted smallest entry still in self.__heap[0], and removes it. So the internal list may grow to N+1 entries immediately following del q[q.smallest()] but by the time we get to that line again it should be back to N entries again. The reasons I found heapq easier to live with in this specific app had more to do with the subtleties involved in sidestepping potential problems with __hash__, __cmp__, and the speed of tuple comparison when the first tuple elements tie. heapq also supplies a "remove current smallest and replace with a new value" primitive, which happens to be just right for this app (that isn't an accident ): """ dummy = Person(-1) q = [dummy] * N for person in people: if person > q[0]: heapq.heapreplace(q, person) """ > On the other hand, if you really want to find the n best items in a data > stream large enough that you care about using only space O(n), it might > also be preferable to take constant amortized time per item rather than > the O(log n) that heapq would use, In practice, it's usually much faster than that. Over time, it gets rarer and rarer for person > q[0] to be true (the new person has to be larger than the N-th largest seen so far, and that bar gets raised whenever a new person manages to hurdle it), and the vast majority of sequence elements are disposed with via that single Python statement (the ">" test fails, and we move on to the next element with no heap operations). In the simplest case, if N==1, the incoming data is randomly ordered, and the incoming sequence has M elements, the if-test is true (on average) only ln(M) times (the expected number of left-to-right maxima). The order statistics get more complicated as N increases, of course, but in practice it remains very fast, and doing a heapreplace() on every incoming item is the worst case (achieved if the items come in sorted order; the best case is when they come in reverse-sorted order, in which case min(M, N) heapreplace() operations are done). > and it's not very difficult nor does it require any fancy data > structures. Some time back I needed some Java code for this, > haven't had an excuse to port it to Python. In case anyone's > interested, it's online at > . > Looking at it now, it seems more complicated than it needs to be, but > maybe that's just the effect of writing in Java instead of Python > (I've seen an example of a three-page Java implementation of an > algorithm in a textbook that could easily be done in a dozen Python > lines). Cool! I understood the thrust but not the details -- and I agree Java must be making it harder than it should be . > In python, this could be done without randomization as simply as > > def addToNBest(L,x,N): > L.append(x) > if len(L) > 2*N: > L.sort() > del L[N:] > > It's not constant amortized time due to the sort, but that's probably > more than made up for due to the speed of compiled sort versus > interpreted randomized pivot. I'll attach a little timing script. addToNBest is done inline there, some low-level tricks were played to speed it, and it was changed to be a max N-best instead of a min N-best. Note that the list sort in 2.3 has a real advantage over Pythons before 2.3 here, because it recognizes (in linear time) that the first half of the list is already in sorted order (on the second & subsequent sorts), and leaves it alone until a final merge step with the other half of the array. The relative speed (compared to the heapq code) varies under 2.3, seeming to depend mostly on M/N. The test case is set up to find the 1000 largest of a million random floats. In that case the sorting method takes about 3.4x longer than the heapq approach. As N gets closer to M, the sorting method eventually wins; when M and N are both a million, the sorting method is 10x faster. For most N-best apps, M is much smaller than N, and the heapq code should be quicker unless the data is already in order. --Boundary_(ID_19ec9GZ0WH09Yh6NFIvyew) Content-type: text/plain; name=timeq.py Content-transfer-encoding: 7BIT Content-disposition: attachment; filename=timeq.py def one(seq, N): from heapq import heapreplace L = [-1] * N for x in seq: if x > L[0]: heapreplace(L, x) L.sort() return L def two(seq, N): L = [] push = L.append twoN = 2*N for x in seq: push(x) if len(L) > twoN: L.sort() del L[:-N] L.sort() del L[:-N] return L def timeit(seq, N): from time import clock as now s = now() r1 = one(seq, N) t = now() e1 = t - s s = now() r2 = two(seq, N) t = now() e2 = t - s print len(seq), N, e1, e2 assert r1 == r2 def tryone(M, N): from random import random seq = [random() for dummy in xrange(M)] timeit(seq, N) for i in range(10): tryone(1000000, 1000) --Boundary_(ID_19ec9GZ0WH09Yh6NFIvyew)-- From python@rcn.com Mon May 5 03:22:08 2003 From: python@rcn.com (Raymond Hettinger) Date: Sun, 4 May 2003 22:22:08 -0400 Subject: [Python-Dev] Dictionary sparseness References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305050034.h450YXx23808@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <003301c312ad$2113e520$125ffea9@oemcomputer> > > After more dictionary sparseness experiments, I've become > > convinced that the ideal settings are better left up to the user > > who is in a better position to know: > > > > * anticipated dictionary size > > * overall application memory issues > > * characteristic access patterns (stores vs. reads vs. deletions > > vs. iteration) > > * when the dictionary is growing, shrinking, or stablized. > > * whether many deletions have taken place > > Hm. Maybe so, but it *is* a feature that there are no user controls > over dictionary behavior, based on the observation that for every user > who knows enough about the dict implementation to know how to tweak > it, there are at least 1000 who don't, and the latter, in their > ill-advised quest for more speed, will use the tweakage API to their > detriment. Perhaps there should be safety-belts and kindergarten controls: d.pack(fat=False) --> None. Reclaims deleted entries. If optional fat argument is true, the internal size is doubled resulting in potentially faster lookups at the expense of slower iteration and more memory. This ought to be both safe and simple. Raymond Hettinger P.S. Also, I think it worthwhile to at least transform dictresize() into PyDict_Resize() so that C extensions will have some control. This would make it possible for us to add a single line making the builtin dictionary more sparse and providing a 75% first probe hit rate. From skip@pobox.com Mon May 5 03:24:31 2003 From: skip@pobox.com (Skip Montanaro) Date: Sun, 4 May 2003 21:24:31 -0500 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <001301c31290$fcea25e0$125ffea9@oemcomputer> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> Message-ID: <16053.52063.690466.272706@montanaro.dyndns.org> Raymond> After more dictionary sparseness experiments, I've become Raymond> convinced that the ideal settings are better left up to the Raymond> user who is in a better position to know: Speaking as a moderately sophisticated Python programmer, I can tell you I wouldn't have the slightest idea what the properties of my applications' dictionary usage is. Unless I'm going to get a major league speedup (like factor of two or greater) tweaking these settings, I don't see that they'd benefit me. Skip From python@rcn.com Mon May 5 03:26:47 2003 From: python@rcn.com (Raymond Hettinger) Date: Sun, 4 May 2003 22:26:47 -0400 Subject: [Python-Dev] Re: heaps References: Message-ID: <003f01c312ad$c7277580$125ffea9@oemcomputer> > The relative speed (compared to the heapq code) varies under 2.3, seeming to > depend mostly on M/N. The test case is set up to find the 1000 largest of a > million random floats. In that case the sorting method takes about 3.4x > longer than the heapq approach. As N gets closer to M, the sorting method > eventually wins; when M and N are both a million, the sorting method is 10x > faster. For most N-best apps, M is much smaller than N, and the heapq code > should be quicker unless the data is already in order. FWIW, there is C implementation of heapq at: http://zhar.net/projects/python/ Raymond Hettinger From tim.one@comcast.net Mon May 5 04:00:09 2003 From: tim.one@comcast.net (Tim Peters) Date: Sun, 04 May 2003 23:00:09 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <003301c312ad$2113e520$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger] > ... > P.S. Also, I think it worthwhile to at least transform dictresize() > into PyDict_Resize() so that C extensions will have some control. > This would make it possible for us to add a single line making > the builtin dictionary more sparse and providing a 75% first probe > hit rate. The dynamic hit rate is the one that counts, and, e.g., it's not going to speed anything to remove the current lowest-8-but-not-lowest-9-bits collision between 'ArithmeticError' and 'reload' (I've never seen the former used, and the latter is expensive). IOW, measuring the dynamic first-probe hit rate is a prerequisite to selling this idea; a stronger prerequisite is demonstrating actual before-and-after speedups. I agree with Guido that giving people controls they're ill-equipped to understand will do more harm than good. Even when they manage to stumble into a small speedup, that will often become counterproductive over time, as the characteristics of their ever-growing app change, and the Speed Weenie who got the 2% speedup left, or moved on to some other project. Or somebody corrects the option name from 'smalest' to 'smallest', and suddenly the only dict entry that mattered doesn't collide anymore -- but the mystery knob boosting the dict size "because it sped things up" forever more wastes half the space for a reason nobody ever understood. Or we change Python's string hash to use addition instead of xor to merge in the next character (a change that may actually help a bit -- addition is a littler better at scrambling the bits). Etc. it's-python-it's-supposed-to-be-slow-ly y'rs - tim From eppstein@ics.uci.edu Mon May 5 04:26:29 2003 From: eppstein@ics.uci.edu (David Eppstein) Date: Sun, 04 May 2003 20:26:29 -0700 Subject: [Python-Dev] Re: heaps In-Reply-To: References: Message-ID: <20414172.1052079989@[10.0.1.2]> On 5/4/03 10:20 PM -0400 Tim Peters wrote: > In practice, it's usually much faster than that. Over time, it gets rarer > and rarer for > > person > q[0] > > to be true (the new person has to be larger than the N-th largest seen so > far, and that bar gets raised whenever a new person manages to hurdle it), Good point. If any permutation of the input sequence is equally likely, and you're selecting the best k out of n items, the expected number of times you have to hit the data structure in your heapq solution is roughly k ln n, so the total expected time is O(n + k log k log n), with a really small constant factor on the O(n) term. The sorting solution I suggested has total time O(n log k), and even though sorting is built-in and fast it can't compete when k is small. Random pivoting is O(n + k), but with a larger constant factor, so your heapq solution looks like a winner. For fairness, it might be interesting to try another run of your test in which the input sequence is sorted in increasing order rather than random. I.e., replace the random generation of seq by seq = range(M) I'd try it myself, but I'm still running python 2.2 and haven't installed heapq. I'd have to know more about your application to have an idea whether the sorted or randomly-permuted case is more representative. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From oren-py-d@hishome.net Mon May 5 06:23:35 2003 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 5 May 2003 01:23:35 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: <20030504185544.6010.qmail@green.zadka.com> References: <20030504185544.6010.qmail@green.zadka.com> Message-ID: <20030505052335.GA37311@hishome.net> On Sun, May 04, 2003 at 06:55:44PM -0000, Moshe Zadka wrote: > Hi! > I haven't seen this come up yet -- why is distutils still using apply? > It causes warnings to be emitted when building packages with Python 2.3 > and -Wall, and is altogether unclean. > > Is this just a matter of checking in a patch? Or submitting one to SF? > Or is there a real desire to be compatible to Python 1.5.2? I was wondering if a milder form of deprecation may be appropriate for some features such as the apply builtin: 1. Add a notice in docstring 'not recommended for new code' 2. Move to 'obsolete' or 'backward compatibility' section in manual 3. Do NOT produce a warning (pychecker may still do that) 4. Do NOT plan removal of feature in a specific future release Oren From martin@v.loewis.de Mon May 5 06:55:56 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 05 May 2003 07:55:56 +0200 Subject: [Python-Dev] Distutils using apply In-Reply-To: <16053.50978.292901.471132@montanaro.dyndns.org> References: <20030504185544.6010.qmail@green.zadka.com> <3EB5676C.1000900@python.org> <16053.50978.292901.471132@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > David> PEP 291 lists distutils as requiring 1.5.2 compatibility. > > Then should distutils be suppressing those warnings? This isn't trivial: the warnings module is not available in Python 1.5.2. Regards, Martin From m@moshez.org Mon May 5 07:40:51 2003 From: m@moshez.org (Moshe Zadka) Date: 5 May 2003 06:40:51 -0000 Subject: [Python-Dev] Distutils using apply In-Reply-To: References: , <20030504185544.6010.qmail@green.zadka.com> <3EB5676C.1000900@python.org> <16053.50978.292901.471132@montanaro.dyndns.org> Message-ID: <20030505064051.29353.qmail@green.zadka.com> [Trimming CC list] On 05 May 2003, martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) wrote: > This isn't trivial: the warnings module is not available in Python > 1.5.2. Yes it is (trivial, not in 1.5.2) try: import warnings except ImportError: pass else: ...disable warnings... Thanks, Moshe -- Moshe Zadka -- http://moshez.org/ Buffy: I don't like you hanging out with someone that... short. Riley: Yeah, a lot of young people nowadays are experimenting with shortness. Agile Programming Language -- http://www.python.org/ From mal@lemburg.com Mon May 5 08:41:05 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 05 May 2003 09:41:05 +0200 Subject: [Python-Dev] Distutils using apply In-Reply-To: <20030504185544.6010.qmail@green.zadka.com> References: <20030504185544.6010.qmail@green.zadka.com> Message-ID: <3EB61591.5070204@lemburg.com> Moshe Zadka wrote: > Hi! > I haven't seen this come up yet -- why is distutils still using apply? > It causes warnings to be emitted when building packages with Python 2.3 > and -Wall, and is altogether unclean. Could someone please explain why apply() was marked deprecated ? The only reference I can find is in PEP 290 and that merely reports this "fact". I'm -1 on deprecating apply(). Not only because it introduces yet another incompatiblity between Python versions, but also because it is still useful in the context of having a function which mimics a function call, e.g. for map() and other instance where you pass around functions as operators. > Is this just a matter of checking in a patch? Or submitting one to SF? > Or is there a real desire to be compatible to Python 1.5.2? Yes. It was decided that Python 2.3 will ship with the last version of distutils that is Python 1.5.2 compatible. After that it may drop that compatibility and become Python 2.0 compatible. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 05 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 50 days left From python@rcn.com Mon May 5 10:28:51 2003 From: python@rcn.com (Raymond Hettinger) Date: Mon, 5 May 2003 05:28:51 -0400 Subject: [Python-Dev] Dictionary sparseness References: Message-ID: <001501c312e8$bd892420$125ffea9@oemcomputer> > it's-python-it's-supposed-to-be-slow-ly y'rs - tim Oh, now you tell me. I've got about a hundred failed experiments that provide slowdowns ranging from modest to excruciating. Take your pick. My favorite: Eliminating the test for dummy entry re-use ended up hurting every benchmark and completely destroying a couple of them. Raymond From guido@python.org Mon May 5 12:47:39 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 07:47:39 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: "Your message of Sun, 04 May 2003 22:22:08 EDT." <003301c312ad$2113e520$125ffea9@oemcomputer> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305050034.h450YXx23808@pcp02138704pcs.reston01.va.comcast.net> <003301c312ad$2113e520$125ffea9@oemcomputer> Message-ID: <200305051147.h45Bldw24692@pcp02138704pcs.reston01.va.comcast.net> > > Hm. Maybe so, but it *is* a feature that there are no user controls > > over dictionary behavior, based on the observation that for every user > > who knows enough about the dict implementation to know how to tweak > > it, there are at least 1000 who don't, and the latter, in their > > ill-advised quest for more speed, will use the tweakage API to their > > detriment. > > Perhaps there should be safety-belts and kindergarten controls: > > d.pack(fat=False) --> None. Reclaims deleted entries. > If optional fat argument is true, the internal size is doubled > resulting in potentially faster lookups at the expense of > slower iteration and more memory. > > This ought to be both safe and simple. And a waste of time except in the most rare circumstances. > Raymond Hettinger > > > P.S. Also, I think it worthwhile to at least transform dictresize() > into PyDict_Resize() so that C extensions will have some control. > This would make it possible for us to add a single line making > the builtin dictionary more sparse and providing a 75% first probe > hit rate. And that would give *how much* of a performance improvement of typical applications? Sorry, I really think that you're complexificating APIs here without sufficient gain. I really value the work you've done on figuring out how to improve dicts, but I think you've come to know the code too well to see the other side of the coin. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon May 5 13:02:08 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 08:02:08 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: "Your message of Mon, 05 May 2003 09:41:05 +0200." <3EB61591.5070204@lemburg.com> References: <20030504185544.6010.qmail@green.zadka.com> <3EB61591.5070204@lemburg.com> Message-ID: <200305051202.h45C28E24795@pcp02138704pcs.reston01.va.comcast.net> > Could someone please explain why apply() was marked deprecated ? Becase it's more readable, more efficient, and more flexible to write f(x, y, *t) than apply(f, (x, y) + t). > The only reference I can find is in PEP 290 and that merely > reports this "fact". > > I'm -1 on deprecating apply(). Not only because it introduces yet > another incompatiblity between Python versions, but also because it > is still useful in the context of having a function which mimics > a function call, e.g. for map() and other instance where you > pass around functions as operators. Then maybe we should add something like operator.__call__. OTOH, you're lucky that map isn't deprecated yet in favor of list comprehensions; I expect that Python 3.0 won't have map or filter either. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon May 5 13:03:58 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 08:03:58 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: "Your message of Mon, 05 May 2003 01:23:35 EDT." <20030505052335.GA37311@hishome.net> References: <20030504185544.6010.qmail@green.zadka.com> <20030505052335.GA37311@hishome.net> Message-ID: <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net> > I was wondering if a milder form of deprecation may be appropriate for > some features such as the apply builtin: > > 1. Add a notice in docstring 'not recommended for new code' > 2. Move to 'obsolete' or 'backward compatibility' section in manual > 3. Do NOT produce a warning (pychecker may still do that) > 4. Do NOT plan removal of feature in a specific future release The form of deprecation used for apply() is already very mild (you don't get a warning unless you do -Wall). I don't think Moshe's use case is important enough to care; if Moshe cares, he can easily construct a command line argument or warnings.filterwarning() call to suppress the warnings he doesn't care about. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon May 5 13:30:30 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 05 May 2003 14:30:30 +0200 Subject: [Python-Dev] Distutils using apply In-Reply-To: <200305051202.h45C28E24795@pcp02138704pcs.reston01.va.comcast.net> References: <20030504185544.6010.qmail@green.zadka.com> <3EB61591.5070204@lemburg.com> <200305051202.h45C28E24795@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3EB65966.6090005@lemburg.com> Guido van Rossum wrote: >>Could someone please explain why apply() was marked deprecated ? > > Becase it's more readable, more efficient, and more flexible to write > f(x, y, *t) than apply(f, (x, y) + t). True, but it's in wide use out there, so it shouldn't go until Python 3 is out the door. BTW, shouldn't these deprecations be listed in e.g PEP 4 ? There doesn't seem to be a single place to look for deprecated features and APIs (PEP 4 only lists modules). I find it rather troublesome that deprecation seems to be using stealth mode of operation in Python development -- discussions about it rarely surface until someone complains about a warning relating to it. There should be open discussions about whether or not to deprecate functionality. >>The only reference I can find is in PEP 290 and that merely >>reports this "fact". >> >>I'm -1 on deprecating apply(). Not only because it introduces yet >>another incompatiblity between Python versions, but also because it >>is still useful in the context of having a function which mimics >>a function call, e.g. for map() and other instance where you >>pass around functions as operators. > > Then maybe we should add something like operator.__call__. Why remove a common API and reinvent it somewhere else ? > OTOH, you're lucky that map isn't deprecated yet in favor of list > comprehensions; I expect that Python 3.0 won't have map or filter > either. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 05 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 50 days left From oren-py-d@hishome.net Mon May 5 13:50:07 2003 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 5 May 2003 08:50:07 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net> References: <20030504185544.6010.qmail@green.zadka.com> <20030505052335.GA37311@hishome.net> <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030505125007.GA20312@hishome.net> On Mon, May 05, 2003 at 08:03:58AM -0400, Guido van Rossum wrote: > > I was wondering if a milder form of deprecation may be appropriate for > > some features such as the apply builtin: > > > > 1. Add a notice in docstring 'not recommended for new code' > > 2. Move to 'obsolete' or 'backward compatibility' section in manual > > 3. Do NOT produce a warning (pychecker may still do that) > > 4. Do NOT plan removal of feature in a specific future release > > The form of deprecation used for apply() is already very mild (you > don't get a warning unless you do -Wall). I don't think Moshe's use > case is important enough to care; if Moshe cares, he can easily > construct a command line argument or warnings.filterwarning() call to > suppress the warnings he doesn't care about. My comment was not specifically about Moshe's use case - it's about the meaning of deprecation in Python. Does it always have to mean "start replacing because it *will* go away" as seems to be implied by PEP 5 or perhaps in some cases it could just mean "please don't use this in new code, okay" ? Oren From guido@python.org Mon May 5 14:47:12 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 09:47:12 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: Your message of "Mon, 05 May 2003 14:30:30 +0200." <3EB65966.6090005@lemburg.com> References: <20030504185544.6010.qmail@green.zadka.com> <3EB61591.5070204@lemburg.com> <200305051202.h45C28E24795@pcp02138704pcs.reston01.va.comcast.net> <3EB65966.6090005@lemburg.com> Message-ID: <200305051347.h45DlCp30562@odiug.zope.com> > Guido van Rossum wrote: > >>Could someone please explain why apply() was marked deprecated ? > > > > Becase it's more readable, more efficient, and more flexible to write > > f(x, y, *t) than apply(f, (x, y) + t). > > True, but it's in wide use out there, so it shouldn't go until > Python 3 is out the door. And it won't. But that doesn't mean we can't add a PendingDeprecation warning for it. > BTW, shouldn't these deprecations be listed in e.g PEP 4 ? > > There doesn't seem to be a single place to look for deprecated > features and APIs (PEP 4 only lists modules). That's a problem indeed. > I find it rather troublesome that deprecation seems to be using > stealth mode of operation in Python development -- discussions > about it rarely surface until someone complains about a warning > relating to it. There should be open discussions about whether > or not to deprecate functionality. I believe the discussions are open enough (things like this are never decided at PythonLabs, but always brought out on python-dev). But it's easy to miss these discussions, and the records aren't always clear. > > Then maybe we should add something like operator.__call__. > > Why remove a common API and reinvent it somewhere else ? To reflect its demoted status. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon May 5 14:50:05 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 09:50:05 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: Your message of "Mon, 05 May 2003 08:50:07 EDT." <20030505125007.GA20312@hishome.net> References: <20030504185544.6010.qmail@green.zadka.com> <20030505052335.GA37311@hishome.net> <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net> <20030505125007.GA20312@hishome.net> Message-ID: <200305051350.h45Do5c30595@odiug.zope.com> > My comment was not specifically about Moshe's use case - it's about > the meaning of deprecation in Python. > > Does it always have to mean "start replacing because it *will* go > away" as seems to be implied by PEP 5 or perhaps in some cases it > could just mean "please don't use this in new code, okay" ? I think that can be safely left up to the individual programmer, who has a better idea (hopefully) on the life expectancy of his code. We try to give guidance about the urgency of the deprecation e.g. in PEPs or by using the normally-silent PendingDeprecation (which suggests it's not urgent :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@pythoncraft.com Mon May 5 14:52:01 2003 From: aahz@pythoncraft.com (Aahz) Date: Mon, 5 May 2003 09:52:01 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <001501c312e8$bd892420$125ffea9@oemcomputer> References: <001501c312e8$bd892420$125ffea9@oemcomputer> Message-ID: <20030505135201.GA14870@panix.com> How about this: when we create read-only dicts, you add an optional argument that re-packs the dict and optimizes for space or speed. That way, the dict can be analyzed to provide appropriate results. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In many ways, it's a dull language, borrowing solid old concepts from many other languages & styles: boring syntax, unsurprising semantics, few automatic coercions, etc etc. But that's one of the things I like about it." --Tim Peters on Python, 16 Sep 93 From skip@pobox.com Mon May 5 15:34:14 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 5 May 2003 09:34:14 -0500 Subject: [Python-Dev] How to test this? Message-ID: <16054.30310.489999.134263@montanaro.dyndns.org> I just added a patch file to . It doesn't include any test cases, since that requires an old db hash v2 file present. Is it okay to check in a dummy file to Lib/test for this purpose? Thanks, Skip From BPettersen@NAREX.com Mon May 5 15:55:02 2003 From: BPettersen@NAREX.com (Bjorn Pettersen) Date: Mon, 5 May 2003 08:55:02 -0600 Subject: [Python-Dev] Windows installer request... Message-ID: <60FB8BB7F0EFC7409B75EEEC13E20192022DE1FD@admin56.narex.com> Would it be possible for the windows installer to use $SYSTEMDRIVE$ as the default installation drive instead of C:? (On my XP box, C: is my zip-drive, and E: is my SYSTEMDRIVE(*) -- I'm now re-installing :-) If it's considered a good idea, and someone can point me to where the change has to be made, I'd be more than willing to produce a patch... -- bjorn (*) Don't ask, MS wisdom I guess. Oh, and if you don't have a C: drive, all you WinExplorer icons disappear (subst C: a: and it works :-) In any case, I'm not brave enough to try to change it . From oren-py-d@hishome.net Mon May 5 15:58:06 2003 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 5 May 2003 10:58:06 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: <200305051350.h45Do5c30595@odiug.zope.com> References: <20030504185544.6010.qmail@green.zadka.com> <20030505052335.GA37311@hishome.net> <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net> <20030505125007.GA20312@hishome.net> <200305051350.h45Do5c30595@odiug.zope.com> Message-ID: <20030505145806.GA46311@hishome.net> On Mon, May 05, 2003 at 09:50:05AM -0400, Guido van Rossum wrote: > > My comment was not specifically about Moshe's use case - it's about > > the meaning of deprecation in Python. > > > > Does it always have to mean "start replacing because it *will* go > > away" as seems to be implied by PEP 5 or perhaps in some cases it > > could just mean "please don't use this in new code, okay" ? > > I think that can be safely left up to the individual programmer, who > has a better idea (hopefully) on the life expectancy of his code. We > try to give guidance about the urgency of the deprecation e.g. in PEPs > or by using the normally-silent PendingDeprecation (which suggests > it's not urgent :-). I'm afraid this is too subtle for me. I'll ask my question a third time, hoping for an answer that a mere mortal can understand: Are all deprecated features on death row or are some of them merely serving a life sentence? Oren "Do not meddle in the affairs of BDFLs, for they are subtle and quick to anger" From guido@python.org Mon May 5 16:10:43 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 11:10:43 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: Your message of "Mon, 05 May 2003 10:58:06 EDT." <20030505145806.GA46311@hishome.net> References: <20030504185544.6010.qmail@green.zadka.com> <20030505052335.GA37311@hishome.net> <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net> <20030505125007.GA20312@hishome.net> <200305051350.h45Do5c30595@odiug.zope.com> <20030505145806.GA46311@hishome.net> Message-ID: <200305051510.h45FAhY31026@odiug.zope.com> > Are all deprecated features on death row or are some of them merely > serving a life sentence? They are all slated to go away. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Mon May 5 16:14:07 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 05 May 2003 11:14:07 -0400 Subject: [Python-Dev] Windows installer request... In-Reply-To: <60FB8BB7F0EFC7409B75EEEC13E20192022DE1FD@admin56.narex.com> Message-ID: [Bjorn Pettersen] > Would it be possible for the windows installer to use $SYSTEMDRIVE$ as > the default installation drive instead of C:? (On my XP box, C: is my > zip-drive, and E: is my SYSTEMDRIVE(*) -- I'm now re-installing :-) Are you saying that the "Select Destination Directory" dialog box doesn't allow you to select your E: drive? Or just that you'd rather not need to select the drive you want? > If it's considered a good idea, and someone can point me to where the > change has to be made, I'd be more than willing to produce a patch... I apparently left this comment in the Wise script: Note from Tim: doesn't seem to be a way to get the true boot drive, the Wizard hardcodes "C". So, AFAIK, there isn't a straightforward way to get Wise 8.14 to suggest a drive other than C:. Perhaps it would work better for you if I removed the Wizard-generated hardcoded "C:" (I don't know which drive Wise would pick then), but since yours is the only complaint about this I've seen, and I have no way to test such a change, I'm very reluctant to fiddle with it. From Jack.Jansen@oratrix.com Mon May 5 16:35:55 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Mon, 5 May 2003 17:35:55 +0200 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <001301c31290$fcea25e0$125ffea9@oemcomputer> Message-ID: <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com> I sort-of agree with Guido that any calls to optimize dictionaries may do more good than bad, but I think that if we make the interface sufficiently abstract we may have something that may work. I was thinking of something analogous to madvise(): the user can specify high level access patterns. For Python dictionaries the access patterns would probably be - I'm going to write a lot of stuff - I'm done writing, and from now on I'm mainly going to read - I haven't a clue what I'm going to do Especially the "I'm going to read from now on" could be put to good use, for instance after completing the dictionary of a class. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From skip@pobox.com Mon May 5 16:43:55 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 5 May 2003 10:43:55 -0500 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com> Message-ID: <16054.34491.64051.134832@montanaro.dyndns.org> Jack> I was thinking of something analogous to madvise(): ... Quick, everyone who's used madvise() please raise your hand... I'll bet a beer most people (even on this list) have never put it to good use. We all know Tim probably has just because he's Tim, and apparently Jack has. Anyone else? Guido, have you ever been tempted? Skip From guido@python.org Mon May 5 16:58:44 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 11:58:44 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: Your message of "Mon, 05 May 2003 10:43:55 CDT." <16054.34491.64051.134832@montanaro.dyndns.org> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com> <16054.34491.64051.134832@montanaro.dyndns.org> Message-ID: <200305051558.h45Fwi531325@odiug.zope.com> > Jack> I was thinking of something analogous to madvise(): ... > > Quick, everyone who's used madvise() please raise your hand... I'll bet a > beer most people (even on this list) have never put it to good use. We all > know Tim probably has just because he's Tim, and apparently Jack has. > Anyone else? Guido, have you ever been tempted? What's madvise()? :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@pfdubois.com Mon May 5 17:13:17 2003 From: paul@pfdubois.com (Paul Dubois) Date: Mon, 5 May 2003 09:13:17 -0700 Subject: [Python-Dev] Election of Todd Miller as head of numpy team Message-ID: <000001c31321$3dc958c0$6801a8c0@NICKLEBY> Todd Miller has been elected as the new Head of the Numeric Python development team. I am still an active developer, but it was time to rotate responsibilities. We especially need help with Numeric maintenance while Todd is working on Numarray. Thanks to all of you who helped me during my tenure. Remember, when you see Todd, the expected greeting to the NummieHead is a salute with more than one finger, accompanied by the cry, "Ni Ni Numpy!". See the file DEVELOPERS in the distribution for our "constitution". Paul From aleax@aleax.it Mon May 5 17:42:02 2003 From: aleax@aleax.it (Alex Martelli) Date: Mon, 5 May 2003 18:42:02 +0200 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <16054.34491.64051.134832@montanaro.dyndns.org> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com> <16054.34491.64051.134832@montanaro.dyndns.org> Message-ID: <200305051842.02937.aleax@aleax.it> On Monday 05 May 2003 05:43 pm, Skip Montanaro wrote: > Jack> I was thinking of something analogous to madvise(): ... > > Quick, everyone who's used madvise() please raise your hand... I'll bet a > beer most people (even on this list) have never put it to good use. We all > know Tim probably has just because he's Tim, and apparently Jack has. I used madvise extensively (and quite successfully) back when I was the senior software consultant responsible for the lower-levels of a variety of Unix-system ports of a line of mechanical CAD products. And I loved and still love the general concept -- let me advise an optimizer (so it can do whatever -- be it a little or a lot -- rather than spend energy trying to guess what in blazes I may be doing:-). Alex From guido@python.org Mon May 5 17:47:06 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 12:47:06 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: Your message of "Mon, 05 May 2003 18:42:02 +0200." <200305051842.02937.aleax@aleax.it> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com> <16054.34491.64051.134832@montanaro.dyndns.org> <200305051842.02937.aleax@aleax.it> Message-ID: <200305051647.h45Gl6N04048@odiug.zope.com> > I used madvise extensively (and quite successfully) back when I was > the senior software consultant responsible for the lower-levels of a > variety of Unix-system ports of a line of mechanical CAD products. > And I loved and still love the general concept -- let me advise an > optimizer (so it can do whatever -- be it a little or a lot -- > rather than spend energy trying to guess what in blazes I may be > doing:-). Hm. How do you know that you were succesful? I could think of an implementation that's similar to those "press to cross" buttons you see at some intersections, and which seem to have no effect whatsoever on the traffic lights. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@zope.com Mon May 5 18:11:50 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 05 May 2003 13:11:50 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <200305051842.02937.aleax@aleax.it> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com> <16054.34491.64051.134832@montanaro.dyndns.org> <200305051842.02937.aleax@aleax.it> Message-ID: <1052154710.12534.14.camel@slothrop.zope.com> On Mon, 2003-05-05 at 12:42, Alex Martelli wrote: > On Monday 05 May 2003 05:43 pm, Skip Montanaro wrote: > > Jack> I was thinking of something analogous to madvise(): ... > > > > Quick, everyone who's used madvise() please raise your hand... I'll bet a > > beer most people (even on this list) have never put it to good use. We all > > know Tim probably has just because he's Tim, and apparently Jack has. > > I used madvise extensively (and quite successfully) back when I was the > senior software consultant responsible for the lower-levels of a variety of > Unix-system ports of a line of mechanical CAD products. And I loved and > still love the general concept -- let me advise an optimizer (so it can do > whatever -- be it a little or a lot -- rather than spend energy trying to > guess what in blazes I may be doing:-). Have you seen the work on gray-box systems? http://www.cs.wisc.edu/graybox/ The philosophy of this project seems to be "You can observe an awful lot just by watching." (Apologies to Yogi.) The approach is to learn how a particular service is implemented, e.g. what buffer-replacement algorithm is used, by observing its behavior. Then write an application that exploits that knowledge to drive the system into optimized behavior for the application. No madvise() necessary. I wonder if the same can be done for dicts? My first guess would be no, because the sparseness is a fixed policy. Jeremy From aleax@aleax.it Mon May 5 18:22:53 2003 From: aleax@aleax.it (Alex Martelli) Date: Mon, 5 May 2003 19:22:53 +0200 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <200305051647.h45Gl6N04048@odiug.zope.com> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051842.02937.aleax@aleax.it> <200305051647.h45Gl6N04048@odiug.zope.com> Message-ID: <200305051922.53855.aleax@aleax.it> On Monday 05 May 2003 06:47 pm, Guido van Rossum wrote: > > I used madvise extensively (and quite successfully) back when I was > > the senior software consultant responsible for the lower-levels of a > > variety of Unix-system ports of a line of mechanical CAD products. > > And I loved and still love the general concept -- let me advise an > > optimizer (so it can do whatever -- be it a little or a lot -- > > rather than spend energy trying to guess what in blazes I may be > > doing:-). > > Hm. How do you know that you were succesful? I could think of an By measuring applications' performance on important benchmarks (mostly not artificial ones, but rather actual benchmarks used in the past by some customers to help them choose which CAD package to buy -- we treasured those, at that firm, and had built up quite a portfolio of them over the years). As CPUs and floating-point units became fast enough, more and more of the speed issues with so-called "CPU intensive" bottlenecks in mechanical-engineering CAD actually became related to memory-access patterns (a phenomenon I had already observed when I worked on IBM multi-CPU mainframes with vector-units, being sold as "supercomputers" but in fact still having complex and deep memory hierarchies -- Cray guys of the time such as Tim no doubt had it easier!-). > implementation that's similar to those "press to cross" buttons you > see at some intersections, and which seem to have no effect whatsoever > on the traffic lights. :-) Yes, there were a few of those, too. That's part of what's cool about an "advise" operation: it IS quite OK to implement it as a no-op, both in the early times when you're moving an existing API to some new platform, AND in (hypothetical:-) late maturity when your optimizer's pattern-detector has become able to outsmart the programmer on a regular basis. C's "register" keyword is a familiar example: it was quite precious in very early compilers with nearly nonexistent optimizers, it was regularly ignored in new compilers for very limited (and particularly register-limited) platforms, and it's invariably ignored now that optimizers have become able to allocate registers better than most programmers. (It should probably have been a #pragma rather than eat up a reserved word, but that's just syntactic-level hindsight:-). Alex From aleax@aleax.it Mon May 5 18:36:20 2003 From: aleax@aleax.it (Alex Martelli) Date: Mon, 5 May 2003 19:36:20 +0200 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <1052154710.12534.14.camel@slothrop.zope.com> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051842.02937.aleax@aleax.it> <1052154710.12534.14.camel@slothrop.zope.com> Message-ID: <200305051936.20078.aleax@aleax.it> On Monday 05 May 2003 07:11 pm, Jeremy Hylton wrote: ... > Have you seen the work on gray-box systems? > > http://www.cs.wisc.edu/graybox/ > > The philosophy of this project seems to be "You can observe an awful lot > just by watching." (Apologies to Yogi.) The approach is to learn how a > particular service is implemented, e.g. what buffer-replacement > algorithm is used, by observing its behavior. Then write an application > that exploits that knowledge to drive the system into optimized behavior > for the application. No madvise() necessary. Haven't read that URL, but this seems to summarize the way we had to work with Fortran compilers on 3090-VF's back in the late '80s -- no way to explicitly advise the compiler about what and how to vectorize, so, lots of experimentation and tweaking to find how what the (expletive deleted) heuristics the GD beast was using, and how to outsmart it and get it to vectorize what *WE* wanted rather than what *IT* thought was good for us. What fun! And of course we got to redo it all over again when a new compiler release came out. No thanks. I've paid my dues and I hope I will *NEVER* again have to work with a system that thinks it's so smart it doesn't need my advisory input -- or at least not on anything that's as performance-crucial as those Fortran programs were (most of my work in IBM Research in did with Rexx -- that's when I learned to love scripting! -- but then and again we did have to crunch really huge batches of numbers). Alex From jeremy@zope.com Mon May 5 18:41:35 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 05 May 2003 13:41:35 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <200305051936.20078.aleax@aleax.it> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051842.02937.aleax@aleax.it> <1052154710.12534.14.camel@slothrop.zope.com> <200305051936.20078.aleax@aleax.it> Message-ID: <1052156494.12531.27.camel@slothrop.zope.com> On Mon, 2003-05-05 at 13:36, Alex Martelli wrote: > No thanks. I've paid my dues and I hope I will *NEVER* again have to > work with a system that thinks it's so smart it doesn't need my advisory > input -- or at least not on anything that's as performance-crucial as > those Fortran programs were (most of my work in IBM Research in > did with Rexx -- that's when I learned to love scripting! -- but then and > again we did have to crunch really huge batches of numbers). I think the graybox project is assuming that few people will have the luxury of working with a system that accepts useful advisory input. Given that hypothesis, they built a tool for identifying what algorithm is being used so that it can be tweaked appropriately. Jeremy From guido@python.org Mon May 5 18:46:28 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 13:46:28 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: Your message of "Mon, 05 May 2003 19:36:20 +0200." <200305051936.20078.aleax@aleax.it> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051842.02937.aleax@aleax.it> <1052154710.12534.14.camel@slothrop.zope.com> <200305051936.20078.aleax@aleax.it> Message-ID: <200305051746.h45HkS009569@odiug.zope.com> > No thanks. I've paid my dues and I hope I will *NEVER* again have to > work with a system that thinks it's so smart it doesn't need my advisory > input -- or at least not on anything that's as performance-crucial as > those Fortran programs were [...] I severely doubt that any Python apps are as performance-critical as those Fortran programs were. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Mon May 5 18:40:18 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 05 May 2003 13:40:18 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <1052154710.12534.14.camel@slothrop.zope.com> Message-ID: [Jeremy Hylton] > Have you seen the work on gray-box systems? > > http://www.cs.wisc.edu/graybox/ > > The philosophy of this project seems to be "You can observe an awful lot > just by watching." (Apologies to Yogi.) The approach is to learn how a > particular service is implemented, e.g. what buffer-replacement > algorithm is used, by observing its behavior. Then write an application > that exploits that knowledge to drive the system into optimized behavior > for the application. No madvise() necessary. > > I wonder if the same can be done for dicts? My first guess would be no, > because the sparseness is a fixed policy. Well, a dict suffers damaging collisions or it doesn't. If it does, the best thing a user can do is rebuild the dict from scratch, inserting keys by decreasing order of access frequency. Then the most frequently accessed keys come earliest in their collision chains. Collisions simply don't matter for rarely referenced keys. (And, for example, if there *are* any truly damaging collisions in __builtin__.__dict__, I expect this gimmick would remove the damage.) The size of the dict can be forced larger by inserting artificial keys, if a user is insane . It's always been possible to eliminate dummy entries by doing "dict = dict.copy()". Note that because Python exposes the hash function used by dicts, you can write a faithful low-level dict emulator in Python, and deduce what effects a sequence of dict inserts and deletes will have. So, overall, I expect there's more you *could* do to speed dict access (in the handful of bad cases it's not already good enough) yourself than Python could do for you. You'd have to be nuts, though -- or writing papers on gray-box systems. From tim.one@comcast.net Mon May 5 18:54:41 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 05 May 2003 13:54:41 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <200305051922.53855.aleax@aleax.it> Message-ID: [Alex Martelli] > ... > As CPUs and floating-point units became fast enough, more and more of > the speed issues with so-called "CPU intensive" bottlenecks in > mechanical-engineering CAD actually became related to memory-access > patterns (a phenomenon I had already observed when I worked on IBM > multi-CPU mainframes with vector-units, being sold as "supercomputers" > but in fact still having complex and deep memory hierarchies -- Cray guys > of the time such as Tim no doubt had it easier!-). Indeed, Seymour Cray used to say a supercomputer is a machine that transforms a CPU-bound program into an I/0-bound program, and didn't want anything "in between" complicating that view. As a result, optimizing programs to run on Crays was, while still arbitrarily difficult, generally a monotonic process, rarely beset by "mysterious regressions" along the way. Now that gigabyte+ RAM boxes are becoming common, I wonder when someone will figure out that the VM machinery is just slowing them down <0.9 wink>. From aleax@aleax.it Mon May 5 18:57:25 2003 From: aleax@aleax.it (Alex Martelli) Date: Mon, 5 May 2003 19:57:25 +0200 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <200305051746.h45HkS009569@odiug.zope.com> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051936.20078.aleax@aleax.it> <200305051746.h45HkS009569@odiug.zope.com> Message-ID: <200305051957.25403.aleax@aleax.it> On Monday 05 May 2003 07:46 pm, Guido van Rossum wrote: > > No thanks. I've paid my dues and I hope I will *NEVER* again have to > > work with a system that thinks it's so smart it doesn't need my advisory > > input -- or at least not on anything that's as performance-crucial as > > those Fortran programs were [...] > > I severely doubt that any Python apps are as performance-critical as > those Fortran programs were. Yes, this may well be correct. My only TRUE wish for tuning performance of Python applications is to have SOME ways to measure memory footprints with sensible guesses about where they come from -- THAT is where I might gain hugely (by fighting excessive working sets through selective flushing of caches, freelists, etc). Alex From jeremy@zope.com Mon May 5 19:12:12 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 05 May 2003 14:12:12 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <200305051957.25403.aleax@aleax.it> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051936.20078.aleax@aleax.it> <200305051746.h45HkS009569@odiug.zope.com> <200305051957.25403.aleax@aleax.it> Message-ID: <1052158331.12534.31.camel@slothrop.zope.com> On Mon, 2003-05-05 at 13:57, Alex Martelli wrote: > On Monday 05 May 2003 07:46 pm, Guido van Rossum wrote: > > > No thanks. I've paid my dues and I hope I will *NEVER* again have to > > > work with a system that thinks it's so smart it doesn't need my advisory > > > input -- or at least not on anything that's as performance-crucial as > > > those Fortran programs were [...] > > > > I severely doubt that any Python apps are as performance-critical as > > those Fortran programs were. > > Yes, this may well be correct. My only TRUE wish for tuning performance > of Python applications is to have SOME ways to measure memory > footprints with sensible guesses about where they come from -- THAT > is where I might gain hugely (by fighting excessive working sets through > selective flushing of caches, freelists, etc). Any idea how to actually do this? Jeremy From python@rcn.com Mon May 5 19:12:53 2003 From: python@rcn.com (Raymond Hettinger) Date: Mon, 5 May 2003 14:12:53 -0400 Subject: [Python-Dev] Dictionary sparseness References: Message-ID: <000101c31339$791838c0$125ffea9@oemcomputer> > the best thing a user can do is rebuild the dict from scratch, inserting keys by > decreasing order of access frequency. Then a periodic resize comes alongm re-inserting everything in a different order. >The size of the dict can be forced larger by > inserting artificial keys, if a user is insane . Uh oh: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/198157 > You'd have to be nuts, though That explains alot ;) Does the *4 patch (amended to have an upper bound) have a chance? It's automatic, simple, benefits some cases while not harming others, Raymond From aleax@aleax.it Mon May 5 20:21:28 2003 From: aleax@aleax.it (Alex Martelli) Date: Mon, 5 May 2003 21:21:28 +0200 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <1052158331.12534.31.camel@slothrop.zope.com> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051957.25403.aleax@aleax.it> <1052158331.12534.31.camel@slothrop.zope.com> Message-ID: <200305052121.28017.aleax@aleax.it> On Monday 05 May 2003 08:12 pm, Jeremy Hylton wrote: > On Mon, 2003-05-05 at 13:57, Alex Martelli wrote: > > On Monday 05 May 2003 07:46 pm, Guido van Rossum wrote: > > > > No thanks. I've paid my dues and I hope I will *NEVER* again have to > > > > work with a system that thinks it's so smart it doesn't need my > > > > advisory input -- or at least not on anything that's as > > > > performance-crucial as those Fortran programs were [...] > > > > > > I severely doubt that any Python apps are as performance-critical as > > > those Fortran programs were. > > > > Yes, this may well be correct. My only TRUE wish for tuning performance > > of Python applications is to have SOME ways to measure memory > > footprints with sensible guesses about where they come from -- THAT > > is where I might gain hugely (by fighting excessive working sets through > > selective flushing of caches, freelists, etc). > > Any idea how to actually do this? Not really, even though I've been thinking about it for a while -- pymalloc's the only "hook" that comes to mind so far. Alex From skip@pobox.com Mon May 5 20:35:23 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 5 May 2003 14:35:23 -0500 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: <200305051957.25403.aleax@aleax.it> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051936.20078.aleax@aleax.it> <200305051746.h45HkS009569@odiug.zope.com> <200305051957.25403.aleax@aleax.it> Message-ID: <16054.48379.533379.672799@montanaro.dyndns.org> Alex> My only TRUE wish for tuning performance of Python applications is Alex> to have SOME ways to measure memory footprints with sensible Alex> guesses about where they come from Here's a thought. Debug builds appear to now add a getobjects method to sys. Would it be possible to also add another method to sys (also only available on debug builds) which knows just enough about basic builtin object types to say a little about how much space it's consuming? For example, I could do something like this: allocdict = {} for o in sys.getobjects(0): allocsize = sys.get_object_allocation_size(o) # I'm not a fan of {}.setdefault() alloc = allocdict.get(type(o), []) alloc.append(allocsize) # or alloc.append((allocsize, o)) allocdict[type(o)] = alloc Once the list is traversed you can poke around in allocdict figuring out where your memory went (other than to allocdict itself!). (I was tempted to suggest another method, but I fear that would just spread the mess around. That may also be a viable option though.) Skip From tim@zope.com Mon May 5 20:34:58 2003 From: tim@zope.com (Tim Peters) Date: Mon, 5 May 2003 15:34:58 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <000101c31339$791838c0$125ffea9@oemcomputer> Message-ID: [Tim] >> the best thing a user can do is rebuild the dict from scratch, >> inserting keys by decreasing order of access frequency. [Raymond Hettinger] > Then a periodic resize comes alongm re-inserting everything > in a different order. Sure -- micro-optimizations are always fragile. This kind of thing will be done by someone who's certain the dict is henceforth read-only, and who thinks it's worth the risk and obscurity to get some small speedup. They're probably right at the time they do it, too, and probably wrong over time. Same thing goes for, e.g., an madvise() call claiming a current truth that changes over time. > ... > Does the *4 patch (amended to have an upper bound) have a chance? > It's automatic, simple, benefits some cases while not harming others, It would be nice if more people tried it and added their results to the patch report: http://www.python.org/sf/729395 Right now, we just have Guido's comment saying that he no longer sees the Zope3 startup speedup he thought he saw earlier. Small percentage speedups are like that, alas. The patch is OK by me. From tim.one@comcast.net Mon May 5 20:56:41 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 05 May 2003 15:56:41 -0400 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: <16054.48379.533379.672799@montanaro.dyndns.org> Message-ID: [Skip Montanaro] > Here's a thought. Debug builds appear to now add a getobjects method to > sys. Yes, although that isn't new -- it's been there forever (read Misc/SpecialBuilds.txt). > Would it be possible to also add another method to sys (also only > available on debug builds) which knows just enough about basic builtin > object types to say a little about how much space it's consuming? Marc-Andre has something like that in mxTools already (his sizeof() function). Note also the COUNT_ALLOCS special build, which saves info about total # of allocations, deallocations, and highwater mark per type, made available via sys.getcounts(). The nifty thing about COUNT_ALLOCS is that you can enable it in a release build (it doesn't rely on the debug-build changes to the layout of PyObject). Stuff all these things miss (even pymalloc, because it isn't asked for the memory) include the immortal and unbounded int freelist, the I&U float FL, and the immortal but bounded frameobject FL. Do, e.g., range(2000000) (as someone did on c.l.py last week), and about 24MB "goes missing" until the program shuts down (it's sitting in the int FL). Note that pymalloc never returns its "arenas" to the system either. From zooko@zooko.com Mon May 5 21:02:08 2003 From: zooko@zooko.com (Zooko) Date: Mon, 05 May 2003 16:02:08 -0400 Subject: [Python-Dev] Re: heaps In-Reply-To: Message from "Raymond Hettinger" of "Sun, 04 May 2003 22:26:47 EDT." <003f01c312ad$c7277580$125ffea9@oemcomputer> References: <003f01c312ad$c7277580$125ffea9@oemcomputer> Message-ID: >From heapq.py: """ Usage: heap = [] # creates an empty heap heappush(heap, item) # pushes a new item on the heap item = heappop(heap) # pops the smallest item from the heap item = heap[0] # smallest item on the heap without popping it ... [It is] possible to view the heap as a regular Python list without surprises: heap[0] is the smallest item, and heap.sort() maintains the heap invariant! """ Shouldn't heapq be a subclass of list? Then it would read: """ heap = heapq() # creates an empty heap heap.push(item) # pushes a new item on the heap item = heap.pop() # pops the smallest item from the heap item = heap[0] # smallest item on the heap without popping it """ In addition to nicer syntax, this would give you the option to forbid invariant-breaking alterations. Although you could also choose to allow invariant-breaking alterations, just as the current heapq does. One thing I don't know how to implement is: # This changes mylist itself into a heapq -- it doesn't make a copy of mylist! makeheapq(mylist) Perhaps this is a limitation of the current object model? Or is there a way to change an object's type at runtime. Regards, Zooko http://zooko.com/ From agthorr@barsoom.org Mon May 5 21:14:16 2003 From: agthorr@barsoom.org (Agthorr) Date: Mon, 5 May 2003 13:14:16 -0700 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: References: <000101c31339$791838c0$125ffea9@oemcomputer> Message-ID: <20030505201416.GB17384@barsoom.org> On Mon, May 05, 2003 at 03:34:58PM -0400, Tim Peters wrote: > Sure -- micro-optimizations are always fragile. This kind of thing will be > done by someone who's certain the dict is henceforth read-only, and who > thinks it's worth the risk and obscurity to get some small speedup. An alternate optimization would be the additional of an immutable dictionary type to the language, initialized from a mutable dictionary type. Upon creation, this dictionary would optimize itself, in a manner similar to "gperf" program which creates (nearly) minimal zero-collision hash tables. On this plus side, this would form a nice symmetry with the existing mutable vs immutable types. Also, it would be proof against bit-rot, since either: a) the user changes the mutable dictionary before it is optimized. In this case, the optimizer will simply optimize the new dictionary, or b) the user attempts to modify the immutable dictionary, which will fail with an error. -- Agthorr From skip@pobox.com Mon May 5 21:24:39 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 5 May 2003 15:24:39 -0500 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: References: <16054.48379.533379.672799@montanaro.dyndns.org> Message-ID: <16054.51335.255858.381526@montanaro.dyndns.org> Tim> Stuff all these things miss (even pymalloc, because it isn't asked Tim> for the memory) include the immortal and unbounded int freelist, Tim> the I&U float FL, and the immortal but bounded frameobject FL. Do, Tim> e.g., range(2000000) (as someone did on c.l.py last week), and Tim> about 24MB "goes missing" until the program shuts down (it's Tim> sitting in the int FL). Note that pymalloc never returns its Tim> "arenas" to the system either. These shortcomings could be remedied by suitable inspection functions added to sys for debug builds. This leads me to wonder, has anyone measured the cost of deleting the int and float free lists when pymalloc is enabled? I wonder how unbearable it would be. Skip From martin@v.loewis.de Mon May 5 21:39:40 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 05 May 2003 22:39:40 +0200 Subject: [Python-Dev] How to test this? In-Reply-To: <16054.30310.489999.134263@montanaro.dyndns.org> References: <16054.30310.489999.134263@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > I just added a patch file to . It > doesn't include any test cases, since that requires an old db hash > v2 file present. Is it okay to check in a dummy file to Lib/test > for this purpose? Make sure you use -kb in the cvs add. Apart from that, it would be fine by me - except that I recall that the file format is endianness-sensitive, so you should make sure that the test passes on machines of both endiannesses before adding the file. Regards, Martin From aleax@aleax.it Mon May 5 21:42:40 2003 From: aleax@aleax.it (Alex Martelli) Date: Mon, 5 May 2003 22:42:40 +0200 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <20030505201416.GB17384@barsoom.org> References: <000101c31339$791838c0$125ffea9@oemcomputer> <20030505201416.GB17384@barsoom.org> Message-ID: <200305052242.40380.aleax@aleax.it> On Monday 05 May 2003 10:14 pm, Agthorr wrote: > On Mon, May 05, 2003 at 03:34:58PM -0400, Tim Peters wrote: > > Sure -- micro-optimizations are always fragile. This kind of thing will > > be done by someone who's certain the dict is henceforth read-only, and > > who thinks it's worth the risk and obscurity to get some small speedup. > > An alternate optimization would be the additional of an immutable > dictionary type to the language, initialized from a mutable dictionary I'd love a read-only dictionary (AND a read-only list) for reasons having little to do with optimization, actually -- ease of use as dict keys and/or set members, plus, occasional help in catching errors (for the latter use it would be wonderful if read-only dictionaries could be actually substituted in place of such things as instance and class dictionaries). Tuples are no substitutes for read-only lists because they lack many useful "read-only" methods of lists (and won't grow them, as the BDFL has abundantly made clear, as he sees tuples as drastically different from lists). Neither, even more clearly, are e.g. tuples of pairs a good substitute for read-only dictionaries. I've played with adding more selective "locking" to dicts but I was unable to do it without a performance hit. If wholesale "RO-ness" can in fact *increase* performance in some cases, so much the better. "RO lists" could probably save a little memory compared to normal ones since they would need no "spare space" for growing. Alex From martin@v.loewis.de Mon May 5 21:43:17 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 05 May 2003 22:43:17 +0200 Subject: [Python-Dev] Windows installer request... In-Reply-To: References: Message-ID: Tim Peters writes: > Are you saying that the "Select Destination Directory" dialog box doesn't > allow you to select your E: drive? Or just that you'd rather not need to > select the drive you want? I second the second; I noticed that Python installed on the "wrong drive" (i.e. the W9x installation) only after installation was complete. I don't know (and can't check at the moment) whether it offered me to pick e:. It probably did, but I don't know for sure. > So, AFAIK, there isn't a straightforward way to get Wise 8.14 to suggest a > drive other than C:. Perhaps it would work better for you if I removed the > Wizard-generated hardcoded "C:" (I don't know which drive Wise would pick > then), but since yours is the only complaint about this I've seen, and I > have no way to test such a change, I'm very reluctant to fiddle with it. I have the same complaint, and I'd happily test any updated installer. Regards, Martin From zooko@zooko.com Mon May 5 21:58:21 2003 From: zooko@zooko.com (Zooko) Date: Mon, 05 May 2003 16:58:21 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: Message from Alex Martelli of "Mon, 05 May 2003 22:42:40 +0200." <200305052242.40380.aleax@aleax.it> References: <000101c31339$791838c0$125ffea9@oemcomputer> <20030505201416.GB17384@barsoom.org> <200305052242.40380.aleax@aleax.it> Message-ID: Alex Martelli wrote: > > I'd love a read-only dictionary (AND a read-only list) for reasons having > little to do with optimization, actually [...] Me too! It would be very useful for secure Python -- I could pass my list to some without risking that it mutates my list. Without RO-lists I have to make a copy of my list every time I want to show it to someone. Regards, Zooko http://zooko.com/ From tim.one@comcast.net Mon May 5 22:00:16 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 05 May 2003 17:00:16 -0400 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: <16054.51335.255858.381526@montanaro.dyndns.org> Message-ID: [Skip Montanaro] [on assorted freelists] > These shortcomings could be remedied by suitable inspection > functions added to sys for debug builds. If someone cares enough , sure. > This leads me to wonder, has anyone measured the cost of deleting the > int and float free lists when pymalloc is enabled? I wonder how > unbearable it would be. Vladimir did when he was first developing pymalloc, and left the free lists in deliberately. I haven't tried it. pymalloc is a bit faster since then, but will always have the additional overhead of needing to figure out *which* freelist to look in (pymalloc's free lists are segregated by block size), and, because it recycles empty pools among different block sizes too, the overhead on free of checking for pool emptiness. The int free list is faster in part because it's so damn Narcissistic <0.7 wink>. From skip@pobox.com Mon May 5 22:24:59 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 5 May 2003 16:24:59 -0500 Subject: [Python-Dev] How to test this? In-Reply-To: References: <16054.30310.489999.134263@montanaro.dyndns.org> Message-ID: <16054.54955.226043.202262@montanaro.dyndns.org> Martin> Make sure you use -kb in the cvs add. Thanks, I'd forgotten about that. Martin> Apart from that, it would be fine by me - except that I recall Martin> that the file format is endianness-sensitive, so you should make Martin> sure that the test passes on machines of both endiannesses Martin> before adding the file. It appears the database itself accounts for the endianness of the file. I copied my test db file from my Mac to a Linux PC. struct.unpack("=l", f.read(4)) showed different values on the two systems (0x61561 vs 0x61150600) but bsddb185 on both systems could read the file. This is a very nice property of Berkeley DB in general. I copy db files from the spambayes project all the time. rsync(1) sure beats the heck out of dumping and reloading a 20+MB file all the time. Skip From martin@v.loewis.de Mon May 5 23:03:00 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 06 May 2003 00:03:00 +0200 Subject: [Python-Dev] How to test this? In-Reply-To: <16054.54955.226043.202262@montanaro.dyndns.org> References: <16054.30310.489999.134263@montanaro.dyndns.org> <16054.54955.226043.202262@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > It appears the database itself accounts for the endianness of the file. I > copied my test db file from my Mac to a Linux PC. struct.unpack("=l", > f.read(4)) showed different values on the two systems (0x61561 vs > 0x61150600) but bsddb185 on both systems could read the file. This is a > very nice property of Berkeley DB in general. That's good to hear. I thought I understood a report on the Subversion mailing list that you can't move databases across endianesses, but that might have been an unrelated issue. Regards, Martin From tim.one@comcast.net Tue May 6 01:33:39 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 05 May 2003 20:33:39 -0400 Subject: [Python-Dev] Re: heaps In-Reply-To: <003f01c312ad$c7277580$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger] > FWIW, there is C implementation of heapq at: > http://zhar.net/projects/python/ Cool! I thought the code was remarkably clear, until I realized it never checked for errors (e.g., PyList_Append() can run out of memory, and PyObject_RichCompareBool() can raise any exception). Those would have to be repaired, and doing so would slow it some. If the heapq module survives with the same API for a release or two, it would be a fine candidate to move into C, or maybe Pyrex (fiddly little integer arithmetic interspersed via if/then/else with trivial array indexing aren't Python's strong suits). From tim.one@comcast.net Tue May 6 03:35:28 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 05 May 2003 22:35:28 -0400 Subject: [Python-Dev] Re: heaps In-Reply-To: <20414172.1052079989@[10.0.1.2]> Message-ID: [David Eppstein, on the bar-raising behavior of person > q[0] ] > Good point. If any permutation of the input sequence is equally likely, > and you're selecting the best k out of n items, the expected number of > times you have to hit the data structure in your heapq solution > is roughly k ln n, so the total expected time is O(n + k log k log n), > with a really small constant factor on the O(n) term. The sorting > solution I suggested has total time O(n log k), and even though sorting > is built-in and fast it can't compete when k is small. Random pivoting > is O(n + k), but with a larger constant factor, so your heapq solution > looks like a winner. In real Python Life, it's the fastest way I know (depending ...). > For fairness, it might be interesting to try another run of your test > in which the input sequence is sorted in increasing order rather > than random. Comparing the worst case of one against the best case of the other isn't my idea of fairness , but sure. On the best-1000 of a million floats test, and sorting the floats first, the heap method ran about 30x slower than on random data, and the sort method ran significantly faster than on random data (a factor of 1.3x faster). OTOH, if I undo my speed tricks and call a function in the sort method (instead of doing it all micro-optimized inline), that slows the sort method by a bit over a factor of 2. > I.e., replace the random generation of seq by > seq = range(M) > I'd try it myself, but I'm still running python 2.2 and haven't > installed heapq. I'd have to know more about your application to > have an idea whether the sorted or randomly-permuted case is more > representative. Of course -- so would I . Here's a surprise: I coded a variant of the quicksort-like partitioning method, at the bottom of this mail. On the largest-1000 of a million random-float case, times were remarkably steady across trials (i.e., using a different set of a million random floats each time): heapq 0.96 seconds sort (micro-optimized) 3.4 seconds KBest (below) 2.6 seconds The KBest code creates new lists with wild abandon. I expect it does better than the sort method anyway because it gets to exploit its own form of "raise the bar" behavior as more elements come in. For example, on the first run, len(buf) exceeded 3000 only 14 times, and the final pivot value each time is used by put() as an "ignore the input unless it's bigger than that" cutoff: pivoted w/ 0.247497558554 pivoted w/ 0.611006884768 pivoted w/ 0.633565558936 pivoted w/ 0.80516673256 pivoted w/ 0.814304890889 pivoted w/ 0.884660572175 pivoted w/ 0.89986744075 pivoted w/ 0.946575251872 pivoted w/ 0.980386533221 pivoted w/ 0.983743795382 pivoted w/ 0.992381911217 pivoted w/ 0.994243625292 pivoted w/ 0.99481443021 pivoted w/ 0.997044443344 The already-sorted case is also a bad case for this method, because then the pivot is never big enough to trigger the early exit in put(). def split(seq, pivot): lt, eq, gt = [], [], [] lta, eqa, gta = lt.append, eq.append, gt.append for x in seq: c = cmp(x, pivot) if c < 0: lta(x) elif c: gta(x) else: eqa(x) return lt, eq, gt # KBest(k, minusinf) remembers the largest k objects # from a sequence of objects passed one at a time to # put(). minusinf must be smaller than any object # passed to put(). After feeding in all the objects, # call get() to retrieve a list of the k largest (or # as many as were passed to put(), if put() was called # fewer than k times). class KBest(object): __slots__ = 'k', 'buflim', 'buf', 'cutoff' def __init__(self, k, minusinf): self.k = k self.buflim = 3*k self.buf = [] self.cutoff = minusinf def put(self, obj): if obj <= self.cutoff: return buf = self.buf buf.append(obj) if len(buf) <= self.buflim: return # Reduce len(buf) by at least one, by retaining # at least k, and at most len(buf)-1, of the # largest objects in buf. from random import choice sofar = [] k = self.k while len(sofar) < k: pivot = choice(buf) buf, eq, gt = split(buf, pivot) sofar.extend(gt) if len(sofar) < k: sofar.extend(eq[:k - len(sofar)]) self.buf = sofar self.cutoff = pivot def get(self): from random import choice buf = self.buf k = self.k if len(buf) <= k: return buf # Retain only the k largest. sofar = [] needed = k while needed: pivot = choice(buf) lt, eq, gt = split(buf, pivot) if len(gt) <= needed: sofar.extend(gt) needed -= len(gt) if needed: takefromeq = min(len(eq), needed) sofar.extend(eq[:takefromeq]) needed -= takefromeq # If we still need more, they have to # come out of things < pivot. buf = lt else: # gt alone is too large. buf = gt assert len(sofar) == k self.buf = sofar return sofar From BPettersen@NAREX.com Tue May 6 05:40:04 2003 From: BPettersen@NAREX.com (Bjorn Pettersen) Date: Mon, 5 May 2003 22:40:04 -0600 Subject: [Python-Dev] Windows installer request... Message-ID: <60FB8BB7F0EFC7409B75EEEC13E20192022DE23A@admin56.narex.com> > From: Tim Peters [mailto:tim.one@comcast.net]=20 >=20 > [Bjorn Pettersen] > > Would it be possible for the windows installer to use=20 > > $SYSTEMDRIVE$ as the default installation drive instead=20 > > of C:? [...] > Are you saying that the "Select Destination Directory" dialog=20 > box doesn't allow you to select your E: drive? Or just=20 > that you'd rather not need to select the drive you want? Most installers default to the system drive, so I didn't even look the first time. I am able to change it manually. > > If it's considered a good idea, and someone can point me to=20 > > where the change has to be made, I'd be more than willing to=20 > > produce a patch... >=20 > I apparently left this comment in the Wise script: >=20 > Note from Tim: doesn't seem to be a way to get the true=20 > boot drive, the Wizard hardcodes "C". >=20 > So, AFAIK, there isn't a straightforward way to get Wise 8.14=20 > to suggest a drive other than C:. It should be as easy as (platforms that doesn't have %systemdrive% could only install to C:): item: Get Environment Variable Variable=3DOSDRIVE Environment=3DSystemDrive Default=3DC: end However, you might have to do item: Get Registry Key Value Variable=3DOSDRIVE Key=3DSystem\CurrentControlSet\Control\Session Manager\Environment Value Name=3DSystemDrive Flags=3D00000100 Defualt=3DC: end (not sure about the Flags parameter) I couldn't find much documentation, and the example I'm looking at is a litte "divided" about which it should use... I think it tries the first one, and falls back on the second(?) (http://ibinstall.defined.net/dl_scripts.htm, script_6016.zip/IBWin32Setup.wse). Also, it looks like you want to use %SYS32% to get to the windows system directory (on WinXP, it's c:\windows\system32, which doesn't seem to be listed anywhere...) I can't figure out how you're building the installer however. If you can point me in the right direction I can test it on my special WinXP, regular WinXP, Win98, Win2k, and maybe WinNT4 (I think we still have one around :-). -- bjorn From eppstein@ics.uci.edu Tue May 6 07:00:24 2003 From: eppstein@ics.uci.edu (David Eppstein) Date: Mon, 05 May 2003 23:00:24 -0700 Subject: [Python-Dev] Re: heaps References: <20414172.1052079989@[10.0.1.2]> Message-ID: In article , Tim Peters wrote: > > For fairness, it might be interesting to try another run of your test > > in which the input sequence is sorted in increasing order rather > > than random. > > Comparing the worst case of one against the best case of the other isn't my > idea of fairness , but sure. Well, it doesn't seem any fairer to use random data to compare an algorithm with an average time bound that depends on an assumption of randomness in the data...anyway, the point was more to understand the limiting cases. If one algorithm is usually 3x faster than the other, and is never more than 10x slower, that's better than being usually 3x faster but sometimes 1000x slower, for instance. > > I'd have to know more about your application to > > have an idea whether the sorted or randomly-permuted case is more > > representative. > > Of course -- so would I . My Java KBest code was written to make data subsets for a half-dozen web pages (same data selected according to different criteria). Of these six instances, one is presented the data in roughly ascending order, one in descending order, and the other four are less clear but probably not random. Robustness in the face of this sort of variation is why I prefer any average-case assumptions in my code's performance to depend only on randomness from a random number generator, and not arbitrariness in the actual input. But I'm not sure I'd usually be willing to pay a 3x penalty for that robustness. > Here's a surprise: I coded a variant of the quicksort-like partitioning > method, at the bottom of this mail. On the largest-1000 of a million > random-float case, times were remarkably steady across trials (i.e., using a > different set of a million random floats each time): > > heapq 0.96 seconds > sort (micro-optimized) 3.4 seconds > KBest (below) 2.6 seconds Huh. You're almost convincing me that asymptotic analysis works even in the presence of Python's compiled-vs-interpreted anomalies. The other surprise is that (unlike, say, the sort or heapq versions) your KBest doesn't look significantly more concise than my earlier Java implementation. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From harri.pasanen@trema.com Tue May 6 09:55:27 2003 From: harri.pasanen@trema.com (Harri Pasanen) Date: Tue, 6 May 2003 10:55:27 +0200 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: <16054.48379.533379.672799@montanaro.dyndns.org> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051957.25403.aleax@aleax.it> <16054.48379.533379.672799@montanaro.dyndns.org> Message-ID: <200305061055.27898.harri.pasanen@trema.com> Speaking of memory consumption, has the memory footprint of Python changed significantly from 2.2 to 2.3? I've been toying with the idea of making a small python ever since I compiled Python 1.0 for MS-DOS box with 512Kb of memory. I've scanned at the Palm Python stuff, but I did not have a clear picture if they really did everything possible to make it small, including changing the representation of internal structs, or did they just chop away the complex type, parser, compiler, etc? Regards, Harri From mal@lemburg.com Tue May 6 11:03:15 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 06 May 2003 12:03:15 +0200 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: References: Message-ID: <3EB78863.5070105@lemburg.com> Tim Peters wrote: > [Skip Montanaro] > > [on assorted freelists] > >>These shortcomings could be remedied by suitable inspection >>functions added to sys for debug builds. > > If someone cares enough , sure. > >>This leads me to wonder, has anyone measured the cost of deleting the >>int and float free lists when pymalloc is enabled? I wonder how >>unbearable it would be. > > Vladimir did when he was first developing pymalloc, and left the free lists > in deliberately. I haven't tried it. pymalloc is a bit faster since then, > but will always have the additional overhead of needing to figure out > *which* freelist to look in (pymalloc's free lists are segregated by block > size), and, because it recycles empty pools among different block sizes too, > the overhead on free of checking for pool emptiness. The int free list is > faster in part because it's so damn Narcissistic <0.7 wink>. If someone really care, I suppose that the garbage collector could do an occasional scan of the int free list and chop of the tail after a certain number of entries. FWIW, Unicode free lists have a cap to limit the number of entries in the list to 1024. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 06 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 49 days left From guido@python.org Tue May 6 13:07:54 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 06 May 2003 08:07:54 -0400 Subject: [Python-Dev] Startup time Message-ID: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> While Python's general speed has gone up, its startup speed has slowed down! I timed this two different ways. The first way is to run python -c "import time; print time.clock()" On Unix, this prints the CPU time used since the process was created. The second way is to run time python -c pass which shows CPU and real time to complete running the process. I did this on a 633 MHz PC running Red Hat Linux 7.3. The Python builds were standard non-debug builds. I tried with and without the -S option, which is supposed to suppress loading of site.py and hence most startup overhead; it didn't exist in Python 1.3 and 1.4. Results for the first way are pretty inaccurate because it's such a small number and is only measured in 1/100 of a second, yet revealing. Some times are printed as two values; I didn't do enough runs to compute a careful average, so I'm just showing the range: Version CPU Time CPU Time with -S 1.3 0.00 N/A 1.4 0.00 N/A 1.5.2 0.01 0.00 2.0 0.01-0.02 0.00 2.1 0.01-0.02 0.00 2.2 0.02 0.00 2.3 0.04 0.03-0.04 Now using time: Version CPU Time CPU Time with -S 1.3 0.004 N/A 1.4 0.004 N/A 1.5 0.018 0.006 2.0 0.021 0.006 2.1 0.018 0.004 2.2 0.025 0.004 2.3 0.045 0.045 Note two things: (a) the start time goes up over time, and (b) for Python 2.3, -S doesn't make any difference. Given that we often run very short Python programs, and e.g. Python's popularity for CGI scripts, I find this increase in startup time very worrysome, and worthy of our attention (more than gaining nanoseconds on dict operations or even socket I/O speed). My goal: I'd like Python 2.3(final) to start up at least as fast as Python 2.2, and I'd like the much faster startup time back with -S. I have no time to investigate the cause right now, although I have a suspicion that the problem might be in loading too much of the encoding framework at start time (I recall Marc-Andre and Martin debating this earlier). --Guido van Rossum (home page: http://www.python.org/~guido/) From mcherm@mcherm.com Tue May 6 13:32:03 2003 From: mcherm@mcherm.com (Michael Chermside) Date: Tue, 6 May 2003 05:32:03 -0700 Subject: [Python-Dev] Re: heaps Message-ID: <1052224323.3eb7ab43530a5@mcherm.com> Zooko writes: > Shouldn't heapq be a subclass of list? [...] > One thing I don't know how to implement is: > > # This changes mylist itself into a heapq -- it doesn't make a copy of mylist! > makeheapq(mylist) > > Perhaps this is a limitation of the current object model? Or is there a way > to change an object's type at runtime. To change an object's CLASS, sure, but it's TYPE -- seems impossible to me on the face of it since a different type may have a different C layout. Now in THIS case there's no need for a different C layout, so perhaps there's some wierd trick I don't know, but I wouldn't think so. As to your FIRST point though... the choice seems to be between making heapq a subclass of list or a module for operating on a list. You argue that the syntax will be cleaner, but comparing your examples: > heap = [] > heappush(heap, item) > item = heappop(heap) > item = heap[0] > heap = heapq() > heap.push(item) > item = heap.pop() > item = heap[0] I honestly see little meaningful difference. Since (as per earlier discussion) heapq is NOT intended to be a abstract heap data type, I tend to prefer the simpler solution (using a list instead of subclassing). -- Michael Chermside From tim.one@comcast.net Tue May 6 16:47:46 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 11:47:46 -0400 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: <3EB78863.5070105@lemburg.com> Message-ID: [M.-A. Lemburg] > If someone really care, I suppose that the garbage collector could > do an occasional scan of the int free list and chop of the tail > after a certain number of entries. Int objects aren't allocated individually; malloc() is used to get single "int blocks", which contain room for about 1000 ints at a time, and these blocks are carved up internally by intobject.c. So it isn't possible to reclaim the space for a single int, so "tail" doesn't mean anything useful in this context. > FWIW, Unicode free lists have a cap to limit the number of entries > in the list to 1024. The Unicode freelist is more like the frameobject freelist that way (it is possible to reclaim the space for an individual Unicode string or frame object). From skip@pobox.com Tue May 6 16:55:25 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 10:55:25 -0500 Subject: [Python-Dev] testing with and without pyc files present Message-ID: <16055.56045.277686.400944@montanaro.dyndns.org> The test targets in the Makefile first delete any .py[co] files, then run the test suite twice. I know there must be a reason for this, but isn't there a less sledgehammer-like and more explicit way to test whatever this is trying to test? Skip From guido@python.org Tue May 6 17:04:20 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 06 May 2003 12:04:20 -0400 Subject: [Python-Dev] testing with and without pyc files present In-Reply-To: Your message of "Tue, 06 May 2003 10:55:25 CDT." <16055.56045.277686.400944@montanaro.dyndns.org> References: <16055.56045.277686.400944@montanaro.dyndns.org> Message-ID: <200305061604.h46G4KR25972@odiug.zope.com> > The test targets in the Makefile first delete any .py[co] files, then run > the test suite twice. I know there must be a reason for this, but isn't > there a less sledgehammer-like and more explicit way to test whatever this > is trying to test? In the past, we've had problems where bugs in the marshalling or elsewhere caused bytecode read from .pyc files to behave differently than bytecode generated directly from a .py source file. Sometimes the bytecode read from a .pyc file had the bug, somtimes the directly generated bytecode. This is sometimes a very shy bug needing a lot of sample data. How else would you propose to test this? --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Tue May 6 17:12:32 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 06 May 2003 18:12:32 +0200 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: References: Message-ID: <3EB7DEF0.4020105@lemburg.com> Tim Peters wrote: > [M.-A. Lemburg] > >>If someone really care, I suppose that the garbage collector could >>do an occasional scan of the int free list and chop of the tail >>after a certain number of entries. > > Int objects aren't allocated individually; malloc() is used to get single > "int blocks", which contain room for about 1000 ints at a time, and these > blocks are carved up internally by intobject.c. So it isn't possible to > reclaim the space for a single int, so "tail" doesn't mean anything useful > in this context. Hmm, looking at the code it seems that the different blocks are not referencing each other. Wouldn't it be possible to link them together as list of blocks ? This list could then be used for the review operation. >>FWIW, Unicode free lists have a cap to limit the number of entries >>in the list to 1024. > > The Unicode freelist is more like the frameobject freelist that way (it is > possible to reclaim the space for an individual Unicode string or frame > object). Probably :-) Would using the block technique from the int implementation make a difference for the frame objects ? I would guess that a typical Python program rarely has more than 100 frames alive at any one time. These could be placed into such a block to make setting them up faster, possible making Python function calls a tad snippier. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 06 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 49 days left From skip@pobox.com Tue May 6 17:16:45 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 11:16:45 -0500 Subject: [Python-Dev] testing with and without pyc files present In-Reply-To: <200305061604.h46G4KR25972@odiug.zope.com> References: <16055.56045.277686.400944@montanaro.dyndns.org> <200305061604.h46G4KR25972@odiug.zope.com> Message-ID: <16055.57325.88910.417060@montanaro.dyndns.org> Guido> Sometimes the bytecode read from a .pyc file had the bug, Guido> somtimes the directly generated bytecode. This is sometimes a Guido> very shy bug needing a lot of sample data. How else would you Guido> propose to test this? I have no idea, but the reason for the two test runs should probably be documented somewhere. I just embellished the comment in Makefile.pre.in which preceed the test targets. Skip From skip@pobox.com Tue May 6 17:20:34 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 11:20:34 -0500 Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck" Message-ID: <16055.57554.364845.689049@montanaro.dyndns.org> I decided to investigate why the resource module wasn't getting built on my Mac today. A quick check showed that build.opt/pyconfig.h didn't include this stanza: /* Define if you have the 'getpagesize' function. */ #define HAVE_GETPAGESIZE 1 although pyconfig.h.in contained this stanza: /* Define if you have the 'getpagesize' function. */ #undef HAVE_GETPAGESIZE The date on pyconfig.h.in was May 5. The date on build.opt/pyconfig.h was Feb 27. Executing ./config.status --recheck in my build.opt tree doesn't regenerate pyconfig.h. I then tried executing ../configure --prefix=/Users/skip/local This generated pyconfig.h. It would thus appear that config.status shouldn't be used by developers. Apparently one of the other flags it appends to the generated configure command suppresses generation of pyconfig.h (and maybe other files). Skip From jepler@unpythonic.net Tue May 6 17:21:27 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Tue, 6 May 2003 11:21:27 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030506162127.GC12791@unpythonic.net> Comparing 2.2 and 2.3, there are a lot of files opened in 2.3 that aren't in 2.2. The comparison is not fully valid because I'm running 2.3 from the compilation directory, while 2.2 is being run from /usr/bin. Results: # Number of attempts to open a file # Python-2.3b1 compiled with no special flags $ strace -e open ./python -S -c pass 2>&1 | wc -l 249 # RedHat 9's /usr/bin/python (based on 2.2.2) $ strace -e open python -S -c pass 2>&1 | wc -l 9 # Number of attempts to open an existing file $ strace -e open python -S -c pass 2>&1 | grep -v ENOENT | wc -l 8 $ strace -e open ./python -S -c pass 2>&1 | grep -v ENOENT | wc -l 46 The modules imported in 2.3 are: warnings re sre sre_compile sre_constants sre_parse string copy_reg types linecache os posixpath stat UserDict codecs encodings.__init__ encodings.utf_8 I'm crossing my fingers that the time to reload(m) is similar to the time to import it in the first place, which gives these maybe-helpful stats: $ for i in warnings re sre sre_compile sre_constants sre_parse string copy_reg types linecache os posixpath stat UserDict codecs encodings.__init__ encodings.utf_8; do echo -n "reload of module $i: "; ./python Lib/timeit.py -s "import $i" "reload($i)"; done reload of module warnings: 1000 loops, best of 3: 495 usec per loop reload of module re: 10000 loops, best of 3: 80.3 usec per loop reload of module sre: 1000 loops, best of 3: 575 usec per loop reload of module sre_compile: 1000 loops, best of 3: 503 usec per loop reload of module sre_constants: 1000 loops, best of 3: 380 usec per loop reload of module sre_parse: 1000 loops, best of 3: 701 usec per loop reload of module string: 1000 loops, best of 3: 465 usec per loop reload of module copy_reg: 10000 loops, best of 3: 200 usec per loop reload of module types: 10000 loops, best of 3: 180 usec per loop reload of module linecache: 10000 loops, best of 3: 156 usec per loop reload of module os: 1000 loops, best of 3: 1.53e+03 usec per loop reload of module posixpath: 1000 loops, best of 3: 403 usec per loop reload of module stat: 10000 loops, best of 3: 157 usec per loop reload of module UserDict: 1000 loops, best of 3: 454 usec per loop reload of module codecs: 1000 loops, best of 3: 852 usec per loop reload of module encodings.__init__: 1000 loops, best of 3: 244 usec per loop reload of module encodings.utf_8: 10000 loops, best of 3: 132 usec per loop These times seem pretty low, but maybe they're accurate. "os" is the worst of the lot (1530us) and the total comes to 7507us (7.5ms). On my system [2.4GHz Pentium4], this is a typical output of 'time' on python: $ time ./python -S -c pass real 0m0.249s user 0m0.020s sys 0m0.000s $ time python -S -c pass real 0m0.043s user 0m0.010s sys 0m0.000s so the time to import these 17 modules does account for 3/4 of the additional user time between 2.2.2 and 2.3. (Do you care about the 200ms increase in "real" time, or just the user time?) I tried compiling 2.3 with profiling, but gprof sees no samples ("Each sample counts as 0.01 seconds. no time accumulated"). I don't have the capability to try oprofile right now either. Jeff From tim.one@comcast.net Tue May 6 17:30:14 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 12:30:14 -0400 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: <3EB7DEF0.4020105@lemburg.com> Message-ID: [M.-A. Lemburg] > Hmm, looking at the code it seems that the different blocks > are not referencing each other. Wouldn't it be possible to link > them together as list of blocks ? This list could then be used > for the review operation. The blocks are linked together; that's what the _intblock.next pointer does. See PyInt_Fini(). > Would using the block technique from the int implementation > make a difference for the frame objects ? I would guess that a > typical Python program rarely has more than 100 frames alive > at any one time. These could be placed into such a block to > make setting them up faster, possible making Python function > calls a tad snippier. frame objects have variable size; int objects have fixed size; variable size objects don't play nice with fixed block sizes. Note that the frame allocation code already tries to reuse whatever initialization it can left over from the frame object it (normally) pulls off the frame free list. From info@nyc-search.com Tue May 6 17:57:56 2003 From: info@nyc-search.com (NYC-SEARCH) Date: Tue, 6 May 2003 12:57:56 -0400 Subject: [Python-Dev] Python Technical Lead, New York, NY - 80-85k Message-ID: <01fd01c313f0$a41abb80$e0bfef18@earthlink.net> Python Technical Lead, New York, NY - 80-85k - IMMEDIATE HIRE http://www.nyc-search.com/jobs/python.html From martin@v.loewis.de Tue May 6 18:35:40 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 06 May 2003 19:35:40 +0200 Subject: [Python-Dev] Startup time In-Reply-To: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > While Python's general speed has gone up, its startup speed has slowed > down! Hear hear! I always thought you didn't care about startup time at all :-) > I have no time to investigate the cause right now, although I have a > suspicion that the problem might be in loading too much of the > encoding framework at start time (I recall Marc-Andre and Martin > debating this earlier). That would be easy to determine: Just disable the block #if defined(Py_USING_UNICODE) && defined(HAVE_LANGINFO_H) && defined(CODESET) in pythonrun.c, and see whether it changes anything. To my knowledge, this is the only cause of loading encodings during startup on Unix. Regards, Martin From martin@v.loewis.de Tue May 6 18:37:46 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 06 May 2003 19:37:46 +0200 Subject: [Python-Dev] Startup time In-Reply-To: <20030506162127.GC12791@unpythonic.net> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030506162127.GC12791@unpythonic.net> Message-ID: Jeff Epler writes: > Comparing 2.2 and 2.3, there are a lot of files opened in 2.3 that > aren't in 2.2. Very interesting. Could you also try to find out the difference in terms of stat calls? > I'm crossing my fingers that the time to reload(m) is similar to the > time to import it in the first place, which gives these maybe-helpful > stats: That is, unfortunately, not the case: reloading a dynamic module is a no-op. Regards, Martin From martin@v.loewis.de Tue May 6 18:39:21 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 06 May 2003 19:39:21 +0200 Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck" In-Reply-To: <16055.57554.364845.689049@montanaro.dyndns.org> References: <16055.57554.364845.689049@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > This generated pyconfig.h. It would thus appear that config.status > shouldn't be used by developers. Apparently one of the other flags it > appends to the generated configure command suppresses generation of > pyconfig.h (and maybe other files). Can you find out whether this is related to the fact that you are building in a separate build directory? Regards, Martin From aleax@aleax.it Tue May 6 18:49:52 2003 From: aleax@aleax.it (Alex Martelli) Date: Tue, 6 May 2003 19:49:52 +0200 Subject: [Python-Dev] Startup time In-Reply-To: <20030506162127.GC12791@unpythonic.net> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030506162127.GC12791@unpythonic.net> Message-ID: <200305061949.52953.aleax@aleax.it> On Tuesday 06 May 2003 06:21 pm, Jeff Epler wrote: ... > # Number of attempts to open an existing file > $ strace -e open python -S -c pass 2>&1 | grep -v ENOENT | wc -l > 8 > $ strace -e open ./python -S -c pass 2>&1 | grep -v ENOENT | wc -l > 46 Yes, same here (2.2.2 and 2.3 from CVS both built locally with Mdk 9.0). Besides the .py and .pyc for all the modules, there's a few more files that 2.3 is opening and 2.2 isn't: early on: open("/usr/lib/libstdc++.so.5", O_RDONLY) = 3 open("/lib/libgcc_s.so.1", O_RDONLY) = 3 in the midst of the imports (just before encodings/__init__.py): open("/usr/share/locale/locale.alias", O_RDONLY) = 3 open("/usr/share/locale/en_US/LC_CTYPE", O_RDONLY) = 3 Alex From aleax@aleax.it Tue May 6 19:20:42 2003 From: aleax@aleax.it (Alex Martelli) Date: Tue, 6 May 2003 20:20:42 +0200 Subject: [Python-Dev] Startup time In-Reply-To: References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030506162127.GC12791@unpythonic.net> Message-ID: <200305062020.42734.aleax@aleax.it> On Tuesday 06 May 2003 07:37 pm, Martin v. Löwis wrote: > Jeff Epler writes: > > Comparing 2.2 and 2.3, there are a lot of files opened in 2.3 that > > aren't in 2.2. > > Very interesting. Could you also try to find out the difference in > terms of stat calls? In general: [alex@lancelot blm]$ strace -e stat64 python2.2 -S -c pass 2>&1 | wc -l 18 [alex@lancelot blm]$ strace -e stat64 python2.3 -S -c pass 2>&1 | wc -l 71 [alex@lancelot blm]$ strace -e fstat64 python2.2 -S -c pass 2>&1 | wc -l 8 [alex@lancelot blm]$ strace -e fstat64 python2.3 -S -c pass 2>&1 | wc -l 71 [alex@lancelot blm]$ Of the stat64 calls, the found-files only: [alex@lancelot blm]$ strace -e stat64 python2.2 -S -c pass 2>&1 | grep -v ENOENT | wc -l 4 [alex@lancelot blm]$ strace -e stat64 python2.3 -S -c pass 2>&1 | grep -v ENOENT | wc -l 12 Alex From guido@python.org Tue May 6 19:26:07 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 06 May 2003 14:26:07 -0400 Subject: [Python-Dev] MS VC 7 offer Message-ID: <200305061826.h46IQ7605750@odiug.zope.com> A month ago at Python UK in Oxford (which was colocated with C and C++ standardization meetings as well as a general C and C++ users conference) I met with some folks from Microsoft's VC development team, including the project lead, Nick Hodapp. I told Nick that Python for Windows was still built using VC 6. He pointed out that the actual compilers (not the GUI) from VC 7 are freely downloadable. More recently, Nick sent me an email offering to donate copies of VC 7 to the "key developers". I count Tim, myself and Mark Hammond among the key developers. Is there anyone else who would count themselves among those? I presume he's offering the pro version, which has a real optimizer, unlike the "standard" version that was kindly donated by Bjorn Pettersen. I can see advantages and disadvantages of moving to VC 7; I'm sure the VC 7 compiler is more standard-compliant and generates faster code, but a disadvantage is that you can't apparently link binaries built with VC 6 to a program built with VC 7, meaning that 3rd party extensions will have to be recompiled with VC 7 as well. I have no idea how many projects this will affect (don't worry about Zope Corp :-). Maybe we should try to include those 3rd party developers in the deal. (I think Robin Dunn would be affected, wxPython has a Windows distribution.) If you think this is a bad idea or if you would like to qualify for a compiler donation, please follow up! --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Tue May 6 19:27:53 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 06 May 2003 20:27:53 +0200 Subject: [Python-Dev] Startup time In-Reply-To: <200305061949.52953.aleax@aleax.it> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030506162127.GC12791@unpythonic.net> <200305061949.52953.aleax@aleax.it> Message-ID: Alex Martelli writes: > in the midst of the imports (just before encodings/__init__.py): > open("/usr/share/locale/locale.alias", O_RDONLY) = 3 > open("/usr/share/locale/en_US/LC_CTYPE", O_RDONLY) = 3 That is the effect of nl_langinfo(CODESET). Regards, Martin From jepler@unpythonic.net Tue May 6 19:36:00 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Tue, 6 May 2003 13:36:00 -0500 Subject: [Python-Dev] Startup time In-Reply-To: References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030506162127.GC12791@unpythonic.net> Message-ID: <20030506183600.GA27125@unpythonic.net> On Tue, May 06, 2003 at 07:37:46PM +0200, Martin v. L=F6wis wrote: > Jeff Epler writes: >=20 > > Comparing 2.2 and 2.3, there are a lot of files opened in 2.3 that > > aren't in 2.2. >=20 > Very interesting. Could you also try to find out the difference in > terms of stat calls? # redhat's 9 2.2.2 $ strace -e stat64 python -S -c pass 2>&1 | wc -l 11 # python.org's 2.3b1 $ strace -e stat64 ./python -S -c pass 2>&1 | wc -l 72 By the way, I was able to account for the wall-time difference I saw due to the fact that my PYTHONPATH contains some directories on NFS, and so the attempted open()s and stat()s of standard modules did take measurable wall time. With no PYTHONPATH variable set, these are the startup timings I see: # 2.2.2 real 0m0.005s user 0m0.000s sys 0m0.000s # 2.3b2 real 0m0.044s user 0m0.020s sys 0m0.020s By the way, I wouldn't be too excited about trusting this Python -- ./python -c "import random" Illegal instruction I wonder what's gone wrong... (gdb) run -c "import random" Starting program: /usr/src/Python-2.3b1/python -c "import random" [New Thread 1074963072 (LWP 28408)] Program received signal SIGILL, Illegal instruction. [Switching to Thread 1074963072 (LWP 28408)] 0x08109aa0 in subtype_getsets_full () (gdb) where #0 0x08109aa0 in subtype_getsets_full () #1 0x4001c743 in random_new (type=3D0x4001c738, args=3D0x4012c02c, kwds=3D= 0x0) at /usr/src/Python-2.3b1/Modules/_randommodule.c:439 (gdb) ptype subtype_getsets_full type =3D struct PyGetSetDef { [...] I'm recompiling now to see if it was just a bogon strike.. surely somebod= y else has tested on redhat9! nope, recompiled and I still have the proble= m. and I can't get the debugger to stop at the top of random_new either. jeff From neal@metaslash.com Tue May 6 19:37:21 2003 From: neal@metaslash.com (Neal Norwitz) Date: Tue, 06 May 2003 14:37:21 -0400 Subject: [Python-Dev] Startup time In-Reply-To: <200305062020.42734.aleax@aleax.it> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030506162127.GC12791@unpythonic.net> <200305062020.42734.aleax@aleax.it> Message-ID: <20030506183721.GC1340@epoch.metaslash.com> On Tue, May 06, 2003 at 08:20:42PM +0200, Alex Martelli wrote: > On Tuesday 06 May 2003 07:37 pm, Martin v. L=F6wis wrote: > > Jeff Epler writes: > > > Comparing 2.2 and 2.3, there are a lot of files opened in 2.3 t= hat > > > aren't in 2.2. >=20 > [alex@lancelot blm]$ strace -e stat64 python2.2 -S -c pass 2>&1 | w= c -l > 18 > [alex@lancelot blm]$ strace -e stat64 python2.3 -S -c pass 2>&1 | w= c -l > 71 I think amny of the extra stat/open calls are due to zipimports. I don't have python23.zip, but it's still looking for a bunch of extra files that can't exist (in python23.zip). Perhaps if the zip file doesn't exist, we can short circuit the remaining calls to open()? stat64("/home/neal/local/lib/python23.zip/warnings", 0xbfffebc0) = =3D -1 ENOENT (No such file or directory) open("/home/neal/local/lib/python23.zip/warnings.so", O_RDONLY|O_LARG= EFILE) =3D -1 ENOENT (No such file or directory) open("/home/neal/local/lib/python23.zip/warningsmodule.so", O_RDONLY|= O_LARGEFILE) =3D -1 ENOENT (No such file or directory) open("/home/neal/local/lib/python23.zip/warnings.py", O_RDONLY|O_LARG= EFILE) =3D -1 ENOENT (No such file or directory) open("/home/neal/local/lib/python23.zip/warnings.pyc", O_RDONLY|O_LAR= GEFILE) =3D -1 ENOENT (No such file or directory) Neal From nas@python.ca Tue May 6 19:41:07 2003 From: nas@python.ca (Neil Schemenauer) Date: Tue, 6 May 2003 11:41:07 -0700 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061826.h46IQ7605750@odiug.zope.com> References: <200305061826.h46IQ7605750@odiug.zope.com> Message-ID: <20030506184107.GA21470@glacier.arctrix.com> Guido van Rossum wrote: > I can see advantages and disadvantages of moving to VC 7; I'm sure the > VC 7 compiler is more standard-compliant and generates faster code, > but a disadvantage is that you can't apparently link binaries built > with VC 6 to a program built with VC 7, meaning that 3rd party > extensions will have to be recompiled with VC 7 as well. Can distutils use (or be made to use) the free command line VC 7 tools? Also, does this affect whether extensions can be compiled by Mingw? It would be nice if people could continue building extensions on Windows using free tools. Neil From guido@python.org Tue May 6 19:45:50 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 06 May 2003 14:45:50 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: Your message of "Tue, 06 May 2003 11:41:07 PDT." <20030506184107.GA21470@glacier.arctrix.com> References: <200305061826.h46IQ7605750@odiug.zope.com> <20030506184107.GA21470@glacier.arctrix.com> Message-ID: <200305061845.h46Ijo106044@odiug.zope.com> > > I can see advantages and disadvantages of moving to VC 7; I'm sure the > > VC 7 compiler is more standard-compliant and generates faster code, > > but a disadvantage is that you can't apparently link binaries built > > with VC 6 to a program built with VC 7, meaning that 3rd party > > extensions will have to be recompiled with VC 7 as well. > > Can distutils use (or be made to use) the free command line VC 7 tools? That would be a project, but his implication was that the compilers are usable as command line tools, so I'm confident it can be done. > Also, does this affect whether extensions can be compiled by Mingw? > It would be nice if people could continue building extensions on > Windows using free tools. I know noting about Mingw. Anyone who does please speak up if this would affect them or not. --Guido van Rossum (home page: http://www.python.org/~guido/) From phil@riverbankcomputing.co.uk Tue May 6 19:48:03 2003 From: phil@riverbankcomputing.co.uk (Phil Thompson) Date: Tue, 6 May 2003 19:48:03 +0100 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061826.h46IQ7605750@odiug.zope.com> References: <200305061826.h46IQ7605750@odiug.zope.com> Message-ID: <200305061948.03757.phil@riverbankcomputing.co.uk> On Tuesday 06 May 2003 7:26 pm, Guido van Rossum wrote: > A month ago at Python UK in Oxford (which was colocated with C and C++ > standardization meetings as well as a general C and C++ users > conference) I met with some folks from Microsoft's VC development > team, including the project lead, Nick Hodapp. I told Nick that > Python for Windows was still built using VC 6. He pointed out that > the actual compilers (not the GUI) from VC 7 are freely downloadable. > > More recently, Nick sent me an email offering to donate copies of VC 7 > to the "key developers". I count Tim, myself and Mark Hammond among > the key developers. Is there anyone else who would count themselves > among those? > > I presume he's offering the pro version, which has a real optimizer, > unlike the "standard" version that was kindly donated by Bjorn > Pettersen. > > I can see advantages and disadvantages of moving to VC 7; I'm sure the > VC 7 compiler is more standard-compliant and generates faster code, > but a disadvantage is that you can't apparently link binaries built > with VC 6 to a program built with VC 7, meaning that 3rd party > extensions will have to be recompiled with VC 7 as well. I have no > idea how many projects this will affect (don't worry about Zope Corp > > :-). Maybe we should try to include those 3rd party developers in the > > deal. (I think Robin Dunn would be affected, wxPython has a Windows > distribution.) > > If you think this is a bad idea or if you would like to qualify for a > compiler donation, please follow up! How do we get hold of the free VC 7 compilers? Phil From theller@python.net Tue May 6 19:48:12 2003 From: theller@python.net (Thomas Heller) Date: 06 May 2003 20:48:12 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <20030506184107.GA21470@glacier.arctrix.com> References: <200305061826.h46IQ7605750@odiug.zope.com> <20030506184107.GA21470@glacier.arctrix.com> Message-ID: Neil Schemenauer writes: > Guido van Rossum wrote: > > I can see advantages and disadvantages of moving to VC 7; I'm sure the > > VC 7 compiler is more standard-compliant and generates faster code, > > but a disadvantage is that you can't apparently link binaries built > > with VC 6 to a program built with VC 7, meaning that 3rd party > > extensions will have to be recompiled with VC 7 as well. > > Can distutils use (or be made to use) the free command line VC 7 tools? The only problem distutils has is to find the compiler and the environment it needs. Currently it relies on (undocumented) registry entries (for VC6), and there's a patch somewhere on SF for the registry entries for VC7. I like the idea of using VC7 (as much as I dislike the VC7 gui itself). 'Professional' windows developers have VC7 anyway, it's included in MSDN professional. Thomas From logistix@cathoderaymission.net Tue May 6 19:52:32 2003 From: logistix@cathoderaymission.net (logistix) Date: Tue, 6 May 2003 13:52:32 -0500 (CDT) Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061826.h46IQ7605750@odiug.zope.com> Message-ID: On Tue, 6 May 2003, Guido van Rossum wrote: > A month ago at Python UK in Oxford (which was colocated with C and C++ > standardization meetings as well as a general C and C++ users > conference) I met with some folks from Microsoft's VC development > team, including the project lead, Nick Hodapp. I told Nick that > Python for Windows was still built using VC 6. He pointed out that > the actual compilers (not the GUI) from VC 7 are freely downloadable. > > More recently, Nick sent me an email offering to donate copies of VC 7 > to the "key developers". I count Tim, myself and Mark Hammond among > the key developers. Is there anyone else who would count themselves > among those? > > I presume he's offering the pro version, which has a real optimizer, > unlike the "standard" version that was kindly donated by Bjorn > Pettersen. > > I can see advantages and disadvantages of moving to VC 7; I'm sure the > VC 7 compiler is more standard-compliant and generates faster code, > but a disadvantage is that you can't apparently link binaries built > with VC 6 to a program built with VC 7, meaning that 3rd party > extensions will have to be recompiled with VC 7 as well. I have no > idea how many projects this will affect (don't worry about Zope Corp > :-). Maybe we should try to include those 3rd party developers in the > deal. (I think Robin Dunn would be affected, wxPython has a Windows > distribution.) > > If you think this is a bad idea or if you would like to qualify for a > compiler donation, please follow up! > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Visual Studio 2003 came out a few weeks ago. I honestly don't know if its considered VC8 or just VC7.1 with the same backend compilers. But if you're going to upgrad, you might as well go all the way. Also, I'm assuming 2.3 will still be compiled on 6.0, right? From guido@python.org Tue May 6 19:55:15 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 06 May 2003 14:55:15 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: Your message of "Tue, 06 May 2003 19:48:03 BST." <200305061948.03757.phil@riverbankcomputing.co.uk> References: <200305061826.h46IQ7605750@odiug.zope.com> <200305061948.03757.phil@riverbankcomputing.co.uk> Message-ID: <200305061855.h46ItFZ06217@odiug.zope.com> > How do we get hold of the free VC 7 compilers? Here's the info Nick sent me: | We offer as part of the .NET Framework SDK each of the compilers that | comprise our Visual Studio tool - including C++. The caveat here is | that we don't yet ship the full CRT or STL with this distribution - | this will be changing. Also, the 64bit C++ compilers ship for free as | part of the Windows Platform SDK. All of this is available on | msdn.microsoft.com. [...] | Here are the links to the SDKs. But so you aren't surprised, these are | NOT low-overhead downloads or installs... | | .NET Framework 1.1 | | http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx | | Platform SDK | | http://msdn.microsoft.com/library/default.asp?url=/library/en-us/sdkintro/sdkintro/obtaining_the_complete_sdk.asp --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Tue May 6 19:56:52 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 14:56:52 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061948.03757.phil@riverbankcomputing.co.uk> Message-ID: [Phil Thompson] > How do we get hold of the free VC 7 compilers? Part of the 100+ MB .NET Framework 1.1 SDK: http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx Note that this requires Win2K minimum. From jepler@unpythonic.net Tue May 6 19:57:50 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Tue, 6 May 2003 13:57:50 -0500 Subject: RedHat 9 _random failure under -pg (was Re: [Python-Dev] Startup time) Message-ID: <20030506185750.GB27125@unpythonic.net> On Tue, May 06, 2003 at 01:36:00PM -0500, Jeff Epler wrote: > (gdb) run -c "import random" > Starting program: /usr/src/Python-2.3b1/python -c "import random" > [New Thread 1074963072 (LWP 28408)] > > Program received signal SIGILL, Illegal instruction. > [Switching to Thread 1074963072 (LWP 28408)] > 0x08109aa0 in subtype_getsets_full () > (gdb) where > #0 0x08109aa0 in subtype_getsets_full () > #1 0x4001c743 in random_new (type=0x4001c738, args=0x4012c02c, kwds=0x0) > at /usr/src/Python-2.3b1/Modules/_randommodule.c:439 > (gdb) ptype subtype_getsets_full > type = struct PyGetSetDef { > [...] gcc is generating plainly bogus code for this simple function random_new: 00001738 : 1738: 55 push %ebp 1739: 89 e5 mov %esp,%ebp 173b: 56 push %esi 173c: 53 push %ebx 173d: ff 93 7c 00 00 00 call *0x7c(%ebx) (for those of you who don't read x86 assembly, the first 4 functions are part of a standard function prologue. The fifth instruction is a call through a function pointer, but the register's value at this point is undefined. This is not the call to type->tp_alloc(), correct code for that is just below) Well, this may have been false alarm -- when I removed -pg from OPT in the Makefile, './python -c "import random"' works. So this is a problem only when profiling is enabled. Is this intended to work? In any case, the fact that the disassembly is so plainly bogus tends to imply that this is a gcc bug, not anything that Python can fix. Jeff From just@letterror.com Tue May 6 19:59:02 2003 From: just@letterror.com (Just van Rossum) Date: Tue, 6 May 2003 20:59:02 +0200 Subject: [Python-Dev] Startup time In-Reply-To: <20030506183721.GC1340@epoch.metaslash.com> Message-ID: Neal Norwitz wrote: > I think amny of the extra stat/open calls are due to zipimports. > > I don't have python23.zip, but it's still looking for a bunch > of extra files that can't exist (in python23.zip). Perhaps > if the zip file doesn't exist, we can short circuit the remaining > calls to open()? I think we should, although I wouldn't know off hand how to do that. There's still some nice-to-have PEP302 stuff that remains to be implemented, that could actually help solve this problem. Currently there are no real importer objects for the builting import mechanisms: a value of None for a path item in sys.path_importer_cache means: use the builtin importer. If there _was_ a true builtin importer object, None could mean: no importer can handle this path item, skip it. See also python.org/sf/692884. I hope to be able to work on this before 2.3b2. > stat64("/home/neal/local/lib/python23.zip/warnings", 0xbfffebc0) = -1 ENOENT (No such file or > directory) > open("/home/neal/local/lib/python23.zip/warnings.so", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No > such file or directory) > open("/home/neal/local/lib/python23.zip/warningsmodule.so", O_RDONLY|O_LARGEFILE) = -1 ENOENT > (No such file or directory) > open("/home/neal/local/lib/python23.zip/warnings.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No > such file or directory) > open("/home/neal/local/lib/python23.zip/warnings.pyc", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No > such file or directory) You could try editing site.py so it (as it used to) removes path items that don't exist on the file system. Except this probably only helps if you'd do this _before_ os.py is imported, as os.py pulls in quite a few modules. Hm, chicken and egg... Or disable to the code that adds the zipfile to sys.path in Modules/getpath.c, and compare the number of stat calls. Just From tim@zope.com Tue May 6 20:00:03 2003 From: tim@zope.com (Tim Peters) Date: Tue, 6 May 2003 15:00:03 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: Message-ID: [logistix] > ... > Also, I'm assuming 2.3 will still be compiled on 6.0, right? The PythonLabs 2.3 Windows distribution will be compiled with MSVC 6, barring an unbroken chain of miracles. From guido@python.org Tue May 6 20:01:01 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 06 May 2003 15:01:01 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: Your message of "Tue, 06 May 2003 13:52:32 CDT." References: Message-ID: <200305061901.h46J11306259@odiug.zope.com> > Visual Studio 2003 came out a few weeks ago. I honestly don't know if > its considered VC8 or just VC7.1 with the same backend compilers. But if > you're going to upgrad, you might as well go all the way. Good question. > Also, I'm assuming 2.3 will still be compiled on 6.0, right? Hm, I was thinking that 2.3 final could be built using 7.x if Nick can get us the donated copies fast enough. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Tue May 6 20:12:13 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 14:12:13 -0500 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061901.h46J11306259@odiug.zope.com> References: <200305061901.h46J11306259@odiug.zope.com> Message-ID: <16056.2317.124886.963460@montanaro.dyndns.org> >> Also, I'm assuming 2.3 will still be compiled on 6.0, right? Guido> Hm, I was thinking that 2.3 final could be built using 7.x if Guido> Nick can get us the donated copies fast enough. I can see the downside (next to no experience with 7.x, and perhaps none before the final release). What's the upside? Skip From skip@pobox.com Tue May 6 20:18:24 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 14:18:24 -0500 Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck" In-Reply-To: References: <16055.57554.364845.689049@montanaro.dyndns.org> Message-ID: <16056.2688.72423.251200@montanaro.dyndns.org> >> This generated pyconfig.h. It would thus appear that config.status >> shouldn't be used by developers. Apparently one of the other flags >> it appends to the generated configure command suppresses generation >> of pyconfig.h (and maybe other files). Martin> Can you find out whether this is related to the fact that you Martin> are building in a separate build directory? I just confirmed that it's not related to the separate build directory. When you run config.status --recheck it reruns your latest configure command with the extra flags --no-create and --no-recursion. Without rummaging around in the configure file my guess is the --no-create flag is the culprit. So, a word to the wise: avoid config.status --recheck. Skip From tim.one@comcast.net Tue May 6 20:17:28 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 15:17:28 -0400 Subject: [Python-Dev] Windows installer request... In-Reply-To: <60FB8BB7F0EFC7409B75EEEC13E20192022DE23A@admin56.narex.com> Message-ID: [Bjorn Pettersen] > Most installers default to the system drive, so I didn't even look the > first time. I am able to change it manually. > ... > It should be as easy as (platforms that doesn't have %systemdrive% could > only install to C:): > > item: Get Environment Variable > Variable=OSDRIVE > Environment=SystemDrive > Default=C: > end > > However, you might have to do > > item: Get Registry Key Value > Variable=OSDRIVE > Key=System\CurrentControlSet\Control\Session Manager\Environment > Value Name=SystemDrive > Flags=00000100 > Defualt=C: > end > > (not sure about the Flags parameter) I couldn't find much documentation, > and the example I'm looking at is a litte "divided" about which it > should use... I think it tries the first one, and falls back on the > second(?) (http://ibinstall.defined.net/dl_scripts.htm, > script_6016.zip/IBWin32Setup.wse). > > Also, it looks like you want to use %SYS32% to get to the windows system > directory (on WinXP, it's c:\windows\system32, which doesn't seem to be > listed anywhere...) Enough already : I don't have time to try umpteen different things here, or really even one. What I did do is build an installer *just* removing the hard-coded Wizard-generated "C:" prefix. Martin tried that and said it worked for him. It doesn't hurt me. If it works for you too, I'll commit the change: ftp://ftp.python.org/pub/tmp/experimental.exe Please give that a try. It's an incoherent mix if files, so please use a junk name for the installation directory and program startup group (or simply abort the install after you see whether it suggested a drive you approve of). > I can't figure out how you're building the installer however. If you can > point me in the right direction I can test it on my special WinXP, > regular WinXP, Win98, Win2k, and maybe WinNT4 (I think we still have one > around :-). .wse files aren't intended to be edited by hand (although we all do, sometimes). Instead, they're input to Wise's commercial GUI, which displays their contents in a nice block-indented, color-coded way. "flags" aren't documented, and the GUI never shows them to you -- they correspond to the on/off status of various checkboxes in various GUI dialogs. We use Wise 8.14 to build the installer. If you have Wise, you open the python20.wse file using it, and click the "Compile" button in the GUI. If you don't have Wise, I suppose you guess what Wise would do if you did have it . From brian@sweetapp.com Tue May 6 20:24:31 2003 From: brian@sweetapp.com (Brian Quinlan) Date: Tue, 06 May 2003 12:24:31 -0700 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <16056.2317.124886.963460@montanaro.dyndns.org> Message-ID: <007e01c31405$1ea52fc0$21795418@dell1700> > I can see the downside (next to no experience with 7.x, and perhaps none > before the final release). What's the upside? It's free and more standards compliant. Cheers, Brian From martin@v.loewis.de Tue May 6 20:21:41 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 06 May 2003 21:21:41 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061826.h46IQ7605750@odiug.zope.com> References: <200305061826.h46IQ7605750@odiug.zope.com> Message-ID: Guido van Rossum writes: > More recently, Nick sent me an email offering to donate copies of VC 7 > to the "key developers". I count Tim, myself and Mark Hammond among > the key developers. Is there anyone else who would count themselves > among those? Does he already have the copies, or would purchase them/donate the money? > If you think this is a bad idea or if you would like to qualify for a > compiler donation, please follow up! If the money isn't spent yet, I think it would be better spent for copies of VC 7.1 (aka .NET 2003). Reportedly, this compiler fixes a number of bugs of the 7.0 release, i.e. it crashes less frequently. I'm still uncertain what the binary compatibility issues are, but I have reason to assume that 7.0 and 7.1 are binary compatible. Before getting multiple copies of the compiler, you should double check that you can actually produce a Windows installer for that compiler. Notice that there is a particular problem hidden here: You will have to ship the C runtime (MSVCR7.DLL) with the installer. However, Microsoft does *not* give you permission to include the DLL file. Instead, they provide a Windows installer snippet which you must "use" (I believe in the sense of "execute on the target machine"). The installer snippet will check for versions, deal with DLL caches, etc. Microsoft has procedure for combining installer snippets into full installer files. They acknowledge the existance of other tools that make installable binaries, but mandate that these tools perform the same procedures. So you should check whether your copy of Wise can deal with these issues. If you find it could actually work, I'm +0 on accepting the donation (though I won't need a copy myself). You have to switch sooner or later, anyway, so you might as well switch now instead of later. The advantage I see for Python itself is that IPv6 would now work on Windows. The disadvantage I see is that distutils would need to get updated. If you think that 2.3 won't be built with 7.x anyway, you might as well reject the donation, and hope the donor will still be there to offer VC 7.2/8.0. Regards, Martin From tim.one@comcast.net Tue May 6 20:20:34 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 15:20:34 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061901.h46J11306259@odiug.zope.com> Message-ID: [Guido] > Hm, I was thinking that 2.3 final could be built using 7.x if Nick can > get us the donated copies fast enough. As I said, the PythonLabs Windows 2.3 installer will be compiled using MSVC 6, barring an unbroken chain of miracles . From martin@v.loewis.de Tue May 6 20:25:20 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 06 May 2003 21:25:20 +0200 Subject: RedHat 9 _random failure under -pg (was Re: [Python-Dev] Startup time) In-Reply-To: <20030506185750.GB27125@unpythonic.net> References: <20030506185750.GB27125@unpythonic.net> Message-ID: Jeff Epler writes: > Well, this may have been false alarm -- when I removed -pg from OPT in > the Makefile, './python -c "import random"' works. So this is a problem > only when profiling is enabled. Is this intended to work? You mean, is the gcc option -pg supposed to work? As a Python developer: How am I supposed to know? As a gcc developer: yes, certainly. > In any case, the fact that the disassembly is so plainly bogus tends to > imply that this is a gcc bug, not anything that Python can fix. That seems to be the case, yes. Python can only work-around, but in this case, the work-around seems trivial. Regards, Martin From gtalvola@nameconnector.com Tue May 6 20:27:11 2003 From: gtalvola@nameconnector.com (Geoffrey Talvola) Date: Tue, 6 May 2003 15:27:11 -0400 Subject: [Python-Dev] MS VC 7 offer Message-ID: <61957B071FF421419E567A28A45C7FE59AF419@mailbox.nameconnector.com> Guido van Rossum wrote: > I can see advantages and disadvantages of moving to VC 7; I'm sure the > VC 7 compiler is more standard-compliant and generates faster code, > but a disadvantage is that you can't apparently link binaries built > with VC 6 to a program built with VC 7, meaning that 3rd party > extensions will have to be recompiled with VC 7 as well. If that's really true then my vote would be against switching to VC 7. My company uses VC 6 extensively and we have no plans to upgrade to VC 7. Our Python programs make extensive use of .pyd's compiled with VC6, and we also embed the Python interpreter within our C++ programs. It would be _very_ painful for us to upgrade our world to VC7, and if Python switched to VC 7, we'd probably be forced to simply compile our own custom version of Python (and the 3rd-party extension DLLs we use) with VC6. So there's one data point for you... - Geoff From skip@pobox.com Tue May 6 20:31:16 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 14:31:16 -0500 Subject: [Python-Dev] SF CVS offline Message-ID: <16056.3460.431223.466945@montanaro.dyndns.org> It appears SF CVS is offline (as of 2:30PM Central Daylight Time). I noticed this when I was prompted for a CVS password for the first time in ages (and which I can't remember). I went poking around for help and came across this page: http://sourceforge.net/docman/display_doc.php?docid=2352&group_id=1 which says, in part: Project CVS Services: Offline; unplanned maintenance (follow-up from 2003-05-05) in-progress FYI. Skip From skip@pobox.com Tue May 6 20:33:22 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 14:33:22 -0500 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <007e01c31405$1ea52fc0$21795418@dell1700> References: <16056.2317.124886.963460@montanaro.dyndns.org> <007e01c31405$1ea52fc0$21795418@dell1700> Message-ID: <16056.3586.553248.689395@montanaro.dyndns.org> >> I can see the downside (next to no experience with 7.x, and perhaps >> none before the final release). What's the upside? Brian> It's free and more standards compliant. Then I suggest we have at least one beta which is built using it. Skip From brian@sweetapp.com Tue May 6 20:53:56 2003 From: brian@sweetapp.com (Brian Quinlan) Date: Tue, 06 May 2003 12:53:56 -0700 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <16056.3586.553248.689395@montanaro.dyndns.org> Message-ID: <007f01c31409$3a5f9670$21795418@dell1700> Brian> It's free and more standards compliant. > > Then I suggest we have at least one beta which is built using it. To get me wrong; I think that moving to VC7 for Python 2.3 would be mistake if VC6 compiled extension modules are not binary compatible. My understanding was that static libraries are not compatible but that dynamic ones are. I spent a few minutes with google but wasn't able to find out. Assuming that VC6 and VC7 are not binary compatible, here are my concerns: 1. 3rd party extension developers will have to switch very quickly to be ready for the 2.3 release 2. Some 3rd party extension developers may have already released binaries for Python 2.3, based on the understanding that there won't be any additional API changes after the first beta (baring a disaster). 3. I believe that the installer normally preserves site-packages when doing an upgrade? If so, the user is going to be left with extension modules that won't work. Cheers, Brian From fdrake@acm.org Tue May 6 20:57:04 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 6 May 2003 15:57:04 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <007f01c31409$3a5f9670$21795418@dell1700> References: <16056.3586.553248.689395@montanaro.dyndns.org> <007f01c31409$3a5f9670$21795418@dell1700> Message-ID: <16056.5008.75631.677019@grendel.zope.com> Brian Quinlan writes: > 1. 3rd party extension developers will have to switch very quickly to be > ready for the 2.3 release A very real issue, to be sure. > 2. Some 3rd party extension developers may have already released > binaries for Python 2.3, based on the understanding that there won't > be any additional API changes after the first beta (baring a > disaster). I'm not convinced that's a huge problem, though it could be an annoyance. > 3. I believe that the installer normally preserves site-packages when > doing an upgrade? If so, the user is going to be left with extension > modules that won't work. Yes, but site-packages is specific to the major.minor version of Python, so it would only bite people going from an alpha/beta to a final release, not from major.minor-1. Is this really an issue? -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From pje@telecommunity.com Tue May 6 20:58:34 2003 From: pje@telecommunity.com (Phillip J. Eby) Date: Tue, 06 May 2003 15:58:34 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061845.h46Ijo106044@odiug.zope.com> References: <200305061826.h46IQ7605750@odiug.zope.com> <20030506184107.GA21470@glacier.arctrix.com> Message-ID: <5.1.1.6.0.20030506155456.01f9c220@telecommunity.com> At 02:45 PM 5/6/03 -0400, Guido van Rossum wrote: > > Also, does this affect whether extensions can be compiled by Mingw? > > It would be nice if people could continue building extensions on > > Windows using free tools. > >I know noting about Mingw. Anyone who does please speak up if this >would affect them or not. I build my extensions on Windows 98 with MinGW. I don't know if VC6 vs. VC7 makes a difference or not, since I don't own either one. I think someone said something about the free VC7 requiring Win2K? That seems to me like a dealbreaker for switching from MinGW to VC7, even if the VC7 is free-as-in-beer. From skip@pobox.com Tue May 6 21:01:51 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 15:01:51 -0500 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <007f01c31409$3a5f9670$21795418@dell1700> References: <16056.3586.553248.689395@montanaro.dyndns.org> <007f01c31409$3a5f9670$21795418@dell1700> Message-ID: <16056.5295.907462.399304@montanaro.dyndns.org> Brian> Assuming that VC6 and VC7 are not binary compatible, here are my Brian> concerns: ... Sounds to me like the switch to VC7 will have to happen with a long lead time, similar to what one might expect if Guido decided to deprecate the sys module. ;-) Skip From aleax@aleax.it Tue May 6 21:12:07 2003 From: aleax@aleax.it (Alex Martelli) Date: Tue, 6 May 2003 22:12:07 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <007f01c31409$3a5f9670$21795418@dell1700> References: <007f01c31409$3a5f9670$21795418@dell1700> Message-ID: <200305062212.07539.aleax@aleax.it> On Tuesday 06 May 2003 09:53 pm, Brian Quinlan wrote: > Brian> It's free and more standards compliant. > > > Then I suggest we have at least one beta which is built using it. > > To get me wrong; I think that moving to VC7 for Python 2.3 would be > mistake if VC6 compiled extension modules are not binary compatible. My > understanding was that static libraries are not compatible but that > dynamic ones are. I spent a few minutes with google but wasn't able to > find out. When we discussed VC versions (back when we met in Ofxord during PythonUK/ACCU), David Abrahams seemed adamant that VC7 and VC6 are indeed compatible -- as he has first-hand experience while I just have horror stories from ex-coworkers I suspect he's likelier to be right. Anyway, I'm CC'ing him since I do suspect he has relevant input and might not be following python-dev right now... Alex From martin@v.loewis.de Tue May 6 21:22:26 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 06 May 2003 22:22:26 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <007f01c31409$3a5f9670$21795418@dell1700> References: <007f01c31409$3a5f9670$21795418@dell1700> Message-ID: <3EB81982.8070600@v.loewis.de> Brian Quinlan wrote: > To get me wrong; I think that moving to VC7 for Python 2.3 would be > mistake if VC6 compiled extension modules are not binary compatible. My > understanding was that static libraries are not compatible but that > dynamic ones are. I spent a few minutes with google but wasn't able to > find out. Please rest assured that they are definitely incompatible. People have been trying to combine VC7 extension modules with VC6, and got consistent crashes. The crashes occur as you pass FILE* across libraries: Neither C library can deal with FILE* (such as stdout) received from the other library. > 1. 3rd party extension developers will have to switch very quickly to be > > ready for the 2.3 release True. > 2. Some 3rd party extension developers may have already released > binaries for Python 2.3, based on the understanding that there won't > be any additional API changes after the first beta (baring a > disaster). There won't be any. That's any ABI change. > 3. I believe that the installer normally preserves site-packages when > doing an upgrade? If so, the user is going to be left with extension > modules that won't work. Users installing betas should still expect such things. Uninstallation before upgrading to the final release is strongly advised. Regards, Martin From guido@python.org Tue May 6 21:23:18 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 06 May 2003 16:23:18 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: Your message of "Tue, 06 May 2003 22:12:07 +0200." <200305062212.07539.aleax@aleax.it> References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it> Message-ID: <200305062023.h46KNI907721@odiug.zope.com> I should mention that on re-reading Nick's email, it's clear that he's offering to donate copies of Visual C++ 2003, so that's the latest. I've invited him to respond directly to the comments and questions. In any case, it looks like it may be best to wait until after 2.3 is released, although if there's time I wouldn't mind playing a bit with 2003. (Hmm... if it really doesn't work on Win98 I have a problem.) --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Tue May 6 21:26:51 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 06 May 2003 22:26:51 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305062212.07539.aleax@aleax.it> References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it> Message-ID: <3EB81A8B.9090603@v.loewis.de> Alex Martelli wrote: > When we discussed VC versions (back when we met in Ofxord during > PythonUK/ACCU), David Abrahams seemed adamant that VC7 and VC6 > are indeed compatible I doubt he said this in this generality: he surely knows that you cannot mix C++ objects files on the object file level between those compilers, as they implement completely different ABIs. For Python, the biggest problem is that you cannot pass FILE* from one C library to the other, because of some stupid locking test in the C library. This does cause crashes when you try to use Python extension modules compiled with the wrong compiler. Regards, Martin From aleax@aleax.it Tue May 6 21:34:12 2003 From: aleax@aleax.it (Alex Martelli) Date: Tue, 6 May 2003 22:34:12 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305062023.h46KNI907721@odiug.zope.com> References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it> <200305062023.h46KNI907721@odiug.zope.com> Message-ID: <200305062234.12363.aleax@aleax.it> On Tuesday 06 May 2003 10:23 pm, Guido van Rossum wrote: > I should mention that on re-reading Nick's email, it's clear that he's > offering to donate copies of Visual C++ 2003, so that's the latest. > I've invited him to respond directly to the comments and questions. > > In any case, it looks like it may be best to wait until after 2.3 is > released, although if there's time I wouldn't mind playing a bit with > 2003. (Hmm... if it really doesn't work on Win98 I have a problem.) Me too -- a BAD one, since I do just about all of my "windows" work these days with win4lin under Linux on my desktop box (cheap, fast, convenient), or on an old Acer Travelmate 345T laptop, and both only support Win98 -- the only "modern" Windows version I have around is in the dualboot of a far-too-heavy Dell laptop which came with Win/XP (so I didn't entirely remove it when installing Linux as the main OS, just shrank it as much as I could in case I ever needed something in it)... It WOULD be deucedly inconvenient to have to install Win/XP and keep it booted just to be able to build Python extension binaries for Windows...!-( Why a command-line compiler shouldn't be able to run on just about any version of its OS really escapes me. Maybe a clever move to force us laggards to upgrade whether we want to or not...?-( Alex From skip@pobox.com Tue May 6 21:50:31 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 15:50:31 -0500 Subject: [Python-Dev] bsddb185 module changes checked in Message-ID: <16056.8215.274307.904009@montanaro.dyndns.org> The various bits necessary to implement the "build bsddb185 when appropriate" have been checked in. I'm pretty sure I don't have the best possible test for the existence of a db library, but it will have to do for now. I suspect others can clean it up later during the beta cycle. The current detection code in setup.py should work for Nick on OSF/1 and for platforms which don't require a separate db library. I'd appreciate some extra pounding on this code. Thanks, Skip From tim.one@comcast.net Tue May 6 21:50:18 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 16:50:18 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB81A8B.9090603@v.loewis.de> Message-ID: [Alex Martelli] > When we discussed VC versions (back when we met in Ofxord during > PythonUK/ACCU), David Abrahams seemed adamant that VC7 and VC6 > are indeed compatible [Martin v. Lowis] > I doubt he said this in this generality: he surely knows that you > cannot mix C++ objects files on the object file level between those > compilers, as they implement completely different ABIs. > > For Python, the biggest problem is that you cannot pass FILE* from one C > library to the other, because of some stupid locking test in the C > library. This does cause crashes when you try to use Python extension > modules compiled with the wrong compiler. And not the only problem. Review the "PyObject_New vs PyObject_NEW" thread from python-dev in March. This snippet sums it up: [David Abrahams] > Python was compiled with vc6, the rest with vc7. I test this > combination regularly and have never seen a problem. [Tim] You have now . From brian@sweetapp.com Tue May 6 22:15:35 2003 From: brian@sweetapp.com (Brian Quinlan) Date: Tue, 06 May 2003 14:15:35 -0700 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB81982.8070600@v.loewis.de> Message-ID: <008601c31414$a26d0120$21795418@dell1700> > Please rest assured that they are definitely incompatible. People have > been trying to combine VC7 extension modules with VC6, and got > consistent crashes. The crashes occur as you pass FILE* across > libraries: Neither C library can deal with FILE* (such as stdout) > received from the other library. Wouldn't this only affect extension modules using PyFile_FromFile and PyFile_AsFile? And a little hackery could make those routines generate exceptions if called from an incompatible VC version. > > 2. Some 3rd party extension developers may have already released > > binaries for Python 2.3, based on the understanding that there > > won't be any additional API changes after the first beta (baring > > a disaster). > > There won't be any. That's any ABI change. Isn't the ABI dependant on the API and linker? The API is supposed to be stable at this point. I would imagine that most extension developers would assume that the build environment is also stable at this point. Cheers, Brian From jepler@unpythonic.net Tue May 6 21:57:33 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Tue, 6 May 2003 15:57:33 -0500 Subject: RedHat 9 _random failure under -pg (was Re: [Python-Dev] Startup time) In-Reply-To: References: <20030506185750.GB27125@unpythonic.net> Message-ID: <20030506205733.GE27125@unpythonic.net> On Tue, May 06, 2003 at 09:25:20PM +0200, Martin v. L=F6wis wrote: > Jeff Epler writes: >=20 > > Well, this may have been false alarm -- when I removed -pg from OPT i= n > > the Makefile, './python -c "import random"' works. So this is a prob= lem > > only when profiling is enabled. Is this intended to work? >=20 > You mean, is the gcc option -pg supposed to work? As a Python > developer: How am I supposed to know? As a gcc developer: yes, > certainly. I didn't know you were a gcc developer. In any case, I've distilled this down to a small testcase and was working on preparing a bug report for their gnats database. The testcase is about as simple as it gets: /* compile with -pg -fPIC -O */ typedef struct { void *(*f)(void *, int); } T; void *g(T *t) { return t->f(t, 0); } however, I checked 3.2.3 and this bug is fixed, so I guess I don't need to do that. Jeff From lists@morpheus.demon.co.uk Tue May 6 22:19:31 2003 From: lists@morpheus.demon.co.uk (Paul Moore) Date: Tue, 06 May 2003 22:19:31 +0100 Subject: [Python-Dev] MS VC 7 offer References: <200305061901.h46J11306259@odiug.zope.com> Message-ID: Guido van Rossum writes: >> Visual Studio 2003 came out a few weeks ago. I honestly don't know if >> its considered VC8 or just VC7.1 with the same backend compilers. But if >> you're going to upgrad, you might as well go all the way. > > Good question. > >> Also, I'm assuming 2.3 will still be compiled on 6.0, right? > > Hm, I was thinking that 2.3 final could be built using 7.x if Nick can > get us the donated copies fast enough. If this means that those of us with VC6, and with no plans/reasons to upgrade can no longer build our own extensions, this would be a disaster. Surely VC7-compiled C programs can be built in such a way as to be link-compatible with VC6-compiled extensions??? (Wait, this is Microsoft...) Please *don't* build 2.3 final with VC7. If you're going to switch, give users more warning, and test builds - I would need at least to find out if I could build extensions against a VC7-compiled Python using mingw... Paul. -- This signature intentionally left blank From lists@morpheus.demon.co.uk Tue May 6 22:20:32 2003 From: lists@morpheus.demon.co.uk (Paul Moore) Date: Tue, 06 May 2003 22:20:32 +0100 Subject: [Python-Dev] MS VC 7 offer References: <200305061948.03757.phil@riverbankcomputing.co.uk> Message-ID: Tim Peters writes: > [Phil Thompson] >> How do we get hold of the free VC 7 compilers? > > Part of the 100+ MB .NET Framework 1.1 SDK: > > http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx > > Note that this requires Win2K minimum. Note that these have no optimiser, as I understand it. Paul. -- This signature intentionally left blank From martin@v.loewis.de Tue May 6 22:45:28 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 06 May 2003 23:45:28 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <008601c31414$a26d0120$21795418@dell1700> References: <008601c31414$a26d0120$21795418@dell1700> Message-ID: <3EB82CF8.4030005@v.loewis.de> Brian Quinlan wrote: > Wouldn't this only affect extension modules using PyFile_FromFile and > PyFile_AsFile? That might be the case. However, notice that there might be other incompatibilities which we might discover by chance only - Microsoft hasn't documented any of this. >>There won't be any. That's any ABI change. > > > Isn't the ABI dependant on the API and linker? And the compiler, and the operating system, and the microprocessor. > The API is supposed to be > stable at this point. I would imagine that most extension developers > would assume that the build environment is also stable at this point. Yes, some are certainly assuming that. Some are sincerely hoping, or even expecting, that Python 2.3 is released with VC7, so that they can embed Python in their VC7-based application without having to recompile it. No matter what the choice is, somebody will be unhappy. Regards, Martin From dave@boost-consulting.com Tue May 6 23:01:28 2003 From: dave@boost-consulting.com (David Abrahams) Date: Tue, 06 May 2003 18:01:28 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB81A8B.9090603@v.loewis.de> (Martin v. =?iso-8859-1?q?L=F6wis's?= message of "Tue, 06 May 2003 22:26:51 +0200") References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it> <3EB81A8B.9090603@v.loewis.de> Message-ID: "Martin v. L=F6wis" writes: > Alex Martelli wrote: > >> When we discussed VC versions (back when we met in Ofxord during >> PythonUK/ACCU), David Abrahams seemed adamant that VC7 and VC6 >> are indeed compatible > > I doubt he said this in this generality Actually, I did. I may have overstated the case slightly, but not by much. > he surely knows that you cannot mix C++ objects files on the object > file level between those compilers, as they implement completely > different ABIs. They implement substantially similar ABIs. Here are the facts in full, glorious/gory detail from a member of Microsoft's compiler team. I quote: The bottom line: the ABI is backwards compatible. We do require using the linker that matches the newest compiler used in a set of .obj files. There were some incompatible name decoration changes (function templates) b/w VC7 and VC7.1. Most people should never notice this one, though I know of at least 1 customer that did. Another name decoration change was made b/w VC6 and VC7, but nobody should notice that change, since they were hitting a broken construct anyway. There was a SP of VC6 that is incompatible with VC7 and other builds of VC6, I forget which exactly, maybe SP4, or maybe it was the processor pack. It only involved pointer to members, but we were layout incompatible. The only other issues I can think of are related to __declspec(align(N)) and __unaligned (IA64 only, really.) > For Python, the biggest problem is that you cannot pass FILE* from > one C library to the other, because of some stupid locking test in > the C library. This does cause crashes when you try to use Python > extension modules compiled with the wrong compiler. Assuming you are passing availability of FILE*s across the extension module boundary and the extension module author is using the VC7 libraries instead of those that ship with VC6 (using the VC6 libraries with VC7 would be a trick)... then yes. In practice, making sure that resources are only used by the appropriate 'C' library is not too difficult, but requires a level of attention that I wouldn't want to demand of newbies. I certainly build all kinds of Boost.Python extension modules with VC7 and test them without problems using a VC6 build of Python. HTH, --=20 Dave Abrahams Boost Consulting www.boost-consulting.com From guido@python.org Tue May 6 23:06:11 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 06 May 2003 18:06:11 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: Your message of "Tue, 06 May 2003 22:19:31 BST." References: <200305061901.h46J11306259@odiug.zope.com> Message-ID: <200305062206.h46M6BP08306@odiug.zope.com> > If this means that those of us with VC6, and with no plans/reasons to > upgrade can no longer build our own extensions, this would be a > disaster. Part of the offer was: | Potentially we can even figure out how to enable anyone to | build Python using the freely downloadable compilers I mentioned | above... --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Tue May 6 23:17:14 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 07 May 2003 00:17:14 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: <200305061901.h46J11306259@odiug.zope.com> Message-ID: <3EB8346A.1000907@v.loewis.de> Paul Moore wrote: > If this means that those of us with VC6, and with no plans/reasons to > upgrade can no longer build our own extensions, this would be a > disaster. Using VC7 would be a desaster for those required to use VC6. Using VC6 is a desaster for those required to use VC7. Somebody will be unhappy. > Surely VC7-compiled C programs can be built in such a way as to be > link-compatible with VC6-compiled extensions??? It probably works in many cases, but it is known to fail in certain cases. Regards, Martin From tim.one@comcast.net Tue May 6 23:17:22 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 18:17:22 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB82CF8.4030005@v.loewis.de> Message-ID: [Martin v. Lowis] > ... > Some are sincerely hoping, or even expecting, that Python 2.3 is > released with VC7, so that they can embed Python in their VC7-based > application without having to recompile it. > > No matter what the choice is, somebody will be unhappy. OTOH, I don't see anything to stop releasing VC6 and VC7 versions of Python, except for the absence of a volunteer to do it. While the Wise installer is proprietary, there's nothing hidden about what goes into a release, there are several free installers people *could* use instead, and the build process for the 3rd-party components is pretty exhaustively documented. Speaking of which, presumably Tcl/Tk and SSL and etc on Windows should also be compiled under VC7 then. From cnetzer@mail.arc.nasa.gov Tue May 6 23:37:30 2003 From: cnetzer@mail.arc.nasa.gov (Chad Netzer) Date: 06 May 2003 15:37:30 -0700 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305062206.h46M6BP08306@odiug.zope.com> References: <200305061901.h46J11306259@odiug.zope.com> <200305062206.h46M6BP08306@odiug.zope.com> Message-ID: <1052260650.529.14.camel@sayge.arc.nasa.gov> On Tue, 2003-05-06 at 15:06, Guido van Rossum wrote: > Part of the offer was: > > | Potentially we can even figure out how to enable anyone to > | build Python using the freely downloadable compilers I mentioned > | above... Which would seem to exclude building on Win98 machines (or WinME *snort*, or even Win NT 4). Those platforms still have a huge installed base, and I would assume a not insignificant developer base. Is offering a MSVC6 version along with a more recent compiler version an option? -- Chad Netzer (any opinion expressed is my own and not NASA's or my employer's) From martin@v.loewis.de Tue May 6 23:47:01 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 07 May 2003 00:47:01 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it> <3EB81A8B.9090603@v.loewis.de> Message-ID: <3EB83B65.8070900@v.loewis.de> David Abrahams wrote: > Actually, I did. I may have overstated the case slightly, but not by > much. Hmm. While this is certainly off-topic for python-dev, I'm still curious. So I just did this: 1. Create a library project with VC6. Put a single class into a single translation unit #include struct X:public CObject{ X(); }; 2. Compile this library with vc6. 3. Create an MFC application with VC7. Instantiate X somewhere. Try to link. This gives the error message LINK : fatal error LNK1104: cannot open file 'mfc42d.lib' Sure enough, VC7 does not come with that library. So it seems very clear to me that the libraries shipped are incompatible in a way that does not allow to mix object files of different compilers. Did I do something wrong here? Regards, Martin From martin@v.loewis.de Tue May 6 23:50:44 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 07 May 2003 00:50:44 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: Message-ID: <3EB83C44.20706@v.loewis.de> Tim Peters wrote: > Speaking of which, presumably Tcl/Tk and SSL and etc on Windows should also > be compiled under VC7 then. That is certainly the case (not to forget bsddb, zlib, and bzip2). This will require quite some volunteer time. Regards, Martin From dave@boost-consulting.com Wed May 7 00:05:41 2003 From: dave@boost-consulting.com (David Abrahams) Date: Tue, 06 May 2003 19:05:41 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB83B65.8070900@v.loewis.de> (Martin v. =?iso-8859-1?q?L=F6wis's?= message of "Wed, 07 May 2003 00:47:01 +0200") References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it> <3EB81A8B.9090603@v.loewis.de> <3EB83B65.8070900@v.loewis.de> Message-ID: "Martin v. L=F6wis" writes: > David Abrahams wrote: > >> Actually, I did. I may have overstated the case slightly, but not by >> much. > > Hmm. While this is certainly off-topic for python-dev, I'm still > curious. So I just did this: > > 1. Create a library project with VC6. Put a single class into > a single translation unit > > #include > > struct X:public CObject{ > X(); > }; > > 2. Compile this library with vc6. > > 3. Create an MFC application with VC7. Instantiate X somewhere. > Try to link. This gives the error message > > LINK : fatal error LNK1104: cannot open file 'mfc42d.lib' > > Sure enough, VC7 does not come with that library. > So it seems very clear to me that the libraries shipped are > incompatible in a way that does not allow to mix object files > of different compilers. Did I do something wrong here? I normally don't think of the contents (or naming) of a non-standard library like MFC that just happens to ship with the compiler as being something that affects object-code compatibility. *If* you accept the way I see that term, your test doesn't say anything about it. Certainly for any accepted definition of "ABI", it's hard to connect your test with the claim that "they implement completely different ABIs". You could make a reasonable argument that differences in the standard 'C' or C++ library affects object code compatibility; frankly I have avoided that area so I don't know whether there are problems with the 'C' library but I know the C++ library underwent a major overhaul, so I wouldn't place any bets. Regardless, when I say "object code compatibility", I'm talking about what's traditionally thought of as the ABI: the layout of objects, calling convention, mechanics of the runtime, etc., all of which are basically library-independent issues. HTH2, --=20 Dave Abrahams Boost Consulting www.boost-consulting.com From mhammond@skippinet.com.au Wed May 7 00:06:38 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 7 May 2003 09:06:38 +1000 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061826.h46IQ7605750@odiug.zope.com> Message-ID: <03dc01c31424$26758500$530f8490@eden> [Guido] > I can see advantages and disadvantages of moving to VC 7; I'm sure the > VC 7 compiler is more standard-compliant and generates faster code, > but a disadvantage is that you can't apparently link binaries built > with VC 6 to a program built with VC 7, meaning that 3rd party > extensions will have to be recompiled with VC 7 as well. Actually, I think this need not be true. I have MSVC7, not currently installed, but when it was I did manage to mix and match compilers for Python and extensions without problem. I am happy to play with this, but am short on time for a week or so. Another thing to consider is the "make" environment. If we don't use DevStudio, then presumably our existing project files will become useless. Not a huge problem, but a real one. MSVC exported makefiles are not designed to be maintained. I'm having good success with autoconf and Python on other projects, but that would raise the barrier to including cygwin in your build environment. Then-just-one-step-from-gcc ly, Mark. From mhammond@skippinet.com.au Wed May 7 00:19:50 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 7 May 2003 09:19:50 +1000 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB83B65.8070900@v.loewis.de> Message-ID: <03e201c31425$feeb3a00$530f8490@eden> > > Actually, I did. I may have overstated the case slightly, > but not by > > much. > > Hmm. While this is certainly off-topic for python-dev, I'm still > curious. So I just did this: What you did is to create a library using a specific version of an "external" library (MFC - shipped with MS as part of MSVC, but as external as any other .lib you may use from anywhere) You then upgrade to a newer version of the library, and attempted to link code built using an earlier one. So this has nothing to do with MSVC as such, only with MFC. It is somewhat similar to trying to use a Python 1.x extension with Python 2.x, or, assuming it was possible, using the same MSVCx with 2 discrete MFC versions. Mark. From phil@riverbankcomputing.co.uk Wed May 7 00:46:06 2003 From: phil@riverbankcomputing.co.uk (Phil Thompson) Date: Wed, 7 May 2003 00:46:06 +0100 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: Message-ID: <200305070046.06725.phil@riverbankcomputing.co.uk> On Tuesday 06 May 2003 7:56 pm, Tim Peters wrote: > [Phil Thompson] > > > How do we get hold of the free VC 7 compilers? > > Part of the 100+ MB .NET Framework 1.1 SDK: > > http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx > > Note that this requires Win2K minimum. Does it generate binaries that will run under Win9x? Phil From pje@telecommunity.com Wed May 7 00:52:39 2003 From: pje@telecommunity.com (Phillip J. Eby) Date: Tue, 06 May 2003 19:52:39 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <03dc01c31424$26758500$530f8490@eden> References: <200305061826.h46IQ7605750@odiug.zope.com> Message-ID: <5.1.1.6.0.20030506194806.01fc05b0@telecommunity.com> At 09:06 AM 5/7/03 +1000, Mark Hammond wrote: >Another thing to consider is the "make" environment. If we don't use >DevStudio, then presumably our existing project files will become useless. >Not a huge problem, but a real one. MSVC exported makefiles are not >designed to be maintained. I'm having good success with autoconf and Python >on other projects, but that would raise the barrier to including cygwin in >your build environment. > >Then-just-one-step-from-gcc ly, Just out of curiosity, what is it that MSVC adds to the picture over gcc anyway? Has anybody ever tried making a MinGW-only build of Python on Windows? From phil@riverbankcomputing.co.uk Wed May 7 00:57:10 2003 From: phil@riverbankcomputing.co.uk (Phil Thompson) Date: Wed, 7 May 2003 00:57:10 +0100 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <007e01c31405$1ea52fc0$21795418@dell1700> References: <007e01c31405$1ea52fc0$21795418@dell1700> Message-ID: <200305070057.10872.phil@riverbankcomputing.co.uk> On Tuesday 06 May 2003 8:24 pm, Brian Quinlan wrote: > > I can see the downside (next to no experience with 7.x, and perhaps > > none > > > before the final release). What's the upside? > > It's free and more standards compliant. This is what I'm struggling with. If it's free, why pay any attention to the offer of a donation of a GUI frontend? (With a certain amount of irony, I don't attach any value to a GUI frontend to a compiler.) If it is free using some Microsoft definition of the word (eg. users have to upgrade to Win2K, or some other "read the small print" reason) then my vote is -1. If it is really free then submit a PEP and factor it in to the normal review/development process. I don't understand the apparent urgency. Phil From mhammond@skippinet.com.au Wed May 7 01:08:25 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 7 May 2003 10:08:25 +1000 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <5.1.1.6.0.20030506194806.01fc05b0@telecommunity.com> Message-ID: <03f801c3142c$c8603290$530f8490@eden> > Just out of curiosity, what is it that MSVC adds to the > picture over gcc > anyway? Has anybody ever tried making a MinGW-only build of > Python on Windows? Now or then? . "Then" it was the simple matter of no gcc available for Windows. Now, it is a combination of no one driving it, and the simple fact that msvc will almost certainly generate better code and work with almost every library on Windows worth talking to. However, until the "no one driving it" part of solved, the latter, including the impact of mingw, wont be able to be measured. Mark. From gh@ghaering.de Wed May 7 01:31:04 2003 From: gh@ghaering.de (=?ISO-8859-1?Q?Gerhard_H=E4ring?=) Date: Wed, 07 May 2003 02:31:04 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <5.1.1.6.0.20030506194806.01fc05b0@telecommunity.com> References: <200305061826.h46IQ7605750@odiug.zope.com> <5.1.1.6.0.20030506194806.01fc05b0@telecommunity.com> Message-ID: <3EB853C8.70100@ghaering.de> Phillip J. Eby wrote: > Just out of curiosity, what is it that MSVC adds to the picture over gcc > anyway? Has anybody ever tried making a MinGW-only build of Python on > Windows? I'm working (as time and enthusiasm permits) on making this happen. For this project, I even got commit privileges by the powers that be :-) Getting as far as: C:\src\python\dist\src>python 'import site' failed; use -v for traceback Python 2.3a2+ (#27, Apr 23 2003, 21:13:49) [GCC 3.2.2 (mingw special 20030208-1)] on mingw32_nt-5.11 Type "help", "copyright", "credits" or "license" for more information. >>> isn't much of a problem. This is a statically linked python.exe built with the autoconf-based build process, msys, mingw and my patches, mostly for posixmodule.c. The difficult part is figuring out the autoconf stuff and distutils, so that the rest of the modules can be built. I didn't get very far on this side, yet :-/ OTOH I'm pretty sure that a mingw build would be much easier if I just wrote my own Makefiles, but that's probably unlikely to ever be merged. At least that was my experience when making PostgreSQL's client code compile with mingw. Their answer was "we don't want to maintain yet anothe set of proprietary Makefiles", which is a good argument. -- Gerhard From tim.one@comcast.net Wed May 7 01:43:59 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 20:43:59 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305070046.06725.phil@riverbankcomputing.co.uk> Message-ID: [Phil Thompson, on http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx ] Follow the link, please. I haven't tried it myself, and you've already proved you can read too . From skip@pobox.com Wed May 7 01:52:13 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 19:52:13 -0500 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB853C8.70100@ghaering.de> References: <200305061826.h46IQ7605750@odiug.zope.com> <5.1.1.6.0.20030506194806.01fc05b0@telecommunity.com> <3EB853C8.70100@ghaering.de> Message-ID: <16056.22717.877383.95261@montanaro.dyndns.org> Gerhard> OTOH I'm pretty sure that a mingw build would be much easier if Gerhard> I just wrote my own Makefiles, but that's probably unlikely to Gerhard> ever be merged. At least that was my experience when making Gerhard> PostgreSQL's client code compile with mingw. I suggest you go ahead with whatever is easiest for you. At least you will be able to focus on actually solving the MinGW-related problems. Others can chip in on the autoconf problems. As a starter perhaps a Makefile.mingw file can be added to the PCBuild directory. At a later date the interim makefile can be removed to the attic. Skip From tim.one@comcast.net Wed May 7 02:17:49 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 21:17:49 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305070057.10872.phil@riverbankcomputing.co.uk> Message-ID: [Phil Thompson] > ... > If it's free, why pay any attention to the offer of a donation of a GUI > frontend? (With a certain amount of irony, I don't attach any > value to a GUI frontend to a compiler.) The GUI isn't just the compiler, it's also the automated dependency analysis, a make system, and a (very good) debugger. > If it is free using some Microsoft definition of the word (eg. > users have to upgrade to Win2K, or some other "read the small print" > reason) then my vote is -1. Guido asked who would want one. You don't, but you don't get to vote that nobody else does either. From BPettersen@NAREX.com Wed May 7 02:21:11 2003 From: BPettersen@NAREX.com (Bjorn Pettersen) Date: Tue, 6 May 2003 19:21:11 -0600 Subject: [Python-Dev] Windows installer request... Message-ID: <60FB8BB7F0EFC7409B75EEEC13E20192022DE2A8@admin56.narex.com> > From: Tim Peters [mailto:tim.one@comcast.net]=20 >=20 > [Bjorn Pettersen] [...] > > item: Get Environment Variable > > Variable=3DOSDRIVE > > Environment=3DSystemDrive > > Default=3DC: > > end [...] > Enough already : I don't have time to try=20 > umpteen different things here, or really even one. Thank you for doing it anyway then . > What I did do is build an installer *just* removing the hard-coded > Wizard-generated "C:" prefix. Martin tried that and said it=20 > worked for him. It doesn't hurt me. If it works for you too,=20 > I'll commit the change: Works like a charm. Tested on Win98, Win2k, WinXP Pro (regular), and my "special" XP. (NT4 seems to have died a silent death, so I couldn't test it there...) > Please give that a try. It's an incoherent mix if files, so=20 > please use a junk name for the installation directory and program=20 > startup group (or simply abort the install after you see whether=20 > it suggested a drive you approve of). I went all the way through (all files seems to have gone in correctly), and as expected it shadowed my original install of 2.3b1 in the Add/Remove Programs window. Surprisingly however, the original came back after this one was removed. Who'd have thought.. ;-) [.. xx.wse needs the Wise GUI to create an installer..] Thought it might be that way... FWIW, re: the MSVC7 debate, the "Microsoft Development Environment" (DevStudio), comes with five different "Setup and Deployment projects". I've never used any of them, nor Wise (obviously :-), but it could potentially get you out of the loop... . Thanks again! -- bjorn From tim.one@comcast.net Wed May 7 02:23:55 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 21:23:55 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB83C44.20706@v.loewis.de> Message-ID: [Tim] > Speaking of which, presumably Tcl/Tk and SSL and etc on Windows > should also be compiled under VC7 then. [Martin v. L=F6wis] > That is certainly the case (not to forget bsddb, zlib, and bzip2). > This will require quite some volunteer time. Amplifying a little, the Python code base required some changes befor= e it would compile under VC 7 (I didn't make these changes, and don't reca= ll any details apart from changes in MS's LONG_INTEGER APIs). There's no re= ason to believe that other code bases are immune from needing changes too. A= t present, we don't maintain any patches to any external code base in o= rder to build the Windows release. If we needed to make changes to them for = VC 7, that would probably change, and should really be done by the packages= ' primary (non-Python) maintainers. From tim.one@comcast.net Wed May 7 02:40:49 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 21:40:49 -0400 Subject: [Python-Dev] Windows installer request... In-Reply-To: <60FB8BB7F0EFC7409B75EEEC13E20192022DE2A8@admin56.narex.com> Message-ID: [Tim] >> Enough already : I don't have time to try >> umpteen different things here, or really even one. [Bjorn Pettersen] > Thank you for doing it anyway then . You're welcome! I took the "C:" out on Monday, when I had just enough spare time to delete one byte, and took the rest out of sleep. > ... > Works like a charm. Tested on Win98, Win2k, WinXP Pro (regular), and my > "special" XP. (NT4 seems to have died a silent death, so I couldn't test > it there...) Thanks! I'll check it in ... Thursday. >> Please give that a try. It's an incoherent mix if files, so >> please use a junk name for the installation directory and program >> startup group (or simply abort the install after you see whether >> it suggested a drive you approve of). > I went all the way through (all files seems to have gone in correctly), > and as expected it shadowed my original install of 2.3b1 in the > Add/Remove Programs window. Surprisingly however, the original came back > after this one was removed. Who'd have thought.. ;-) The rollback features in Wise 8.14-generated installers are pretty good (esp. if you check the "make backups" option when installing). Uninstall/rollback will even restore start menu groups and file associations. I don't trust it enough to recommend it, though (I haven't really beat on it). Something fun to waste time: in the very last "Installation Completed!" install dialog, click "Cancel" instead of "Finish". It will then roll back all the changes it made, leaving things as they were before you started the installer. > ... FWIW, re: the MSVC7 debate, the "Microsoft Development Environment" > (DevStudio), comes with five different "Setup and Deployment projects". > I've never used any of them, nor Wise (obviously :-), but it could > potentially get you out of the loop... . Thanks, but I'm not sure even death has that kind of power. From dave@boost-consulting.com Wed May 7 03:16:25 2003 From: dave@boost-consulting.com (David Abrahams) Date: Tue, 06 May 2003 22:16:25 -0400 Subject: [Python-Dev] Re: MS VC 7 offer References: <008601c31414$a26d0120$21795418@dell1700> <3EB82CF8.4030005@v.loewis.de> Message-ID: "Martin v. Löwis" writes: > Brian Quinlan wrote: > >> Wouldn't this only affect extension modules using PyFile_FromFile and >> PyFile_AsFile? > > That might be the case. However, notice that there might be other > incompatibilities which we might discover by chance only - Microsoft > hasn't documented any of this. They pretty much told you the exact score, through me. More details are available if neccessary. -- Dave Abrahams Boost Consulting www.boost-consulting.com From dave@boost-consulting.com Wed May 7 03:20:59 2003 From: dave@boost-consulting.com (David Abrahams) Date: Tue, 06 May 2003 22:20:59 -0400 Subject: [Python-Dev] Re: MS VC 7 offer References: <16056.2317.124886.963460@montanaro.dyndns.org> <007e01c31405$1ea52fc0$21795418@dell1700> Message-ID: Brian Quinlan writes: >> I can see the downside (next to no experience with 7.x, and perhaps > none >> before the final release). What's the upside? > > It's free and more standards compliant. That compliance means a lot to C++ programmers. It takes MSVC from being a real PITA to do any serious C++ in (Vc7.0 was worse than 6 in some ways) to being a first-class contender among quality C++ implementations. I'm not sure whether that should have any effect on decisions made about Python development, though ;-) -- Dave Abrahams Boost Consulting www.boost-consulting.com From tim.one@comcast.net Wed May 7 03:42:29 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 22:42:29 -0400 Subject: [Python-Dev] Re: heaps In-Reply-To: Message-ID: [David Eppstein] >>> For fairness, it might be interesting to try another run of your test >>> in which the input sequence is sorted in increasing order rather >>> than random. [Tim] >> Comparing the worst case of one against the best case of the >> other isn't my idea of fairness , but sure. [David] > Well, it doesn't seem any fairer to use random data to compare an > algorithm with an average time bound that depends on an assumption of > randomness in the data...anyway, the point was more to understand the > limiting cases. If one algorithm is usually 3x faster than the other, > and is never more than 10x slower, that's better than being usually 3x > faster but sometimes 1000x slower, for instance. Sure. In practice you need to know time distributions when using an algorithm -- best, expected, worse, and how likely each are under a variety of expected conditions. > My Java KBest code was written to make data subsets for a half-dozen web > pages (same data selected according to different criteria). Of these > six instances, one is presented the data in roughly ascending order, one > in descending order, and the other four are less clear but probably not > random. > > Robustness in the face of this sort of variation is why I prefer any > average-case assumptions in my code's performance to depend only on > randomness from a random number generator, and not arbitrariness in the > actual input. But I'm not sure I'd usually be willing to pay a 3x > penalty for that robustness. Most people aren't, until they hit a bad case <0.5 wink>. So "pure" algorithms rarely survive in the face of a large variety of large problem instances. The monumental complications Python's list.sort() endures to work well under many conditions (both friendly and hostile) is a good example of that. In industrial real life, I expect an all-purpose N-Best queue would need to take a hybrid approach, monitoring its fast-path gimmick in some cheap way in order to fall back to a more defensive algorithm when the fast-path gimmick isn't paying. >> Here's a surprise: I coded a variant of the quicksort-like >> partitioning method, at the bottom of this mail. On the largest-1000 >> of a million random-float case, times were remarkably steady across >> trials (i.e., using a different set of a million random floats each >> time): >> >> heapq 0.96 seconds >> sort (micro-optimized) 3.4 seconds >> KBest (below) 2.6 seconds > Huh. You're almost convincing me that asymptotic analysis works even in > the presence of Python's compiled-vs-interpreted anomalies. Indeed, you can't fight the math! It often takes a large problem for better O() behavior to overcome a smaller constant in a worse O() approach, and especially in Python. For example, I once wrote and tuned and timed an O(N) worst-case rank algorithm in Python ("find the k'th smallest item in a sequence"), using the median-of-medians-of-5 business. I didn't have enough RAM at the time to create a list big enough for it to beat "seq.sort(); return seq[k]". By playing lots of tricks, and boosting it to median-of-medians-of-11, IIRC I eventually got it to run faster than sorting on lists with "just" a few hundred thousand elements. But in *this* case I'm not sure that the only thing we're really measuring isn't: 1. Whether an algorithm has an early-out gimmick. 2. How effective that early-out gimmick is. and 3. How expensive it is to *try* the early-out gimmick. The heapq method Rulz on random data because its answers then are "yes, very, dirt cheap". I wrote the KBest test like so: def three(seq, N): NBest = KBest(N, -1e200) for x in seq: NBest.put(x) L = NBest.get() L.sort() return L (the sort at the end is just so the results can be compared against the other methods, to ensure they all get the same answer). If I break into the abstraction and change the test like so: def three(seq, N): NBest = KBest(N, -1e200) cutoff = -1e200 for x in seq: if x > cutoff: NBest.put(x) cutoff = NBest.cutoff L = NBest.get() L.sort() return L then KBest is about 20% *faster* than heapq on random data. Doing the comparison inline avoids a method call when early-out pays, early-out pays more and more as the sequence nears its end, and simply avoiding the method call then makes the overall algorithm 3X faster. So O() analysis may triumph when equivalent low-level speed tricks are played (the heapq method did its early-out test inline too), but get swamped before doing so. > The other surprise is that (unlike, say, the sort or heapq versions) > your KBest doesn't look significantly more concise than my earlier Java > implementation. The only thing I was trying to minimize was my time in whipping up something correct to measure. Still, I count 107 non-blank, non-comment lines of Java, and 59 of Python. Java gets unduly penalized for curly braces, Python for tedious tricks like buf = self.buf k = self.k to make locals for speed, and that I never put dependent code on the same line as an "if" or "while" test (while you do). Note that it's not quite the same algorithm: the Python version isn't restricted to ints, and in particular doesn't assume it can do arithmetic on a key to get "the next larger" key. Instead it does 3-way partitioning to find the items equal to the pivot. The greater generality may make the Python a little windier. BTW, the heapq code isn't really more compact than C, if you count the implementation code in heapq.py too: it's all low-level small-int arithmetic and array indexing. The only real advantage Python has over assembler for code like that is that we can grow the list/heap dynamically without any coding effort. From martin@v.loewis.de Wed May 7 06:21:46 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 07 May 2003 07:21:46 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: Message-ID: Tim Peters writes: > There's no reason to believe that other code bases are immune from > needing changes too. OTOH, there is any reason to believe that for many of these packages, the required changes have been made already, atleast for those that get regular updates (Tcl/Tk, bsddb). Regards, Martin From martin@v.loewis.de Wed May 7 06:23:31 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 07 May 2003 07:23:31 +0200 Subject: [Python-Dev] Re: MS VC 7 offer In-Reply-To: References: <008601c31414$a26d0120$21795418@dell1700> <3EB82CF8.4030005@v.loewis.de> Message-ID: David Abrahams writes: > They pretty much told you the exact score, through me. More details > are available if neccessary. That is information about the core ABI. I do need to be concerned about changes in the libraries, as well, in particular about incompatibilities resulting from multiple copies of the C library. You said you don't know much about that. Regards, Martin From paoloinvernizzi@dmsware.com Wed May 7 08:12:34 2003 From: paoloinvernizzi@dmsware.com (Paolo Invernizzi) Date: Wed, 07 May 2003 09:12:34 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <03dc01c31424$26758500$530f8490@eden> References: <03dc01c31424$26758500$530f8490@eden> Message-ID: <3EB8B1E2.2050108@dmsware.com> Mark Hammond wrote: >Another thing to consider is the "make" environment. If we don't use >DevStudio, then presumably our existing project files will become useless. >Not a huge problem, but a real one. MSVC exported makefiles are not >designed to be maintained. I'm having good success with autoconf and Python >on other projects, but that would raise the barrier to including cygwin in >your build environment. > I think the scons (www.scons.org) will have in its next release full support for building targets using VC6 *project* file, and full support for VC7. Actually it has support also for cygwin and mingw... So I think is possible to have an automated way for building VC7 python based only on some scons script and VC6 project files... The possible goal is to keep working with VC6 IDE as now, and have a simple build script able to automatically build the VC7 version tracking changes.. I've inserted Greg Spencer, who I know is working on this... surely he can bring us more details. --- Paolo Invernizzi. From phil@riverbankcomputing.co.uk Wed May 7 09:02:46 2003 From: phil@riverbankcomputing.co.uk (Phil Thompson) Date: Wed, 7 May 2003 09:02:46 +0100 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: Message-ID: <200305070902.46353.phil@riverbankcomputing.co.uk> On Wednesday 07 May 2003 2:17 am, Tim Peters wrote: > [Phil Thompson] > > > ... > > If it's free, why pay any attention to the offer of a donation of a GUI > > frontend? (With a certain amount of irony, I don't attach any > > value to a GUI frontend to a compiler.) > > The GUI isn't just the compiler, it's also the automated dependency > analysis, a make system, and a (very good) debugger. > > > If it is free using some Microsoft definition of the word (eg. > > users have to upgrade to Win2K, or some other "read the small print" > > reason) then my vote is -1. > > Guido asked who would want one. You don't, but you don't get to vote that > nobody else does either. That's not the point I'm trying to make. If there is a cost to *users* of a change then that change must be managed properly. The statement on Microsoft's web page says... "Non-developers need to install the .NET Framework 1.1 to run applications developed using the .NET Framework 1.1." The impression I'm getting is that a quick switchover to VC 7 is being suggested - that's what I'm "voting" against. Phil From harri.pasanen@trema.com Wed May 7 09:31:08 2003 From: harri.pasanen@trema.com (Harri Pasanen) Date: Wed, 7 May 2003 10:31:08 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB81A8B.9090603@v.loewis.de> References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it> <3EB81A8B.9090603@v.loewis.de> Message-ID: <200305071031.08474.harri.pasanen@trema.com> On Tuesday 06 May 2003 22:26, Martin v. L=F6wis wrote: > Alex Martelli wrote: > > When we discussed VC versions (back when we met in Ofxord during > > PythonUK/ACCU), David Abrahams seemed adamant that VC7 and VC6 > > are indeed compatible > > I doubt he said this in this generality: he surely knows that you > cannot mix C++ objects files on the object file level between those > compilers, as they implement completely different ABIs. > > For Python, the biggest problem is that you cannot pass FILE* from > one C library to the other, because of some stupid locking test in > the C library. This does cause crashes when you try to use Python > extension modules compiled with the wrong compiler. > One known failure case from the real world is the OmniORB free CORBA=20 ORB. The omniidl parser, which is implemented as a mixture of python=20 and C++ requires that python is compiled with the same VC version as=20 you are compiling OmniORB with. So if you are using VC7 to compile OmniORB, you cannot use the binary=20 Python 2.2.2 from pythonlabs for it, you need to compile your own=20 python using VC7. I believe it is the FILE * that is causing the=20 problem here. If I recall correctly, the size of the underlying FILE=20 struct is different in msvcrt.dll and msvcrt7.dll. I don't know the=20 gory details, I just know the cure. This issue was also in omniORB=20 mailing list. =46or our own product we have to support both VC6 and VC7. For our=20 development version we have actually imported python 2.3 to our CVS,=20 and we are compiling it with VC7.1. Our previous release continues=20 to rely on VC6, and Python 2.2.2, so each develeloper actually has=20 both VC6 and VC7.1 installed on their machine, and correspondingly=20 both python 2.2.2 and python 2.3. Just another datapoint. =2DHarri From sjoerd@acm.org Wed May 7 09:36:46 2003 From: sjoerd@acm.org (Sjoerd Mullender) Date: Wed, 07 May 2003 10:36:46 +0200 Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck" In-Reply-To: <16056.2688.72423.251200@montanaro.dyndns.org> References: <16055.57554.364845.689049@montanaro.dyndns.org> <16056.2688.72423.251200@montanaro.dyndns.org> Message-ID: <20030507083646.6305F74230@indus.ins.cwi.nl> On Tue, May 6 2003 Skip Montanaro wrote: > > >> This generated pyconfig.h. It would thus appear that config.status > >> shouldn't be used by developers. Apparently one of the other flags > >> it appends to the generated configure command suppresses generation > >> of pyconfig.h (and maybe other files). > > Martin> Can you find out whether this is related to the fact that you > Martin> are building in a separate build directory? > > I just confirmed that it's not related to the separate build directory. > When you run config.status --recheck it reruns your latest configure command > with the extra flags --no-create and --no-recursion. Without rummaging > around in the configure file my guess is the --no-create flag is the > culprit. > > So, a word to the wise: avoid config.status --recheck. I don't agree. Just run ./config.status without arguments after running ./config.status --recheck. That *will* regenerate all files. -- Sjoerd Mullender From Paul.Moore@atosorigin.com Wed May 7 11:49:36 2003 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Wed, 7 May 2003 11:49:36 +0100 Subject: [Python-Dev] MS VC 7 offer Message-ID: <16E1010E4581B049ABC51D4975CEDB88619A64@UKDCX001.uk.int.atosorigin.com> From: Guido van Rossum [mailto:guido@python.org] > > If this means that those of us with VC6, and with no plans/reasons = to > > upgrade can no longer build our own extensions, this would be a > > disaster. > Part of the offer was: > | Potentially we can even figure out how to enable anyone to > | build Python using the freely downloadable compilers I mentioned > | above... Which is good news (don't get me wrong, I'm glad to see Microsoft supporting open source projects in this way). But wouldn't that imply unoptimised builds? I just checked: >cl /O2 Microsoft (R) 32-bit C/C++ Standard Compiler Version 13.00.9466 for = 80x86 Copyright (C) Microsoft Corporation 1984-2001. All rights reserved. cl : Command line warning D4029 : optimization is not available in the standard edition compiler So, specifically, if PythonLabs releases Python 2.3 built with MSVC7, and I want to build the latest version of PIL, (maybe because Fredrik hasn't released a binary version yet), do I have no way of getting an optimised build (I pick PIL deliberately, because I guess that image processing would benefit from optimisation, and in the past, PIL = binaries have been relatively hard to obtain at times)? That's the problem I see, personally. I have VC6 because my employer = uses Visual Studio for Visual Basic development. But VB has changed so much = in the transition to .NET, that I don't believe they will ever going to = VS7. So I will have to remain with VS6 (I'm never going to buy VS7 myself, = just for this sort of job). Paul. From mhammond@skippinet.com.au Wed May 7 11:52:21 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 07 May 2003 20:52:21 +1000 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305070902.46353.phil@riverbankcomputing.co.uk> Message-ID: <046701c31486$bdac6170$530f8490@eden> > "Non-developers need to install the .NET Framework 1.1 to run > applications developed using the .NET Framework 1.1." MSVC7 is not the .NET framework. Let's just relax a little and have some faith in the people making these decisions. Mark. From mhammond@skippinet.com.au Wed May 7 11:55:48 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 7 May 2003 20:55:48 +1000 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <16E1010E4581B049ABC51D4975CEDB88619A64@UKDCX001.uk.int.atosorigin.com> Message-ID: <046a01c31487$399d3390$530f8490@eden> This is a multi-part message in MIME format. ------=_NextPart_000_046B_01C314DB.0B494390 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit > That's the problem I see, personally. I have VC6 because my > employer uses Visual Studio for Visual Basic development. > But VB has changed so much in > the transition to .NET, that I don't believe they will ever > going to VS7. So I will have to remain with VS6 (I'm never > going to buy VS7 myself, just for this sort of job). I must say that anecdotally, I find this to be true. Developers are *not* flocking to VC7. I wonder if that fact has anything to do with MS offering free compilers? Maybe we could get 100 free versions out of them Mark. ------=_NextPart_000_046B_01C314DB.0B494390 Content-Type: application/ms-tnef; name="winmail.dat" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="winmail.dat" eJ8+IjMKAQaQCAAEAAAAAAABAAEAAQeQBgAIAAAA5AQAAAAAAADoAAEIgAcAGAAAAElQTS5NaWNy b3NvZnQgTWFpbC5Ob3RlADEIAQ2ABAACAAAAAgACAAEGgAMADgAAANMHBQAHABQANwAAAAMANAEB A5AGALgGAAAlAAAACwACAAEAAAALACMAAAAAAAMAJgAAAAAACwApAAAAAAADAC4AAAAAAAMANgAA AAAAHgBwAAEAAAAbAAAAW1B5dGhvbi1EZXZdIE1TIFZDIDcgb2ZmZXIAAAIBcQABAAAAFgAAAAHD FIc3oS70lqaMukw1n+Kj0kqn7w8AAAIBHQwBAAAAHwAAAFNNVFA6TUhBTU1PTkRAU0tJUFBJTkVU LkNPTS5BVQAACwABDgAAAABAAAYOANrXGocUwwECAQoOAQAAABgAAAAAAAAAccRAiYOwsk+IY+5m qPea/wKHAAADABQOAQAAAAsAHw4BAAAAAgEJEAEAAAB2AgAAcgIAAHwDAABMWkZ1tVBoEAMACgBy Y3BnMTI14jIDQ3RleAVBAQMB9/8KgAKkA+QHEwKAD/MAUARWPwhVB7IRJQ5RAwECAGNo4QrAc2V0 MgYABsMRJfYzBEYTtzASLBEzCO8J97Y7GB8OMDURIgxgYwBQMwsJAWQzNhZQC6YgPiQgVBPgdCcE IHRoGGUgcANgAmBlbSBkSSAUEGUsHbAEkHNjAiAHQGx5Lh4xE+B2QR2gVkM2IGIFkGHidRQQIG15 CuMKgBzwcx4QC1BveRKBIGEEIFYNBAB1B0AGAHR1ZGmcbyACEAXAIjVCYQ3R7iABAB+wF7BwB4AC MB9QySDWQnUFQFZCH4EEILkT0W5nCYAeUCLwbRrQnGggC4Eg5R2CdHIAchZ0IuADoHQi8C5ORV5U HpAdgB0wHjFkAiAn+wVAICBsCJAfsR2BILAD8J8fICFAH7AK1CESZ28LgOJnKKJWUzcfUAYAIvD/ HkAq4x+TKLEYIADAC4Aq0UcdgCyBIAAoSSceIG7zKz8sQ2J1ILAskSCRFBD0bGYekGogYAVAIxId gMcEACaRACAgb2YxwB3gfCkuINQg1B5AJtAx8XP+YSCwKUMAcAWQKbABkB8h+x6QHkBmC4AmgDJj MKIn4nsKUCzBRCQ0HsE1MBggIDQqbjWQKiMAF7Bja/0sJkMssi0hAiAEgScQMxC9KUNmANAFQCXi AHB5MmHfLDQpsC50BeEzAGYGcSwxbwNQCeAmEANwcAMQHsE/LTOKTTTANwF3PeJ1bOcmgCZgBUAx MBZQPbMvgX8AkAIgBCAIYDLjHYEeIDyRA/Buaz4+rHJrM3sCfUSwAAAeAEIQAQAAAEgAAAA8MTZF MTAxMEU0NTgxQjA0OUFCQzUxRDQ5NzVDRURCODg2MTlBNjRAVUtEQ1gwMDEudWsuaW50LmF0b3Nv cmlnaW4uY29tPgADAAlZAQAAAAsAC4AIIAYAAAAAAMAAAAAAAABGAAAAAAOFAAAAAAAAAwAMgAgg BgAAAAAAwAAAAAAAAEYAAAAAEIUAAAAAAAADAA2ACCAGAAAAAADAAAAAAAAARgAAAABShQAAfW4B AB4ADoAIIAYAAAAAAMAAAAAAAABGAAAAAFSFAAABAAAABAAAADkuMAALABKACCAGAAAAAADAAAAA AAAARgAAAAAOhQAAAAAAAAMAE4AIIAYAAAAAAMAAAAAAAABGAAAAABGFAAAAAAAAAwAUgAggBgAA AAAAwAAAAAAAAEYAAAAAGIUAAAAAAAALABWACCAGAAAAAADAAAAAAAAARgAAAAAGhQAAAAAAAAMA FoAIIAYAAAAAAMAAAAAAAABGAAAAAAGFAAAAAAAAAgH4DwEAAAAQAAAAccRAiYOwsk+IY+5mqPea /wIB+g8BAAAAEAAAAHHEQImDsLJPiGPuZqj3mv8CAfsPAQAAAJIAAAAAAAAAOKG7EAXlEBqhuwgA KypWwgAAbXNwc3QuZGxsAAAAAABOSVRB+b+4AQCqADfZbgAAAEU6XERvY3VtZW50cyBhbmQgU2V0 dGluZ3Ncc2tpcFxMb2NhbCBTZXR0aW5nc1xBcHBsaWNhdGlvbiBEYXRhXE1pY3Jvc29mdFxPdXRs b29rXG91dGxvb2sucHN0AAAAAwD+DwUAAAADAA00/TcAAAIBfwABAAAAMQAAADAwMDAwMDAwNzFD NDQwODk4M0IwQjI0Rjg4NjNFRTY2QThGNzlBRkY2NDE5RjkwMAAAAAADAAYQa9ilLgMABxCvAQAA AwAQEAEAAAADABEQAQAAAB4ACBABAAAAZQAAAFRIQVRTVEhFUFJPQkxFTUlTRUUsUEVSU09OQUxM WUlIQVZFVkM2QkVDQVVTRU1ZRU1QTE9ZRVJVU0VTVklTVUFMU1RVRElPRk9SVklTVUFMQkFTSUNE RVZFTE9QTUVOVEJVVFYAAAAAC5I= ------=_NextPart_000_046B_01C314DB.0B494390-- From dave@boost-consulting.com Wed May 7 12:06:22 2003 From: dave@boost-consulting.com (David Abrahams) Date: Wed, 07 May 2003 07:06:22 -0400 Subject: [Python-Dev] Re: MS VC 7 offer In-Reply-To: (Martin v. =?iso-8859-15?q?L=F6wis's?= message of "07 May 2003 07:23:31 +0200") References: <008601c31414$a26d0120$21795418@dell1700> <3EB82CF8.4030005@v.loewis.de> Message-ID: martin@v.loewis.de (Martin v. L=F6wis) writes: > David Abrahams writes: > >> They pretty much told you the exact score, through me. More details >> are available if neccessary. > > That is information about the core ABI. I do need to be concerned > about changes in the libraries, as well, in particular about > incompatibilities resulting from multiple copies of the C library. You > said you don't know much about that. I can find out almost as easily, if you have specific questions. Just let me know, --=20 Dave Abrahams Boost Consulting www.boost-consulting.com From mwh@python.net Wed May 7 12:31:58 2003 From: mwh@python.net (Michael Hudson) Date: Wed, 07 May 2003 12:31:58 +0100 Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck" In-Reply-To: <16056.2688.72423.251200@montanaro.dyndns.org> (Skip Montanaro's message of "Tue, 6 May 2003 14:18:24 -0500") References: <16055.57554.364845.689049@montanaro.dyndns.org> <16056.2688.72423.251200@montanaro.dyndns.org> Message-ID: <2m65on6lht.fsf@starship.python.net> Skip Montanaro writes: > So, a word to the wise: avoid config.status --recheck. I don't know if I'm wise or not but I do tend to go for rm -rf build && mkdir build && cd build && ../configure -q && make -s for most rebuilds... I guess I should trust my tools a bit more. Cheers, M. -- The meaning of "brunch" is as yet undefined. -- Simon Booth, ucam.chat From skip@pobox.com Wed May 7 12:42:21 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 7 May 2003 06:42:21 -0500 Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck" In-Reply-To: <2m65on6lht.fsf@starship.python.net> References: <16055.57554.364845.689049@montanaro.dyndns.org> <16056.2688.72423.251200@montanaro.dyndns.org> <2m65on6lht.fsf@starship.python.net> Message-ID: <16056.61725.602991.181703@montanaro.dyndns.org> >> So, a word to the wise: avoid config.status --recheck. Michael> I don't know if I'm wise or not but I do tend to go for Michael> rm -rf build && mkdir build && cd build && ../configure -q && make -s Michael> for most rebuilds... I guess I should trust my tools a bit Michael> more. I got in the habit of using config.status --recheck because it allowed me to only remember a single configure-like command for most packages I build/install using configure. I only had to figure out what flags to pass to configure once, then later typing "C-r rech" in bash was sufficient to reconfigure the package. It would be nice if config.status had a flag which actually executed configure without the --no-create and --no-recursion flags. Someone mentioned invoking config.status without the --recheck flag. I don't think that's wise in a development environment since that doesn't actually run configure. Since we're talking about building Python in a development environment, I find it hard to believe you'd want to skip configure altogether. Skip From Jack.Jansen@cwi.nl Wed May 7 14:08:44 2003 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Wed, 7 May 2003 15:08:44 +0200 Subject: [Python-Dev] bsddb185 module changes checked in In-Reply-To: <16056.8215.274307.904009@montanaro.dyndns.org> Message-ID: <085D82A5-808D-11D7-A6E2-0030655234CE@cwi.nl> On Tuesday, May 6, 2003, at 22:50 Europe/Amsterdam, Skip Montanaro wrote: > > The various bits necessary to implement the "build bsddb185 when > appropriate" have been checked in. I'm pretty sure I don't have the > best > possible test for the existence of a db library, but it will have to > do for > now. I suspect others can clean it up later during the beta cycle. > The > current detection code in setup.py should work for Nick on OSF/1 and > for > platforms which don't require a separate db library. > > I'd appreciate some extra pounding on this code. On SGI Irix 6.5 (MIPSpro Compilers: Version 7.2.1) it tries to build it, and fails. It complains about "u_int" and such not being defined. There's magic at the top of /usr/include/db.h for defining various types optionally, and that's as far as my understanding went. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From andymac@bullseye.apana.org.au Wed May 7 10:18:41 2003 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Wed, 7 May 2003 20:18:41 +1100 (edt) Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB853C8.70100@ghaering.de> Message-ID: On Wed, 7 May 2003, [ISO-8859-1] Gerhard H=E4ring wrote: > OTOH I'm pretty sure that a mingw build would be much easier if I just > wrote my own Makefiles, but that's probably unlikely to ever be merged. I'm maintaining the EMX port in a subdirectory of the PC directory (in CVS), and it is (basically) the way the MSVC build is being maintained - if you consider Visual Studio project files as abstract makefiles. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From duncan@rcp.co.uk Wed May 7 14:30:14 2003 From: duncan@rcp.co.uk (Duncan Booth) Date: Wed, 7 May 2003 14:30:14 +0100 Subject: [Python-Dev] Microsoft speedup Message-ID: I was just playing around with the compiler options using Microsoft VC6 and I see that adding the option /Ob2 speeds up pystone by about 2.5% (/Ob2 is the option to automatically inline functions where the compiler thinks it is worthwhile.) The downside is that it increases the size of python23.dll by about 13%. It's not a phenomenal speedup, but it should be pretty low impact if the extra size is considered a worthwhile tradeoff. I haven't checked yet with VC7, but the compiler options are the same so the effect should also be similar. -- Duncan Booth duncan@rcp.co.uk int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure? From sjoerd@acm.org Wed May 7 15:14:19 2003 From: sjoerd@acm.org (Sjoerd Mullender) Date: Wed, 07 May 2003 16:14:19 +0200 Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck" In-Reply-To: <16056.61725.602991.181703@montanaro.dyndns.org> References: <16055.57554.364845.689049@montanaro.dyndns.org> <16056.2688.72423.251200@montanaro.dyndns.org> <2m65on6lht.fsf@starship.python.net> <16056.61725.602991.181703@montanaro.dyndns.org> Message-ID: <20030507141419.B87AA74230@indus.ins.cwi.nl> On Wed, May 7 2003 Skip Montanaro wrote: > > >> So, a word to the wise: avoid config.status --recheck. > > Michael> I don't know if I'm wise or not but I do tend to go for > > Michael> rm -rf build && mkdir build && cd build && ../configure -q && make -s > > Michael> for most rebuilds... I guess I should trust my tools a bit > Michael> more. > > I got in the habit of using config.status --recheck because it allowed me to > only remember a single configure-like command for most packages I > build/install using configure. I only had to figure out what flags to pass > to configure once, then later typing "C-r rech" in bash was sufficient to > reconfigure the package. It would be nice if config.status had a flag which > actually executed configure without the --no-create and --no-recursion > flags. > > Someone mentioned invoking config.status without the --recheck flag. I > don't think that's wise in a development environment since that doesn't > actually run configure. Since we're talking about building Python in a > development environment, I find it hard to believe you'd want to skip > configure altogether. I mentioned that. But I also said to do that after running with the --recheck flag. In fact, I use the bit Makefile: Makefile.in config.h.in config.status ./config.status config.status: configure ./config.status --recheck in some of my makefiles. I just type "make Makefile" and it does all it needs to do. -- Sjoerd Mullender From skip@pobox.com Wed May 7 15:24:46 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 7 May 2003 09:24:46 -0500 Subject: [Python-Dev] odd interpreter feature Message-ID: <16057.5934.556547.671279@montanaro.dyndns.org> I was editing the tutorial just now and noticed the secondary prompt (...) in a situation where I didn't think it was appropriate: >>> # The argument of repr() may be any Python object: ... repr(x, y, ('spam', 'eggs')) "(32.5, 40000, ('spam', 'eggs'))" It's caused by the trailing colon at the end of the comment. I verified it using current CVS: >>> hello = 'hello, world\n' hellos = repr(hello) print hellos 'hello, world\n' >>> # hello: ... >>> Shouldn't the trailing colon be ignored in comments? Bug, feature or wart? Skip From mwh@python.net Wed May 7 15:37:37 2003 From: mwh@python.net (Michael Hudson) Date: Wed, 07 May 2003 15:37:37 +0100 Subject: [Python-Dev] odd interpreter feature In-Reply-To: <16057.5934.556547.671279@montanaro.dyndns.org> (Skip Montanaro's message of "Wed, 7 May 2003 09:24:46 -0500") References: <16057.5934.556547.671279@montanaro.dyndns.org> Message-ID: <2mwuh26cwe.fsf@starship.python.net> Skip Montanaro writes: > I was editing the tutorial just now and noticed the secondary prompt (...) > in a situation where I didn't think it was appropriate: > > >>> # The argument of repr() may be any Python object: > ... repr(x, y, ('spam', 'eggs')) > "(32.5, 40000, ('spam', 'eggs'))" > > It's caused by the trailing colon at the end of the comment. Python 2.3b1+ (#1, May 6 2003, 18:00:11) [GCC 2.96 20000731 (Red Hat Linux 7.2 2.96-112.7.2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> # no it's not ... Cheers, M. -- The Internet is full. Go away. -- http://www.disobey.com/devilshat/ds011101.htm From amk@amk.ca Wed May 7 15:35:39 2003 From: amk@amk.ca (A.M. Kuchling) Date: Wed, 07 May 2003 10:35:39 -0400 Subject: [Python-Dev] Re: odd interpreter feature In-Reply-To: <16057.5934.556547.671279@montanaro.dyndns.org> References: <16057.5934.556547.671279@montanaro.dyndns.org> Message-ID: Skip Montanaro wrote: > It's caused by the trailing colon at the end of the comment. No, it's just the comment. >>> # hello ... print 'foo' foo >>> --amk From tim.one@comcast.net Wed May 7 15:42:22 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 07 May 2003 10:42:22 -0400 Subject: [Python-Dev] odd interpreter feature In-Reply-To: <16057.5934.556547.671279@montanaro.dyndns.org> Message-ID: [Skip Montanaro] > I was editing the tutorial just now and noticed the secondary prompt (...) > in a situation where I didn't think it was appropriate: > > >>> # The argument of repr() may be any Python object: > ... repr(x, y, ('spam', 'eggs')) > "(32.5, 40000, ('spam', 'eggs'))" > > It's caused by the trailing colon at the end of the comment. I > verified it using current CVS: > > >>> hello = 'hello, world\n' hellos = repr(hello) print hellos > 'hello, world\n' > >>> # hello: > ... > >>> > > Shouldn't the trailing colon be ignored in comments? Bug, > feature or wart? This changed at some very early point in Python's life. I don't think the trailing colon is relevant: >>> 1+2 3 >>> # hello ... >>> From skip@pobox.com Wed May 7 15:51:10 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 7 May 2003 09:51:10 -0500 Subject: [Python-Dev] odd interpreter feature In-Reply-To: <2mwuh26cwe.fsf@starship.python.net> References: <16057.5934.556547.671279@montanaro.dyndns.org> <2mwuh26cwe.fsf@starship.python.net> Message-ID: <16057.7518.148868.168522@montanaro.dyndns.org> >>> # no it's not ... Damn, thanks... I guess the question still remains though, should the secondary prompt be issued after a comment? Skip From fdrake@acm.org Wed May 7 15:55:45 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 7 May 2003 10:55:45 -0400 Subject: [Python-Dev] odd interpreter feature In-Reply-To: References: <16057.5934.556547.671279@montanaro.dyndns.org> Message-ID: <16057.7793.975960.566995@grendel.zope.com> Tim Peters writes: > This changed at some very early point in Python's life. I don't think the > trailing colon is relevant: > > >>> 1+2 > 3 > >>> # hello > ... > >>> I think this is also a point on which Python and Jython differ, but I don't have Jython installed anywhere nearby to test with. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From guido@python.org Wed May 7 16:02:21 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 07 May 2003 11:02:21 -0400 Subject: [Python-Dev] odd interpreter feature In-Reply-To: "Your message of Wed, 07 May 2003 09:24:46 CDT." <16057.5934.556547.671279@montanaro.dyndns.org> References: <16057.5934.556547.671279@montanaro.dyndns.org> Message-ID: <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net> > I was editing the tutorial just now and noticed the secondary prompt (...) > in a situation where I didn't think it was appropriate: > > >>> # The argument of repr() may be any Python object: > ... repr(x, y, ('spam', 'eggs')) > "(32.5, 40000, ('spam', 'eggs'))" > > It's caused by the trailing colon at the end of the comment. I verified it > using current CVS: > > >>> hello = 'hello, world\n' hellos = repr(hello) print hellos > 'hello, world\n' > >>> # hello: > ... > >>> > > Shouldn't the trailing colon be ignored in comments? Bug, feature or wart? It's not the trailing colon. Any line that consists of only a comment does this: >>> >>> # foo ... >>> # foo ... >>> 12 # foo 12 >>> And yes, it's a wart, but I don't know how to fix it. --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas@xs4all.net Wed May 7 16:16:20 2003 From: thomas@xs4all.net (Thomas Wouters) Date: Wed, 7 May 2003 17:16:20 +0200 Subject: [Python-Dev] odd interpreter feature In-Reply-To: <16057.7793.975960.566995@grendel.zope.com> References: <16057.5934.556547.671279@montanaro.dyndns.org> <16057.7793.975960.566995@grendel.zope.com> Message-ID: <20030507151620.GI26254@xs4all.nl> On Wed, May 07, 2003 at 10:55:45AM -0400, Fred L. Drake, Jr. wrote: > Tim Peters writes: > > This changed at some very early point in Python's life. I don't think the > > trailing colon is relevant: > > > > >>> 1+2 > > 3 > > >>> # hello > > ... > > >>> > I think this is also a point on which Python and Jython differ, but I > don't have Jython installed anywhere nearby to test with. I do: debian:~ > jython Jython 2.1 on java1.1.8 (JIT: null) Type "copyright", "credits" or "license" for more information. >>> 1+2 3 >>> # hello >>> ^D (This is why I use Debian... 'apt-get install jython' :-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From skip@pobox.com Wed May 7 16:16:29 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 7 May 2003 10:16:29 -0500 Subject: [Python-Dev] odd interpreter feature In-Reply-To: <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net> References: <16057.5934.556547.671279@montanaro.dyndns.org> <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <16057.9037.913362.225855@montanaro.dyndns.org> Guido> And yes, it's a wart, but I don't know how to fix it. I did a little digging and noticed this comment dating from v 2.5 (Jul 91): /* Lines with only whitespace and/or comments shouldn't affect the indentation and are not passed to the parser as NEWLINE tokens, except *totally* empty lines in interactive mode, which signal the end of a command group. */ Not surprisingly, given the age of the change, your fingerprints are all over it. ;-) I suspect if the code beneath that comment was executed only when the indentation level is zero we'd be okay, but I don't know if the tokenizer has that sort of information available. I'll do a little more poking around. Skip From akuchlin@mems-exchange.org Wed May 7 16:19:16 2003 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 07 May 2003 11:19:16 -0400 Subject: [Python-Dev] Relying on ReST in the core? Message-ID: For PEP 314, it's been suggested that the Description field be written in RestructuredText. This change doesn't affect the Distutils code, because the Distutils just takes this field and copies it into an output file; programs using the metadata defined in PEP 314 would have to be able to process ReST, though. I know the plan is to eventually add ReST/docutils to the standard library, and that this isn't happening for Python 2.3. Question: is it OK to make something in the core implicitly depend on ReST before ReST is in the core? Until docutils is added, there's always the risk that we decide to never add ReST to the core, or ReST 2.0 changes the format completely, or we decide XYZ is much better, or something like that. --amk (www.amk.ca) IAGO: Poor and content is rich and rich enough. -- _Othello_, III, iii From guido@python.org Wed May 7 16:32:39 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 07 May 2003 11:32:39 -0400 Subject: [Python-Dev] Relying on ReST in the core? In-Reply-To: "Your message of Wed, 07 May 2003 11:19:16 EDT." References: Message-ID: <200305071532.h47FWdX03514@pcp02138704pcs.reston01.va.comcast.net> > For PEP 314, it's been suggested that the Description field > be written in RestructuredText. This change doesn't affect the > Distutils code, because the Distutils just takes this field and > copies it into an output file; programs using the metadata defined > in PEP 314 would have to be able to process ReST, though. > > I know the plan is to eventually add ReST/docutils to the standard > library, and that this isn't happening for Python 2.3. Question: is > it OK to make something in the core implicitly depend on ReST before > ReST is in the core? Until docutils is added, there's always the risk > that we decide to never add ReST to the core, or ReST 2.0 changes the > format completely, or we decide XYZ is much better, or something like > that. I think it's okay to make this a recommendation, with the suggestion to be conservative in using reST features. Since a description is usually only a paragraph long, I think that should be okay. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed May 7 16:33:31 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 07 May 2003 11:33:31 -0400 Subject: [Python-Dev] odd interpreter feature In-Reply-To: "Your message of Wed, 07 May 2003 10:16:29 CDT." <16057.9037.913362.225855@montanaro.dyndns.org> References: <16057.5934.556547.671279@montanaro.dyndns.org> <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net> <16057.9037.913362.225855@montanaro.dyndns.org> Message-ID: <200305071533.h47FXVf03526@pcp02138704pcs.reston01.va.comcast.net> > Guido> And yes, it's a wart, but I don't know how to fix it. > > I did a little digging and noticed this comment dating from v 2.5 (Jul 91): > > /* Lines with only whitespace and/or comments > shouldn't affect the indentation and are > not passed to the parser as NEWLINE tokens, > except *totally* empty lines in interactive > mode, which signal the end of a command group. */ > > Not surprisingly, given the age of the change, your fingerprints are all > over it. ;-) > > I suspect if the code beneath that comment was executed only when the > indentation level is zero we'd be okay, but I don't know if the tokenizer > has that sort of information available. I'll do a little more poking > around. Please do. The indentation level should be easily available, since it is computed by the tokenizer. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Wed May 7 17:15:35 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 7 May 2003 11:15:35 -0500 Subject: [Python-Dev] odd interpreter feature In-Reply-To: <200305071533.h47FXVf03526@pcp02138704pcs.reston01.va.comcast.net> References: <16057.5934.556547.671279@montanaro.dyndns.org> <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net> <16057.9037.913362.225855@montanaro.dyndns.org> <200305071533.h47FXVf03526@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <16057.12583.500034.130135@montanaro.dyndns.org> Guido> Please do. The indentation level should be easily available, Guido> since it is computed by the tokenizer. Alas, it's more complicated than just the indentation level of the current line. I need to know if the previous line was indented, which I don't think the tokenizer knows (at least examining *tok in gdb under various conditions suggests it doesn't). I see the following possible cases (there are perhaps more, but I think they are similar enough to ignore here): >>> if x == y: ... # hello ... pass ... >>> if x == y: ... x = 1 ... # hello ... pass ... >>> x = 1 >>> # hello ... >>> Only the last case should display the primary prompt after the comment is entered. The other two correctly display the secondary prompt. It's distinguishing the second and third cases in the tokenizer without help from the parser that's the challenge. Oh well. Perhaps it's a wart best left alone. Skip From brian@sweetapp.com Wed May 7 18:02:19 2003 From: brian@sweetapp.com (Brian Quinlan) Date: Wed, 07 May 2003 10:02:19 -0700 Subject: [Python-Dev] Re: MS VC 7 offer In-Reply-To: Message-ID: <010501c314ba$6b8dbef0$21795418@dell1700> > > That is information about the core ABI. I do need to be concerned > > about changes in the libraries, as well, in particular about > > incompatibilities resulting from multiple copies of the C library. > > You said you don't know much about that. > > I can find out almost as easily, if you have specific questions. But the actual question that we would like to answer is quite broad: what are all of the possible compatibility problems associated with using a VC6 compiled DLL with a VC7 compiled application? Assuming that only changed runtime data structures are going to be a problem, knowing which ones cannot be passed between the two versions would be nice. Below is a list of the standard types defined by Microsoft's VC6 runtime library (taken from the VC6 docs): clock_t _complex _dev_t div_t, ldiv_t _exception FILE _finddata_t, _wfinddata_t, _wfinddatai64_t _FPIEEE_RECORD fpos_t _HEAPINFO jmp_buf lconv _off_t _onexit_t _PNH ptrdiff_t sig_atomic_t size_t _stat time_t _timeb tm _utimbuf va_list wchar_t wctrans_t wctype_t wint_t Cheers, Brian From greg_spencer@acm.org Wed May 7 17:58:03 2003 From: greg_spencer@acm.org (Greg Spencer) Date: Wed, 7 May 2003 10:58:03 -0600 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB8B1E2.2050108@dmsware.com> Message-ID: Well, I'm almost done with the SCons integration for both VC6 and VC7. Just some tests to write and integration into the current codeline to do. Paolo, I'm not sure what you mean by "full" support for VC7, but here's what I'm working on: 1) SCons writes out and maintains (as a "product" of the build) a .dsw and .dsw file for VC6, or an .sln and .vcproj file for VC7. 2) The project and solution files contain "External Makefile" targets, which in MSVC means that it will launch an external command when the "build" button is pressed. 3) The project files contain all of the sources configured in the SCons file, and you can include as many additional files as you would like. The SConscript file that generated the .dsp or .vcproj file is automatically included in the source list so you can edit it from the IDE. With this scheme, you can browse the class hierarchy, edit resource files, build the project, double-click on errors (if any :-), edit source files from the IDE, launch the executable (if any) in the debugger, lather, rinse, repeat. The build is then completely controlled by the Python SConscripts, with the full flexibility that offers, and the project files are now just products of the build that will be blown away and regenerated any time they need to be rebuilt. The only things I've discovered that you can't do with this scheme are insert ActiveX controls (because the menu items are disabled) and build individual object files. At first glance, it seems like the logical choice for VS integration is to build a plugin to Visual Studio, but for VS6, there aren't really enough trigger events to capture the appropriate information at the right times, so it's not really feasible. For VS7, I think things are much more promising in the plugin department, but truthfully, I'm not sure there's much added value. You could insert new ActiveX controls with the wizard and build individual files, sure. But do you really want to change build settings from within the IDE's dialogs? I haven't really decided how this would even work. Probably you'd need a third configuration file that both the VS7 tool and the SConscript could share so that they could get their build setting information. Yet another config file, and now you'd have to keep the .sln and .vcproj files too, making a total of four files that control the build. They'd be in sync, but one file is always better than four. Also, this only works for VS7, and it's complex. I'm still considering a VS7 plugin as a possible future direction, but I need some compelling reasons to do it. I've used the "External Makefile" scheme with classic Cons for four years now, and I haven't had any major complaints from anyone -- they're just overjoyed that their build is automated and "just works", and they can still use the IDE for 90% of what they used it for before. Not to mention all the benefits of using a build system like SCons (centralized setting of build parameters for all projects, for instance). I hope that addresses your needs. If you have suggestions or questions, feel free to e-mail me. BTW, I don't subscribe to python-dev, so be sure to CC me in this thread. -Greg. P.S. Thanks for creating a language that a Perl guy can learn in a week. And I thought shifting from classic cons to scons would be hard... :-) -----Original Message----- From: Paolo Invernizzi [mailto:paoloinvernizzi@dmsware.com] Sent: Wednesday, May 07, 2003 1:13 AM To: python-dev@python.org Cc: Mark Hammond; greg_spencer@acm.org Subject: Re: [Python-Dev] MS VC 7 offer Mark Hammond wrote: >Another thing to consider is the "make" environment. If we don't use >DevStudio, then presumably our existing project files will become useless. >Not a huge problem, but a real one. MSVC exported makefiles are not >designed to be maintained. I'm having good success with autoconf and Python >on other projects, but that would raise the barrier to including cygwin in >your build environment. > I think the scons (www.scons.org) will have in its next release full support for building targets using VC6 *project* file, and full support for VC7. Actually it has support also for cygwin and mingw... So I think is possible to have an automated way for building VC7 python based only on some scons script and VC6 project files... The possible goal is to keep working with VC6 IDE as now, and have a simple build script able to automatically build the VC7 version tracking changes.. I've inserted Greg Spencer, who I know is working on this... surely he can bring us more details. --- Paolo Invernizzi. From jepler@unpythonic.net Wed May 7 18:06:18 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Wed, 7 May 2003 12:06:18 -0500 Subject: [Python-Dev] Startup time In-Reply-To: References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030507170618.GI27125@unpythonic.net> On Tue, May 06, 2003 at 07:35:40PM +0200, Martin v. L=F6wis wrote: > That would be easy to determine: Just disable the block >=20 > #if defined(Py_USING_UNICODE) && defined(HAVE_LANGINFO_H) && defined(CO= DESET) >=20 > in pythonrun.c, and see whether it changes anything. To my knowledge, > this is the only cause of loading encodings during startup on Unix. With this change, I typically see real 0m0.020s user 0m0.020s sys 0m0.000s instead of real 0m0.022s user 0m0.020s sys 0m0.000s The number of successful open()s decreases, but not by much: # before change $ strace -e open ./python-2.3 -S -c pass 2>&1 | grep -v ENOENT | wc -l 46 # after change $ strace -e open ./python -S -c pass 2>&1 | grep -v ENOENT | wc -l 39 What about this line? It seems to be the cause of a bunch of imports, including the sre stuff: /* pythonrun.c */ PyModule_WarningsModule =3D PyImport_ImportModule("warnings"); Jeff From patmiller@llnl.gov Wed May 7 18:25:57 2003 From: patmiller@llnl.gov (Pat Miller) Date: Wed, 07 May 2003 10:25:57 -0700 Subject: [Python-Dev] odd interpreter feature Message-ID: <3EB941A5.5070003@llnl.gov> Skip writes: > >>> # hello: > ... > >>> > > Shouldn't the trailing colon be ignored in comments? Bug, feature or wart? I figured it was a feature... Taking the view that any source block asks for continuations seemed natural, so I assumed Guido intended it that way ;-) If the comments were active objects (like doc strings), then it would be the desired association. >>> # About to do something tricky ... tricky() >>> Pat -- Patrick Miller | (925) 423-0309 | http://www.llnl.gov/CASC/people/pmiller All you need in this life is ignorance and confidence, and then success is sure. -- Mark Twain From martin@v.loewis.de Wed May 7 18:48:41 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 07 May 2003 19:48:41 +0200 Subject: [Python-Dev] bsddb185 module changes checked in In-Reply-To: <085D82A5-808D-11D7-A6E2-0030655234CE@cwi.nl> References: <085D82A5-808D-11D7-A6E2-0030655234CE@cwi.nl> Message-ID: Jack Jansen writes: > On SGI Irix 6.5 (MIPSpro Compilers: Version 7.2.1) it tries to build > it, and fails. It complains about "u_int" and such not being > defined. There's magic at the top of /usr/include/db.h for defining > various types optionally, and that's as far as my understanding > went. I would not be worried about that too much. An Irix user who cares about that will propose a solution, if there are Irix users who care about that. Regards, Martin From greg_spencer@acm.org Wed May 7 19:25:11 2003 From: greg_spencer@acm.org (Greg Spencer) Date: Wed, 7 May 2003 12:25:11 -0600 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB8B1E2.2050108@dmsware.com> Message-ID: Actually, on re-reading your mail, I realize that you might just be talking about getting VC7 to work well with SCons (since it currently only knows about how to find VC6). I've got that part done, and it'll be in with the project file stuff. -Greg. -----Original Message----- From: Paolo Invernizzi [mailto:paoloinvernizzi@dmsware.com] Sent: Wednesday, May 07, 2003 1:13 AM To: python-dev@python.org Cc: Mark Hammond; greg_spencer@acm.org Subject: Re: [Python-Dev] MS VC 7 offer Mark Hammond wrote: >Another thing to consider is the "make" environment. If we don't use >DevStudio, then presumably our existing project files will become useless. >Not a huge problem, but a real one. MSVC exported makefiles are not >designed to be maintained. I'm having good success with autoconf and Python >on other projects, but that would raise the barrier to including cygwin in >your build environment. > I think the scons (www.scons.org) will have in its next release full support for building targets using VC6 *project* file, and full support for VC7. Actually it has support also for cygwin and mingw... So I think is possible to have an automated way for building VC7 python based only on some scons script and VC6 project files... The possible goal is to keep working with VC6 IDE as now, and have a simple build script able to automatically build the VC7 version tracking changes.. I've inserted Greg Spencer, who I know is working on this... surely he can bring us more details. --- Paolo Invernizzi. From jepler@unpythonic.net Wed May 7 19:30:26 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Wed, 7 May 2003 13:30:26 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <20030507170618.GI27125@unpythonic.net> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507170618.GI27125@unpythonic.net> Message-ID: <20030507183025.GJ27125@unpythonic.net> On Wed, May 07, 2003 at 12:06:18PM -0500, Jeff Epler wrote: > What about this line? It seems to be the cause of a bunch of imports, > including the sre stuff: > /* pythonrun.c */ > PyModule_WarningsModule = PyImport_ImportModule("warnings"); With this *and* the unicode stuff removed, I see runtimes like this: $ time ./python -S -c pass real 0m0.008s user 0m0.010s sys 0m0.000s and opens are nearly down to 2.2 levels: $ strace -e open ./python -S -c pass 2>&1 | grep -v ENOENT | wc 11 44 489 $ strace -e open /usr/bin/python -S -c pass 2>&1 | grep -v ENOENT | wc 8 32 355 (the differences are libstdc++, libgcc_s, and librt) With *just* the import of warnings removed, I get this: $ time ./python -S -c pass real 0m0.017s user 0m0.010s sys 0m0.010s .. and the input of sre is back. I guess it's used in both warnings.py and encodings/__init__.py Jeff From jepler@unpythonic.net Wed May 7 19:52:46 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Wed, 7 May 2003 13:52:46 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <20030507183025.GJ27125@unpythonic.net> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507170618.GI27125@unpythonic.net> <20030507183025.GJ27125@unpythonic.net> Message-ID: <20030507185245.GL27125@unpythonic.net> On Wed, May 07, 2003 at 01:30:26PM -0500, Jeff Epler wrote: > .. and the input of sre is back. I guess it's used in both warnings.py > and encodings/__init__.py In encodings.__init__.py, the only use of re is for the normalize_encoding function. It could potentially be replaced with only string operations: # translate all offending characters to whitespace _norm_encoding_trans = string.maketrans(...) def normalize_encoding(encoding): encoding = encoding.translate(_norm_encoding_trans) # let the str.split machinery take care of splitting # only once on repeated whitespace return "_".join(encoding.split()) .. or the import of re could be moved inside normalize_encoding. In warnings.py, re is used in two functions, filterwarnings() and _setoption(). it's probably safe to move 'import re' inside these functions. I'm guessing the 'import lock' warnings.py problem doesn't apply when parsing options or adding new warning filters. Furthermore, filterwarnings() will have to be changed to not use re.compile() when message is "" (the resulting RE is always successfully matched) since several filterwarnings() calls are already performed by default, but always with message="". These changes would prevent the import of 're' at startup time, which appears to be the real killer. (see my module import timings in an earlier message) From skip@pobox.com Wed May 7 20:05:06 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 7 May 2003 14:05:06 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <20030507185245.GL27125@unpythonic.net> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507170618.GI27125@unpythonic.net> <20030507183025.GJ27125@unpythonic.net> <20030507185245.GL27125@unpythonic.net> Message-ID: <16057.22754.582272.377803@montanaro.dyndns.org> Jeff> In encodings.__init__.py, the only use of re is for the Jeff> normalize_encoding function. It could potentially be replaced with only Jeff> string operations: ... Jeff> .. or the import of re could be moved inside normalize_encoding. I don't know if this still holds true, but at one point during the 2.x series I think it was pretty expensive to perform imports inside functions, much more expensive than in 1.5.2 at least (maybe right after nested scopes were introduced?). If that is still true, moving the import might be false economy. Skip "Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems." -- Jamie Zawinski From jepler@unpythonic.net Wed May 7 20:42:17 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Wed, 7 May 2003 14:42:17 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <16057.22754.582272.377803@montanaro.dyndns.org> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507170618.GI27125@unpythonic.net> <20030507183025.GJ27125@unpythonic.net> <20030507185245.GL27125@unpythonic.net> <16057.22754.582272.377803@montanaro.dyndns.org> Message-ID: <20030507194215.GM27125@unpythonic.net> On Wed, May 07, 2003 at 02:05:06PM -0500, Skip Montanaro wrote: > I don't know if this still holds true, but at one point during the 2.x > series I think it was pretty expensive to perform imports inside functions, > much more expensive than in 1.5.2 at least (maybe right after nested scopes > were introduced?). If that is still true, moving the import might be false > economy. $ ./python Lib/timeit.py -s "def f(): import sys" "f()" 100000 loops, best of 3: 3.34 usec per loop $ ./python Lib/timeit.py -s "def f(): pass" "import sys; f()" 100000 loops, best of 3: 3.3 usec per loop $ ./python Lib/timeit.py -s "def f(): pass" "f()" 1000000 loops, best of 3: 0.451 usec per loop $ ./python Lib/timeit.py 'import sys' 100000 loops, best of 3: 2.88 usec per loop About 2.8usec would be added to each invocation of the functions in question, about the same as the cost of a global-scope import. This means that you lose overall as soon as the function is called twice. .. but this was about speeding python startup, not just speeding python. <.0375 wink> Jeff From aleax@aleax.it Wed May 7 21:57:26 2003 From: aleax@aleax.it (Alex Martelli) Date: Wed, 7 May 2003 22:57:26 +0200 Subject: [Python-Dev] Startup time In-Reply-To: <16057.22754.582272.377803@montanaro.dyndns.org> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507185245.GL27125@unpythonic.net> <16057.22754.582272.377803@montanaro.dyndns.org> Message-ID: <200305072257.26085.aleax@aleax.it> On Wednesday 07 May 2003 09:05 pm, Skip Montanaro wrote: > Jeff> In encodings.__init__.py, the only use of re is for the > Jeff> normalize_encoding function. It could potentially be replaced > with only Jeff> string operations: > ... > Jeff> .. or the import of re could be moved inside normalize_encoding. > > I don't know if this still holds true, but at one point during the 2.x > series I think it was pretty expensive to perform imports inside functions, > much more expensive than in 1.5.2 at least (maybe right after nested scopes > were introduced?). If that is still true, moving the import might be false > economy. Doesn't seem to be true in 2.3, if I understand what you're saying: [alex@lancelot src]$ python Lib/timeit.py -s'def f(): pass' 'import math; f()' 100000 loops, best of 3: 4.04 usec per loop [alex@lancelot src]$ python Lib/timeit.py -s'def f(): import math' 'pass; f()' 100000 loops, best of 3: 4.05 usec per loop or even [alex@lancelot src]$ python Lib/timeit.py -s'import math' -s'def f(): pass' 'reload(math); f()' 10000 loops, best of 3: 168 usec per loop [alex@lancelot src]$ python Lib/timeit.py -s'import math' -s'def f(): reload(math)' 'pass; f()' 10000 loops, best of 3: 169 usec per loop Alex From skip@pobox.com Wed May 7 22:16:28 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 7 May 2003 16:16:28 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <200305072257.26085.aleax@aleax.it> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507185245.GL27125@unpythonic.net> <16057.22754.582272.377803@montanaro.dyndns.org> <200305072257.26085.aleax@aleax.it> Message-ID: <16057.30636.403064.675001@montanaro.dyndns.org> >> I don't know if this still holds true, but at one point during the >> 2.x series I think it was pretty expensive to perform imports inside >> functions, much more expensive than in 1.5.2 at least (maybe right >> after nested scopes were introduced?). Alex> Doesn't seem to be true in 2.3, if I understand what you're saying: Alex> [alex@lancelot src]$ python Lib/timeit.py -s'def f(): pass' 'import math; f()' Alex> 100000 loops, best of 3: 4.04 usec per loop Alex> [alex@lancelot src]$ python Lib/timeit.py -s'def f(): import math' 'pass; f()' Alex> 100000 loops, best of 3: 4.05 usec per loop Yes, you're correct. Guess I could have run that myself had I been thinking. (My sleeping cap wasn't on much last night, so my thinking cap hasn't been on much today.) Guido, any chance you can quickly run the above two through the thirty-leven versions of Python you have laying about so we can narrow this down or refute my faulty memory? I've seen some recent posts by you which had performance data as far back as 1.3. I tried with 2.1, 2.2 and CVS but saw no discernable differences within versions: % python ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()' 100000 loops, best of 3: 7.44 usec per loop % python ~/local/bin/timeit.py -s'def f(): import math' 'f()' 100000 loops, best of 3: 7.6 usec per loop % python2.2 ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()' 100000 loops, best of 3: 9.19 usec per loop % python2.2 ~/local/bin/timeit.py -s'def f(): import math' 'f()' 100000 loops, best of 3: 9.05 usec per loop % python2.1 ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()' 100000 loops, best of 3: 9.16 usec per loop % python2.1 ~/local/bin/timeit.py -s'def f(): import math' 'f()' 100000 loops, best of 3: 9.12 usec per loop Maybe it was 2.0? Thx, Skip From drifty@alum.berkeley.edu Wed May 7 23:16:50 2003 From: drifty@alum.berkeley.edu (Brett Cannon) Date: Wed, 7 May 2003 15:16:50 -0700 (PDT) Subject: [Python-Dev] Make _strptime only time.strptime implementation? Message-ID: Someone filed a bug report wanting it to be mentioned that most libc implementations of strptime don't handle %Z. Michael asked whether _strptime was going to become the permanent version of time.strptime or not. This was partially discussed back when Guido used his amazing time machine to make time.strptime use _strptime exclusively for testing purposes. I vaguely remember Tim saying he supported moving to _strptime, but I don't remember Guido having an opinion. If this is going to happen for 2.3 I would like to know so as to fix the documentation to be better. -Brett From python@rcn.com Thu May 8 00:55:03 2003 From: python@rcn.com (Raymond Hettinger) Date: Wed, 7 May 2003 19:55:03 -0400 Subject: [Python-Dev] Startup time References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507185245.GL27125@unpythonic.net> <16057.22754.582272.377803@montanaro.dyndns.org> <200305072257.26085.aleax@aleax.it> <16057.30636.403064.675001@montanaro.dyndns.org> Message-ID: <003e01c314f4$1414f780$125ffea9@oemcomputer> > Guido, any chance you can quickly run the above two through the thirty-leven > versions of Python you have laying about so we can narrow this down or > refute my faulty memory? I've seen some recent posts by you which had > performance data as far back as 1.3. I tried with 2.1, 2.2 and CVS but saw > no discernable differences within versions: > > % python ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()' > 100000 loops, best of 3: 7.44 usec per loop > % python ~/local/bin/timeit.py -s'def f(): import math' 'f()' > 100000 loops, best of 3: 7.6 usec per loop > > % python2.2 ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()' > 100000 loops, best of 3: 9.19 usec per loop > % python2.2 ~/local/bin/timeit.py -s'def f(): import math' 'f()' > 100000 loops, best of 3: 9.05 usec per loop > > % python2.1 ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()' > 100000 loops, best of 3: 9.16 usec per loop > % python2.1 ~/local/bin/timeit.py -s'def f(): import math' 'f()' > 100000 loops, best of 3: 9.12 usec per loop I don't think timeit.py helps here. It works by substituting *both* the setup and statement inside a compiled function. So, *none* of the above timings show the effect of a top level import versus one that is inside a function. It does compare 1 deep nesting to 2 levels deep. So, you'll likely have to roll your own minature timer if you want a straight answer. Raymond Hettinger From jepler@unpythonic.net Thu May 8 01:53:44 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Wed, 7 May 2003 19:53:44 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <003e01c314f4$1414f780$125ffea9@oemcomputer> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507185245.GL27125@unpythonic.net> <16057.22754.582272.377803@montanaro.dyndns.org> <200305072257.26085.aleax@aleax.it> <16057.30636.403064.675001@montanaro.dyndns.org> <003e01c314f4$1414f780$125ffea9@oemcomputer> Message-ID: <20030508005342.GA3634@unpythonic.net> On Wed, May 07, 2003 at 07:55:03PM -0400, Raymond Hettinger wrote: > I don't think timeit.py helps here. It works by substituting *both* > the setup and statement inside a compiled function. > > So, *none* of the above timings show the effect of a top level import > versus one that is inside a function. It does compare 1 deep nesting > to 2 levels deep. This program prints clock() times for 4e6 imports, first at global and then at function scope. Function scope wins a little bit, possibly due to the speed of STORE_FAST instead of STORE_GLOBAL (or would it be STORE_NAME?) ######################################################################## # (on a different machine than my earlier timeit results, running 2.2.2) # time for global import 30.21 # time for function import 27.31 import time, sys t0 = time.clock() for i in range(1e6): import sys; import sys; import sys; import sys; t1 = time.clock() print "time for global import", t1-t0 def f(): for i in range(1e6): import sys; import sys; import sys; import sys; t0 = time.clock() f() t1 = time.clock() print "time for function import", t1-t0 ######################################################################## If Skip is thinking of a slowdown for import and function scope, could it be the {LOAD,STORE}_FAST performance killer 'import *'? (wow, LOAD_NAME isn't as much slower than LOAD_FAST as you might expect..) ######################################################################## # time for 27.9 # time for 37.94 import new, sys, time m = new.module('m') sys.modules['m'] = m m.__dict__.update({'__all__': ['x'], 'x': None}) def f(): from m import x x; x; x; x; x; x; x; x; x; x x; x; x; x; x; x; x; x; x; x x; x; x; x; x; x; x; x; x; x x; x; x; x; x; x; x; x; x; x x; x; x; x; x; x; x; x; x; x def g(): from m import * x; x; x; x; x; x; x; x; x; x x; x; x; x; x; x; x; x; x; x x; x; x; x; x; x; x; x; x; x x; x; x; x; x; x; x; x; x; x x; x; x; x; x; x; x; x; x; x for fn in f, g: t0 = time.clock() for i in range(1e6): fn() t1 = time.clock() print "time for", fn, t1-t0 ######################################################################## From dave@boost-consulting.com Thu May 8 03:02:34 2003 From: dave@boost-consulting.com (David Abrahams) Date: Wed, 07 May 2003 22:02:34 -0400 Subject: [Python-Dev] Re: MS VC 7 offer In-Reply-To: <010501c314ba$6b8dbef0$21795418@dell1700> (Brian Quinlan's message of "Wed, 07 May 2003 10:02:19 -0700") References: <010501c314ba$6b8dbef0$21795418@dell1700> Message-ID: Brian Quinlan writes: >> > That is information about the core ABI. I do need to be concerned >> > about changes in the libraries, as well, in particular about >> > incompatibilities resulting from multiple copies of the C library. >> > You said you don't know much about that. >> >> I can find out almost as easily, if you have specific questions. > > But the actual question that we would like to answer is quite broad: > what are all of the possible compatibility problems associated with > using a VC6 compiled DLL with a VC7 compiled application? > > Assuming that only changed runtime data structures are going to be a > problem, knowing which ones cannot be passed between the two versions > would be nice. Below is a list of the standard types defined by > Microsoft's VC6 runtime library (taken from the VC6 docs): > > clock_t > _complex > _dev_t > div_t, ldiv_t > _exception > FILE > _finddata_t, _wfinddata_t, _wfinddatai64_t > _FPIEEE_RECORD > fpos_t > _HEAPINFO > jmp_buf > lconv > _off_t > _onexit_t > _PNH > ptrdiff_t > sig_atomic_t > size_t > _stat > time_t > _timeb > tm > _utimbuf > va_list > wchar_t > wctrans_t > wctype_t > wint_t So do you want me to ask what all the possible compatibility problems are, or do you want me to ask which of the above structures cannot be passed between the two versions (or neither)? -- Dave Abrahams Boost Consulting www.boost-consulting.com From logistix@cathoderaymission.net Thu May 8 03:48:50 2003 From: logistix@cathoderaymission.net (logistix) Date: Wed, 7 May 2003 22:48:50 -0400 Subject: [Python-Dev] Building Python with .NET 2003 SDK Message-ID: <000201c3150c$5b294cd0$20bba8c0@XP> I decided to see if you really could build Python with the .NET compiler. I just got a preliminary build done that passed 67 tests (and failed 17) Two big gothas: 1) You also need to install the "Platform SDK". This one makes the .NET SDK download seem fast. 2) VC6 generated makefiles include references to a few .lib files that aren't included. They also don't seem to be needed either. The offending librarys are largeint.lib, odbc32.lib, and odbccp32.lib. More detailed notes on what had to be done to get it working can be found here, http://www.cathoderaymission.net/~logistix/python/buildingPythonWithDotN et.html Enjoy! -Grant From skip@pobox.com Thu May 8 04:23:58 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 7 May 2003 22:23:58 -0500 Subject: [Python-Dev] local import cost Message-ID: <16057.52686.475079.530463@montanaro.dyndns.org> Thanks to Raymond H for pointing out the probably fallacy in my original timeit runs. Here's a simple timer which I think gets at what I'm after: import time import math import sys N = 500000 def fmath(): import math def fpass(): pass v = sys.version.split()[0] t = time.clock() for i in xrange(N): fmath() fmathcps = N/(time.clock()-t) t = time.clock() for i in xrange(N): fpass() fpasscps = N/(time.clock()-t) print "%s fpass/fmath: %.1f" % (v, fpasscps/fmathcps) On my Mac I get these outputs: 2.1.3 fpass/fmath: 5.0 2.2.2 fpass/fmath: 5.6 2.3b1+ fpass/fmath: 5.3 Naturally, I expect fpass() to run a lot faster than fmath(). If my presumption is correct though, there will be a sharp increase in the ratio, maybe in 2.0 or 2.1, or whenever nested scopes were first introduced. I can't run anything earlier than 2.1.x (I'll see about building 2.1) on my Mac. I'd have to break out my Linux laptop and do a bunch of downloading and compiling to get earlier results. From brian@sweetapp.com Thu May 8 08:08:48 2003 From: brian@sweetapp.com (Brian Quinlan) Date: Thu, 08 May 2003 00:08:48 -0700 Subject: [Python-Dev] Re: MS VC 7 offer In-Reply-To: Message-ID: <002f01c31530$ac4b5ee0$21795418@dell1700> > So do you want me to ask what all the possible compatibility problems > are, or do you want me to ask which of the above structures cannot be > passed between the two versions (or neither)? The former question would be best as the later would seem to be a subset. Cheers, Brian From sjoerd@acm.org Thu May 8 09:11:15 2003 From: sjoerd@acm.org (Sjoerd Mullender) Date: Thu, 08 May 2003 10:11:15 +0200 Subject: [Python-Dev] odd interpreter feature In-Reply-To: <16057.12583.500034.130135@montanaro.dyndns.org> References: <16057.5934.556547.671279@montanaro.dyndns.org> <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net> <16057.9037.913362.225855@montanaro.dyndns.org> <200305071533.h47FXVf03526@pcp02138704pcs.reston01.va.comcast.net> <16057.12583.500034.130135@montanaro.dyndns.org> Message-ID: <20030508081115.79E3174230@indus.ins.cwi.nl> Isn't it the case that you should only get a secondary prompt after the comment line if the comment line *itself* had a secondary prompt? On Wed, May 7 2003 Skip Montanaro wrote: > > Guido> Please do. The indentation level should be easily available, > Guido> since it is computed by the tokenizer. > > Alas, it's more complicated than just the indentation level of the current > line. I need to know if the previous line was indented, which I don't think > the tokenizer knows (at least examining *tok in gdb under various conditions > suggests it doesn't). > > I see the following possible cases (there are perhaps more, but I think they > are similar enough to ignore here): > > >>> if x == y: > ... # hello > ... pass > ... > >>> if x == y: > ... x = 1 > ... # hello > ... pass > ... > >>> x = 1 > >>> # hello > ... > >>> > > Only the last case should display the primary prompt after the comment is > entered. The other two correctly display the secondary prompt. It's > distinguishing the second and third cases in the tokenizer without help from > the parser that's the challenge. > > Oh well. Perhaps it's a wart best left alone. -- Sjoerd Mullender From mal@lemburg.com Thu May 8 11:38:44 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 08 May 2003 12:38:44 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: Message-ID: <3EBA33B4.3080601@lemburg.com> Tim Peters wrote: > [Martin v. Lowis] > >>... >>Some are sincerely hoping, or even expecting, that Python 2.3 is >>released with VC7, so that they can embed Python in their VC7-based >>application without having to recompile it. >> >>No matter what the choice is, somebody will be unhappy. > > OTOH, I don't see anything to stop releasing VC6 and VC7 versions of Python, > except for the absence of a volunteer to do it. While the Wise installer is > proprietary, there's nothing hidden about what goes into a release, there > are several free installers people *could* use instead, and the build > process for the 3rd-party components is pretty exhaustively documented. > > Speaking of which, presumably Tcl/Tk and SSL and etc on Windows should also > be compiled under VC7 then. I'm sure commercial players like e.g. ActiveState will happily provide Windows installers for both versions. Personally I don't think that people will switch to VC7 all that soon -- the .NET libs are still far from being stable and as I read the quotes on the VC compiler included in the .NET SDK, it will only generate code that runs with the .NET libs installed. Could be wrong, though. Given that tools like distutils probably don't work out of the box with the VC7 compiler suite, I'd wait at least another release before making VC7 binaries the default on Windows. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 08 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 47 days left From mwh@python.net Thu May 8 11:54:21 2003 From: mwh@python.net (Michael Hudson) Date: Thu, 08 May 2003 11:54:21 +0100 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc/ext noddy2.c,NONE,1.1 noddy3.c,NONE,1.1 newtypes.tex,1.21,1.22 In-Reply-To: <3EBA342A.7020006@zope.com> (Jim Fulton's message of "Thu, 08 May 2003 06:40:42 -0400") References: <2mu1c568eq.fsf@starship.python.net> <3EBA342A.7020006@zope.com> Message-ID: <2mr879674y.fsf@starship.python.net> [Ccing python-dev because of the last paragraph] Jim Fulton writes: > Michael Hudson wrote: >> dcjim@users.sourceforge.net writes: >> >>>Update of /cvsroot/python/python/dist/src/Doc/ext >>>In directory sc8-pr-cvs1:/tmp/cvs-serv13294 >>> >>>Modified Files: >>> newtypes.tex Added Files: >>> noddy2.c noddy3.c Log Message: >>>Rewrote the basic section of the chapter on defining new types. >> As the original author of this section, thank you! > > You're welcome. :) > > My main reason for doing this was to learn the material myself. > (I had the luxury of sitting next to Guido as I did it and bugging > him with questions. :) That would help :-) >> Do you mention anywhere that this only works for 2.2 and up? That >> might be an idea. > > OK, I'll add that in the introduction. It was *already* dependent on > Python 2.3 due to the use of PyMODINIT_FUNC as the type of the init > function. Yes. That wasn't me, and whoever changed it didn't keep the .c file in sync with the bits of it quoted in the .tex file, grumble. > I'm not sure why this is needed rather than void. Maybe I should change this > so it works with Python 2.2. I'll talk to Guido and Fred about this. The Py_MODINIT()/DL_IMPORT() thing is an annoying incompatibility-causer ... perhaps something to deal with this could be added to pymemcompat.h? (in which case it's misnamed...) Cheers, M. -- ARTHUR: Ford, you're turning into a penguin, stop it. -- The Hitch-Hikers Guide to the Galaxy, Episode 2 From paoloinvernizzi@dmsware.com Thu May 8 12:08:41 2003 From: paoloinvernizzi@dmsware.com (Paolo Invernizzi) Date: Thu, 08 May 2003 13:08:41 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBA33B4.3080601@lemburg.com> References: <3EBA33B4.3080601@lemburg.com> Message-ID: <3EBA3AB9.8020305@dmsware.com> M.-A. Lemburg wrote: > as I read > the quotes on the VC compiler included in the .NET SDK, it will only > generate code that runs with the .NET libs installed. Could be wrong, > though. Uh? The VC compiler included with the .NET SDK can only generate managed code? I don't think so... > Given that tools like distutils probably don't work > out of the box with the VC7 compiler suite, I'd wait at least > another release before making VC7 binaries the default on > Windows. Actually I have VC6 *and* VC7 in my at-work machine, python22 (Standard distribution, VC6 based), python 23b1 (Standard, VC6 based) and python cvs, wich I manually build with VC7. I can build/install distutils packages choosing wich environment to use (6 or 7) and python to use (22, 23b1, 23 head). So I think this is a no-problem... But isn't possible, at least, to have a 'not-default' release compiled with VC7? It can be a boost for having other *complicated* packages released with VC7 among with VC6 (I'm thinking at wxPython, and so...) --- Paolo Invernizzi From nhodgson@bigpond.net.au Thu May 8 12:31:20 2003 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Thu, 08 May 2003 21:31:20 +1000 Subject: [Python-Dev] MS VC 7 offer References: <3EBA33B4.3080601@lemburg.com> Message-ID: <000d01c31555$59222800$3da48490@neil> M.-A. Lemburg: > Personally I don't think that people will switch to VC7 all that > soon -- the .NET libs are still far from being stable and as I read > the quotes on the VC compiler included in the .NET SDK, it will only > generate code that runs with the .NET libs installed. Could be wrong, > though. VC7 can produce stand-alone binaries that do not need the .NET framework or even the C runtime DLLs. I have distributed executable versions of my Scintilla and SciTE projects built with VC7 for 9 months now. The executables are quite a bit smaller and faster (average of 10%) over VC6. The link time code generation option which can inline functions at link time rather than compile time is effective. Possible issues with moving to VC7 are ensuring compatibility with extension modules and the End User License Agreement. I looked at the EULA thoroughly before buying VC7 as the license includes some clauses that may cause problems for open source software that may be included in GPLed applications. Redistributing applications compiled with VC7 is OK, but redistributing the runtime DLLs such as msvcr70.dll (which is not already present on pre VC7 versions of Windows) can not be done with GPLed code: """ (ii) not distributing Identified Software in conjunction with the Redistributables or a derivative work thereof; ... Identified Software includes, without limitation, any software that requires as a condition of use, modification and/or distribution of such software that other software incorporated into, derived from or distributed with such software be (1) disclosed or distributed in source code form; (2) be licensed for the purpose of making derivative works; or (3) be redistributable at no charge. """ MS may have come to their senses and dropped this for Visual Studio 2003. It can be quite fun tracking the EULA down and working out which components are licensed under which EULA. When downloading .NET before VC7 was available, the web site EULA was different to the installer's version. Neil From mal@lemburg.com Thu May 8 12:37:51 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 08 May 2003 13:37:51 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBA3AB9.8020305@dmsware.com> References: <3EBA33B4.3080601@lemburg.com> <3EBA3AB9.8020305@dmsware.com> Message-ID: <3EBA418F.3020006@lemburg.com> Paolo Invernizzi wrote: > M.-A. Lemburg wrote: > >> as I read >> the quotes on the VC compiler included in the .NET SDK, it will only >> generate code that runs with the .NET libs installed. Could be wrong, >> though. > > Uh? The VC compiler included with the .NET SDK can only generate > managed code? I don't think so... That's what I read in messages on this topic on google groups. I've just downloaded the SDK myself and will probably give it a go later today. >> Given that tools like distutils probably don't work >> out of the box with the VC7 compiler suite, I'd wait at least >> another release before making VC7 binaries the default on >> Windows. > > Actually I have VC6 *and* VC7 in my at-work machine, python22 (Standard > distribution, VC6 based), python 23b1 (Standard, VC6 based) and python > cvs, wich I manually build with VC7. > I can build/install distutils packages choosing wich environment to use > (6 or 7) and python to use (22, 23b1, 23 head). > So I think this is a no-problem... That's good to know (btw, how do you tell distutils which VC version to use ? or does it find out by itself using the Python time machine ;-). > But isn't possible, at least, to have a 'not-default' release compiled > with VC7? > > It can be a boost for having other *complicated* packages released with > VC7 among with VC6 (I'm thinking at wxPython, and so...) If someone volunteers to maintain such a branch, I suppose there's nothing preventing it :-) Perhaps we should look at the offer in a different light... What advantage would the move from VC6 to VC7 give Python users ? -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 08 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 47 days left From Paul.Moore@atosorigin.com Thu May 8 12:53:54 2003 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Thu, 8 May 2003 12:53:54 +0100 Subject: [Python-Dev] MS VC 7 offer Message-ID: <16E1010E4581B049ABC51D4975CEDB88619A6B@UKDCX001.uk.int.atosorigin.com> From: Neil Hodgson [mailto:nhodgson@bigpond.net.au] > Possible issues with moving to VC7 are ensuring compatibility with > extension modules That's the one that I see as most important. For the PythonLabs distribution to move to VC7, it sounds as if many of the Windows binary extensions in existence will also need to be built with VC7. I've no idea how much of a problem this would be to extension authors, but it would be a problem to end users if extension authors could no longer provide binaries. For reference, extensions I'd be in trouble without include win32all, wxPython, cx_Oracle, pyXML (on occasion), ctypes, PIL, mod_python. I've used many others on occasion, and no VC7 version would be an issue for me. So I guess that's the key issue. Can the majority of extension authors produce VC7-compatible binaries? This probably needs to be asked on comp.lang.python, not just on python-dev. Paul. From paoloinvernizzi@dmsware.com Thu May 8 13:25:21 2003 From: paoloinvernizzi@dmsware.com (Paolo Invernizzi) Date: Thu, 08 May 2003 14:25:21 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBA418F.3020006@lemburg.com> References: <3EBA33B4.3080601@lemburg.com> <3EBA3AB9.8020305@dmsware.com> <3EBA418F.3020006@lemburg.com> Message-ID: <3EBA4CB1.6020904@dmsware.com> M.-A. Lemburg wrote: > That's good to know (btw, how do you tell distutils which VC > version to use ? or does it find out by itself using the > Python time machine ;-). I simply run the right .bat file that sets all the needed variables before running the setup.py ;-) > If someone volunteers to maintain such a branch, I suppose > there's nothing preventing it :-) As I guessed :-) I think that the next release of scons can open new perspectives... (see previous post of Greg Spencer) > Perhaps we should look at the offer in a different light... > > What advantage would the move from VC6 to VC7 give Python users ? I don't know if there are advantages on *moving*... but I'm concerned on *adding*... a VC7 plus a VC6 release... --- Paolo Invernizzi From DavidA@ActiveState.com Thu May 8 17:29:14 2003 From: DavidA@ActiveState.com (David Ascher) Date: Thu, 08 May 2003 09:29:14 -0700 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBA33B4.3080601@lemburg.com> References: <3EBA33B4.3080601@lemburg.com> Message-ID: <3EBA85DA.5050806@ActiveState.com> M.-A. Lemburg wrote: > Tim Peters wrote: > I'm sure commercial players like e.g. ActiveState will happily > provide Windows installers for both versions. We will as soon as our customers ask for it. So far, we've gotten no interest in that direction. --david From brian@sweetapp.com Thu May 8 17:34:10 2003 From: brian@sweetapp.com (Brian Quinlan) Date: Thu, 08 May 2003 09:34:10 -0700 Subject: [Python-Dev] Re: MS VC 7 offer In-Reply-To: Message-ID: <008701c3157f$a7169210$21795418@dell1700> Carl Kleffner referred me to an interesting discussion regarding VC6 and VC7 compatibility: http://tinyurl.com/baok The bottom line seems to be that the C runtime libraries for VC6 and VC7 are currently binary compatible but that might change in the future. And CRT-allocated resources cannot be shared between the two. Cheers, Brian From mal@lemburg.com Thu May 8 17:36:57 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 08 May 2003 18:36:57 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBA85DA.5050806@ActiveState.com> References: <3EBA33B4.3080601@lemburg.com> <3EBA85DA.5050806@ActiveState.com> Message-ID: <3EBA87A9.7090805@lemburg.com> David Ascher wrote: > M.-A. Lemburg wrote: > >> Tim Peters wrote: > > >> I'm sure commercial players like e.g. ActiveState will happily >> provide Windows installers for both versions. > > We will as soon as our customers ask for it. So far, we've gotten no > interest in that direction. I suppose that's fair enough :-) -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 08 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 47 days left From lists@morpheus.demon.co.uk Thu May 8 19:03:32 2003 From: lists@morpheus.demon.co.uk (Paul Moore) Date: Thu, 08 May 2003 19:03:32 +0100 Subject: [Python-Dev] MS VC 7 offer References: <16E1010E4581B049ABC51D4975CEDB88619A64@UKDCX001.uk.int.atosorigin.com> <046a01c31487$399d3390$530f8490@eden> Message-ID: "Mark Hammond" writes: > I must say that anecdotally, I find this to be true. Developers are > *not* flocking to VC7. I wonder if that fact has anything to do with > MS offering free compilers? One further data point - the free mingw gcc compiler generates binaries which depend on msvcrt.dll. So, if the Pythonlabs distribution switches to MSVC7, developers using MSVC6 *and* developers using mingw will be unable to build compatible extensions. The only compatible compiler will be MSVC7 (either the paid for version or the free limited version). Whatever you may think of Microsoft's offer, I feel that this reduction in choice is a bad thing. Paul. -- This signature intentionally left blank From bbolli@ymail.ch Thu May 8 21:20:12 2003 From: bbolli@ymail.ch (Beat Bolli) Date: Thu, 8 May 2003 22:20:12 +0200 Subject: [Python-Dev] Subclassing int? [Was: Re: [PEP] += on return of function call result] Message-ID: <20030508202012.GA3809@bolli.homeip.net> Andrew Koenig wrote: > > Why can't you do this? > > foo =3D log.setdefault(r,'') > > foo +=3D "test %d\n" % t > You can do it, but it's useless! I got bitten by the same problem some time ago. Please let me explain: I needed to count words, using a dict, of course. So, in my first enthusiasm, I wrote: count =3D {} for word in wordlist: count.setdefault(word, 0) +=3D 1 This, as I soon realized, didn't work, exactly because ints are immutable= . So I tried a different track. No problem, I thought, in the new Python object world, the native classes can be subclassed. I imagined I could enhance the int class with an inc() method, thusly: class Counter(int): def inc(self): # to be defined self +=3D 1?? count =3D {} for word in wordlist: count.setdefault(word, Counter()).inc() As you can see, I have a problem at the comment: how do I access the inherited int value??? I realized that this also wasn't going to work, either. I finally used the perhaps idiomatic count =3D {} for word in wordlist: count[word] =3D count.get(word, 0) + 1 which of course is suboptimal, because the lookup is done twice. I decide= d not to implement a proper Counter class for memory efficiency reasons. Th= e code would have been simple: class Counter: def __init__(self): self.n =3D 0 def inc(self): self.n +=3D 1 def get(self): return self.n count =3D {} for word in wordlist: count.setdefault(word, Counter()).inc() But to restate the core question: can class Counter be written as a subcl= ass of int? Beat Bolli (please CC: me on replys, I'm not on the list) --=20 mail: `echo '' | sed -e 's/[A-S]//g'` pgp: 0x506A903A; 49D5 794A EA77 F907 764F D89E 304B 93CF 506A 903A icbm: 47=B0 02' 43.0" N, 07=B0 16' 17.5" E (WGS84) From lists@morpheus.demon.co.uk Thu May 8 21:05:28 2003 From: lists@morpheus.demon.co.uk (Paul Moore) Date: Thu, 08 May 2003 21:05:28 +0100 Subject: [Python-Dev] MS VC 7 offer References: <3EBA33B4.3080601@lemburg.com> <3EBA85DA.5050806@ActiveState.com> Message-ID: David Ascher writes: >> I'm sure commercial players like e.g. ActiveState will happily >> provide Windows installers for both versions. > > We will as soon as our customers ask for it. So far, we've gotten no > interest in that direction. Is that no interest in a VC7 version? If so, that's probably pretty relevant information... Paul. -- This signature intentionally left blank From tim.one@comcast.net Thu May 8 22:56:32 2003 From: tim.one@comcast.net (Tim Peters) Date: Thu, 08 May 2003 17:56:32 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBA418F.3020006@lemburg.com> Message-ID: [M.-A. Lemburg] > ... > Perhaps we should look at the offer in a different light... > > What advantage would the move from VC6 to VC7 give Python users ? In general, smaller and faster code is a decent bet. For those who use VC7 already, an easier life. "Move" implies abandoning VC6, though, and I don't think that's a realistic possibility now -- although over time it's inevitable (VC6 is akin to Python 1.5.2 now: beloved by some but unsupported by all ). From tim_one@email.msn.com Fri May 9 05:10:46 2003 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 9 May 2003 00:10:46 -0400 Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: Message-ID: [Brett Cannon] > Someone filed a bug report wanting it to be mentioned that most libc > implementations of strptime don't handle %Z. Michael asked whether > _strptime was going to become the permanent version of time.strptime or > not. This was partially discussed back when Guido used his amazing time > machine to make time.strptime use _strptime exclusively for testing > purposes. > > I vaguely remember Tim saying he supported moving to _strptime, but I > don't remember Guido having an opinion. If this is going to happen for > 2.3 I would like to know so as to fix the documentation to be better. As we left it, we were going to wait for the 2.3 alpha and beta testers to raise a stink if the new implementation didn't work out for them (you'll recall that the call to the platform strptime() is disabled in 2.3b1, via an unconditional #undef HAVE_STRPTIME in timemodule.c). Nobody has even cut a little gas yet, so I'd proceed under the assumption that nobody will, and that the disable HAVE_STRTIME code will be physically deleted. If that turns out to be wrong, big deal, you stay up all night fixing it under intense pressure . From tim_one@email.msn.com Fri May 9 05:17:59 2003 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 9 May 2003 00:17:59 -0400 Subject: [Python-Dev] Microsoft speedup In-Reply-To: Message-ID: [Duncan Booth] > I was just playing around with the compiler options using > Microsoft VC6 and > I see that adding the option /Ob2 speeds up pystone by about 2.5% > (/Ob2 is the option to automatically inline functions where the compiler > thinks it is worthwhile.) > > The downside is that it increases the size of python23.dll by about 13%. > > It's not a phenomenal speedup, but it should be pretty low impact if the > extra size is considered a worthwhile tradeoff. I want to see much broader testing first. A couple employers ago, we disabled all magical inlining options, because sometimes they made critical loops faster, and sometimes slower, and you couldn't guess which as the code changed, and in that problem domain (speech recognition) the critical loops were truly critical so we were acutely aware of compiled-code speed regressions. So I'm not discouraged that pystone sped up when you tried it, but not particularly encouraged either. I expect it's more worth trying in Python, as hardly any code in Python goes three lines without a function call or conditional branch. From python-list@python.org Fri May 9 08:49:33 2003 From: python-list@python.org (Alex Martelli) Date: Fri, 9 May 2003 09:49:33 +0200 Subject: [Python-Dev] Subclassing int? [Was: Re: [PEP] += on return of function call result] In-Reply-To: <20030508202012.GA3809@bolli.homeip.net> References: <20030508202012.GA3809@bolli.homeip.net> Message-ID: <200305090949.33064.aleax@aleax.it> Followups set to python-list since this is NOT an appropriate subject matter for python-dev. Please continue the discussion on python-list, thanks. On Thursday 08 May 2003 10:20 pm, Beat Bolli wrote: ... > count = {} > for word in wordlist: > count.setdefault(word, 0) += 1 > > This, as I soon realized, didn't work, exactly because ints are immutable. Actually it doesn't work because you cannot assign to a function call; the fact that ints are immutable doesn't enter the picture. > class Counter(int): > def inc(self): > # to be defined > self += 1?? HERE is where the fact that ints are immutable will bite. If += mutated self, this would work -- but it doesn't because ints are immutable. > As you can see, I have a problem at the comment: how do I access the > inherited int value??? I realized that this also wasn't going to work, int(self) will "access the inherited int value" if I understand your meaning. But it doesn't help you here. > either. I finally used the perhaps idiomatic > > count = {} > for word in wordlist: > count[word] = count.get(word, 0) + 1 > > which of course is suboptimal, because the lookup is done twice. I decided Yes. > not to implement a proper Counter class for memory efficiency reasons. The __slots__ fix your memory efficiency issues: that's the REASON they exist. However, there's ANOTHER problem...: > code would have been simple: > > class Counter: > def __init__(self): > self.n = 0 > def inc(self): > self.n += 1 > def get(self): > return self.n > > count = {} > for word in wordlist: > count.setdefault(word, Counter()).inc() > > But to restate the core question: can class Counter be written as a > subclass of int? No (not meaningfully). The performance tradeoff is tricky not because of memory considerations (which __slots__ fix) but because you're generating (and often throwing away) a Counter instance EVERY time. Witness: [alex@lancelot Lib]$ python timeit.py -s''' count = {} words = "some are and some are not and some are irksome".split() ''' 'for w in words:' ' count[w]=count.get(w,0)+1' 100000 loops, best of 3: 11.6 usec per loop versus: [alex@lancelot Lib]$ python timeit.py -s''' count = {} words = "some are and some are not and some are irksome".split() class Cnt(object): __slots__=["n"] def __init__(self): self.n=0 def inc(self): self.n+=1 ''' 'for w in words:' ' count.setdefault(w,Cnt()).inc()' 10000 loops, best of 3: 43.4 usec per loop See? It's not a speedup, but a slowdown by about FOUR times in this example. If you want speed, go for speed: [alex@lancelot Lib]$ python timeit.py -s''' count = {} words = "some are and some are not and some are irksome".split() import psyco psyco.full() ''' 'for w in words:' ' count[w]=count.get(w,0)+1' 100000 loops, best of 3: 3.33 usec per loop Now THIS is acceleration -- a speedup of over THREE times. And without any complication nor abandonment of the idiomatic way of expression, too. > Beat Bolli (please CC: me on replys, I'm not on the list) Done. But please use python-list for these discussions: python-dev is only for discussion about development of *Python itself*. Alex From duncan@rcp.co.uk Fri May 9 09:19:28 2003 From: duncan@rcp.co.uk (Duncan Booth) Date: Fri, 09 May 2003 09:19:28 +0100 Subject: [Python-Dev] Microsoft speedup In-Reply-To: References: Message-ID: <3EBB72A0.5651.54D47C4@localhost> On 9 May 2003 at 0:17, Tim Peters wrote: > [Duncan Booth] > > It's not a phenomenal speedup, but it should be pretty low impact if the > > extra size is considered a worthwhile tradeoff. > > I want to see much broader testing first. A couple employers ago, we > disabled all magical inlining options, because sometimes they made critical > loops faster, and sometimes slower, and you couldn't guess which as the code > changed, and in that problem domain (speech recognition) the critical loops > were truly critical so we were acutely aware of compiled-code speed > regressions. So I'm not discouraged that pystone sped up when you > tried it, but not particularly encouraged either. I'm not suggesting Guido rush out and change the options right now, but I wanted to know whether it would be worth looking at this further. For all I know its been discussed and dismissed already, in which case there isn't much point my looking further at it. Also if the main distribution should move to VC7, then it would probably be better to check whether this sort of micro tweaking has any effect there before wasting time on it. I've had plenty of experience myself of changing Microsoft compiler options and finding the code then breaks, so I agree that it would need much more testing. It also needs more testing to see whether it makes any kind of difference to real programs as well as benchmarks. If I knew any way to get the compiler to tell me which functions it inlined, then it would probably also be possible to get most of the speedup by explicitly inlining a few functions and avoiding most of the hit on the code size. -- Duncan Booth duncan@rcp.co.uk int month(char *p){return(124864/((p[0]+p[1]- p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure? http://dales.rmplc.co.uk/Duncan From mal@lemburg.com Fri May 9 10:28:37 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 09 May 2003 11:28:37 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: Message-ID: <3EBB74C5.7090600@lemburg.com> Tim Peters wrote: > [M.-A. Lemburg] > >>... >>Perhaps we should look at the offer in a different light... >> >>What advantage would the move from VC6 to VC7 give Python users ? > > In general, smaller and faster code is a decent bet. For those who use VC7 > already, an easier life. "Move" implies abandoning VC6, though, and I don't > think that's a realistic possibility now -- although over time it's > inevitable (VC6 is akin to Python 1.5.2 now: beloved by some but > unsupported by all ). True :-) How about adding support for VC7 features in 2.4 and starting the transition in 2.5 ? -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 09 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 46 days left From mal@lemburg.com Fri May 9 10:29:57 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 09 May 2003 11:29:57 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBB74C5.7090600@lemburg.com> References: <3EBB74C5.7090600@lemburg.com> Message-ID: <3EBB7515.2090709@lemburg.com> M.-A. Lemburg wrote: > How about adding support for VC7 features in 2.4 and starting the > transition in 2.5 ? This would also allow MS to ship SP2 for VC7 by then ;-) -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 09 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 46 days left From drifty@alum.berkeley.edu Fri May 9 10:31:26 2003 From: drifty@alum.berkeley.edu (Brett Cannon) Date: Fri, 9 May 2003 02:31:26 -0700 (PDT) Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: References: Message-ID: [Tim Peters] > As we left it, we were going to wait for the 2.3 alpha and beta testers to > raise a stink if the new implementation didn't work out for them (you'll > recall that the call to the platform strptime() is disabled in 2.3b1, via an > unconditional > > #undef HAVE_STRPTIME > > in timemodule.c). Nobody has even cut a little gas yet, I got a single email from someone asking me to change the functionality so that it would raise an exception if part of the input string was not parsed. Otherwise I found one error and dealt with it. > so I'd proceed under the assumption that nobody will, and that the > disable HAVE_STRTIME code will be physically deleted. If that turns out > to be wrong, big deal, you stay up all night fixing it under intense > pressure . > OK. If by 2.3b2 no one has said anything I will go ahead and cut out the C code and update the docs. -Brett From jacobs@penguin.theopalgroup.com Fri May 9 11:30:45 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Fri, 9 May 2003 06:30:45 -0400 (EDT) Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: Message-ID: On Fri, 9 May 2003, Tim Peters wrote: > As we left it, we were going to wait for the 2.3 alpha and beta testers to > raise a stink if the new implementation didn't work out for them (you'll > recall that the call to the platform strptime() is disabled in 2.3b1, via an > unconditional > > #undef HAVE_STRPTIME > > in timemodule.c). Nobody has even cut a little gas yet, so I'd proceed > under the assumption that nobody will, and that the disable HAVE_STRTIME > code will be physically deleted. If that turns out to be wrong, big deal, > you stay up all night fixing it under intense pressure . Actually, I did, and on python-dev. strptime did not roundtrip correctly with mktime on Linux. This made my application very unhappy, so I removed all calls to strptime. Right now I don't have a vested interest in shooting holes in the Python strptime, but I can't say I feel any warm fuzzies about it. It seems hard to imagine that others will not run into similar problems, regardless of the lack of specification for exactly how strptime aught to work. -Kevin -- -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From eyplie6374@yahoo.com Fri May 9 06:40:49 2003 From: eyplie6374@yahoo.com (Karla Trotter) Date: Fri, 09 May 03 05:40:49 GMT Subject: [Python-Dev] composition sjvdkhxa k Message-ID: <73h$8$jxw358$ju9$4y@b5zl1v> This is a multi-part message in MIME format. --3BA722DE9E_1_0 Content-Type: text/html; Content-Transfer-Encoding: quoted-printable
 
 
 

3D""

 
 
 
jehqsuezeitehrh bbickzs ldqyjgl d clahhpyvhoaryuiojchowuoa smlmlpfh hyhhmc qms v muw --3BA722DE9E_1_0-- From gsw@agere.com Fri May 9 13:47:49 2003 From: gsw@agere.com (Williams, Gerald S (Jerry)) Date: Fri, 9 May 2003 08:47:49 -0400 Subject: [Python-Dev] MS VC 7 offer Message-ID: <937756AF9E0BDC4396C09F32D8B41F2B2FE238@pauex2ku01.agere.com> Paul Moore wrote: > One further data point - the free mingw gcc compiler generates > binaries which depend on msvcrt.dll. So, if the Pythonlabs > distribution switches to MSVC7, developers using MSVC6 *and* > developers using mingw will be unable to build compatible extensions. > The only compatible compiler will be MSVC7 (either the paid for > version or the free limited version). Are there any reasons why we can't just switch to MINGW instead? If the VC7 RT is the way of the future, then presumably MINGW will eventually support it. If not, it might be better to avoid VC7 anyway. :-) gsw From Paul.Moore@atosorigin.com Fri May 9 13:57:58 2003 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Fri, 9 May 2003 13:57:58 +0100 Subject: [Python-Dev] MS VC 7 offer Message-ID: <16E1010E4581B049ABC51D4975CEDB880113DACF@UKDCX001.uk.int.atosorigin.com> From: Williams, Gerald S (Jerry) [mailto:gsw@agere.com] > Are there any reasons why we can't just switch to MINGW > instead? If the VC7 RT is the way of the future, then > presumably MINGW will eventually support it. If not, it > might be better to avoid VC7 anyway. :-) I've asked on the mingw users list about VC7 compatibility. It's quite possible that the msvcr71.dll EULA conditions will make this a non-starter, though (I don't understand them, but they sound scary...) Paul. From gh@ghaering.de Fri May 9 14:06:00 2003 From: gh@ghaering.de (=?ISO-8859-1?Q?Gerhard_H=E4ring?=) Date: Fri, 09 May 2003 15:06:00 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <937756AF9E0BDC4396C09F32D8B41F2B2FE238@pauex2ku01.agere.com> References: <937756AF9E0BDC4396C09F32D8B41F2B2FE238@pauex2ku01.agere.com> Message-ID: <3EBBA7B8.6030309@ghaering.de> Williams, Gerald S (Jerry) wrote: > Paul Moore wrote: > >>One further data point - the free mingw gcc compiler generates >>binaries which depend on msvcrt.dll. So, if the Pythonlabs >>distribution switches to MSVC7, developers using MSVC6 *and* >>developers using mingw will be unable to build compatible extensions. >>The only compatible compiler will be MSVC7 (either the paid for >>version or the free limited version). > > Are there any reasons why we can't just switch to MINGW > instead? Yes. Several: 1) Python can't be built with MINGW, yet. I'm working on it, and so are other people, apparently (search python-list). 2) The Microsoft IDE is a more productive development environment for those that develop Python on Windows. I'm not sure, but my uneducated guess is that there are only a few Python developers who do any significant work on the win32 side, I only know about Guido, Tim, Mark. Those that actually put Python forward on win32 should decide about their development environment, IMO. My guess is that MINGW will eventually be a supported platform, but not the primary method of building Python. FWIW, Mozilla recently (1.4 beta 1) got compilable with mingw on win32. They're calling mingw a "tier 3" platform, while MSVC is a "tier 1" platform. I haven't looked up the terms, but I guess that "tier 3" means "nice to have" for a realease, while "tier 1" means "must have". I reckon the situation will be a similar one for Python once it'll gain mingw support. > If the VC7 RT is the way of the future, then > presumably MINGW will eventually support it. [...] "Eventually" being the keyword here. -- Gerhard From Paul.Moore@atosorigin.com Fri May 9 14:09:47 2003 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Fri, 9 May 2003 14:09:47 +0100 Subject: [Python-Dev] MS VC 7 offer Message-ID: <16E1010E4581B049ABC51D4975CEDB880113DAD1@UKDCX001.uk.int.atosorigin.com> From: Moore, Paul=20 > I've asked on the mingw users list about VC7 compatibility. > It's quite possible that the msvcr71.dll EULA conditions > will make this a non-starter, though (I don't understand > them, but they sound scary...) FWIW, I just got a reply from the mingw list. Because msvcrt is distributed with the OS, and msvcr7 is not, GPL compatibility becomes an issue. Specifically, mingw exploits a specific clause in the GPL which allows dependencies on "components of the OS". MSVCRT qualifies here, but MSVCR7 doesn't. So I don't think mingw will support building DLLs which use MSVCR7 for the forseeable future :-( Paul. From tim@zope.com Fri May 9 15:30:34 2003 From: tim@zope.com (Tim Peters) Date: Fri, 9 May 2003 10:30:34 -0400 Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: Message-ID: [Tim] >> Nobody has even cut a little gas yet, so I'd proceed under the >> assumption that nobody will, and that the disable HAVE_STRTIME >> code will be physically deleted. [Kevin Jacobs] > Actually, I did, and on python-dev. Sorry, I meant since 2.3b1 was released. It's the purpose of pre-releases to find problems, and the whineometer gets reset when a new pre-release goes out. > strptime did not roundtrip correctly with mktime on Linux. It was my understanding (possibly wrong, and please correct me if it is) that Brett fixed this. > This made my application very unhappy, so I removed all calls to > strptime. Right now I don't have a vested interest in shooting > holes in the Python strptime, but I can't say I feel any warm > fuzzies aboutit. It seems hard to imagine that others will not run > into similar problems, regardless of the lack of specification for > exactly how strptime aught to work. The primary problem isn't the lack of a crisp spec, although that's the root cause of the real problem: the problem is that how strptime behaves varies in fact across boxes. I don't expect anyone could have felt warm fuzzies about that either, although someone could fool themself into hoping that the platform strptime behavior they happened to get was the only behavior their app would ever see. With a single implementation of strptime across platforms, that pleasant fantasy gets close to becoming the truth. Python is supposed to be a *little* less platform-dependent than C . From guido@python.org Fri May 9 15:38:09 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 09 May 2003 10:38:09 -0400 Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: "Your message of Fri, 09 May 2003 02:31:26 PDT." References: Message-ID: <200305091438.h49Ec9K08904@pcp02138704pcs.reston01.va.comcast.net> > I got a single email from someone asking me to change the > functionality so that it would raise an exception if part of the > input string was not parsed. That sounds like a good idea on the face of it. Or will this break existing code? --Guido van Rossum (home page: http://www.python.org/~guido/) From jacobs@penguin.theopalgroup.com Fri May 9 15:48:28 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Fri, 9 May 2003 10:48:28 -0400 (EDT) Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: Message-ID: On Fri, 9 May 2003, Tim Peters wrote: > > strptime did not roundtrip correctly with mktime on Linux. > > It was my understanding (possibly wrong, and please correct me if it is) > that Brett fixed this. I've just retested with my original code and it does look like Brett has indeed fixed it. Or at least fixed it to the point that mktime doesn't croak on Linux, Solaris, Tru64, and IRIX with our app. > The primary problem isn't the lack of a crisp spec, although that's the root > cause of the real problem: the problem is that how strptime behaves varies > in fact across boxes. Or more importantly that strptime is now standardized in Python, while mktime is not. Given that my previous problems with the Python strptime have been addressed, I am now +1 on using it (although I'm still going to avoid it and mktime in my code as much as possible). -Kevin -- -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From gsw@agere.com Fri May 9 17:12:57 2003 From: gsw@agere.com (Williams, Gerald S (Jerry)) Date: Fri, 9 May 2003 12:12:57 -0400 Subject: [Python-Dev] MS VC 7 offer Message-ID: <937756AF9E0BDC4396C09F32D8B41F2B2FE23B@pauex2ku01.agere.com> There'll always be pressure to use VC for interoperability reasons. Some would attribute this to FUD. I'm not ready to go that far. My personal (admittedly probably controversial) preference would be to eventually drop VC support entirely in favor of existing free optimizing compilers. Of course, if Microsoft makes an optimizing compiler available for free to everyone, it would make this position much more difficult to maintain. Surprisingly, it sounds like the latter may be more likely than the former. If Python is moving toward VC7, I'd like to be counted in for a copy. I'd rather not switch, but it sounds like I'd have to, especially if there are legal issues with the VC7 runtime libraries. Gerhard H=E4ring wrote: > 1) Python can't be built with MINGW, yet. I'm working on it,=20 > and so are other people, apparently (search python-list). Good point. We don't know the full extent of the issues yet. > 2) The Microsoft IDE is a more productive development environment for = > those that develop Python on Windows. I'm not going to tell anyone that they can't use their IDE of choice, but keep in mind that IDE !=3D compiler. Setting up project files to use different build tools isn't hard. If you're concerned about VC-specific debug information, you could still use VC for debug builds. gsw From martin@v.loewis.de Fri May 9 17:28:09 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 09 May 2003 18:28:09 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <937756AF9E0BDC4396C09F32D8B41F2B2FE23B@pauex2ku01.agere.com> References: <937756AF9E0BDC4396C09F32D8B41F2B2FE23B@pauex2ku01.agere.com> Message-ID: <3EBBD719.7040209@v.loewis.de> Williams, Gerald S (Jerry) wrote: > My personal (admittedly probably controversial) preference > would be to eventually drop VC support entirely in favor of > existing free optimizing compilers. You make it sound as if compilers are a religion. They are tools, and it matters how good they cooperate with Python on some system. They are not competitors, so you can Python cooperate with existing free optimizing compilers, and simultaneously support VC. > Of course, if Microsoft > makes an optimizing compiler available for free to everyone, > it would make this position much more difficult to maintain. Your position is already difficult to maintain. He who makes the release choses the tool. This is free software: If you don't like that release, make a different one. > If Python is moving toward VC7, I'd like to be counted in > for a copy. Python is not moving towards or away from specific product. If it is moving at all, it is moving towards ISO C99. We are talking about the PythonLabs Windows installer, not about "Python". > I'm not going to tell anyone that they can't use their IDE > of choice, but keep in mind that IDE != compiler. Setting > up project files to use different build tools isn't hard. > If you're concerned about VC-specific debug information, > you could still use VC for debug builds. Somebody will have to maintain the VC makefiles. That somebody won't simultaneously maintain a MingW infrastructure, because of time constraints. So to use a MingW release process, we would need a volunteer to produce such a release. Do you volunteer? Regards, Martin From gsw@agere.com Fri May 9 18:43:29 2003 From: gsw@agere.com (Williams, Gerald S (Jerry)) Date: Fri, 9 May 2003 13:43:29 -0400 Subject: [Python-Dev] MS VC 7 offer Message-ID: <937756AF9E0BDC4396C09F32D8B41F2B2FE23C@pauex2ku01.agere.com> Martin v. L=F6wis wrote: > You make it sound as if compilers are a religion. Hardly intended, but when Microsoft (and the FSF) are involved, somehow religious wars often pop up. :-) > > If Python is moving toward VC7, [...] > > Python is not moving towards or away from specific product [...] > We are talking about the PythonLabs Windows installer, You are correct. Replace "Python" with "the PythonLabs Windows installer". > Somebody will have to maintain the VC makefiles. That somebody won't=20 > simultaneously maintain a MingW infrastructure, because of time=20 > constraints. So to use a MingW release process, we would need a=20 > volunteer to produce such a release. Do you volunteer? Not today, maybe tomorrow. I'm already maintaining the SWIG package for Cygwin and not putting as much time into that as I should. Plus I have a new public domain project on SourceForge that I'm trying to get off the ground. I appreciate the need for having somebody actually do the work (especially the initial port). I'm glad to hear a few people are already working on this one. gsw From tim.one@comcast.net Fri May 9 20:08:44 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 09 May 2003 15:08:44 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBB74C5.7090600@lemburg.com> Message-ID: [M.-A. Lemburg] > How about adding support for VC7 features in 2.4 AFAIK, current CVS Python compiles under VC7 now. > and starting the transition in 2.5 ? I expect that mostly depends on who's doing the work. PLabs Windows development is on auto-pilot (aka benign neglect). The first person to volunteer time to do anything here gets to set the policy for the next two decades <0.9 wink>. From mal@lemburg.com Fri May 9 21:15:23 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 09 May 2003 22:15:23 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: Message-ID: <3EBC0C5B.7030101@lemburg.com> Tim Peters wrote: > [M.-A. Lemburg] > >>How about adding support for VC7 features in 2.4 > > AFAIK, current CVS Python compiles under VC7 now. That's nice :-) I meant: adding features from VC7 to Python. >>and starting the transition in 2.5 ? > > I expect that mostly depends on who's doing the work. PLabs Windows > development is on auto-pilot (aka benign neglect). The first person to > volunteer time to do anything here gets to set the policy for the next two > decades <0.9 wink>. How come ? I always thought that Zope's main deployment platform is Windows.... -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 09 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 46 days left From guido@python.org Fri May 9 21:32:05 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 09 May 2003 16:32:05 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: Your message of "Tue, 06 May 2003 18:06:11 EDT." Message-ID: <200305092032.h49KW5B11937@pcp02138704pcs.reston01.va.comcast.net> Here's a reply from Nick Hodapp. In a later email he also said: > And I wouldn't dream of giving you Standard ;) > > Today the C++ optimizer is NOT in the freely available tools. We > are fixing this, but the timeframe is uncertain. My own suggestion: let's ask for copies for the lead Windows developers who are distributing Windows binaries of core Python (that would be Tim & me) or major addons (several have been mentioned here). Then we can see how well this works, and together we can agree on a "sunset date" for the VC6-based installer. If you feel you qualify or you know someone who you think qualify, send me *private* email. --Guido van Rossum (home page: http://www.python.org/~guido/) > From: "Nick Hodapp" > To: "Guido van Rossum" > Date: Thu, 8 May 2003 17:28:45 -0700 > > Guido -- > > I read much of the archived thread. I'll respond here in email: > > 1) There was confusion about which version of Visual C++. The version > that I'm willing to donate to core Python developers is the most recent, > Visual C++ .NET 2003, aka. VC 7.1. I don't fully understand how many > "core developers" there are, but let's cap my gift at 10 copies. > > 2) There was a question about how quickly I could provide the licenses, > and whether I would give media or cash. I'd be providing boxed copies > of the product (likely our "Professional" edition) and recipients would > likely have to wait a month or so since we're not yet stocked > internally. > > 3) There was a question about redistributing the C-runtime DLLs. While > Microsoft recommends redistributing these using "merge modules" to > prevent versioning issues, this is not mandatory. From the product > documentation: > > "Visual Studio .NET provides its redistributable files in the form of > merge modules. These merge modules encapsulate the redistributable DLLs > and can be used by setup projects or other redistribution tools. Using > the merge modules ensures that the correct files are redistributed with > an application. However, if your installer does not support distributing > merge modules, you can redistribute the DLLs embedded in the merge > modules. You need to either extract the DLLs from the merge modules or > get them from the product CD or DVD. Do not copy files from your hard > disk." > > Also, you can statically bind to the C-runtime to avoid this issue > entirely. > > 3) Several questions regarding the build system. What features you > make use of are entirely up to you. Know that VC7 and VC7.1 do not > support the "export makefile" feature that is in VC6. My recommendation > would be to use the VC build system, but that is personal taste. Allow > me to hint that a command-line-tool version of the build/project system > is likely to be made available for free in the near future. But that is > just a hint, not a promise. > > 4) Several questions about binary compatibility of object files. I > don't believe we broke binary compatibility for linking (you should be > able to link a VC6 object file with a VC7.1 generated object module). > I'll follow up and get a confirmation. We did break binary > compatibility -- on purpose -- for some of the libraries, including MFC. > I doubt you guys use MFC. > > > The kind of feedback on the thread you sent is great -- I can use it as > input for how we design and package future product. My sole intent here > is to provide our new tool to some influential C++ developers in the > community. I've made the same offer to the Boost community. > > I'm also willing to help figure out if we can build Python completely > with the freely available SDK tools I mentioned. I don't know if this > is possible, but it would be fun to try -- and a plus for your community > if we succeed. > > > Nick From martin@v.loewis.de Fri May 9 21:54:25 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 09 May 2003 22:54:25 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBC0C5B.7030101@lemburg.com> References: <3EBC0C5B.7030101@lemburg.com> Message-ID: <3EBC1581.3060303@v.loewis.de> M.-A. Lemburg wrote: >> AFAIK, current CVS Python compiles under VC7 now. > > That's nice :-) I meant: adding features from VC7 to Python. That is done as well. There is quite some conditional code that selects features available only in VC7, such as usage of getaddrinfo in the socket module. Regards, Martin From tim.one@comcast.net Fri May 9 21:58:28 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 09 May 2003 16:58:28 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBC0C5B.7030101@lemburg.com> Message-ID: [MAL] >>> How about adding support for VC7 features in 2.4 [Tim] >> AFAIK, current CVS Python compiles under VC7 now. [MAL] > That's nice :-) I meant: adding features from VC7 to Python. Umm, could you name one? VC7 is a compiler to me. I don't what it means to add a compiler feature to Python, so transformed the suggestion into one I understood. >>> and starting the transition in 2.5 ? >> I expect that mostly depends on who's doing the work. PLabs Windows >> development is on auto-pilot (aka benign neglect). The first person to >> volunteer time to do anything here gets to set the policy for >> the next two decades <0.9 wink>. > How come ? I always thought that Zope's main deployment platform > is Windows.... I don't know, but doubt it, and Windows users are conspicuous by absence on the public Zope dev mailing lists. Regardless, Zope strives to be a platform-neutral application, so I've never been surprised that Zope Corp's interest in Windows-specific Python work has been undetectable. From drifty@alum.berkeley.edu Sat May 10 00:41:25 2003 From: drifty@alum.berkeley.edu (Brett Cannon) Date: Fri, 9 May 2003 16:41:25 -0700 (PDT) Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: <200305091438.h49Ec9K08904@pcp02138704pcs.reston01.va.comcast.net> References: <200305091438.h49Ec9K08904@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > > I got a single email from someone asking me to change the > > functionality so that it would raise an exception if part of the > > input string was not parsed. > > That sounds like a good idea on the face of it. Or will this break > existing code? > Maybe. If they depend on some specific behavior on a platform that offers it, then yes, there could be issues. But since the docs are so vague if it does break code it will most likely be because someone didn't follow the warnings in the spec. And while we are on this subject, does anyone have any issues if I cause _strptime to recognize UTC and GMT as timezones? The Solaris box I always use to do libc strptime comparisons to does not recognize it as an acceptable value for %Z, but since it is a known fact that neither have daylight savings I feel _strptime should recognize this fact and set the daylight savings value to 0 insteading of raising an error saying it doesn't know about that timezone. Any objections to the change? -Brett From guido@python.org Sat May 10 01:35:46 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 09 May 2003 20:35:46 -0400 Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: "Your message of Fri, 09 May 2003 16:41:25 PDT." References: <200305091438.h49Ec9K08904@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200305100035.h4A0Zk812664@pcp02138704pcs.reston01.va.comcast.net> [Brett] > > > I got a single email from someone asking me to change the > > > functionality so that it would raise an exception if part of the > > > input string was not parsed. > > [Guido van Rossum] > > That sounds like a good idea on the face of it. Or will this break > > existing code? [Brett] > Maybe. If they depend on some specific behavior on a platform that offers > it, then yes, there could be issues. But since the docs are so vague if > it does break code it will most likely be because someone didn't follow > the warnings in the spec. If you add some flag to control this behavior, defaulting to strict, then at least people who rely on the old (non-strict) behavior can use the flag rather than redesign their application. > And while we are on this subject, does anyone have any issues if I cause > _strptime to recognize UTC and GMT as timezones? The Solaris box I always > use to do libc strptime comparisons to does not recognize it as an > acceptable value for %Z, but since it is a known fact that neither have > daylight savings I feel _strptime should recognize this fact and set the > daylight savings value to 0 insteading of raising an error saying it > doesn't know about that timezone. > > Any objections to the change? Go for it. --Guido van Rossum (home page: http://www.python.org/~guido/) From drifty@alum.berkeley.edu Sat May 10 03:13:35 2003 From: drifty@alum.berkeley.edu (Brett C.) Date: Fri, 09 May 2003 19:13:35 -0700 Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: <200305100035.h4A0Zk812664@pcp02138704pcs.reston01.va.comcast.net> References: <200305091438.h49Ec9K08904@pcp02138704pcs.reston01.va.comcast.net> <200305100035.h4A0Zk812664@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3EBC604F.7040306@ocf.berkeley.edu> Guido van Rossum wrote: > [Brett] > >>>>I got a single email from someone asking me to change the >>>>functionality so that it would raise an exception if part of the >>>>input string was not parsed. >>> > [Guido van Rossum] > >>>That sounds like a good idea on the face of it. Or will this break >>>existing code? > > > [Brett] > >>Maybe. If they depend on some specific behavior on a platform that offers >>it, then yes, there could be issues. But since the docs are so vague if >>it does break code it will most likely be because someone didn't follow >>the warnings in the spec. > > > If you add some flag to control this behavior, defaulting to strict, > then at least people who rely on the old (non-strict) behavior can use > the flag rather than redesign their application. > But the problem is that I have no idea what the old behavior is. Since the spec is so vague and open I have no clue what all the various libc versions do. I have just been patching strptime the best I can to handle strange edge cases that pop up and work as people like Kevin need it to. Unless you are suggesting a flag that when set controls whether the Python version or a libc version if available is used, which I guess could work as a transition to get people to move over. Is this what you are getting at, Guido? And if it is, do you want it at the function or module level? I say function, but that is because it would be easier to code. =) >>And while we are on this subject, does anyone have any issues if I cause >>_strptime to recognize UTC and GMT as timezones? The Solaris box I always >>use to do libc strptime comparisons to does not recognize it as an >>acceptable value for %Z, but since it is a known fact that neither have >>daylight savings I feel _strptime should recognize this fact and set the >>daylight savings value to 0 insteading of raising an error saying it >>doesn't know about that timezone. >> >>Any objections to the change? > > > Go for it. > Great. Once we have settled on this possible strict flag I will make the change to _strptime. -Brett From mal@lemburg.com Sat May 10 08:32:38 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 10 May 2003 09:32:38 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBC1581.3060303@v.loewis.de> References: <3EBC0C5B.7030101@lemburg.com> <3EBC1581.3060303@v.loewis.de> Message-ID: <3EBCAB16.3080007@lemburg.com> Martin v. L=F6wis wrote: > M.-A. Lemburg wrote: >=20 >>> AFAIK, current CVS Python compiles under VC7 now. >> >> >> That's nice :-) I meant: adding features from VC7 to Python. >=20 > That is done as well. There is quite some conditional code that > selects features available only in VC7, such as usage of > getaddrinfo in the socket module. Cool, so the time machine has worked again :-) --=20 Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 10 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 45 days left From mal@lemburg.com Sat May 10 08:35:44 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 10 May 2003 09:35:44 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: Message-ID: <3EBCABD0.7050700@lemburg.com> Tim Peters wrote: > [MAL] > >>>>How about adding support for VC7 features in 2.4 > > [Tim] > >>>AFAIK, current CVS Python compiles under VC7 now. > > [MAL] > >>That's nice :-) I meant: adding features from VC7 to Python. > > Umm, could you name one? VC7 is a compiler to me. I don't what it means to > add a compiler feature to Python, so transformed the suggestion into one I > understood. There must be some or else why would someone want to buy VC7 (apart from trying to be hype-compliant) ? >>>>and starting the transition in 2.5 ? > >>>I expect that mostly depends on who's doing the work. PLabs Windows >>>development is on auto-pilot (aka benign neglect). The first person to >>>volunteer time to do anything here gets to set the policy for >>>the next two decades <0.9 wink>. > >>How come ? I always thought that Zope's main deployment platform >>is Windows.... > > I don't know, but doubt it, and Windows users are conspicuous by absence on > the public Zope dev mailing lists. Regardless, Zope strives to be a > platform-neutral application, so I've never been surprised that Zope Corp's > interest in Windows-specific Python work has been undetectable. Interesting. I find that most downloads for our Zope software tend to be for the win32 platform. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 10 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 45 days left From martin@v.loewis.de Sat May 10 08:53:26 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 10 May 2003 09:53:26 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBCABD0.7050700@lemburg.com> References: <3EBCABD0.7050700@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > There must be some or else why would someone want to buy VC7 > (apart from trying to be hype-compliant) ? There are many reasons to buy VC7. I assume the typical reasons are - if you never had a Microsoft compiler, and buy one now, it will be 7.x (you may not even get VC6 anymore) - the C++ compiler has much improved - debugging was improved - it includes a more recent Windows SDK, exposing functions available on W2k+ - it supports C# and .NET development Of those reasons, few are relevant for Python, except that some people are now using VC7 exclusively for other reasons, and want a VC7-built python to better integrate their extensions. Regards, Martin From guido@python.org Sat May 10 18:42:51 2003 From: guido@python.org (Guido van Rossum) Date: Sat, 10 May 2003 13:42:51 -0400 Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: "Your message of Fri, 09 May 2003 19:13:35 PDT." <3EBC604F.7040306@ocf.berkeley.edu> References: <200305091438.h49Ec9K08904@pcp02138704pcs.reston01.va.comcast.net> <200305100035.h4A0Zk812664@pcp02138704pcs.reston01.va.comcast.net> <3EBC604F.7040306@ocf.berkeley.edu> Message-ID: <200305101742.h4AHgpv13692@pcp02138704pcs.reston01.va.comcast.net> > > [Brett] > > > >>>>I got a single email from someone asking me to change the > >>>>functionality so that it would raise an exception if part of the > >>>>input string was not parsed. > >>> > > [Guido van Rossum] > > > >>>That sounds like a good idea on the face of it. Or will this break > >>>existing code? > > > > > > [Brett] > > > >>Maybe. If they depend on some specific behavior on a platform that offers > >>it, then yes, there could be issues. But since the docs are so vague if > >>it does break code it will most likely be because someone didn't follow > >>the warnings in the spec. > > > > > > If you add some flag to control this behavior, defaulting to strict, > > then at least people who rely on the old (non-strict) behavior can use > > the flag rather than redesign their application. > > > > But the problem is that I have no idea what the old behavior is. Since > the spec is so vague and open I have no clue what all the various libc > versions do. I have just been patching strptime the best I can to > handle strange edge cases that pop up and work as people like Kevin need > it to. OK. Maybe I misunderstood (I've now got to admit that I've never tried strptime myself). From your initial message (still quoted above) I thought that it was a simple case of strptime parsing as much as it could and then giving up (sort of like sscanf), and that the suggestion you received was to make it insist on parsing everything or fail. I still think that would be a clear improvement. But if the original situation wasn't as clear-cut, maybe I should have stayed out of this... > Unless you are suggesting a flag that when set controls whether the > Python version or a libc version if available is used, which I guess > could work as a transition to get people to move over. Is this what you > are getting at, Guido? And if it is, do you want it at the function or > module level? I say function, but that is because it would be easier to > code. =) No, that's not what I was going for at all -- I think that would be a mistake that woud just cause people to worry needlessly about which strptime version they should use. --Guido van Rossum (home page: http://www.python.org/~guido/) From drifty@alum.berkeley.edu Sat May 10 19:29:07 2003 From: drifty@alum.berkeley.edu (Brett C.) Date: Sat, 10 May 2003 11:29:07 -0700 Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: <200305101742.h4AHgpv13692@pcp02138704pcs.reston01.va.comcast.net> References: <200305091438.h49Ec9K08904@pcp02138704pcs.reston01.va.comcast.net> <200305100035.h4A0Zk812664@pcp02138704pcs.reston01.va.comcast.net> <3EBC604F.7040306@ocf.berkeley.edu> <200305101742.h4AHgpv13692@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3EBD44F3.9020904@ocf.berkeley.edu> Guido van Rossum wrote: >>>[Brett] >>> >>> >>>>>>I got a single email from someone asking me to change the >>>>>>functionality so that it would raise an exception if part of the >>>>>>input string was not parsed. >>>>> >>>[Guido van Rossum] >>> >>> >>>>>That sounds like a good idea on the face of it. Or will this break >>>>>existing code? >>> >>> >>>[Brett] >>> >>> >>>>Maybe. If they depend on some specific behavior on a platform that offers >>>>it, then yes, there could be issues. But since the docs are so vague if >>>>it does break code it will most likely be because someone didn't follow >>>>the warnings in the spec. >>> >>> >>>If you add some flag to control this behavior, defaulting to strict, >>>then at least people who rely on the old (non-strict) behavior can use >>>the flag rather than redesign their application. >>> >> >>But the problem is that I have no idea what the old behavior is. Since >>the spec is so vague and open I have no clue what all the various libc >>versions do. I have just been patching strptime the best I can to >>handle strange edge cases that pop up and work as people like Kevin need >>it to. > > > OK. Maybe I misunderstood (I've now got to admit that I've never > tried strptime myself). From your initial message (still quoted > above) I thought that it was a simple case of strptime parsing as much > as it could and then giving up (sort of like sscanf), and that the > suggestion you received was to make it insist on parsing everything or > fail. I still think that would be a clear improvement. But if the > original situation wasn't as clear-cut, maybe I should have stayed out > of this... > I wasn't clear enough. I already patched strptime to raise an error if there is anything left that was not parsed (my first CVS checkin actually); this functionality is already there. So I think we just talked ourselves in a circle. =) > >>Unless you are suggesting a flag that when set controls whether the >>Python version or a libc version if available is used, which I guess >>could work as a transition to get people to move over. Is this what you >>are getting at, Guido? And if it is, do you want it at the function or >>module level? I say function, but that is because it would be easier to >>code. =) > > > No, that's not what I was going for at all -- I think that would be a > mistake that woud just cause people to worry needlessly about which > strptime version they should use. > Well, now that I think we have the whole strict parsing cleared up, I assume we don't need this anymore. Is there any other worries? -Brett From drifty@alum.berkeley.edu Sat May 10 19:29:21 2003 From: drifty@alum.berkeley.edu (Brett C.) Date: Sat, 10 May 2003 11:29:21 -0700 Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: <200305101742.h4AHgpv13692@pcp02138704pcs.reston01.va.comcast.net> References: <200305091438.h49Ec9K08904@pcp02138704pcs.reston01.va.comcast.net> <200305100035.h4A0Zk812664@pcp02138704pcs.reston01.va.comcast.net> <3EBC604F.7040306@ocf.berkeley.edu> <200305101742.h4AHgpv13692@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3EBD4501.9080906@ocf.berkeley.edu> Guido van Rossum wrote: >>>[Brett] >>> >>> >>>>>>I got a single email from someone asking me to change the >>>>>>functionality so that it would raise an exception if part of the >>>>>>input string was not parsed. >>>>> >>>[Guido van Rossum] >>> >>> >>>>>That sounds like a good idea on the face of it. Or will this break >>>>>existing code? >>> >>> >>>[Brett] >>> >>> >>>>Maybe. If they depend on some specific behavior on a platform that offers >>>>it, then yes, there could be issues. But since the docs are so vague if >>>>it does break code it will most likely be because someone didn't follow >>>>the warnings in the spec. >>> >>> >>>If you add some flag to control this behavior, defaulting to strict, >>>then at least people who rely on the old (non-strict) behavior can use >>>the flag rather than redesign their application. >>> >> >>But the problem is that I have no idea what the old behavior is. Since >>the spec is so vague and open I have no clue what all the various libc >>versions do. I have just been patching strptime the best I can to >>handle strange edge cases that pop up and work as people like Kevin need >>it to. > > > OK. Maybe I misunderstood (I've now got to admit that I've never > tried strptime myself). From your initial message (still quoted > above) I thought that it was a simple case of strptime parsing as much > as it could and then giving up (sort of like sscanf), and that the > suggestion you received was to make it insist on parsing everything or > fail. I still think that would be a clear improvement. But if the > original situation wasn't as clear-cut, maybe I should have stayed out > of this... > I wasn't clear enough. I already patched strptime to raise an error if there is anything left that was not parsed (my first CVS checkin actually); this functionality is already there. So I think we just talked ourselves in a circle. =) > >>Unless you are suggesting a flag that when set controls whether the >>Python version or a libc version if available is used, which I guess >>could work as a transition to get people to move over. Is this what you >>are getting at, Guido? And if it is, do you want it at the function or >>module level? I say function, but that is because it would be easier to >>code. =) > > > No, that's not what I was going for at all -- I think that would be a > mistake that woud just cause people to worry needlessly about which > strptime version they should use. > Well, now that I think we have the whole strict parsing cleared up, I assume we don't need this anymore. Is there any other worries? -Brett From tim@zope.com Sun May 11 03:46:18 2003 From: tim@zope.com (Tim Peters) Date: Sat, 10 May 2003 22:46:18 -0400 Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: Message-ID: [Kevin Jacobs, on strptime] > I've just retested with my original code and it does look like Brett has > indeed fixed it. Or at least fixed it to the point that mktime doesn't > croak on Linux, Solaris, Tru64, and IRIX with our app. Great! Hats off to Brett. >> ... the problem is that how strptime behaves varies in fact across >> boxes. > Or more importantly that strptime is now standardized in Python, while > mktime is not. Ya, that one's a real problem. The new-in-2.3 datetime module supplies a saner way to deal with dates & times, but is new, and is probably lacking some features some people need. The problem with mktime() is that Python also wants nice ways to play with random C libraries on your platform, and platform mktime() implementations are *really* different (they vary in their beliefs about when "the epoch" begins, what the first representable year is, what the last representable year is, and whether leap seconds exist; POSIX gives clear snswers to the first and last, explicitly gives up on the middle two, and not all Python platforms try to follow POSIX anyway). So I expect mktime() will remain a cross-platform mess forever -- else Python wouldn't play nice with the mess that is your platform <0.9 wink>. > Given that my previous problems with the Python strptime have been > addressed, I am now +1 on using it (although I'm still going to > avoid it and mktime in my code as much as possible). Unfortunately, datetime doesn't supply a wholly sane way to do strftime yet, and no way to do strptime. The ISO formats are very easy (by design) to parse, so those might be best to use in portable code. From skip@mojam.com Sun May 11 13:00:24 2003 From: skip@mojam.com (Skip Montanaro) Date: Sun, 11 May 2003 07:00:24 -0500 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200305111200.h4BC0Ot21778@manatee.mojam.com> Bug/Patch Summary ----------------- 415 open / 3632 total bugs (-7) 137 open / 2144 total patches (+2) New Bugs -------- bsddb.*open mode should default to 'r' rather than 'c' (2003-05-05) http://python.org/sf/732951 Need an easy way to check the version (2003-05-06) http://python.org/sf/733231 kwargs handled incorrectly (2003-05-06) http://python.org/sf/733667 PackMan recursive/force fails on pseudo packages (2003-05-07) http://python.org/sf/733819 Function for creating/extracting CoreFoundation types (2003-05-08) http://python.org/sf/734695 telnetlib.read_until: float req'd for timeout (2003-05-08) http://python.org/sf/734806 pyxml setup error on Mac OS X (2003-05-08) http://python.org/sf/734844 Lambda functions in list comprehensions (2003-05-08) http://python.org/sf/734869 Mach-O gcc optimisation flag can boost performance up to 10% (2003-05-09) http://python.org/sf/735110 urllib2 parse_http_list wrong return (2003-05-09) http://python.org/sf/735248 FILEMODE not honoured (2003-05-09) http://python.org/sf/735274 Command line timeit.py sets sys.path badly (2003-05-09) http://python.org/sf/735293 urllib / urllib2 should cache 301 redirections (2003-05-09) http://python.org/sf/735515 cStringIO.StringIO (2003-05-09) http://python.org/sf/735535 libwinsound.tex is missing MessageBeep() description (2003-05-10) http://python.org/sf/735674 New Patches ----------- build of html docs broken (liboptparse.tex) (2003-05-04) http://python.org/sf/732174 Docs for test package (2003-05-04) http://python.org/sf/732394 Allows os.forkpty to work on more platforms (Solaris!) (2003-05-04) http://python.org/sf/732401 Make Tkinter.py's nametowidget work with cloned menu widgets (2003-05-07) http://python.org/sf/734176 time.tzset documentation (2003-05-08) http://python.org/sf/735051 Python2.3b1 makefile improperly installs IDLE (2003-05-10) http://python.org/sf/735613 Python makefile may install idle in the wrong place (2003-05-10) http://python.org/sf/735614 Pydoc.py fixes links (2003-05-10) http://python.org/sf/735694 Closed Bugs ----------- textwrap has problems wrapping hyphens (2002-08-17) http://python.org/sf/596434 httplib HEAD request fails - keepalive (2002-10-11) http://python.org/sf/622042 new.function ignores keyword arguments (2003-02-25) http://python.org/sf/692959 Mention gmtime in Chapter 6.9 "Time access and conversions" (2003-03-05) http://python.org/sf/697983 Clarify timegm documentation (2003-03-05) http://python.org/sf/697986 Clarify daylight variable meaning (2003-03-05) http://python.org/sf/697988 Clarify mktime semantics (2003-03-05) http://python.org/sf/697989 Problems building python with tkinter on HPUX... (2003-03-17) http://python.org/sf/704919 OpenBSD 3.2: make altinstall dumps core (2003-03-29) http://python.org/sf/712056 cPickle fails to pickle inf (2003-04-03) http://python.org/sf/714733 urlopen(url_to_a_non-existing-domain) raises gaierror (2003-04-18) http://python.org/sf/723831 textwrap.wrap infinite loop (2003-04-23) http://python.org/sf/726446 use bsddb185 if necessary in dbhash (2003-04-24) http://python.org/sf/727137 email parsedate still wrong (PATCH) (2003-04-25) http://python.org/sf/727719 Tools/msgfmt.py results in two warnings under Python 2.3b1 (2003-04-26) http://python.org/sf/728277 setup.py breaks during build of Python-2.3b1 (2003-04-27) http://python.org/sf/728322 Long file names in osa suites (2003-04-27) http://python.org/sf/728574 ConfigurePython gives depreaction warning (2003-04-27) http://python.org/sf/728608 Unexpected Changes in list Iterator (2003-04-30) http://python.org/sf/730296 HTTPRedirectHandler variable out of scope (2003-05-01) http://python.org/sf/730963 urllib2 raises AttributeError on redirect (2003-05-01) http://python.org/sf/731116 IDE "lookup in documentation" doesn't work in interactive wi (2003-05-02) http://python.org/sf/731643 GIL not released around getaddrinfo() (2003-05-02) http://python.org/sf/731644 Closed Patches -------------- textwrap.dedent, inspect.getdoc-ish (2002-08-21) http://python.org/sf/598163 release GIL around getaddrinfo() (2002-09-03) http://python.org/sf/604210 Allow more Unicode on sys.stdout (2002-09-21) http://python.org/sf/612627 MSVC 7.0 compiler support (2002-09-25) http://python.org/sf/614770 Port tests to unittest (2003-01-05) http://python.org/sf/662807 Optimize dictionary resizing (2003-01-20) http://python.org/sf/671454 Dictionary tuning (2003-04-29) http://python.org/sf/729395 assert from longobject.c, line 1215 (2003-04-30) http://python.org/sf/730594 From tim@multitalents.net Sun May 11 19:49:13 2003 From: tim@multitalents.net (Tim Rice) Date: Sun, 11 May 2003 11:49:13 -0700 (PDT) Subject: [Python-Dev] patch 718286 Message-ID: It would be nice to see patch 718286 (DESTIR variable patch) applied. It would make package builder's life easier. -- Tim Rice Multitalents (707) 887-1469 tim@multitalents.net From martin@v.loewis.de Sun May 11 21:53:33 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 11 May 2003 22:53:33 +0200 Subject: [Python-Dev] patch 718286 In-Reply-To: References: Message-ID: Tim Rice writes: > It would be nice to see patch 718286 (DESTIR variable patch) applied. > It would make package builder's life easier. Done. Martin From drifty@alum.berkeley.edu Mon May 12 00:33:52 2003 From: drifty@alum.berkeley.edu (Brett C.) Date: Sun, 11 May 2003 16:33:52 -0700 Subject: [Python-Dev] Need some patches checked Message-ID: <3EBEDDE0.3040308@ocf.berkeley.edu> Since I am trying to tackle patches that were not written by me for the first time I need someone to check that I am doing the right thing. http://www.python.org/sf/649742 is a patch to make adding headers to urllib2's Request object have consistent case. I cleaned up the patch and everything seems reasonable and I don't see how doing this will hurt backwards-compatibilty short of code that tried to add multiple headers of the same name with different case which is not legal anyway for HTTP. http://www.python.org/sf/639139 is a patch wanting to remove an isinstance assertion. Raymond initially suggested weakening the assertion to doing attribute checks. I personally see no reason we can't just take the check out entirely since the code does not appear to have any place where it will mask an AttributeError exception and the comment for the assert says it is just for checking the interface. But since Raymond initially wanted to go another direction I need someone to step in and give me some advice (or Raymond can look at it again; patch is old). -Brett From drifty@alum.berkeley.edu Mon May 12 00:34:00 2003 From: drifty@alum.berkeley.edu (Brett C.) Date: Sun, 11 May 2003 16:34:00 -0700 Subject: [Python-Dev] Need some patches checked Message-ID: <3EBEDDE8.9020706@ocf.berkeley.edu> Since I am trying to tackle patches that were not written by me for the first time I need someone to check that I am doing the right thing. http://www.python.org/sf/649742 is a patch to make adding headers to urllib2's Request object have consistent case. I cleaned up the patch and everything seems reasonable and I don't see how doing this will hurt backwards-compatibilty short of code that tried to add multiple headers of the same name with different case which is not legal anyway for HTTP. http://www.python.org/sf/639139 is a patch wanting to remove an isinstance assertion. Raymond initially suggested weakening the assertion to doing attribute checks. I personally see no reason we can't just take the check out entirely since the code does not appear to have any place where it will mask an AttributeError exception and the comment for the assert says it is just for checking the interface. But since Raymond initially wanted to go another direction I need someone to step in and give me some advice (or Raymond can look at it again; patch is old). -Brett From guido@python.org Mon May 12 01:20:18 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 11 May 2003 20:20:18 -0400 Subject: [Python-Dev] Need some patches checked In-Reply-To: "Your message of Sun, 11 May 2003 16:33:52 PDT." <3EBEDDE0.3040308@ocf.berkeley.edu> References: <3EBEDDE0.3040308@ocf.berkeley.edu> Message-ID: <200305120020.h4C0KIm29011@pcp02138704pcs.reston01.va.comcast.net> > Since I am trying to tackle patches that were not written by me for the > first time I need someone to check that I am doing the right thing. > > http://www.python.org/sf/649742 is a patch to make adding headers to > urllib2's Request object have consistent case. I cleaned up the patch > and everything seems reasonable and I don't see how doing this will hurt > backwards-compatibilty short of code that tried to add multiple headers > of the same name with different case which is not legal anyway for HTTP. Good! I just noticed with disgust that the headers dict is case currently case-sensitive, so that if I want to change the Content-type header, I have to use the exact case used in the source. I can't imagine b/w compatibility issues with this. > http://www.python.org/sf/639139 is a patch wanting to remove an > isinstance assertion. Raymond initially suggested weakening the > assertion to doing attribute checks. I personally see no reason we > can't just take the check out entirely since the code does not appear to > have any place where it will mask an AttributeError exception and the > comment for the assert says it is just for checking the interface. But > since Raymond initially wanted to go another direction I need someone to > step in and give me some advice (or Raymond can look at it again; patch > is old). The advantage of the assert (or some other check) is to catch a type error early, rather than 4 call levels deeper, where the source of the AttributeError may not be obvious when it happens. But I agree that that is a minor issue, and for correct code removing the assert is fine. Checking exactly for the attributes that are (or may be) used is probably overly expensive. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg@cosc.canterbury.ac.nz Mon May 12 01:38:01 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 12 May 2003 12:38:01 +1200 (NZST) Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <16E1010E4581B049ABC51D4975CEDB880113DAD1@UKDCX001.uk.int.atosorigin.com> Message-ID: <200305120038.h4C0c1F23070@oma.cosc.canterbury.ac.nz> > Specifically, mingw exploits a specific clause in the GPL which allows > dependencies on "components of the OS". MSVCRT qualifies here, but > MSVCR7 doesn't. But surely any GPL issues with using mingw apply only to libraries that mingw *itself* depends on, and then only if one is redistributing a work derived from mingw -- not to anything *created* with mingw? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From gh@ghaering.de Mon May 12 02:26:38 2003 From: gh@ghaering.de (=?ISO-8859-1?Q?Gerhard_H=E4ring?=) Date: Mon, 12 May 2003 03:26:38 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305120038.h4C0c1F23070@oma.cosc.canterbury.ac.nz> References: <200305120038.h4C0c1F23070@oma.cosc.canterbury.ac.nz> Message-ID: <3EBEF84E.4090704@ghaering.de> Greg Ewing wrote: >>Specifically, mingw exploits a specific clause in the GPL which allows >>dependencies on "components of the OS". MSVCRT qualifies here, but >>MSVCR7 doesn't. > > But surely any GPL issues with using mingw apply only > to libraries that mingw *itself* depends on, and then > only if one is redistributing a work derived from > mingw -- not to anything *created* with mingw? That's my interpretation of the GPL, as well. -- Gerhard From tim.one@comcast.net Mon May 12 02:47:33 2003 From: tim.one@comcast.net (Tim Peters) Date: Sun, 11 May 2003 21:47:33 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <20030505201416.GB17384@barsoom.org> Message-ID: [Agthorr] > An alternate optimization would be the additional of an immutable > dictionary type to the language, initialized from a mutable dictionary > type. Upon creation, this dictionary would optimize itself, in a > manner similar to "gperf" program which creates (nearly) minimal > zero-collision hash tables. Possibly, but it's fraught with difficulties. For example, Python dicts can be indexed by lots of things besides 8-bit strings, and you generally need to know a great deal about the internal structure of a key type to generate a sensible hash function. A more fundamental problem is that minimality can be harmful when failing lookups are frequent: a sparse table has a good chance of hitting a null entry immediately then, but a minimal table never does. In the former case full-blown key comparison can be skipped when a null entry is hit, in the latter case full-blown key comparison is always needed on a failing lookup. For symbol table apps, Bentley & Sedgewick rehabilitated the idea of ternary search trees a few years ago, and I think you'd enjoy reading their papers: http://www.cs.princeton.edu/~rs/strings/ In particular, they're faster than hashing in the failing-lookup case. From tim.one@comcast.net Mon May 12 03:38:25 2003 From: tim.one@comcast.net (Tim Peters) Date: Sun, 11 May 2003 22:38:25 -0400 Subject: [Python-Dev] os.path.walk() lacks 'depth first' option In-Reply-To: <2mist7nd62.fsf@starship.python.net> Message-ID: [Jeremy Fincher] >>> On another related front, sets (in my Python 2.3a2) raise KeyError on a >>> .remove(elt) when elt isn't in the set. Since sets aren't mappings, >>> should that be a ValueError (like list raises) instead? [Tim] >> Since sets aren't sequences either, why should sets raise the >> same exception lists raise? It's up to the type to use whichever >> fool exceptions it chooses. This doesn't always make life easy for >> users, alas -- there's not much consistency in exception behavior >> across packages. In this case, a user would be wise to avoid >> expecting IndexError or KeyError, and catch their common base class >> (LookupError) instead. The distinction between IndexError and KeyError >> isn't really useful (IMO; LookupError was injected as a base class >> recently in Python's life). [Michael Hudson] > Without me noticing, too! Well, I knew there was a lookup error that > you get when failing to find a codec, but I didn't know IndexError and > KeyError derived from it... > > Also note that Jeremy was suggesting *ValueError*, not IndexError... Oops! So he was -- I spaced out on that. > that any kind of index-or-key-ing is going on is trivia of the > implementation, surely? Sure. I don't care for ValueError in this context, though -- there's nothing wrong with the value I'm testing for set membership, after all. Of course I never cared for ValueError on a failing list.remove() either. I like ValueError best when an input is of the right type but outside the defined domain of a function, like math.sqrt(-1.0) or chr(500). Failing to find something feels more like a (possibly proper subclass of) LookupError to me. But I'd hate to create even more useless distinctions among different kinds of lookup failures, so am vaguely happy reusing the KeyError flavor of LookupError. In any case, I'm not unhappy enough with it to do something about it. I nevertheless agree Jerry raised a good point, and maybe somebody else is unhappy enough with it to change it? From agthorr@barsoom.org Mon May 12 04:28:18 2003 From: agthorr@barsoom.org (Agthorr) Date: Sun, 11 May 2003 20:28:18 -0700 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: References: <20030505201416.GB17384@barsoom.org> Message-ID: <20030512032817.GA31824@barsoom.org> On Sun, May 11, 2003 at 09:47:33PM -0400, Tim Peters wrote: > Possibly, but it's fraught with difficulties. For example, Python dicts can > be indexed by lots of things besides 8-bit strings, and you generally need > to know a great deal about the internal structure of a key type to generate > a sensible hash function. > A more fundamental problem is that minimality can be harmful when > failing lookups are frequent: a sparse table has a good chance of > hitting a null entry immediately then, but a minimal table never > does. In the former case full-blown key comparison can be skipped > when a null entry is hit, in the latter case full-blown key > comparison is always needed on a failing lookup. Both good observations. > For symbol table apps, Bentley & Sedgewick rehabilitated the idea of ternary > search trees a few years ago, and I think you'd enjoy reading their papers: > > http://www.cs.princeton.edu/~rs/strings/ > > In particular, they're faster than hashing in the failing-lookup case. hhmmm.. yes, those are interesting. Thanks :-) A few months ago I implemented suffix trees for fun and practice. Suffix trees are based on tries, and I used a binary-tree for each node to keep track of its children (which the papers point out is an equivalent way of doing ternary trees). (Suffix trees let you input a set of strings of total length n. This has a cost of O(n) time and O(n) memory. Then, you can look to see if a string of length m is a substring of any of the strings in the set in O(m) time; this is impressive since the number and size of the set of strings only matters for the setup operation; it has no effect on the lookup speed whatsoever.) Ternary search trees seem like a good approach for string-only dictionaries. These seem like an inelegant optimization that might yield performance improvements for places where non-string keys are syntactically disallowed anyway (such as the members of a class or module). -- Agthorr From jack@performancedrivers.com Mon May 12 07:16:57 2003 From: jack@performancedrivers.com (Jack Diederich) Date: Mon, 12 May 2003 02:16:57 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: ; from tim.one@comcast.net on Sun, May 11, 2003 at 09:47:33PM -0400 References: <20030505201416.GB17384@barsoom.org> Message-ID: <20030512021657.C951@localhost.localdomain> On Sun, May 11, 2003 at 09:47:33PM -0400, Tim Peters wrote: > [Agthorr] > > An alternate optimization would be the additional of an immutable > > dictionary type to the language, initialized from a mutable dictionary > > type. Upon creation, this dictionary would optimize itself, in a > > manner similar to "gperf" program which creates (nearly) minimal > > zero-collision hash tables. > > For symbol table apps, Bentley & Sedgewick rehabilitated the idea of ternary > search trees a few years ago, and I think you'd enjoy reading their papers: > > http://www.cs.princeton.edu/~rs/strings/ > > In particular, they're faster than hashing in the failing-lookup case. They nest well too. And you can do some caching if the higher level trees are unchaning (local scope can shortcut into builtins). I have a pure-python ternary tree and a C w/python wrappers of ternary trees lying around. They were written with symbol tables is mind, I haven't touched em since my presentation proposal on the topic [ternary trees in general, replacing python symbol dict w/ t-trees as the closing example] was declined for the Portland OReilly thingy (bruised ego, sour grapes, et al). Cut-n-paste from an off-list for this undying thread below. Hettinger's idea of treaps is a good one. A ternary-treap would also be possible. -jack [Raymond] > My thought is to use a treap. The binary search side would scan the > hash values while the heap part would organize from most frequent to > least frequently accessed key. It could even be dynamic and re-arrange > the heap according to usage patterns. [me] treaps would probably be a better fit than ternary trees, espcially for builtins for the reasons you mention. A good default ordering would go a long way. [me, about ternary trees] They nest nicely, a valid 'next' node can be another ternary tree, so pseudo code for import would be newmodule = __import__('mymodule') # assume __module_symdict__ is the module's symbol table __module_symdict__['mymodule.'] = newmodule.__module_symdict__ a lookup for 'mymodule.some_function' would happily run from the current module's tree into the 'mymodule' tree. The '.' seperator would only remain speical from a user's point of view. If symbols don't share leading characters, ternary trees are just binary trees that require additional bookkeeping. This is probably the case, so ternary trees become less neat [even if they do make for prettier pictures]. From drifty@alum.berkeley.edu Mon May 12 09:09:12 2003 From: drifty@alum.berkeley.edu (Brett C.) Date: Mon, 12 May 2003 01:09:12 -0700 Subject: [Python-Dev] Random SF tracker ettiquete questions Message-ID: <3EBF56A8.8090603@ocf.berkeley.edu> First, do we care about closing RFEs? I realized that Skip does not keep count of them in his weekly summary so I am not sure how much we care about them. Should I waste my time wading through them to close them? Second, when is it okay to reassign a tracker item to yourself or close an item that is assigned to another person? I ask this because Fred has some patches assigned to him that I think I can close myself, but I don't want to step on his toes since they are assigned to him. Third, when does someone warrant being mentioned in the ACKS.txt file? Only when they have done some significant body of work? Or does committing even a one-line patch warrant inclusion? -Brett P.S.: Python got to #6 on SF's most active projects. Maybe I am overdoing the comments on patches. =) From walter@livinglogic.de Mon May 12 10:56:25 2003 From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Mon, 12 May 2003 11:56:25 +0200 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: References: Message-ID: <3EBF6FC9.3090805@livinglogic.de> Tim Peters wrote: > [...] > For symbol table apps, Bentley & Sedgewick rehabilitated the idea of ternary > search trees a few years ago, and I think you'd enjoy reading their papers: > > http://www.cs.princeton.edu/~rs/strings/ > > In particular, they're faster than hashing in the failing-lookup case. The digital search tries mentioned in the article seem to use the same fundamental approach as state machines, i.e. while traversing the string, remember the string prefix that has already been recognized. Digital search tries traverse the tree and the memory is in the path that has been traversed. State machines traverse a transition table and the memory is the current state. Digital search tries seem to be easy to update, while state machine are not. Has anybody tried state machines for symbol tables in Python? The size of the transition table might be a problem and any attempt to reduce the size might kill performance in the inner loop. Performancewise stringobject.c/string_hash() is hard to beat (especially when the hash value is already cached). Bye, Walter Dörwald From mwh@python.net Mon May 12 11:35:25 2003 From: mwh@python.net (Michael Hudson) Date: Mon, 12 May 2003 11:35:25 +0100 Subject: [Python-Dev] Random SF tracker ettiquete questions In-Reply-To: <3EBF56A8.8090603@ocf.berkeley.edu> ("Brett C."'s message of "Mon, 12 May 2003 01:09:12 -0700") References: <3EBF56A8.8090603@ocf.berkeley.edu> Message-ID: <2m3cjk78r6.fsf@starship.python.net> "Brett C." writes: > First, do we care about closing RFEs? I realized that Skip does not > keep count of them in his weekly summary so I am not sure how much we > care about them. Should I waste my time wading through them to close > them? I remember Martin pointing out that we should prioritize patches over bug reports, as for a patch someone has actually put some work in. By this light RFEs are at the bottom of the pile. > Second, when is it okay to reassign a tracker item to yourself or > close an item that is assigned to another person? I ask this because > Fred has some patches assigned to him that I think I can close myself, > but I don't want to step on his toes since they are assigned to him. All doc bugs get assigned to Fred by default, IIRC. These means he probably doesn't feel too attached to them... > Third, when does someone warrant being mentioned in the ACKS.txt file? > Only when they have done some significant body of work? Or does > committing even a one-line patch warrant inclusion? I err on the side of adding people. It's a judgement call. IMHO pointing out a typo in the docs isn't sufficient, but just about anything that involves thinking is. > P.S.: Python got to #6 on SF's most active projects. Maybe I am > overdoing the comments on patches. =) Nonsense! Cheers, M. -- Good? Bad? Strap him into the IETF-approved witch-dunking apparatus immediately! -- NTK now, 21/07/2000 From fdrake@acm.org Mon May 12 11:54:11 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 12 May 2003 06:54:11 -0400 Subject: [Python-Dev] Random SF tracker ettiquete questions In-Reply-To: <3EBF56A8.8090603@ocf.berkeley.edu> References: <3EBF56A8.8090603@ocf.berkeley.edu> Message-ID: <16063.32083.158258.152293@grendel.zope.com> Brett C. writes: > Second, when is it okay to reassign a tracker item to yourself or close > an item that is assigned to another person? I ask this because Fred has > some patches assigned to him that I think I can close myself, but I > don't want to step on his toes since they are assigned to him. If you have the time to review them, please feel free to reassign to yourself. I'm really busy with other projects at the moment, so I'm unlikely to get to them real soon. Thanks! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From guido@python.org Mon May 12 12:58:19 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 12 May 2003 07:58:19 -0400 Subject: [Python-Dev] os.path.walk() lacks 'depth first' option In-Reply-To: "Your message of Sun, 11 May 2003 22:38:25 EDT." References: Message-ID: <200305121158.h4CBwKD29921@pcp02138704pcs.reston01.va.comcast.net> > I like ValueError best when an input is of the right type but > outside the defined domain of a function, like math.sqrt(-1.0) or > chr(500). Failing to find something feels more like a (possibly > proper subclass of) LookupError to me. Yeah, [].remove(42) raising ValueError is a bit weird. It was put in before we had the concept of LookupError, and the rationale for using ValueError was that the *value* is not found -- can't use IndexError because the value is chosen from a different set than the index, can't use KeyError because lists don't have a concept of key. In retrospect, it would have been better to define a SearchError, subclassing LookupError. OTOH there's something to say for fewer errors, not more; e.g. sometimes I wish AttributeError and TypeError were unified, because AttributeError usually means that an object isn't of the expected type. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon May 12 13:11:22 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 12 May 2003 08:11:22 -0400 Subject: [Python-Dev] Random SF tracker ettiquete questions In-Reply-To: "Your message of Mon, 12 May 2003 01:09:12 PDT." <3EBF56A8.8090603@ocf.berkeley.edu> References: <3EBF56A8.8090603@ocf.berkeley.edu> Message-ID: <200305121211.h4CCBNK29988@pcp02138704pcs.reston01.va.comcast.net> > First, do we care about closing RFEs? I realized that Skip does not > keep count of them in his weekly summary so I am not sure how much > we care about them. Should I waste my time wading through them to > close them? I don't think that's a waste of time. It's good to keep track of how much progress we make in any dimension. > Second, when is it okay to reassign a tracker item to yourself or > close an item that is assigned to another person? I ask this > because Fred has some patches assigned to him that I think I can > close myself, but I don't want to step on his toes since they are > assigned to him. All doc issues are automatically assigned to Fred (maybe this needs to be revised); I don't think he'll be offended if you take some work of his chest. > Third, when does someone warrant being mentioned in the ACKS.txt > file? Only when they have done some significant body of work? Or > does committing even a one-line patch warrant inclusion? I tend to add people to Misc/ACKS for any code contribution whatsoever, including one-liners. > -Brett > > P.S.: Python got to #6 on SF's most active projects. Maybe I am > overdoing the comments on patches. =) No, please keep it up. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Mon May 12 14:47:27 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 12 May 2003 09:47:27 -0400 Subject: [Python-Dev] Random SF tracker ettiquete questions In-Reply-To: <200305121211.h4CCBNK29988@pcp02138704pcs.reston01.va.comcast.net> References: <3EBF56A8.8090603@ocf.berkeley.edu> <200305121211.h4CCBNK29988@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <16063.42479.84510.486834@grendel.zope.com> Guido van Rossum writes: > All doc issues are automatically assigned to Fred (maybe this needs to > be revised); I don't think he'll be offended if you take some work of > his chest. Is this still the case? I'm fairly certain I changed that, given the amount of time I haven't been able to spend over the past several months. (I'm pretty sure I actually changed that last year... I'd check but the SF website is down.) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From pedronis@bluewin.ch Mon May 12 15:48:01 2003 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Mon, 12 May 2003 16:48:01 +0200 Subject: [Python-Dev] codeop: small details (Q); commit priv request Message-ID: <5.2.1.1.0.20030512140727.02362ab0@localhost> 1) Python 2.3b1 (#40, Apr 25 2003, 19:06:24) [MSC v.1200 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import codeop >>> codeop.compile_command("",symbol="eval") Traceback (most recent call last): File "", line 1, in ? File "s:\transit\py23\lib\codeop.py", line 129, in compile_command return _maybe_compile(_compile, source, filename, symbol) File "s:\transit\py23\lib\codeop.py", line 106, in _maybe_compile raise SyntaxError, err1 File "", line 1 pass ^ SyntaxError: invalid syntax the error is basically an artifact of the logic that enforces: compile_command("",symbol="single") === compile_command("pass",symbol="single") (this makes typing enter immediately after the prompt at a simulated shell a nop as expected) I would expect compile_command("",symbol="eval") to return None, i.e. to simply signal an incomplete expression (that is what would happen if the code for "eval" case would avoid the cited logic). 2) symbol = "exec" is silently accepted but the documentation intentionally only refers to "exec" and "single" as valid values for symbol. Maybe a ValueError should be raised. Context: I was working on improving Jython codeop compatibility with CPython codeop. Btw, as considered here by Guido http://sourceforge.net/tracker/index.php?func=detail&aid=645404&group_id=5470&atid=305470 I would ask to have commit privileges for CPython regards From tino.lange@isg.de Mon May 12 15:49:37 2003 From: tino.lange@isg.de (Tino Lange) Date: Mon, 12 May 2003 16:49:37 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305092032.h49KW5B11937@pcp02138704pcs.reston01.va.comcast.net> References: <200305092032.h49KW5B11937@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3EBFB481.4070907@isg.de> Hi! It's still not clear to me after reading this thread: Besides the optimizer - is it really possible to build a python.exe that *doesn't* depend on the .NET framework with the free download combination of ".NET 1.1" / "latest SDK". Of course you can build it with the VC7.1 Pro, there's a framework-compiler and a standalone vc++-compiler included. But also in the free download edition? I thought this is only the several framework compilers? Thanks for giving me a hint. Best regards Tino From barry@python.org Mon May 12 16:15:14 2003 From: barry@python.org (Barry Warsaw) Date: 12 May 2003 11:15:14 -0400 Subject: [Python-Dev] codeop: small details (Q); commit priv request In-Reply-To: <5.2.1.1.0.20030512140727.02362ab0@localhost> References: <5.2.1.1.0.20030512140727.02362ab0@localhost> Message-ID: <1052752514.22883.16.camel@barry> On Mon, 2003-05-12 at 10:48, Samuele Pedroni wrote: > Btw, as considered here by Guido > http://sourceforge.net/tracker/index.php?func=detail&aid=645404&group_id=5470&atid=305470 > I would ask to have commit privileges for CPython Done! -Barry From mwh@python.net Mon May 12 16:22:22 2003 From: mwh@python.net (Michael Hudson) Date: Mon, 12 May 2003 16:22:22 +0100 Subject: [Python-Dev] codeop: small details (Q); commit priv request In-Reply-To: <5.2.1.1.0.20030512140727.02362ab0@localhost> (Samuele Pedroni's message of "Mon, 12 May 2003 16:48:01 +0200") References: <5.2.1.1.0.20030512140727.02362ab0@localhost> Message-ID: <2mvfwg5gwh.fsf@starship.python.net> Samuele Pedroni writes: > 1) > > Python 2.3b1 (#40, Apr 25 2003, 19:06:24) [MSC v.1200 32 bit (Intel)] on win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> import codeop > >>> codeop.compile_command("",symbol="eval") > Traceback (most recent call last): > File "", line 1, in ? > File "s:\transit\py23\lib\codeop.py", line 129, in compile_command > return _maybe_compile(_compile, source, filename, symbol) > File "s:\transit\py23\lib\codeop.py", line 106, in _maybe_compile > raise SyntaxError, err1 > File "", line 1 > pass > ^ > SyntaxError: invalid syntax > > > the error is basically an artifact of the logic that enforces: > > compile_command("",symbol="single") === compile_command("pass",symbol="single") > > (this makes typing enter immediately after the prompt at a simulated > shell a nop as expected) > > I would expect > > compile_command("",symbol="eval") > > to return None, i.e. to simply signal an incomplete expression (that > is what would happen if the code for "eval" case would avoid the cited > logic). OK, but I think you should preserve the existing behaviour for symbol="single". Cheers, M. -- I also feel it essential to note, [...], that Description Logics, non-Monotonic Logics, Default Logics and Circumscription Logics can all collectively go suck a cow. Thank you. -- http://advogato.org/person/Johnath/diary.html?start=4 From harri.pasanen@trema.com Mon May 12 17:13:36 2003 From: harri.pasanen@trema.com (Harri Pasanen) Date: Mon, 12 May 2003 18:13:36 +0200 Subject: [Python-Dev] Python 2.3b1 _XOPEN_SOURCE value from configure.in Message-ID: <200305121813.36383.harri.pasanen@trema.com> Seems that Python 2.3b1 hardcodes the value of XOPEN_SOURCE to 600 on Solaris. In configure in: if test $define_xopen_source = yes then AC_DEFINE(_XOPEN_SOURCE, 600, Define to the level of X/Open that your system supports) Now the correct value for Solaris 2.7 in our case is 500, which is defined in the system headers, and boost config picks up the correct value. So when compiling boost-python, there are zillion warning messages about redefinition of _XOPEN_SOURCE. Is this a problem people are aware of, and is someone fixing it as I write, or is a volunteer needed? -Harri From pedronis@bluewin.ch Mon May 12 17:31:13 2003 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Mon, 12 May 2003 18:31:13 +0200 Subject: [Python-Dev] codeop: small details (Q); commit priv request In-Reply-To: <1052752514.22883.16.camel@barry> References: <5.2.1.1.0.20030512140727.02362ab0@localhost> <5.2.1.1.0.20030512140727.02362ab0@localhost> Message-ID: <5.2.1.1.0.20030512183026.01cef8c8@localhost> At 11:15 12.05.2003 -0400, Barry Warsaw wrote: >On Mon, 2003-05-12 at 10:48, Samuele Pedroni wrote: > > > Btw, as considered here by Guido > > > http://sourceforge.net/tracker/index.php?func=detail&aid=645404&group_id=5470&atid=305470 > > I would ask to have commit privileges for CPython > >Done! >-Barry Thanks. From pedronis@bluewin.ch Mon May 12 17:34:21 2003 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Mon, 12 May 2003 18:34:21 +0200 Subject: [Python-Dev] codeop: small details (Q); commit priv request In-Reply-To: <2mvfwg5gwh.fsf@starship.python.net> References: <5.2.1.1.0.20030512140727.02362ab0@localhost> <5.2.1.1.0.20030512140727.02362ab0@localhost> Message-ID: <5.2.1.1.0.20030512183320.01d012c8@localhost> At 16:22 12.05.2003 +0100, Michael Hudson wrote: > > > > I would expect > > > > compile_command("",symbol="eval") > > > > to return None, i.e. to simply signal an incomplete expression (that > > is what would happen if the code for "eval" case would avoid the cited > > logic). > >OK, but I think you should preserve the existing behaviour for >symbol="single". Of course, I didn't mean otherwise. I can prepare a patch and I have also a somehow beefed up test_codeop. From tim.one@comcast.net Mon May 12 21:58:27 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 12 May 2003 16:58:27 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <3EBF6FC9.3090805@livinglogic.de> Message-ID: [Walter D=F6rwald] > ... > Has anybody tried state machines for symbol tables in Python? Not that I know of. > The size of the transition table might be a problem and any attempt > to reduce the size might kill performance in the inner loop. > Performancewise stringobject.c/string_hash() is hard to > beat (especially when the hash value is already cached). Which is why, if nobody ever did or ever does try alternative approac= hes, I would be neither surprised nor disappointed <0.9 wink>. From martin@v.loewis.de Mon May 12 22:04:06 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 12 May 2003 23:04:06 +0200 Subject: [Python-Dev] Python 2.3b1 _XOPEN_SOURCE value from configure.in In-Reply-To: <200305121813.36383.harri.pasanen@trema.com> References: <200305121813.36383.harri.pasanen@trema.com> Message-ID: Harri Pasanen writes: > So when compiling boost-python, there are zillion warning messages > about redefinition of _XOPEN_SOURCE. [...] > Is this a problem people are aware of, and is someone fixing it as I > write, or is a volunteer needed? I'm not aware of this problem specifically, but of the problem in general. I'd claim that this is a bug in Boost. Python.h should be the first header file included, before any header file from the application or the system, so it gets to define the value of _XOPEN_SOURCE. This is documented in the extensions manual. Of course, it would be sufficient to set it to a smaller value on systems that support only older X/Open issues; I think I'd accept a patch that changes this (if the patch is correct, of course). Regards, Martin From barry@barrys-emacs.org Mon May 12 22:37:35 2003 From: barry@barrys-emacs.org (Barry Scott) Date: Mon, 12 May 2003 22:37:35 +0100 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: <3EBCABD0.7050700@lemburg.com> <3EBCABD0.7050700@lemburg.com> Message-ID: <5.1.1.6.0.20030512222353.022ade78@torment.chelsea.private> Did I miss the answer to why bother to move to VC7? As a C project I know of very little to recommend VC7 or VC7.1. As a C++ developer I've decided that VC7 as little more then a broken VC6. Maybe Jesse Lipcon (who works for MS now) has managed to make VC7.1 more standards compatible for C++ work, which would recommend it to C++ developers. Note that wxPython claims that it will not compile correctly with VC7 unless you add a work around for a bug in the code generator. Barry From nhodgson@bigpond.net.au Mon May 12 22:36:00 2003 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Tue, 13 May 2003 07:36:00 +1000 Subject: [Python-Dev] MS VC 7 offer References: <200305092032.h49KW5B11937@pcp02138704pcs.reston01.va.comcast.net> <3EBFB481.4070907@isg.de> Message-ID: <00c701c318ce$7b249330$3da48490@neil> Tino Lange: > Besides the optimizer - is it really possible to build a python.exe that > *doesn't* depend on the .NET framework with the free download > combination of ".NET 1.1" / "latest SDK". The C++ compiler in the free .NET SDK download can create executables with no dependence on the .NET runtime. There is only a small set of headers and libraries in the .NET SDK download but a full set is in the free Platform SDK download. IIRC, the first public beta of .NET even included the optimizer but that was swiftly removed. Neil From logistix@cathoderaymission.net Mon May 12 23:06:53 2003 From: logistix@cathoderaymission.net (logistix) Date: Mon, 12 May 2003 18:06:53 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <5.1.1.6.0.20030512222353.022ade78@torment.chelsea.private> Message-ID: <000001c318d2$cbf3b0d0$20bba8c0@XP> > -----Original Message----- > From: python-dev-admin@python.org > [mailto:python-dev-admin@python.org] On Behalf Of Barry Scott > Sent: Monday, May 12, 2003 5:38 PM > To: 'python-dev' > Subject: Re: [Python-Dev] MS VC 7 offer > > > Did I miss the answer to why bother to move to VC7? > From logistix@cathoderaymission.net Mon May 12 23:08:56 2003 From: logistix@cathoderaymission.net (logistix) Date: Mon, 12 May 2003 18:08:56 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <5.1.1.6.0.20030512222353.022ade78@torment.chelsea.private> Message-ID: <000101c318d3$1583c870$20bba8c0@XP> > -----Original Message----- > From: python-dev-admin@python.org > [mailto:python-dev-admin@python.org] On Behalf Of Barry Scott > Sent: Monday, May 12, 2003 5:38 PM > To: 'python-dev' > Subject: Re: [Python-Dev] MS VC 7 offer > > > Did I miss the answer to why bother to move to VC7? > Here's one reason. You can't buy VC6.0 anymore. I can't find any indication of an Official End-of-life on MS's site though. http://msdn.microsoft.com/vstudio/previous/downgrade.aspx From tdelaney@avaya.com Tue May 13 01:00:00 2003 From: tdelaney@avaya.com (Delaney, Timothy C (Timothy)) Date: Tue, 13 May 2003 10:00:00 +1000 Subject: [Python-Dev] os.path.walk() lacks 'depth first' option Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE57F1E2@au3010avexu1.global.avaya.com> > From: Guido van Rossum [mailto:guido@python.org] >=20 > OTOH there's something to say for fewer errors, not more; > e.g. sometimes I wish AttributeError and TypeError were unified, > because AttributeError usually means that an object isn't of the > expected type. Hmm ... I was going to ask if there was any reason not to make = AttributeError a subclass of TypeError, but that would mean that code = like: try: ... except TypeError: ... would also catch all AttributeErrors. Maybe we should have a __future__ directive and phase it in starting in = 2.4? I wouldn't suggest making AttributeError and TypeError be synonyms = though ... I think it is useful to distinguish the situations. I can't think of any case in *my* code where I would want to distinguish = between a TypeError and an AttributeError - usually I end up having: try: ... except (TypeError, AttributeError): ... Tim Delaney From pje@telecommunity.com Tue May 13 01:34:21 2003 From: pje@telecommunity.com (Phillip J. Eby) Date: Mon, 12 May 2003 20:34:21 -0400 Subject: [Python-Dev] os.path.walk() lacks 'depth first' option In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DE57F1E2@au3010avexu1.global .avaya.com> Message-ID: <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com> At 10:00 AM 5/13/03 +1000, Delaney, Timothy C (Timothy) wrote: >I can't think of any case in *my* code where I would want to distinguish >between a TypeError and an AttributeError - usually I end up having: > > try: > ... > except (TypeError, AttributeError): > ... How odd. I was going to say the reverse; that I *always* want to distinguish between the two, because TypeError almost invariably is a programming error of some kind, while AttributeError is nearly always an error that I'm checking in order to have a fallback. E.g.: try: foo = thingy.foo except AttributeError: # default case else: foo() However, if 'thingy.foo' were to raise any other kind of error, such as a TypeError, it'd probably mean that thingy had a broken 'foo' descriptor that I'd want to know about. From guido@python.org Tue May 13 02:44:41 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 12 May 2003 21:44:41 -0400 Subject: [Python-Dev] os.path.walk() lacks 'depth first' option In-Reply-To: "Your message of Mon, 12 May 2003 20:34:21 EDT." <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com> References: <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com> Message-ID: <200305130144.h4D1ifq30620@pcp02138704pcs.reston01.va.comcast.net> > How odd. I was going to say the reverse; that I *always* want to > distinguish between the two, because TypeError almost invariably is a > programming error of some kind, while AttributeError is nearly always an > error that I'm checking in order to have a fallback. E.g.: > > try: > foo = thingy.foo > except AttributeError: > # default case > else: > foo() > > However, if 'thingy.foo' were to raise any other kind of error, such as a > TypeError, it'd probably mean that thingy had a broken 'foo' descriptor > that I'd want to know about. This sounds like a much more advanced use, typical to a certain style of programming. Others would do this using hasattr() or three-argument getattr(); some will argue that you should have a base class that handles the default case so you don't need to handle that case separately at all (though that may not always be possible, e.g. when dealing with objects created by a 3rd party library). Your example argues for allowing to distringuish between AttributeError and TypeError, but doesn't convince me that they are totally different beasts. --Guido van Rossum (home page: http://www.python.org/~guido/) From pje@telecommunity.com Tue May 13 04:02:24 2003 From: pje@telecommunity.com (Phillip J. Eby) Date: Mon, 12 May 2003 23:02:24 -0400 Subject: [Python-Dev] os.path.walk() lacks 'depth first' option In-Reply-To: <200305130144.h4D1ifq30620@pcp02138704pcs.reston01.va.comca st.net> References: <"Your message of Mon, 12 May 2003 20:34:21 EDT." <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com> <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com> Message-ID: <5.1.0.14.0.20030512224621.03ef8500@mail.telecommunity.com> At 09:44 PM 5/12/03 -0400, Guido van Rossum wrote: >This sounds like a much more advanced use, typical to a certain style >of programming. Framework programming, for maximal adaptability of third-party code, yes. >Others would do this using hasattr() or three-argument >getattr() I use three-argument getattr() most of the time, actually. However, doesn't 'getattr()' rely on catching AttributeError? I just wanted my example to be explicit. >Your example argues for allowing to distringuish between >AttributeError and TypeError, but doesn't convince me that they are >totally different beasts. Sure. My point is more that using exceptions to indicate failed lookups is a tricky business. I almost wish there was a way to declare the "normal" exceptions raised by an operation; or perhaps to easily query where an exception was raised. Nowadays, when designing interfaces that need to signal some kind of exceptional condition, I tend to want to have them return sentinel values rather than raise exceptions, in order to distinguish between "failed" and "broken". I'm sure that this is an issue specific to framework programming and to large team-built systems, though, and not something that bothers the mythical "average developer" a bit. :) From tanzer@swing.co.at Tue May 13 06:33:27 2003 From: tanzer@swing.co.at (Christian Tanzer) Date: Tue, 13 May 2003 07:33:27 +0200 Subject: [Python-Dev] os.path.walk() lacks 'depth first' option In-Reply-To: Your message of "Tue, 13 May 2003 10:00:00 +1000." <338366A6D2E2CA4C9DAEAE652E12A1DE57F1E2@au3010avexu1.global.avaya.com> Message-ID: "Delaney, Timothy C (Timothy)" wrote: > > From: Guido van Rossum [mailto:guido@python.org] > > > > OTOH there's something to say for fewer errors, not more; > > e.g. sometimes I wish AttributeError and TypeError were unified, > > because AttributeError usually means that an object isn't of the > > expected type. > > Hmm ... I was going to ask if there was any reason not to make > AttributeError a subclass of TypeError, but that would mean that code > like: > > try: > ... > except TypeError: > ... > > would also catch all AttributeErrors. > > Maybe we should have a __future__ directive and phase it in starting > in 2.4? > > I wouldn't suggest making AttributeError and TypeError be synonyms > though ... I think it is useful to distinguish the situations. > > I can't think of any case in *my* code where I would want to > distinguish between a TypeError and an AttributeError - usually I end > up having: > > try: > ... > except (TypeError, AttributeError): > ... More hmmm... Just grepped over my source tree (1293 .py files, ~ 300000 lines): - 45 occurrences of `except AttributeError` with no mention of `TypeError` - 16 occurrences of `except TypeError` with no mention of `AttributeError` - 3 occurrences of `except (AttributeError, TypeError)` Works well enough for me. Deriving both AttributeError and TypeError from a common base would make sense to me. Merging them wouldn't. PS: As that was my first post here, a short introduction. I'm a consultant using Python since early 1998. Since then the precentage of C/C++ use in my daily work steadily shrank. Nowadays, using C normally means generating C code from Python. -- = Christian Tanzer tanzer@swing.co.= at From mrussell@verio.net Tue May 13 10:52:43 2003 From: mrussell@verio.net (Mark Russell) Date: Tue, 13 May 2003 10:52:43 +0100 Subject: [Python-Dev] os.walk() silently ignores errors Message-ID: I've just noticed that os.walk() silently skips unreadable directories. I think this is surprising behaviour, which at least should be documented (there is a comment explaining this is source, but nothing in the doc string). Is it too late to add an optional callback argument to handle unreadable directories, so the caller could log them, raise an exception or whatever? I think the default behaviour should still be to silently ignore them, but it would be nice to have a way to override it. Mark Russell From Raymond Hettinger" Was there a reason that __slots__ makes initialized variables read-only? It would be useful to have overridable default values (even if it entailed copying them into an instance's slots): class Pane(object): __slots__ = ('background', 'foreground', 'size', 'content') background = 'black' foreground = 'white' size = (80, 25) p = Pane() p.background = 'light blue' # override the default assert p.foreground == 'white' # other defaults still in-place Raymond Hettinger --------------------------- >>> class A(object): __slots__ = ('x',) x = 1 >>> class B(object): __slots__ = ('x',) >>> A().x = 2 Traceback (most recent call last): File "", line 1, in ? A().x = 2 AttributeError: 'A' object attribute 'x' is read-only >>> B().x = 2 From guido@python.org Tue May 13 14:51:33 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 13 May 2003 09:51:33 -0400 Subject: [Python-Dev] os.walk() silently ignores errors In-Reply-To: Your message of "Tue, 13 May 2003 10:52:43 BST." References: Message-ID: <200305131351.h4DDpXG30768@odiug.zope.com> > I've just noticed that os.walk() silently skips unreadable > directories. I think this is surprising behaviour, which at least > should be documented (there is a comment explaining this is source, > but nothing in the doc string). Is it too late to add an optional > callback argument to handle unreadable directories, so the caller > could log them, raise an exception or whatever? I think the default > behaviour should still be to silently ignore them, but it would be > nice to have a way to override it. Ignoring is definitely the right thing to do by default, as otherwise the existence of a single unreadable directory would cause your entire walk to fail. What's your use case for wanting to do something else? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue May 13 14:57:46 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 13 May 2003 09:57:46 -0400 Subject: [Python-Dev] __slots__ and default values In-Reply-To: Your message of "Tue, 13 May 2003 09:23:49 EDT." <000601c31954$728e9500$32b02c81@oemcomputer> References: <000601c31954$728e9500$32b02c81@oemcomputer> Message-ID: <200305131357.h4DDvkJ31195@odiug.zope.com> > Was there a reason that __slots__ makes initialized > variables read-only? It would be useful to have > overridable default values (even if it entailed copying > them into an instance's slots): > > class Pane(object): > __slots__ = ('background', 'foreground', 'size', 'content') > background = 'black' > foreground = 'white' > size = (80, 25) > > p = Pane() > p.background = 'light blue' # override the default > assert p.foreground == 'white' # other defaults still in-place You can't do that. The class variable 'background' overrides the descriptor created by __slots__. background now appears read-only because there is no instance dict. --Guido van Rossum (home page: http://www.python.org/~guido/) From jacobs@penguin.theopalgroup.com Tue May 13 15:07:45 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Tue, 13 May 2003 10:07:45 -0400 (EDT) Subject: [Python-Dev] __slots__ and default values In-Reply-To: <000601c31954$728e9500$32b02c81@oemcomputer> Message-ID: On Tue, 13 May 2003, Raymond Hettinger wrote: > Was there a reason that __slots__ makes initialized > variables read-only? It would be useful to have > overridable default values (even if it entailed copying > them into an instance's slots): > > class Pane(object): > __slots__ = ('background', 'foreground', 'size', 'content') > background = 'black' > foreground = 'white' > size = (80, 25) > > p = Pane() > p.background = 'light blue' # override the default > assert p.foreground == 'white' # other defaults still in-place Those attributes are read-only, because there is a name collision between the slot descriptors for 'background' and 'foreground', so the class favors the class variables. Thus, no slots are allocated for 'background' and 'foreground', so the instance, not having an instance dictionary correctly reports that those attributes are indeed read-only. Also, slots are not automatically initialized from class variables, though one can easily write a metaclass to do so. (Actually, it is only easy for a first approximation, it is actually quite tricky to get 100% correct.) -Kevin -- -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From aahz@pythoncraft.com Tue May 13 15:17:19 2003 From: aahz@pythoncraft.com (Aahz) Date: Tue, 13 May 2003 10:17:19 -0400 Subject: [Python-Dev] __slots__ and default values In-Reply-To: <000601c31954$728e9500$32b02c81@oemcomputer> References: <000601c31954$728e9500$32b02c81@oemcomputer> Message-ID: <20030513141719.GA12321@panix.com> On Tue, May 13, 2003, Raymond Hettinger wrote: > > Was there a reason that __slots__ makes initialized variables > read-only? It would be useful to have overridable default values > (even if it entailed copying them into an instance's slots): > > class Pane(object): > __slots__ = ('background', 'foreground', 'size', 'content') > background = 'black' > foreground = 'white' > size = (80, 25) > > p = Pane() > p.background = 'light blue' # override the default > assert p.foreground == 'white' # other defaults still in-place Why not do the initializing in __init__? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In many ways, it's a dull language, borrowing solid old concepts from many other languages & styles: boring syntax, unsurprising semantics, few automatic coercions, etc etc. But that's one of the things I like about it." --Tim Peters on Python, 16 Sep 93 From jeremy@zope.com Tue May 13 15:40:39 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 13 May 2003 10:40:39 -0400 Subject: [Python-Dev] Need some patches checked In-Reply-To: <200305120020.h4C0KIm29011@pcp02138704pcs.reston01.va.comcast.net> References: <3EBEDDE0.3040308@ocf.berkeley.edu> <200305120020.h4C0KIm29011@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <1052836839.973.6.camel@slothrop.zope.com> On Sun, 2003-05-11 at 20:20, Guido van Rossum wrote: > > Since I am trying to tackle patches that were not written by me for the > > first time I need someone to check that I am doing the right thing. There are a bunch of open bugs and patches for urllib2. I've cleaned up a few things lately. We might make a concerted effort to close them all for 2.3b2. Whole-scale refactoring can be more effective that a large set of small fixes. Jeremy From harri.pasanen@trema.com Tue May 13 16:06:27 2003 From: harri.pasanen@trema.com (Harri Pasanen) Date: Tue, 13 May 2003 17:06:27 +0200 Subject: [Python-Dev] os.walk() silently ignores errors In-Reply-To: <200305131351.h4DDpXG30768@odiug.zope.com> References: <200305131351.h4DDpXG30768@odiug.zope.com> Message-ID: <200305131706.27202.harri.pasanen@trema.com> On Tuesday 13 May 2003 15:51, Guido van Rossum wrote: > > I've just noticed that os.walk() silently skips unreadable > > directories. I think this is surprising behaviour, which at > > least should be documented (there is a comment explaining this is > > source, but nothing in the doc string). Is it too late to add an > > optional callback argument to handle unreadable directories, so > > the caller could log them, raise an exception or whatever? I > > think the default behaviour should still be to silently ignore > > them, but it would be nice to have a way to override it. > > Ignoring is definitely the right thing to do by default, as > otherwise the existence of a single unreadable directory would > cause your entire walk to fail. What's your use case for wanting > to do something else? Sometimes I'm looking for something in a files in directory tree, forgetting I don't have access permissions to a particular subdirectory by default. So the search can silently fail, and I'm left with the wrong idea that what I was looking is not there. Ideally, I'd like the possibility have my script remind me to login as root prior to running it. I know I could do some defensive programming in the walker function to go around this, but I this would likely imply more stat calls and impact performance. I've been bitten by this a couple of times, so I thought I'd pipe in. -Harri From duncan@rcp.co.uk Tue May 13 16:20:30 2003 From: duncan@rcp.co.uk (Duncan Booth) Date: Tue, 13 May 2003 16:20:30 +0100 Subject: [Python-Dev] __slots__ and default values References: <000601c31954$728e9500$32b02c81@oemcomputer> <20030513141719.GA12321@panix.com> Message-ID: Aahz wrote in news:20030513141719.GA12321@panix.com: > On Tue, May 13, 2003, Raymond Hettinger wrote: >> >> Was there a reason that __slots__ makes initialized variables >> read-only? It would be useful to have overridable default values >> (even if it entailed copying them into an instance's slots): >> >> class Pane(object): >> __slots__ = ('background', 'foreground', 'size', 'content') >> background = 'black' >> foreground = 'white' >> size = (80, 25) >> >> p = Pane() >> p.background = 'light blue' # override the default >> assert p.foreground == 'white' # other defaults still in-place > > Why not do the initializing in __init__? The following works, but I can't remember whether you're supposed to be able to use a dict in __slots__ or if it just happens to be allowed: >>> class Pane(object): __slots__ = { 'background': 'black', 'foreground': 'white', 'size': (80, 25) } def __init__(self): for k, v in self.__slots__.iteritems(): setattr(self, k, v) >>> p = Pane() >>> p.background = 'blue' >>> p.background, p.foreground ('blue', 'white') >>> -- Duncan Booth duncan@rcp.co.uk int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure? From mcherm@mcherm.com Tue May 13 16:25:18 2003 From: mcherm@mcherm.com (Michael Chermside) Date: Tue, 13 May 2003 08:25:18 -0700 Subject: [Python-Dev] Re: __slots__ and default values Message-ID: <1052839518.3ec10e5e3f7c0@mcherm.com> Raymond Hettinger wrote: > class Pane(object): > __slots__ = ('background', 'foreground', 'size', 'content') > background = 'black' > foreground = 'white' > size = (80, 25) ...which doesn't work since the class variable overwrites the __slots__ descriptor. Aahz replies: > Why not do the initializing in __init__? I presume that Raymond's concern was not that there wouldn't be a way to do initialization, but that this would become a new c.l.p FAQ and point of confusion for newbies. Unfortunately, I fear that it will. Already I am seeing that people are "discovering" class variables as a sort of "initialized instance variable" instead of using __init__ as they "ought" to. Of course, it's NOT an initialized instance variable, but newbies stumble across it and seem to prefer it to using __init__. Combine this with the fact that newbies from staticly typed languages tend to think of __slots__ as "practically mandatory" (because it prevents the use of instance variables not pre-declared, which they erroniously think is a good thing) rather than the special purpose performance hack that it REALLY is, and you have a recipe for trouble. I'm not quite sure how to present things so as to steer them right, but there's definitely a potential pitfall here. -- Michael Chermside From aleax@aleax.it Tue May 13 16:29:54 2003 From: aleax@aleax.it (Alex Martelli) Date: Tue, 13 May 2003 17:29:54 +0200 Subject: [Python-Dev] os.walk() silently ignores errors In-Reply-To: <200305131706.27202.harri.pasanen@trema.com> References: <200305131351.h4DDpXG30768@odiug.zope.com> <200305131706.27202.harri.pasanen@trema.com> Message-ID: <200305131729.54759.aleax@aleax.it> On Tuesday 13 May 2003 05:06 pm, Harri Pasanen wrote: ... > > Ignoring is definitely the right thing to do by default, as > > otherwise the existence of a single unreadable directory would > > cause your entire walk to fail. What's your use case for wanting > > to do something else? > > Sometimes I'm looking for something in a files in directory tree, > forgetting I don't have access permissions to a particular > subdirectory by default. So the search can silently fail, and I'm > left with the wrong idea that what I was looking is not there. > > Ideally, I'd like the possibility have my script remind me to login as > root prior to running it. Seconded! The default of ignoring errors is just fine, but it WOULD be nice to optionally get a callback on errors so as to be able to raise warnings or exceptions. "Errors should never pass silently unless explicitly silenced" would argue for stronger diagnostic behavior, but compatibility surely constrain the default behavior -- BUT, an easy way to get non-silent behavior would be something I'd end up using, roughly, in 50% of my tree-walking scripts. Alex From mrussell@verio.net Tue May 13 16:40:07 2003 From: mrussell@verio.net (Mark Russell) Date: Tue, 13 May 2003 16:40:07 +0100 Subject: [Python-Dev] os.walk() silently ignores errors In-Reply-To: Your message of "Tue, 13 May 2003 09:59:01 EDT." <20030513135901.5867.87468.Mailman@mail.python.org> Message-ID: >Ignoring is definitely the right thing to do by default, as otherwise >the existence of a single unreadable directory would cause your entire >walk to fail. What's your use case for wanting to do something else? I was using os.walk() to copy a directory tree with some modifications - I assumed that as no exceptions had been raised the tree had been copied successfully. It was only when I diffed the original and copy trees that I found some directories had been skipped because they were unreadable. Had I not checked I would have silently lost data - not behaviour I expect from a python script :-) Mark From guido@python.org Tue May 13 16:40:53 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 13 May 2003 11:40:53 -0400 Subject: [Python-Dev] os.path.walk() lacks 'depth first' option In-Reply-To: Your message of "Mon, 12 May 2003 23:02:24 EDT." <5.1.0.14.0.20030512224621.03ef8500@mail.telecommunity.com> References: <"Your message of Mon, 12 May 2003 20:34:21 EDT." <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com> <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com> <5.1.0.14.0.20030512224621.03ef8500@mail.telecommunity.com> Message-ID: <200305131540.h4DFerS05699@odiug.zope.com> How about this patch? Index: os.py =================================================================== RCS file: /cvsroot/python/python/dist/src/Lib/os.py,v retrieving revision 1.70 diff -c -c -r1.70 os.py *** os.py 25 Apr 2003 07:11:48 -0000 1.70 --- os.py 13 May 2003 15:40:21 -0000 *************** *** 203,209 **** __all__.extend(["makedirs", "removedirs", "renames"]) ! def walk(top, topdown=True): """Directory tree generator. For each directory in the directory tree rooted at top (including top --- 203,209 ---- __all__.extend(["makedirs", "removedirs", "renames"]) ! def walk(top, topdown=True, onerror=None): """Directory tree generator. For each directory in the directory tree rooted at top (including top *************** *** 232,237 **** --- 232,243 ---- dirnames have already been generated by the time dirnames itself is generated. + By default errors from the os.listdir() call are ignored. If + optional arg 'onerror' is specified, it should be a function; + it will be called with one argument, an exception instance. It + can report the error to continue with the walk, or raise the + exception to abort the walk. + Caution: if you pass a relative pathname for top, don't change the current working directory between resumptions of walk. walk never changes the current directory, and assumes that the client doesn't *************** *** 259,265 **** # Note that listdir and error are globals in this module due # to earlier import-*. names = listdir(top) ! except error: return dirs, nondirs = [], [] --- 265,273 ---- # Note that listdir and error are globals in this module due # to earlier import-*. names = listdir(top) ! except error, err: ! if onerror is not None: ! onerror(err) return dirs, nondirs = [], [] *************** *** 274,280 **** for name in dirs: path = join(top, name) if not islink(path): ! for x in walk(path, topdown): yield x if not topdown: yield top, dirs, nondirs --- 282,288 ---- for name in dirs: path = join(top, name) if not islink(path): ! for x in walk(path, topdown, onerror): yield x if not topdown: yield top, dirs, nondirs --Guido van Rossum (home page: http://www.python.org/~guido/) From theller@python.net Tue May 13 16:45:34 2003 From: theller@python.net (Thomas Heller) Date: 13 May 2003 17:45:34 +0200 Subject: [Python-Dev] Re: __slots__ and default values In-Reply-To: <1052839518.3ec10e5e3f7c0@mcherm.com> References: <1052839518.3ec10e5e3f7c0@mcherm.com> Message-ID: Michael Chermside writes: > Combine this with the fact that newbies from staticly typed > languages tend to think of __slots__ as "practically mandatory" > (because it prevents the use of instance variables not pre-declared, > which they erroniously think is a good thing) rather than the > special purpose performance hack that it REALLY is, and you have > a recipe for trouble. Unrelated to *this* topic, but Andrew's "What's new in Python 2.2" still presents __slots__ as a way to constrain the instance variables: A new-style class can define a class attribute named __slots__ to constrain the list of legal attribute names. http://www.python.org/doc/current/whatsnew/sect-rellinks.html#SECTION000340000000000000000 This should probably be fixed. Thomas From walter@livinglogic.de Tue May 13 17:14:51 2003 From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=) Date: Tue, 13 May 2003 18:14:51 +0200 Subject: [Python-Dev] os.path.walk() lacks 'depth first' option In-Reply-To: <200305131540.h4DFerS05699@odiug.zope.com> References: <"Your message of Mon, 12 May 2003 20:34:21 EDT." <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com> <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com> <5.1.0.14.0.20030512224621.03ef8500@mail.telecommunity.com> <200305131540.h4DFerS05699@odiug.zope.com> Message-ID: <3EC119FB.5000302@livinglogic.de> Guido van Rossum wrote: > How about this patch? I like the increased flexibility. But how about the following version? --- def walk(top, order=".d", recursive=True, onerror=None): from os.path import join, isdir, islink, normpath try: names = listdir(top) except error, err: if onerror is not None: onerror(err) return dirs, nondirs = [], [] for name in names: if isdir(join(top, name)): dirs.append(name) else: nondirs.append(name) for c in order: if c==".": yield top, dirs, nondirs elif c=="f": for nd in nondirs: yield normpath(join(top, nd)), [], [] elif c=="d": for name in dirs: path = join(top, name) if not islink(path): if recursive: for x in walk(path, order, recursive, onerror): yield (normpath(x[0]), x[1], x[2]) else: yield path else: raise ValueError, "unknown order %r" % c --- It combines recursive and non-recursive walks, topdown and bottomup walks, walks with and without files or directories. E.g. Getting a list of all files, topdown: [x[0] for x in os.walk(top, order="fd")] or a list of directories bottom up: [x[0] for x in os.walk(top, order="d.")] or a list of files and directories, topdown, with files before subdirectories: [x[0] for x in os.walk(top, order=".fd")] Bye, Walter Dörwald From walter@livinglogic.de Tue May 13 17:36:18 2003 From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=) Date: Tue, 13 May 2003 18:36:18 +0200 Subject: [Python-Dev] os.path.walk() lacks 'depth first' option In-Reply-To: <200305131620.h4DGKMi08267@odiug.zope.com> References: <"Your message of Mon, 12 May 2003 20:34:21 EDT." <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com> <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com> <5.1.0.14.0.20030512224621.03ef8500@mail.telecommunity.com> <200305131540.h4DFerS05699@odiug.zope.com> <3EC119FB.5000302@livinglogic.de> <200305131620.h4DGKMi08267@odiug.zope.com> Message-ID: <3EC11F02.2060301@livinglogic.de> Guido van Rossum wrote: >> I like the increased flexibility. But how about the following >> version? > > > I don't think there's any need for such increased flexibility. Let's > stop while we're ahead. Fixing the silent errors case is important > (see various posts here). Your generalization is a YAGNI though. True, getting a list of files in the current directory even works with the current os.walk: sum([[os.path.join(x[0], f) for f in x[2]] for x in os.walk(".")], []) Bye, Walter Dörwald From tim@zope.com Tue May 13 18:19:48 2003 From: tim@zope.com (Tim Peters) Date: Tue, 13 May 2003 13:19:48 -0400 Subject: [Python-Dev] os.path.walk() lacks 'depth first' option In-Reply-To: <3EC11F02.2060301@livinglogic.de> Message-ID: [Walter D=F6rwald] > True, getting a list of files in the current directory even works > with the current os.walk: > > sum([[os.path.join(x[0], f) for f in x[2]] for x in os.walk(".")], []) Convoluted one-liners are poor Python style, IMO. That walks the entire tree, too. If you want the files in just the current directory, for root, dirs, files in os.walk('.'): break print files or if clarity is disturbing : files =3D os.walk('.').next()[-1] From paul@pfdubois.com Tue May 13 18:25:38 2003 From: paul@pfdubois.com (Paul Dubois) Date: Tue, 13 May 2003 10:25:38 -0700 Subject: [Python-Dev] Inplace multiply Message-ID: <000401c31974$ac31c730$6801a8c0@NICKLEBY> My "masked array" class MA has a problem that I don't know how to solve. The inplace multiply function def __imul__ (self, other) is not getting called while my other input operations do work. The scenario is x = MA.array(...) x *= c If c is an int, this works correctly, calling MA.__imul__. Otherwise, I get a message from the Python runtime saying it can't multiply a sequence by a non-int. But change MA to Numeric, it works. Numeric is an extension type and MA is a (new style) class. MA defines __len__ as well as all the math operators. From guido@python.org Tue May 13 18:44:37 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 13 May 2003 13:44:37 -0400 Subject: [Python-Dev] Inplace multiply In-Reply-To: Your message of "Tue, 13 May 2003 10:25:38 PDT." <000401c31974$ac31c730$6801a8c0@NICKLEBY> References: <000401c31974$ac31c730$6801a8c0@NICKLEBY> Message-ID: <200305131744.h4DHibn13549@odiug.zope.com> > My "masked array" class MA has a problem that I don't know how to solve. The > inplace multiply function > > def __imul__ (self, other) > > is not getting called while my other input operations do work. The scenario > is > > x = MA.array(...) > > x *= c > > If c is an int, this works correctly, calling MA.__imul__. Otherwise, I get > a message from the Python runtime saying it can't multiply a sequence by a > non-int. But change MA to Numeric, it works. > > Numeric is an extension type and MA is a (new style) class. MA defines > __len__ as well as all the math operators. We won't be able to help without seeing your code. --Guido van Rossum (home page: http://www.python.org/~guido/) From jepler@unpythonic.net Tue May 13 19:31:45 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Tue, 13 May 2003 13:31:45 -0500 Subject: [Python-Dev] Inplace multiply In-Reply-To: <000401c31974$ac31c730$6801a8c0@NICKLEBY> References: <000401c31974$ac31c730$6801a8c0@NICKLEBY> Message-ID: <20030513183144.GH11289@unpythonic.net> There must be something more to your problem than what you described. The following executes just fine for me (ditto if NewKoke is a subclass of object instead of list, and no matter whether I define __getitem__ or not [a guess based on your remark about 'multiply a sequence']): $ python dubois.py sausages vegetable-style breakfast patty sausages vegetable-style breakfast patty class Klassic: def __imul__(self, other): return "sausages" def __getitem__(self, i): return None class NewKoke(list): def __imul__(self, other): return "vegetable-style breakfast patty" def __getitem__(self, i): return None k = Klassic() o = NewKoke() k *= 1 o *= 1 print k, o k = Klassic() o = NewKoke() k *= "spam" o *= "spam" print k, o From tim.one@comcast.net Tue May 13 20:03:58 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 13 May 2003 15:03:58 -0400 Subject: [Python-Dev] os.path.walk() lacks 'depth first' option In-Reply-To: Message-ID: [Christian Tanzer] > More hmmm... > > Just grepped over my source tree (1293 .py files, ~ 300000 lines): > > - 45 occurrences of `except AttributeError` with no mention of > `TypeError` > > - 16 occurrences of `except TypeError` with no mention of > `AttributeError` > > - 3 occurrences of `except (AttributeError, TypeError)` > > Works well enough for me. With a fixed release of Python, it would be hard not to work well enough. I have to point out, though, that *across* Python releases, a frequent kind of patch made to the Python test suite is changing former TypeError occurrences to AttributeError, or vice versa. I'm not sure which direction is most common overall, and it's often unclear which is more appropriate. For example, >>> d = {} >>> d.update('abc') Traceback (most recent call last): File "", line 1, in ? AttributeError: keys >>> I wouldn't be surprised if that changed to TypeError someday. > Deriving both AttributeError and TypeError from a common base would > make sense to me. Merging them wouldn't. Yes -- and we should derive all exceptions from LookupError . From tjreedy@udel.edu Tue May 13 20:12:50 2003 From: tjreedy@udel.edu (Terry Reedy) Date: Tue, 13 May 2003 15:12:50 -0400 Subject: [Python-Dev] Re: Inplace multiply References: <000401c31974$ac31c730$6801a8c0@NICKLEBY> <20030513183144.GH11289@unpythonic.net> Message-ID: "Jeff Epler" wrote in message news:20030513183144.GH11289@unpythonic.net... > There must be something more to your problem than what you described. > > The following executes just fine for me (ditto if NewKoke is a subclass > of object instead of list, and no matter whether I define __getitem__ or > not [a guess based on your remark about 'multiply a sequence']): > > $ python dubois.py > sausages vegetable-style breakfast patty > sausages vegetable-style breakfast patty On Win98 2.2.1, cut and paste into interactive window outputs sausages vegetable-style breakfast patty sausages [] > class Klassic: > def __imul__(self, other): > return "sausages" > def __getitem__(self, i): return None > > class NewKoke(list): > def __imul__(self, other): > return "vegetable-style breakfast patty" > def __getitem__(self, i): return None > > k = Klassic() > o = NewKoke() > > k *= 1 > o *= 1 > > print k, o > > k = Klassic() > o = NewKoke() > > k *= "spam" > o *= "spam" Because line above gives TypeError: can't multiply sequence to non-int > print k, o Maybe something has been 'fixed' since then. Terry J. Reedy From drifty@alum.berkeley.edu Tue May 13 20:44:48 2003 From: drifty@alum.berkeley.edu (Brett C.) Date: Tue, 13 May 2003 12:44:48 -0700 Subject: [Python-Dev] Need some patches checked In-Reply-To: <1052836839.973.6.camel@slothrop.zope.com> References: <3EBEDDE0.3040308@ocf.berkeley.edu> <200305120020.h4C0KIm29011@pcp02138704pcs.reston01.va.comcast.net> <1052836839.973.6.camel@slothrop.zope.com> Message-ID: <3EC14B30.1000102@ocf.berkeley.edu> Jeremy Hylton wrote: > There are a bunch of open bugs and patches for urllib2. I've cleaned up > a few things lately. We might make a concerted effort to close them all > for 2.3b2. Whole-scale refactoring can be more effective that a large > set of small fixes. > Aw, but Jeremey, I wanted to start working on the AST branch after I finished going through the open bugs and patches one time! =) We can and I am willing to help, but I suspect it would be best to first get the test suite rewritten; it is severely lacking. Which reminds me, I need to finish writing urllib's tests by doing the network-requiring ones. -Brett From marc@informatik.uni-bremen.de Tue May 13 20:58:45 2003 From: marc@informatik.uni-bremen.de (Marc Recht) Date: Tue, 13 May 2003 21:58:45 +0200 Subject: [Python-Dev] Python 2.3b1 _XOPEN_SOURCE value from configure.in In-Reply-To: References: <200305121813.36383.harri.pasanen@trema.com> Message-ID: <8560000.1052855925@leeloo.intern.geht.de> --==========1813079384========== Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: quoted-printable Content-Disposition: inline > Of course, it would be sufficient to set it to a smaller value on > systems that support only older X/Open issues; I think I'd accept a > patch that changes this (if the patch is correct, of course). Defining __EXTENSIONS__ could also help. IIRC it works just like=20 _GNU_SOURCE/_NETBSD_SOURCE. Regards, Marc mundus es fabula --==========1813079384========== Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (NetBSD) iD8DBQE+wU587YQCetAaG3MRAvh6AKCHCQAHyBXrVMdkZYMydWU/tXF3agCeK/iC 3zysIKN2VG/b80Yc5D6/3eI= =tdrF -----END PGP SIGNATURE----- --==========1813079384==========-- From mrussell@verio.net Tue May 13 21:08:25 2003 From: mrussell@verio.net (Mark Russell) Date: Tue, 13 May 2003 21:08:25 +0100 Subject: [Python-Dev] os.path.walk() lacks 'depth first' option In-Reply-To: Your message of "Tue, 13 May 2003 12:00:08 EDT." <20030513160008.1261.22301.Mailman@mail.python.org> Message-ID: >How about this patch? Thanks - I tried that with my script and it worked nicely. Hopefully this is as complex as os.walk() needs to get. Mark From jepler@unpythonic.net Tue May 13 21:15:50 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Tue, 13 May 2003 15:15:50 -0500 Subject: [Python-Dev] Re: Inplace multiply In-Reply-To: References: <000401c31974$ac31c730$6801a8c0@NICKLEBY> <20030513183144.GH11289@unpythonic.net> Message-ID: <20030513201549.GJ11289@unpythonic.net> On Tue, May 13, 2003 at 03:12:50PM -0400, Terry Reedy wrote: > On Win98 2.2.1, cut and paste into interactive window outputs > TypeError: can't multiply sequence to non-int > > > print k, o > > Maybe something has been 'fixed' since then. using RedHat9's "2.2.2-26" here. Jeff From martin@v.loewis.de Tue May 13 21:57:22 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 13 May 2003 22:57:22 +0200 Subject: [Python-Dev] Python 2.3b1 _XOPEN_SOURCE value from configure.in In-Reply-To: <8560000.1052855925@leeloo.intern.geht.de> References: <200305121813.36383.harri.pasanen@trema.com> <8560000.1052855925@leeloo.intern.geht.de> Message-ID: Marc Recht writes: > Defining __EXTENSIONS__ could also help. IIRC it works just like > _GNU_SOURCE/_NETBSD_SOURCE. We do define __EXTENSIONS__, this is not the issue. We define _XOPEN_SOURCE to 600 on all systems, because that is the highest value specified by any X/Open spec today. The system may not support all of the latest Posix features; this is not problem because we autoconfiscate them. The problem really only occurs if somebody thinks they need to define _XOPEN_SOURCE to some other value; the compiler will then complain. Regards, Martin From tdelaney@avaya.com Tue May 13 22:32:49 2003 From: tdelaney@avaya.com (Delaney, Timothy C (Timothy)) Date: Wed, 14 May 2003 07:32:49 +1000 Subject: [Python-Dev] __slots__ and default values Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE57F30C@au3010avexu1.global.avaya.com> > From: Raymond Hettinger [mailto:raymond.hettinger@verizon.net] >=20 > class Pane(object): > __slots__ =3D ('background', 'foreground', 'size', 'content') > background =3D 'black' > foreground =3D 'white' > size =3D (80, 25) >=20 > p =3D Pane() > p.background =3D 'light blue' # override the default > assert p.foreground =3D=3D 'white' # other defaults still in-place Wow - I hadn't realised that. I would prefer to think of this is a useful feature rather than a wart. = Finally we can have true constants! class Pane (module): # Or whatever - you get the idea ;) __slots__ =3D ('background', 'foreground', 'size', 'content', = '__name__', '__file__') __name__ =3D globals()['__name__'] __file__ =3D globals()['__file__'] background =3D 'black' foreground =3D 'white' size =3D (80, 25) import sys sys.modules[__name__] =3D Pane() OK - so you could get around it by getting the class of the "module" and = then modifying that ... but it's the best yet. It even tells you that = the attribute is read-only! Tim Delaney From guido@python.org Tue May 13 22:38:10 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 13 May 2003 17:38:10 -0400 Subject: [Python-Dev] __slots__ and default values In-Reply-To: Your message of "Wed, 14 May 2003 07:32:49 +1000." <338366A6D2E2CA4C9DAEAE652E12A1DE57F30C@au3010avexu1.global.avaya.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DE57F30C@au3010avexu1.global.avaya.com> Message-ID: <200305132138.h4DLcA930451@odiug.zope.com> > I would prefer to think of this is a useful feature rather than a > wart. Finally we can have true constants! Yuck. If you want that, define a property-like class that doesn't allow setting. These "constants" of yours are easily subverted by defining a subclass which adds an instance __dict__ (any subclass that doesn't define __slots__ of its own does this). --Guido van Rossum (home page: http://www.python.org/~guido/) From tdelaney@avaya.com Tue May 13 22:40:10 2003 From: tdelaney@avaya.com (Delaney, Timothy C (Timothy)) Date: Wed, 14 May 2003 07:40:10 +1000 Subject: [Python-Dev] __slots__ and default values Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE57F30D@au3010avexu1.global.avaya.com> > From: Delaney, Timothy C (Timothy)=20 >=20 > class Pane (module): # Or whatever - you get the idea ;) >=20 > __slots__ =3D ('background', 'foreground', 'size',=20 > 'content', '__name__', '__file__') > __name__ =3D globals()['__name__'] > __file__ =3D globals()['__file__'] >=20 > background =3D 'black' > foreground =3D 'white' > size =3D (80, 25) >=20 > import sys > sys.modules[__name__] =3D Pane() Hmm ... I was just wondering if we could use this technique to gain the = advantages of fast lookup in other modules automatically (but = optionally). The idea is, the last line of your module includes a call which = transforms your module into a module subclass instance with slots. The functions in the module become methods which access the *class* = instance variables (so they can modify them) but other modules can't. A fair bit of work, and probably not worthwhile, but it's interesting to = think about ;) Tim Delaney From tdelaney@avaya.com Tue May 13 22:42:51 2003 From: tdelaney@avaya.com (Delaney, Timothy C (Timothy)) Date: Wed, 14 May 2003 07:42:51 +1000 Subject: [Python-Dev] __slots__ and default values Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE57F30E@au3010avexu1.global.avaya.com> > From: Guido van Rossum [mailto:guido@python.org] >=20 > > I would prefer to think of this is a useful feature rather than a > > wart. Finally we can have true constants! >=20 > Yuck. If you want that, define a property-like class that doesn't > allow setting. >=20 > These "constants" of yours are easily subverted by defining a subclass > which adds an instance __dict__ (any subclass that doesn't define > __slots__ of its own does this). Sorry - missing smiley there ;) But see my other post for a potentially useful side-effect of this. Not = something I think should be done, but fun to think about. The idea is that you shouldn't be able to create a subclass without some = really nasty work, as the only way to get it is to determine what module = the module subclass is defined in, then grab the class out of that. But it's a horrible hack and I should never have suggested it ;) Tim Delaney From lkcl@samba-tng.org Tue May 13 23:57:40 2003 From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton) Date: Tue, 13 May 2003 22:57:40 +0000 Subject: [Python-Dev] sf.net/708007: expectlib.py telnetlib.py split Message-ID: <20030513225740.GH2305@localhost> [i am not on the python-dev list but i check the archives, please cc me] approximately two years ago i needed the functionality outlined in the present telnetlib.py for several other remote protocols, most notably commands (including ssh and bash) and also HTTP. i figure that this functionality should be more than invaluable to other python developers. for example even the existing python libraries such as the ftp client ftplib.py, which makes extensive use of regular expressions to parse commands, could possibly benefit from rewrites using the "new" expectlib.py. also i believe it's the sort of thing that the twisted crowd should already have invented, and they're mad if they haven't already got something similar. this message is therefore just a polite ping to the regular python developers that the above referenced patch appears not to have yet been looked at or assigned to anybody. that having been said (perhaps with unintended implications of criticism that the regular python developers are slackers, which i most certainly am NOT saying!): the expectlib.py / telnetlib.py split is not exactly a top priority - just the sort of thing that one python fanatic would classify as "nice to have". l. -- -- expecting email to be received and understood is a bit like picking up the telephone and immediately dialing without checking for a dial-tone; speaking immediately without listening for either an answer or ring-tone; hanging up immediately and then expecting someone to call you (and to be able to call you). -- every day, people send out email expecting it to be received without being tampered with, read by other people, delayed or simply - without prejudice but lots of incompetence - destroyed. -- please therefore treat email more like you would a CB radio to communicate across the world (via relaying stations): ask and expect people to confirm receipt; send nothing that you don't mind everyone in the world knowing about... From python@rcn.com Wed May 14 02:12:03 2003 From: python@rcn.com (Raymond Hettinger) Date: Tue, 13 May 2003 21:12:03 -0400 Subject: [Python-Dev] Re: __slots__ and default values References: <1052839518.3ec10e5e3f7c0@mcherm.com> Message-ID: <002301c319b5$d44b8580$febd958d@oemcomputer> > Aahz replies: > > Why not do the initializing in __init__? > > Michael: > I presume that Raymond's concern was not that there wouldn't be > a way to do initialization, but that this would become a new c.l.p > FAQ and point of confusion for newbies. Unfortunately, I fear > that it will. Yes. Since it "works" with classic classes and unslotted newstyle classes, it isn't terribly unreasonable to believe it would work with __slots__. Further, there is no reason it couldn't work (either through an autoinit upon instance creation or through a default entry that references the class variable). > Michael: > Already I am seeing that people are "discovering" > class variables as a sort of "initialized instance variable" > instead of using __init__ as they "ought" to. Of course, it's NOT > an initialized instance variable, but newbies stumble across it > and seem to prefer it to using __init__. Perhaps I skipped school the day they taught that was bad, but it seems perfectly reasonable to me and I'm sure it is a common practice. I even find it to be clearer and more maintainable than using __init__. The only downside I see is that self.classvar += 1 reads from and writes to a different place. So, a reworded version of my question is "why not?". What is the downside of providing behavior that is similar to non-slotted classes? What is gained by these blocking an assignment and reporting a read-only error? When I find an existing class can be made lighter by using __slots__, it would be nice to transform it with a single line. From: class Tree(object): left = None right = None def __init__(self, value): self.value = value adding only one line: __slots__ = ('left', 'right', 'value') It would be a bummer to also have to move the left/right = None into __init__ and transform them into self.left = self.right = None. > Duncan Booth: > The following works, but I can't remember whether you're supposed to be > able to use a dict in __slots__ or if it just happens to be allowed: > > >>> class Pane(object): > __slots__ = { 'background': 'black', 'foreground': 'white', > 'size': (80, 25) } > def __init__(self): > for k, v in self.__slots__.iteritems(): > setattr(self, k, v) __slots__ accepts any iterable. So, yes, you're allowed eventhough that particular use was not intended. There are several possible workarounds including metaclasses. My question is why there needs to be a workaround at all. > Thomas Heller: > Unrelated to *this* topic, but Andrew's "What's new in Python 2.2" still > presents __slots__ as a way to constrain the instance variables: > > A new-style class can define a class attribute named __slots__ to > constrain the list of legal attribute names. Though I think of __slots__ as a way to make lighter weight instances, constraining instance variables is also one of its functions. Raymond Hettinger From guido@python.org Wed May 14 02:38:27 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 13 May 2003 21:38:27 -0400 Subject: [Python-Dev] Re: __slots__ and default values In-Reply-To: "Your message of Tue, 13 May 2003 21:12:03 EDT." <002301c319b5$d44b8580$febd958d@oemcomputer> References: <1052839518.3ec10e5e3f7c0@mcherm.com> <002301c319b5$d44b8580$febd958d@oemcomputer> Message-ID: <200305140138.h4E1cRx02045@pcp02138704pcs.reston01.va.comcast.net> > Yes. Since it "works" with classic classes and unslotted newstyle > classes, it isn't terribly unreasonable to believe it would work > with __slots__. Further, there is no reason it couldn't work (either > through an autoinit upon instance creation or through a default > entry that references the class variable). Really? The metaclass would have to copy the initializers to a safekeeping place, because they compete with the slot descriptors. Don't forget that when you write __slots__ = ['a', 'b', 'c'], descriptors named a, b and c are inserted into the class dict by the metaclass. And then the metaclass would have to add a hidden initializer that initializes the slot. Very messy... > > Michael: > > Already I am seeing that people are "discovering" > > class variables as a sort of "initialized instance variable" > > instead of using __init__ as they "ought" to. Of course, it's NOT > > an initialized instance variable, but newbies stumble across it > > and seem to prefer it to using __init__. > > Perhaps I skipped school the day they taught that was bad, but > it seems perfectly reasonable to me and I'm sure it is a common > practice. I even find it to be clearer and more maintainable than > using __init__. The only downside I see is that self.classvar += 1 > reads from and writes to a different place. It's also a bad idea for initializers that aren't immutable, because the initial values are shared between all instances (another example of the "aliasing" problem, also known from default argument values). > So, a reworded version of my question is "why not?". What is > the downside of providing behavior that is similar to non-slotted > classes? What is gained by these blocking an assignment and > reporting a read-only error? It's not like I did any work to prevent what you want from working. Rather, what you seem to want would be hard to implement (see above). > When I find an existing class can be made lighter by using __slots__, > it would be nice to transform it with a single line. From: > > class Tree(object): > left = None > right = None > def __init__(self, value): > self.value = value > > adding only one line: > __slots__ = ('left', 'right', 'value') > > It would be a bummer to also have to move the left/right = None > into __init__ and transform them into self.left = self.right = None. Maybe I should remove slots from the language? <0.5 wink> They seem to be the most widely misunderstood feature of Python 2.2. If you don't understand how they work, please don't use them. > > Duncan Booth: > > The following works, but I can't remember whether you're supposed to be > > able to use a dict in __slots__ or if it just happens to be allowed: > > > > >>> class Pane(object): > > __slots__ = { 'background': 'black', 'foreground': 'white', > > 'size': (80, 25) } > > def __init__(self): > > for k, v in self.__slots__.iteritems(): > > setattr(self, k, v) > > __slots__ accepts any iterable. So, yes, you're allowed > eventhough that particular use was not intended. This loophole was intentionally left for people to find a good use for. > There are several possible workarounds including metaclasses. > My question is why there needs to be a workaround at all. I hope that has been answered by now. > > Thomas Heller: > > Unrelated to *this* topic, but Andrew's "What's new in Python 2.2" still > > presents __slots__ as a way to constrain the instance variables: > > > > A new-style class can define a class attribute named __slots__ to > > constrain the list of legal attribute names. > > Though I think of __slots__ as a way to make lighter weight instances, > constraining instance variables is also one of its functions. Not true. That is at best an unintended side effect of slots. And there's nothing against having __slots__ include __dict__, so your instance has a __dict__ as well as slots. --Guido van Rossum (home page: http://www.python.org/~guido/) From python@rcn.com Wed May 14 07:51:24 2003 From: python@rcn.com (Raymond Hettinger) Date: Wed, 14 May 2003 02:51:24 -0400 Subject: [Python-Dev] Re: __slots__ and default values References: <1052839518.3ec10e5e3f7c0@mcherm.com> <002301c319b5$d44b8580$febd958d@oemcomputer> <200305140138.h4E1cRx02045@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <003e01c319e5$3d044100$8427c797@oemcomputer> > It's also a bad idea for initializers that aren't immutable, because > the initial values are shared between all instances (another example > of the "aliasing" problem, also known from default argument values). Right. I once knew that and had forgotten. > > Though I think of __slots__ as a way to make lighter weight instances, > > constraining instance variables is also one of its functions. > > Not true. That is at best an unintended side effect of slots. And > there's nothing against having __slots__ include __dict__, so your > instance has a __dict__ as well as slots. That's something I never knew but wish I had known (and I *have* read the source). Live and learn. Raymond From mwh@python.net Wed May 14 10:53:04 2003 From: mwh@python.net (Michael Hudson) Date: Wed, 14 May 2003 10:53:04 +0100 Subject: [Python-Dev] Inplace multiply In-Reply-To: <000401c31974$ac31c730$6801a8c0@NICKLEBY> ("Paul Dubois"'s message of "Tue, 13 May 2003 10:25:38 -0700") References: <000401c31974$ac31c730$6801a8c0@NICKLEBY> Message-ID: <2mvfwdsvlr.fsf@starship.python.net> "Paul Dubois" writes: > My "masked array" class MA has a problem that I don't know how to solve. The > inplace multiply function > > def __imul__ (self, other) > > is not getting called while my other input operations do work. The scenario > is > > x = MA.array(...) > > x *= c > > If c is an int, this works correctly, calling MA.__imul__. Otherwise, I get > a message from the Python runtime saying it can't multiply a sequence by a > non-int. But change MA to Numeric, it works. > > Numeric is an extension type and MA is a (new style) class. MA defines > __len__ as well as all the math operators. What version of Python? This smells like a bug that has been (thought) fixed. Cheers, M. -- The ability to quote is a serviceable substitute for wit. -- W. Somerset Maugham From guido@python.org Wed May 14 15:03:08 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 14 May 2003 10:03:08 -0400 Subject: [Python-Dev] Re: __slots__ and default values In-Reply-To: Your message of "Wed, 14 May 2003 02:51:24 EDT." <003e01c319e5$3d044100$8427c797@oemcomputer> References: <1052839518.3ec10e5e3f7c0@mcherm.com> <002301c319b5$d44b8580$febd958d@oemcomputer> <200305140138.h4E1cRx02045@pcp02138704pcs.reston01.va.comcast.net> <003e01c319e5$3d044100$8427c797@oemcomputer> Message-ID: <200305141403.h4EE38F25569@odiug.zope.com> > > Not true. That is at best an unintended side effect of slots. And > > there's nothing against having __slots__ include __dict__, so your > > instance has a __dict__ as well as slots. > > That's something I never knew but wish I had known (and I *have* > read the source). Live and learn. Actually I think that's new in 2.3. --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@pfdubois.com Wed May 14 16:12:35 2003 From: paul@pfdubois.com (Paul Dubois) Date: Wed, 14 May 2003 08:12:35 -0700 Subject: [Python-Dev] inplace multiply problem was a bug that has been fixed Message-ID: <000101c31a2b$4048e1e0$6801a8c0@NICKLEBY> My question about inplace multiply was answered by Todd Miller: it was a bug in Python that is now fixed. I upgraded and my problem went away. From jeremy@zope.com Wed May 14 16:55:57 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 14 May 2003 11:55:57 -0400 Subject: [Python-Dev] Startup time In-Reply-To: References: Message-ID: <1052927757.7258.38.camel@slothrop.zope.com> I don't know if this thread came to any conclusion. I did the same strace that everyone else has reported on, and I've included a summary of the results here. I have one directory in my PYTHONPATH, which affects the number of directories that are searched for each imported module. Comparing Python 2.3 current CVS with Python 2.2 CVS, I see the following system call counts (limited to top 6 in 2.3). 2.3 2.2 open 305 104 stat64 102 44 fstat64 74 34 read 71 30 rt_sig... 69 68 brk 62 74 When a single module is imported from the standard library, Python 2.2 looks in 10 different places. Specifically, it looks for five different files in two different directories -- PYTHONPATH and the std library directory. For files that aren't found (e.g. sitecustomize), it looks in 25 places (5 files x 5 directories). Interesting to note that PYTHONPATH directory is not searched for sitecustomize. In Python 2.3, the standard library module requires 15 lookups because /usr/local/lib/python23.zip is added to the path before the std library directory. The failed lookup of sitecustomize takes 35 lookups, because PYTHONPATH and python23.zip are now on the path. The list of attempted imports is much larger in 2.3 than in 2.2. -- 2.3 -- -- 2.2 -- __future__ codecs copy_reg copy_reg encodings encodings/__init__ encodings/aliases encodings/iso_8859_15 exceptions linecache os os posixpath posixpath re site site sitecustomize sitecustomize sre sre_compile sre_constants sre_parse stat stat string strop types types UserDict UserDict warnings 22 total 7 total The increase in open, stat64, and fstat64 all seem consistent with a 3x increase in the number of modules searched for. The use of re in the warnings module seems the primary culprit, since it pulls in re, sre and friends, string, and strop. Jeremy From noah@noah.org Wed May 14 17:34:32 2003 From: noah@noah.org (Noah Spurrier) Date: Wed, 14 May 2003 09:34:32 -0700 Subject: [Python-Dev] Re: sf.net/708007: expectlib.py telnetlib.py split Message-ID: <3EC27018.9020503@noah.org> Tell me more about expectlib.py. Is it pty based? I ask because I wonder if it's like my pexpect module: http://pexpect.sourceforge.net/ Yours, Noah From python@rcn.com Wed May 14 17:55:22 2003 From: python@rcn.com (Raymond Hettinger) Date: Wed, 14 May 2003 12:55:22 -0400 Subject: [Python-Dev] Re: sf.net/708007: expectlib.py telnetlib.py split References: <3EC27018.9020503@noah.org> Message-ID: <003001c31a39$9c554d80$7c22a044@oemcomputer> > Tell me more about expectlib.py. Is it pty based? > I ask because I wonder if it's like my pexpect module: > http://pexpect.sourceforge.net/ Hello Noah, Googling for "expect.py" gives several useful hits: http://www.google.com/search?sourceid=navclient&q=expect%2Epy If those don't answer your question, I recommend posting to the comp.lang.python newgroup where you can benefit from the experiences of hundreds of users. The python-dev list isn't a good place to follow-up because these kinds of questions are not the primary focus here. Raymond Hettinger From skip@pobox.com Wed May 14 18:35:13 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 14 May 2003 12:35:13 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <1052927757.7258.38.camel@slothrop.zope.com> References: <1052927757.7258.38.camel@slothrop.zope.com> Message-ID: <16066.32337.635236.691405@montanaro.dyndns.org> Jeremy> I don't know if this thread came to any conclusion. I don't think so. I think it bogged down about the time I suggested that executing import from within a function might slow things down. Jeremy> The use of re in the warnings module seems the primary culprit, Jeremy> since it pulls in re, sre and friends, string, and strop. I just peeked at warnings.py. None of the uses of re.* in there seem like they'd be in time-critical functions. The straightforward change (migrate "import re" into the functions which use the module) worked for me, so I went ahead and checked it in. Skip From guido@python.org Wed May 14 18:37:22 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 14 May 2003 13:37:22 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20 In-Reply-To: Your message of "Wed, 14 May 2003 10:33:55 PDT." References: Message-ID: <200305141737.h4EHbMv06730@odiug.zope.com> > Modified Files: > warnings.py > Log Message: > defer re module imports to help improve interpreter startup Are you sure that's going to help? "import warnings" callse _processoptions() and makes a few calls to filterwarnings() which brings in the re module anyway... --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@zope.com Wed May 14 18:45:33 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 14 May 2003 13:45:33 -0400 Subject: [Python-Dev] Startup time In-Reply-To: <16066.32337.635236.691405@montanaro.dyndns.org> References: <1052927757.7258.38.camel@slothrop.zope.com> <16066.32337.635236.691405@montanaro.dyndns.org> Message-ID: <1052934332.7260.45.camel@slothrop.zope.com> On Wed, 2003-05-14 at 13:35, Skip Montanaro wrote: > Jeremy> I don't know if this thread came to any conclusion. > > I don't think so. I think it bogged down about the time I suggested that > executing import from within a function might slow things down. > > Jeremy> The use of re in the warnings module seems the primary culprit, > Jeremy> since it pulls in re, sre and friends, string, and strop. > > I just peeked at warnings.py. None of the uses of re.* in there seem like > they'd be in time-critical functions. The straightforward change (migrate > "import re" into the functions which use the module) worked for me, so I > went ahead and checked it in. Guido and I looked at that briefly. It doesn't make any difference does it? The functions that use re are called when the module is imported. Jeremy From skip@pobox.com Wed May 14 19:02:05 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 14 May 2003 13:02:05 -0500 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20 In-Reply-To: <200305141737.h4EHbMv06730@odiug.zope.com> References: <200305141737.h4EHbMv06730@odiug.zope.com> Message-ID: <16066.33949.903064.834797@montanaro.dyndns.org> >> defer re module imports to help improve interpreter startup Guido> Are you sure that's going to help? "import warnings" callse Guido> _processoptions() and makes a few calls to filterwarnings() which Guido> brings in the re module anyway... Apparently not. :-( The call to _processoptions() won't hurt unless the user invokes the interpreter with a -W arg. Not much we can do there. I think the import in filterwarnings can be avoided by deferring the re compilation until warn_explicit. I'll see what I can come up with and submit a patch. Skip From skip@pobox.com Wed May 14 19:02:42 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 14 May 2003 13:02:42 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <1052934332.7260.45.camel@slothrop.zope.com> References: <1052927757.7258.38.camel@slothrop.zope.com> <16066.32337.635236.691405@montanaro.dyndns.org> <1052934332.7260.45.camel@slothrop.zope.com> Message-ID: <16066.33986.472857.935825@montanaro.dyndns.org> Jeremy> Guido and I looked at that briefly. It doesn't make any Jeremy> difference does it? The functions that use re are called when Jeremy> the module is imported. You're right. I'll come up with something. Skip From jepler@unpythonic.net Wed May 14 19:08:03 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Wed, 14 May 2003 13:08:03 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <16066.33986.472857.935825@montanaro.dyndns.org> References: <1052927757.7258.38.camel@slothrop.zope.com> <16066.32337.635236.691405@montanaro.dyndns.org> <1052934332.7260.45.camel@slothrop.zope.com> <16066.33986.472857.935825@montanaro.dyndns.org> Message-ID: <20030514180801.GN11289@unpythonic.net> On Wed, May 14, 2003 at 01:02:42PM -0500, Skip Montanaro wrote: > > Jeremy> Guido and I looked at that briefly. It doesn't make any > Jeremy> difference does it? The functions that use re are called when > Jeremy> the module is imported. > > You're right. I'll come up with something. I'd suggested (or I think I suggested) that re needs to only be imported when message != "" in filterwarnings. The other use, in _processoptions -> _setoption, is only hit when sys.warnoptions has a non-empty value. This still leaves the usage of re in encodings.__init__.normalize_encoding() which I also suggested moving into the function --however, I never checked when .normalize_encoding() is called, so it might always be hit at startup anyway. This could also be rewritten as string operations, too. Jeff From skip@pobox.com Wed May 14 19:14:43 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 14 May 2003 13:14:43 -0500 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20 In-Reply-To: <16066.33949.903064.834797@montanaro.dyndns.org> References: <200305141737.h4EHbMv06730@odiug.zope.com> <16066.33949.903064.834797@montanaro.dyndns.org> Message-ID: <16066.34707.101894.297890@montanaro.dyndns.org> Skip> I'll see what I can come up with and submit a patch. Okay, this seems too simple. ;-) There's no need to compile the message and module arguments to filterwarnings. Just store them as strings and call re.match() later with the appropriate args. That takes care of that one. I think use of the -W command line flag is infrequent enough that it doesn't really matter that _processoptions, _setoption and -getcategory might get called at startup. Most of the time sys.warnoptions will be an empty list, so _setoption and _getcategory won't be called. Aside: Why are message and module names given on the command line treated as literal strings while message and module names which are passed directly to filterwarnings() treated as regular expressions? If they were treated as regular expressions, the calls to re.escape() could be removed and _setoptions wouldn't use re either. Skip From guido@python.org Wed May 14 19:23:16 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 14 May 2003 14:23:16 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20 In-Reply-To: Your message of "Wed, 14 May 2003 13:14:43 CDT." <16066.34707.101894.297890@montanaro.dyndns.org> References: <200305141737.h4EHbMv06730@odiug.zope.com> <16066.33949.903064.834797@montanaro.dyndns.org> <16066.34707.101894.297890@montanaro.dyndns.org> Message-ID: <200305141823.h4EINGt15343@odiug.zope.com> > Skip> I'll see what I can come up with and submit a patch. > > Okay, this seems too simple. ;-) There's no need to compile the message and > module arguments to filterwarnings. Just store them as strings and call > re.match() later with the appropriate args. That takes care of that one. I > think use of the -W command line flag is infrequent enough that it doesn't > really matter that _processoptions, _setoption and -getcategory might get > called at startup. Most of the time sys.warnoptions will be an empty list, > so _setoption and _getcategory won't be called. OK. Please report old and new startup times! > Aside: Why are message and module names given on the command line treated as > literal strings while message and module names which are passed directly to > filterwarnings() treated as regular expressions? If they were treated as > regular expressions, the calls to re.escape() could be removed and > _setoptions wouldn't use re either. Um, I don't remember. It would seem to be useful, wouldn't it? The only reason I can come up with is that for dotted names, the dot would have to be escaped on the command line, and escaping something on the command line is painful because \ is also a shell escape character, so you'd have to escape the escape. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Wed May 14 20:32:23 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 14 May 2003 14:32:23 -0500 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py, 1.19, 1.20 In-Reply-To: <200305141823.h4EINGt15343@odiug.zope.com> References: <200305141737.h4EHbMv06730@odiug.zope.com> <16066.33949.903064.834797@montanaro.dyndns.org> <16066.34707.101894.297890@montanaro.dyndns.org> <200305141823.h4EINGt15343@odiug.zope.com> Message-ID: <16066.39367.822086.836812@montanaro.dyndns.org> --g4s0SsAiBg Content-Type: text/plain; charset=us-ascii Content-Description: message body text Content-Transfer-Encoding: 7bit Guido> OK. Please report old and new startup times! That seems to be a little harder than you'd think. First, I had to fix encodings/__init__.py as well. Here are some numbers. In all cases the interpreter is as I built it from CVS on May 6. PYTHONSTARTUP is not defined. "import re" at top level: Without -S (best real time of 5 runs): % time python -c pass real 0m0.169s user 0m0.100s sys 0m0.030s With -S (best of 5): % time python -S -c pass real 0m0.125s user 0m0.020s sys 0m0.080s Proposed mods to warnings.py (and provisional replacement of re with string translation tables in encodings/__init__.py): Without -S (best of 5): % time python -c pass real 0m0.187s user 0m0.120s sys 0m0.030s With -S (best of 5): % time python -S -c pass real 0m0.118s user 0m0.020s sys 0m0.020s Not too exciting. I verified using -v that these modules are imported in 2.3 with no PYTHONSTARTUP and the -S flag after my changes: UserDict copy_reg linecache os posix posixpath stat types warnings zipimport Without -S (and my sitecustomize.py file moved) I get these: UserDict copy_reg linecache os posix posixpath site stat types warnings zipimport I've got to get back to some paying work, so I can't pursue this more at the moment. Attached are my current diffs for warnings.py and encodings/ __init__.py if someone has a few moments to look at it. Skip --g4s0SsAiBg Content-Type: application/octet-stream Content-Disposition: attachment; filename="nore.diff" Content-Transfer-Encoding: base64 SW5kZXg6IExpYi93YXJuaW5ncy5weQo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09ClJDUyBmaWxlOiAvY3Zzcm9v dC9weXRob24vcHl0aG9uL2Rpc3Qvc3JjL0xpYi93YXJuaW5ncy5weSx2CnJldHJpZXZpbmcg cmV2aXNpb24gMS4yMApkaWZmIC1jIC1yMS4yMCB3YXJuaW5ncy5weQoqKiogTGliL3dhcm5p bmdzLnB5CTE0IE1heSAyMDAzIDE3OjMzOjUzIC0wMDAwCTEuMjAKLS0tIExpYi93YXJuaW5n cy5weQkxNCBNYXkgMjAwMyAxOToyNzozMiAtMDAwMAoqKioqKioqKioqKioqKioKKioqIDMs OSAqKioqCiAgIyBOb3RlOiBmdW5jdGlvbiBsZXZlbCBpbXBvcnRzIHNob3VsZCAqbm90KiBi ZSB1c2VkCiAgIyBpbiB0aGlzIG1vZHVsZSBhcyBpdCBtYXkgY2F1c2UgaW1wb3J0IGxvY2sg ZGVhZGxvY2suCiAgIyBTZWUgYnVnIDY4MzY1OC4KISBpbXBvcnQgc3lzLCB0eXBlcwogIGlt cG9ydCBsaW5lY2FjaGUKICAKICBfX2FsbF9fID0gWyJ3YXJuIiwgInNob3d3YXJuaW5nIiwg ImZvcm1hdHdhcm5pbmciLCAiZmlsdGVyd2FybmluZ3MiLAotLS0gMyw5IC0tLS0KICAjIE5v dGU6IGZ1bmN0aW9uIGxldmVsIGltcG9ydHMgc2hvdWxkICpub3QqIGJlIHVzZWQKICAjIGlu IHRoaXMgbW9kdWxlIGFzIGl0IG1heSBjYXVzZSBpbXBvcnQgbG9jayBkZWFkbG9jay4KICAj IFNlZSBidWcgNjgzNjU4LgohIGltcG9ydCBzeXMsIHJlLCB0eXBlcwogIGltcG9ydCBsaW5l Y2FjaGUKICAKICBfX2FsbF9fID0gWyJ3YXJuIiwgInNob3d3YXJuaW5nIiwgImZvcm1hdHdh cm5pbmciLCAiZmlsdGVyd2FybmluZ3MiLAoqKioqKioqKioqKioqKioKKioqIDEyOSwxMzUg KioqKgogICAgICAiIiJJbnNlcnQgYW4gZW50cnkgaW50byB0aGUgbGlzdCBvZiB3YXJuaW5n cyBmaWx0ZXJzIChhdCB0aGUgZnJvbnQpLgogIAogICAgICBVc2UgYXNzZXJ0aW9ucyB0byBj aGVjayB0aGF0IGFsbCBhcmd1bWVudHMgaGF2ZSB0aGUgcmlnaHQgdHlwZS4iIiIKLSAgICAg aW1wb3J0IHJlCiAgICAgIGFzc2VydCBhY3Rpb24gaW4gKCJlcnJvciIsICJpZ25vcmUiLCAi YWx3YXlzIiwgImRlZmF1bHQiLCAibW9kdWxlIiwKICAgICAgICAgICAgICAgICAgICAgICAg Im9uY2UiKSwgImludmFsaWQgYWN0aW9uOiAlcyIgJSBgYWN0aW9uYAogICAgICBhc3NlcnQg aXNpbnN0YW5jZShtZXNzYWdlLCBiYXNlc3RyaW5nKSwgIm1lc3NhZ2UgbXVzdCBiZSBhIHN0 cmluZyIKLS0tIDEyOSwxMzQgLS0tLQoqKioqKioqKioqKioqKioKKioqIDE2MywxNjkgKioq KgogIAogICMgSGVscGVyIGZvciBfcHJvY2Vzc29wdGlvbnMoKQogIGRlZiBfc2V0b3B0aW9u KGFyZyk6Ci0gICAgIGltcG9ydCByZQogICAgICBwYXJ0cyA9IGFyZy5zcGxpdCgnOicpCiAg ICAgIGlmIGxlbihwYXJ0cykgPiA1OgogICAgICAgICAgcmFpc2UgX09wdGlvbkVycm9yKCJ0 b28gbWFueSBmaWVsZHMgKG1heCA1KTogJXMiICUgYGFyZ2ApCi0tLSAxNjIsMTY3IC0tLS0K KioqKioqKioqKioqKioqCioqKiAyMDAsMjA2ICoqKioKICAKICAjIEhlbHBlciBmb3IgX3Nl dG9wdGlvbigpCiAgZGVmIF9nZXRjYXRlZ29yeShjYXRlZ29yeSk6Ci0gICAgIGltcG9ydCBy ZQogICAgICBpZiBub3QgY2F0ZWdvcnk6CiAgICAgICAgICByZXR1cm4gV2FybmluZwogICAg ICBpZiByZS5tYXRjaCgiXlthLXpBLVowLTlfXSskIiwgY2F0ZWdvcnkpOgotLS0gMTk4LDIw MyAtLS0tCkluZGV4OiBMaWIvZW5jb2RpbmdzL19faW5pdF9fLnB5Cj09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0K UkNTIGZpbGU6IC9jdnNyb290L3B5dGhvbi9weXRob24vZGlzdC9zcmMvTGliL2VuY29kaW5n cy9fX2luaXRfXy5weSx2CnJldHJpZXZpbmcgcmV2aXNpb24gMS4xOApkaWZmIC1jIC1yMS4x OCBfX2luaXRfXy5weQoqKiogTGliL2VuY29kaW5ncy9fX2luaXRfXy5weQkyNCBBcHIgMjAw MyAxNjowMjo0OSAtMDAwMAkxLjE4Ci0tLSBMaWIvZW5jb2RpbmdzL19faW5pdF9fLnB5CTE0 IE1heSAyMDAzIDE5OjI3OjMyIC0wMDAwCioqKioqKioqKioqKioqKgoqKiogMjcsMzkgKioq KgogIAogICIiIiMiCiAgCiEgaW1wb3J0IGNvZGVjcywgZXhjZXB0aW9ucywgcmUKICAKICBf Y2FjaGUgPSB7fQogIF91bmtub3duID0gJy0tdW5rbm93bi0tJwogIF9pbXBvcnRfdGFpbCA9 IFsnKiddCiEgX25vcm1fZW5jb2RpbmdfUkUgPSByZS5jb21waWxlKCdbXmEtekEtWjAtOS5d JykKICAKICBjbGFzcyBDb2RlY1JlZ2lzdHJ5RXJyb3IoZXhjZXB0aW9ucy5Mb29rdXBFcnJv ciwKICAgICAgICAgICAgICAgICAgICAgICAgICAgZXhjZXB0aW9ucy5TeXN0ZW1FcnJvcik6 CiAgICAgIHBhc3MKLS0tIDI3LDQ4IC0tLS0KICAKICAiIiIjIgogIAohIGltcG9ydCBjb2Rl Y3MsIGV4Y2VwdGlvbnMsIHN0cmluZwogIAogIF9jYWNoZSA9IHt9CiAgX3Vua25vd24gPSAn LS11bmtub3duLS0nCiAgX2ltcG9ydF90YWlsID0gWycqJ10KISAjX25vcm1fZW5jb2Rpbmdf UkUgPSByZS5jb21waWxlKCdbXmEtekEtWjAtOS5dJykKICAKKyAjIGEgbGl0dGxlIGRhbmNl IHRvIGF2b2lkIHVzaW5nIHJlCisgX25vcm1fZW5jb2RpbmdfdHJhbnMgPSAoIl8iKm9yZCgn MCcpKworICAgICAgICAgICAgICAgICAgICAgICAgIHN0cmluZy5kaWdpdHMrCisgICAgICAg ICAgICAgICAgICAgICAgICAgIl8iKihvcmQoJ0EnKS1vcmQoJzknKS0xKSsKKyAgICAgICAg ICAgICAgICAgICAgICAgICBzdHJpbmcuYXNjaWlfdXBwZXJjYXNlKworICAgICAgICAgICAg ICAgICAgICAgICAgICJfIioob3JkKCdhJyktb3JkKCdaJyktMSkrCisgICAgICAgICAgICAg ICAgICAgICAgICAgc3RyaW5nLmFzY2lpX2xvd2VyY2FzZSsKKyAgICAgICAgICAgICAgICAg ICAgICAgICAiXyIqKDI1Ni1vcmQoJ3onKS0xKSkKKyAgICAgICAgICAgICAgICAgICAgICAg ICAKICBjbGFzcyBDb2RlY1JlZ2lzdHJ5RXJyb3IoZXhjZXB0aW9ucy5Mb29rdXBFcnJvciwK ICAgICAgICAgICAgICAgICAgICAgICAgICAgZXhjZXB0aW9ucy5TeXN0ZW1FcnJvcik6CiAg ICAgIHBhc3MKKioqKioqKioqKioqKioqCioqKiA0OCw1NCAqKioqCiAgICAgICAgICBiZWNv bWVzICdfJy4KICAKICAgICAgIiIiCiEgICAgIHJldHVybiAnXycuam9pbihfbm9ybV9lbmNv ZGluZ19SRS5zcGxpdChlbmNvZGluZykpCiAgCiAgZGVmIHNlYXJjaF9mdW5jdGlvbihlbmNv ZGluZyk6CiAgCi0tLSA1Nyw2MyAtLS0tCiAgICAgICAgICBiZWNvbWVzICdfJy4KICAKICAg ICAgIiIiCiEgICAgIHJldHVybiBzdHJpbmcudHJhbnNsYXRlKGVuY29kaW5nLCBfbm9ybV9l bmNvZGluZ190cmFucykKICAKICBkZWYgc2VhcmNoX2Z1bmN0aW9uKGVuY29kaW5nKToKICAK --g4s0SsAiBg-- From dave@boost-consulting.com Thu May 15 01:22:46 2003 From: dave@boost-consulting.com (David Abrahams) Date: Wed, 14 May 2003 20:22:46 -0400 Subject: [Python-Dev] Re: MS VC 7 offer References: <3EBCABD0.7050700@lemburg.com> <3EBCABD0.7050700@lemburg.com> <5.1.1.6.0.20030512222353.022ade78@torment.chelsea.private> Message-ID: Barry Scott writes: > Did I miss the answer to why bother to move to VC7? > > As a C project I know of very little to recommend VC7 or VC7.1. > As a C++ developer I've decided that VC7 as little more then a broken > VC6. That was roughly my experience, support for template template arguments notwithstanding. > Maybe Jesse Lipcon (who works for MS now) has managed to > make VC7.1 more standards compatible for C++ work, which would > recommend it to C++ developers. That's not a maybe. As a hard-core C++-head, I can tell you that it's like night and day. VC7.1 is very, very good. > Note that wxPython claims that it will not compile correctly with > VC7 unless you add a work around for a bug in the code generator. It's very unlikely that this bug survived the VC7.1 release, but I suppose it's possible. -- Dave Abrahams Boost Consulting www.boost-consulting.com From tim.one@comcast.net Thu May 15 04:08:18 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 14 May 2003 23:08:18 -0400 Subject: [Python-Dev] Simple dicts Message-ID: Behind the scenes, Damien Morton has been playing with radically different designs for "small" dicts. This reminded me that, in previous lives, I always used chaining ("linked lists") for collision resolution in hash tables. I don't have time to pursue this now, but it would make a nice experiment for someone who wants to try a non-trivial but still-simple and interesting project. It may even pay off, but that shouldn't be a motivation . Design notes: PyDictEntry would grow a new PyDictEntry *next; slot, boosting it from 12 to 16 bytes on a 32-bit box. This wasn't reasonable when Python was designed, but pymalloc allocates 16-byte chunks with virtually no wasted space. PyDictObject would loose the ma_fill and ma_table and ma_smalltable members, and gain a pointer to a variable-size vector of pointers PyDictEntry **first; For a hash table with 2**n slots, this vector holds 2**n pointers, memset to NULL initially. The hash chain for an object with hash code h starts at first[h & ma_mask], and is linked together via PyDictEntry.next pointers. Hash chains per hash code are independent. There's no use for the "dummy" state. There's no *logical* need for the "unused" state either, although a micro-optimizer may want to retain that in some form. Memory use isn't nearly as bad as it may first appear -- to the contrary, it's probably better on average! Assuming a 32-bit box: Current: tables are normally 1/3 to 2/3 full. If there are N active objects in the table, at 1/3 full the table contains 3*N PyDictEntries, and at 2/3 full it contains 1.5*N PyDictEntries, for a total of (multiplying by 12 bytes per PyDictEntry) 18*N to 36*N bytes. Chaining: assuming tables are still 1/3 to 2/3 full. At 1/3 full there are 3*N first pointers and at 2/3 full there are 1.5*N first pointers, for a total of 6*N to 12*N bytes for first pointers. Independent of load factor, 16*N bytes are consumed by the larger PyDictEntry structs. Adding, that's 22*N to 28*N bytes. This relies on pymalloc's tiny wastage when allocating 16-byte chunks (under 1%). The worst case is worse than the current scheme, and the best case is better. The average is probably better. Note that "full" is a misnomer here. A chained table with 2**i slots can actually hold any number of objects, even if i==0; on average, each hash chain contains N/2**i PyDictEntry structs. Note that a small load factor is less important with chained resolution than with open addressing, because collisions at different hash codes can't interfere with each other (IOW, an object in slot #i never slows access to an object in the slot #j collision list, whenever i != j; "breathing room" to ease cross-hashcode collision pressure isn't needed; primary collisions are all that exist). Collision resolution code: Just a list walk. For example, lookdict_string could be, in its entirety: static dictentry * lookdict_string(dictobject *mp, PyObject *key, register long hash) { dictentry *p = mp->first[hash & mp->ma_mask]; if (PyString_CheckExact(key)) { for (; p != NULL; p = p->next) { if (p->me_key == key || (ep->me_hash == hash && PyString_Eq(ep->me_key, key))) return p; } return NULL; } mp->ma_lookup = lookdict; return lookdict(mp, key, hash); } Resizing: Probably much faster. The vector of first pointers can be realloc'ed, and sometimes benefit from the platform malloc extending it in-place. No other memory allocation operation is needed on a resize. Instead about half the PyDictEntry structs will need to move to "the other half" of the table (the structs themselves don't get copied or moved; they just get relinked via their next pointers). Copying: Probably slower, due to needing a PyObject_Malloc() call for each key/value pair. Building a dict up: Probably slower, again due to repeated PyObject_Malloc() calls. Referencing a dict: Probably a wash, although because the code can be so much simpler compilers may do a better job of optimizing it, and no tests are needed to distinguish among three kinds of states. Out-of-cache dicts are killers either way. Also see next point. Optimizations: The cool algorithmic thing about chaining is that self-organizing gimmicks (like swap-toward-front (or move-to-front) on reference) are easy to code and run fast (again, the dictentry structs don't move, you just need to fiddle a few next pointers). When collision chains can collide, dynamic table reorganization is so complicated and expensive that nobody has even thought about trying it in Python. When they can't collide, it's simple. Note too that since the memory burden per unused slot falls from 12 to 4 bytes, sparser tables are less painful to contemplate. Small dicts: There's no gimmick here to favor them. From cgw@alum.mit.edu Thu May 15 05:38:26 2003 From: cgw@alum.mit.edu (Charles G Waldman) Date: Wed, 14 May 2003 23:38:26 -0500 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20 In-Reply-To: <20030515030905.382.60542.Mailman@mail.python.org> References: <20030515030905.382.60542.Mailman@mail.python.org> Message-ID: <16067.6594.326128.398884@nyx.dyndns.org> GvR> only reason I can come up with is that for dotted names, the dot would GvR> have to be escaped on the command line, and escaping something on the GvR> command line is painful because \ is also a shell escape character, so GvR> you'd have to escape the escape. I'm afraid I must be missing something terribly obvious here, but why would you need to escape a dot on a command line? None of the shells I'm familiar with treat dot as a metacharacter. Isn't `?' the standard shell metacharacter for "any character"? Filename patterns on the shell command line are "glob patterns", not RE's. But, like I said, I'm probably missing something. I think I'll go back into the shadows to lurk some more now.... Charles From fdrake@acm.org Thu May 15 05:43:57 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 15 May 2003 00:43:57 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20 In-Reply-To: <16067.6594.326128.398884@nyx.dyndns.org> References: <20030515030905.382.60542.Mailman@mail.python.org> <16067.6594.326128.398884@nyx.dyndns.org> Message-ID: <16067.6925.897980.796041@grendel.zope.com> Charles G Waldman writes: > I'm afraid I must be missing something terribly obvious here, but why > would you need to escape a dot on a command line? None of the shells > I'm familiar with treat dot as a metacharacter. Isn't `?' the > standard shell metacharacter for "any character"? Filename patterns > on the shell command line are "glob patterns", not RE's. It's not the shell that treats it as a metacharacter, but the RE syntax. Preventing "." from being treated as an RE metacharacter would be done by inserting a "\" character, which is a shell metacharacter, and would need another "\" to escape that, so that one of the "\" would end up in the RE. Of course, my favorite way of dealing with this is to use single quotes around the argument rather than backslashes; that works fine in sh-syntax shells, and doesn't require doubling-up backslashes. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From drifty@alum.berkeley.edu Thu May 15 07:45:54 2003 From: drifty@alum.berkeley.edu (Brett C.) Date: Wed, 14 May 2003 23:45:54 -0700 Subject: [Python-Dev] Simple dicts In-Reply-To: References: Message-ID: <3EC337A2.7060702@ocf.berkeley.edu> Tim Peters wrote: > Behind the scenes, Damien Morton has been playing with radically different > designs for "small" dicts. This reminded me that, in previous lives, I > always used chaining ("linked lists") for collision resolution in hash > tables. I don't have time to pursue this now, but it would make a nice > experiment for someone who wants to try a non-trivial but still-simple and > interesting project. It may even pay off, but that shouldn't be a > motivation . > When I took data structures I was taught that chaining was actually the easiest way to do hash tables and they still had good performance compared to open addressing. Because of this taught bias I always wondered why Python used open addressing; can someone tell me? I am interested in seeing how this would pan out, but I am unfortunately going to be busy for the next three days (if anyone is going to be at E3 Thursday or Friday for some odd reason let me know since I will be there). If someone takes this up please let me know; I am interested in helping if I can. Perhaps this should be a sandbox thing? -Brett From lkcl@samba-tng.org Thu May 15 09:59:27 2003 From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton) Date: Thu, 15 May 2003 08:59:27 +0000 Subject: [Python-Dev] Re: sf.net/708007: expectlib.py telnetlib.py split In-Reply-To: <20030513225740.GH2305@localhost> References: <20030513225740.GH2305@localhost> Message-ID: <20030515085927.GD908@localhost> raymond, regarding expect.py which you give a link to: - expect.py is extremely basic, offering pretty much only read and write. what it _actually_ offers is an advantage over the python distribution's popen2/3 because it catches ptys (stdin) even on ssh and passwd. - expectlib.py [new] _is_ telnetlib.py [old] - with over-rideable read, write, open and close methods. - pexpect is like... an independently developed version of the above, with all of the above functionality And Then Some - including an ANSI screen emulator should an application developer choose to use it. what i figure is a sensible roadmap to suggest / propose to people: - telnetlib.py [old] gets split into telnetlib.py [patched] plus expectlib.py [patched]. - noah investigates expectlib.py and a) works some magic on it b) uses it in pexpect. - someone independently investigates expect.py's popen2 c-code capability to see if it can be merged into the python distribution. i do not know if it is a "bug" that python's popen functions cannot capture ssh / passwd but it would certainly appear to be sensible to have an option to allow ALL user input to be captured. certainly i found it a total pain two years ago to have to patch ssh to allow a user password to be accepted on the command-line! [i didn't know about expect.py then] last time i spoke to guido about the telnetlib.py/expectlib.py patch, he a) wasn't so madly busy as he is now, b) rejected the then-patch because it wasn't clean c) acknowledged that telnetlib.py is a mess and needed a complete rewrite. since that time, i notice that telnetlib.py has had a control-char handling function, which alleviates some of the need for a complete rewrite. l. On Tue, May 13, 2003 at 10:57:40PM +0000, Luke Kenneth Casson Leighton wrote: > [i am not on the python-dev list but i check the archives, please cc me] > > approximately two years ago i needed the functionality outlined > in the present telnetlib.py for several other remote protocols, > most notably commands (including ssh and bash) and also HTTP. > > i figure that this functionality should be more than invaluable > to other python developers. From walter@livinglogic.de Thu May 15 11:05:22 2003 From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Thu, 15 May 2003 12:05:22 +0200 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py, 1.19, 1.20 In-Reply-To: <16066.39367.822086.836812@montanaro.dyndns.org> References: <200305141737.h4EHbMv06730@odiug.zope.com> <16066.33949.903064.834797@montanaro.dyndns.org> <16066.34707.101894.297890@montanaro.dyndns.org> <200305141823.h4EINGt15343@odiug.zope.com> <16066.39367.822086.836812@montanaro.dyndns.org> Message-ID: <3EC36662.5070706@livinglogic.de> This is a multi-part message in MIME format. --------------020507060202080607040303 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Skip Montanaro wrote: > [...] > I've got to get back to some paying work, so I can't pursue this more at the > moment. Attached are my current diffs for warnings.py and encodings/ > __init__.py if someone has a few moments to look at it. Your normalize_encoding() doesn't preserve the "." and it doesn't collapse consecutive non-alphanumeric characters. Furthermore it imports the string module. How about the attached patch? Constructing the translation string might be bad for startup time. Bye, Walter Dörwald --------------020507060202080607040303 Content-Type: text/plain; name="diff.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="diff.txt" Index: Lib/encodings/__init__.py =================================================================== RCS file: /cvsroot/python/python/dist/src/Lib/encodings/__init__.py,v retrieving revision 1.18 diff -u -r1.18 __init__.py --- Lib/encodings/__init__.py 24 Apr 2003 16:02:49 -0000 1.18 +++ Lib/encodings/__init__.py 15 May 2003 10:00:52 -0000 @@ -32,7 +32,13 @@ _cache = {} _unknown = '--unknown--' _import_tail = ['*'] -_norm_encoding_RE = re.compile('[^a-zA-Z0-9.]') +_norm_encoding_trans = [] +for i in xrange(128): + c = chr(i) + if not c.isalnum() and not c==".": + c = "_" + _norm_encoding_trans.append(c) +_norm_encoding_trans = "".join(_norm_encoding_trans) + "_"*128 class CodecRegistryError(exceptions.LookupError, exceptions.SystemError): @@ -48,7 +54,7 @@ becomes '_'. """ - return '_'.join(_norm_encoding_RE.split(encoding)) + return '_'.join(filter(None, encoding.translate(_norm_encoding_trans).split("_"))) def search_function(encoding): --------------020507060202080607040303-- From guido@python.org Thu May 15 12:07:04 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 15 May 2003 07:07:04 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20 In-Reply-To: "Your message of Thu, 15 May 2003 00:43:57 EDT." <16067.6925.897980.796041@grendel.zope.com> References: <20030515030905.382.60542.Mailman@mail.python.org> <16067.6594.326128.398884@nyx.dyndns.org> <16067.6925.897980.796041@grendel.zope.com> Message-ID: <200305151107.h4FB75O17039@pcp02138704pcs.reston01.va.comcast.net> > Of course, my favorite way of dealing with this is to use single > quotes around the argument rather than backslashes; that works fine in > sh-syntax shells, and doesn't require doubling-up backslashes. Agreed, but you're still using two levels of quoting, and with anything less, "foo.bar" will also match a module named "foolbar". --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu May 15 12:10:37 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 15 May 2003 07:10:37 -0400 Subject: [Python-Dev] Simple dicts In-Reply-To: "Your message of Wed, 14 May 2003 23:45:54 PDT." <3EC337A2.7060702@ocf.berkeley.edu> References: <3EC337A2.7060702@ocf.berkeley.edu> Message-ID: <200305151110.h4FBAb717062@pcp02138704pcs.reston01.va.comcast.net> > When I took data structures I was taught that chaining was actually the > easiest way to do hash tables and they still had good performance > compared to open addressing. Because of this taught bias I always > wondered why Python used open addressing; can someone tell me? It was my choice, but I don't recall why. Probably because Knuth said so. Or because it's simpler to implement with a single allocated block (I think I was aware of the cost of malloc(), or else tuples and strings would have used two blocks. BTW, why don't Unicode objects use this trick?) --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Thu May 15 13:09:57 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 15 May 2003 08:09:57 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20 In-Reply-To: <200305151107.h4FB75O17039@pcp02138704pcs.reston01.va.comcast.net> References: <20030515030905.382.60542.Mailman@mail.python.org> <16067.6594.326128.398884@nyx.dyndns.org> <16067.6925.897980.796041@grendel.zope.com> <200305151107.h4FB75O17039@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <16067.33685.659075.459229@grendel.zope.com> Guido van Rossum writes: > Agreed, but you're still using two levels of quoting, and with > anything less, "foo.bar" will also match a module named "foolbar". Agreed. "foo\.bar" will match "foolbar" as well, but 'foo\.bar' only matches "foo.bar". The advantage of single quotes is that you're not escaping the escape characters with themselves; what's inside the quotes is simple RE syntax, so you only need to think about one of the layers at a time. Either approach works, of course. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From aahz@pythoncraft.com Thu May 15 13:44:34 2003 From: aahz@pythoncraft.com (Aahz) Date: Thu, 15 May 2003 08:44:34 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20 In-Reply-To: <16067.33685.659075.459229@grendel.zope.com> References: <20030515030905.382.60542.Mailman@mail.python.org> <16067.6594.326128.398884@nyx.dyndns.org> <16067.6925.897980.796041@grendel.zope.com> <200305151107.h4FB75O17039@pcp02138704pcs.reston01.va.comcast.net> <16067.33685.659075.459229@grendel.zope.com> Message-ID: <20030515124434.GA20979@panix.com> On Thu, May 15, 2003, Fred L. Drake, Jr. wrote: > Guido van Rossum writes: >> >> Agreed, but you're still using two levels of quoting, and with >> anything less, "foo.bar" will also match a module named "foolbar". > > Agreed. "foo\.bar" will match "foolbar" as well, but 'foo\.bar' only > matches "foo.bar". The advantage of single quotes is that you're not > escaping the escape characters with themselves; what's inside the > quotes is simple RE syntax, so you only need to think about one of the > layers at a time. The point is that with current behavior you can use foo.bar on the command line and not worry, because "." is a meta character in neither shell nor Python. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In many ways, it's a dull language, borrowing solid old concepts from many other languages & styles: boring syntax, unsurprising semantics, few automatic coercions, etc etc. But that's one of the things I like about it." --Tim Peters on Python, 16 Sep 93 From gward@python.net Thu May 15 14:39:27 2003 From: gward@python.net (Greg Ward) Date: Thu, 15 May 2003 09:39:27 -0400 Subject: [Python-Dev] Simple dicts In-Reply-To: <3EC337A2.7060702@ocf.berkeley.edu> References: <3EC337A2.7060702@ocf.berkeley.edu> Message-ID: <20030515133927.GA15523@cthulhu.gerg.ca> On 14 May 2003, Brett C. said: > When I took data structures I was taught that chaining was actually the > easiest way to do hash tables and they still had good performance > compared to open addressing. Because of this taught bias I always > wondered why Python used open addressing; can someone tell me? If your nodes are small, chaining has a huge overhead -- an extra pointer for each node in a chain. You can play around with glomming several nodes together to amortize the cost of those pointers, but ISTR the win isn't that big. Open addressing is more memory-efficient, but when the hash table fills (or gets close to full), you absolutely positively have to rehash. (Back in January, I played around with writing a custom hash table for keeping ZODB indexes in memory without using a Python dict, so that's why I'm fairly fresh on hash table minutiae.) Greg -- Greg Ward http://www.gerg.ca/ NOBODY expects the Spanish Inquisition! From skip@pobox.com Thu May 15 15:20:36 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 15 May 2003 09:20:36 -0500 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20 In-Reply-To: <16067.33685.659075.459229@grendel.zope.com> References: <20030515030905.382.60542.Mailman@mail.python.org> <16067.6594.326128.398884@nyx.dyndns.org> <16067.6925.897980.796041@grendel.zope.com> <200305151107.h4FB75O17039@pcp02138704pcs.reston01.va.comcast.net> <16067.33685.659075.459229@grendel.zope.com> Message-ID: <16067.41524.485389.151519@montanaro.dyndns.org> Fred> Guido van Rossum writes: >> Agreed, but you're still using two levels of quoting, and with >> anything less, "foo.bar" will also match a module named "foolbar". Fred> Agreed. "foo\.bar" will match "foolbar" as well, but 'foo\.bar' Fred> only matches "foo.bar". Coming back to my original question, does it make sense to allow regular expressions in the message and module fields in a -W command line arg? The complexity of all the shell/re quoting suggests not, but having -W args treated differently than the args to filterwarnings() doesn't seem right. Perhaps this is something that never happens in practice. I've never used -W. Are there people out there who have used it and wished the message and module fields could be regular expressions? Conversely, does anyone make use of the fact that the message and module args to filterwarnings() can be regular expressions? Looking through the Python source I see several examples of filterwarning() where one or the other of the message and module args are regular expressions, so that answers the second question. The first remains open. Skip From guido@python.org Thu May 15 15:27:58 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 15 May 2003 10:27:58 -0400 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20 In-Reply-To: Your message of "Thu, 15 May 2003 09:20:36 CDT." <16067.41524.485389.151519@montanaro.dyndns.org> References: <20030515030905.382.60542.Mailman@mail.python.org> <16067.6594.326128.398884@nyx.dyndns.org> <16067.6925.897980.796041@grendel.zope.com> <200305151107.h4FB75O17039@pcp02138704pcs.reston01.va.comcast.net> <16067.33685.659075.459229@grendel.zope.com> <16067.41524.485389.151519@montanaro.dyndns.org> Message-ID: <200305151427.h4FERwL14363@odiug.zope.com> > Fred> Guido van Rossum writes: > >> Agreed, but you're still using two levels of quoting, and with > >> anything less, "foo.bar" will also match a module named "foolbar". > > Fred> Agreed. "foo\.bar" will match "foolbar" as well, but 'foo\.bar' > Fred> only matches "foo.bar". > > Coming back to my original question, does it make sense to allow regular > expressions in the message and module fields in a -W command line arg? The > complexity of all the shell/re quoting suggests not, but having -W args > treated differently than the args to filterwarnings() doesn't seem right. > > Perhaps this is something that never happens in practice. I've never used > -W. Are there people out there who have used it and wished the message and > module fields could be regular expressions? Conversely, does anyone make > use of the fact that the message and module args to filterwarnings() can be > regular expressions? > > Looking through the Python source I see several examples of filterwarning() > where one or the other of the message and module args are regular > expressions, so that answers the second question. The first remains open. I'll call YAGNI on regexps for -W. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Thu May 15 15:33:56 2003 From: tim.one@comcast.net (Tim Peters) Date: Thu, 15 May 2003 10:33:56 -0400 Subject: [Python-Dev] Simple dicts In-Reply-To: <3EC337A2.7060702@ocf.berkeley.edu> Message-ID: [Brett C.] > When I took data structures I was taught that chaining was actually the > easiest way to do hash tables and they still had good performance > compared to open addressing. Because of this taught bias I always > wondered why Python used open addressing; can someone tell me? malloc overhead is a major drag; as my msg said, the feasibility depends on pymalloc's very low overhead, and that dictentry nodes are 12 bytes apiece even now; pymalloc didn't exist back then, and Python wasn't originally micro-optimized as it is now (e.g., there wasn't the current zoo of dedicated free-lists, or pre-allocation strategies, and dicts could *only* be indexed by strings so collision only had to worry about the kinds of problems a single known hash function was prone to). > I am interested in seeing how this would pan out, but I am unfortunately > going to be busy for the next three days (if anyone is going to be at E3 > Thursday or Friday for some odd reason let me know since I will be > there). If someone takes this up please let me know; I am interested in > helping if I can. Perhaps this should be a sandbox thing? There's no rush , and I'd be surprised if Python adopted a different scheme in the end anyway. It's likely a just-for-fun to-see-what-happens project. Note one nasty class of problem: in chaining *only* primary collisions exist. The current dict implementation turned the problem of open-addressing's secondary collisions into "a feature", which will become clear when you contemplate dictobject.c's [i << 16 for i in range(20000)] example. Python's original dict design didn't have a problem with this because it used prime numbers for table sizes and reduced 32-bit hashes via mod-by-a-prime. The current scheme of just grabbing the last i bits is both much faster and more delicate than that, and we really rely on the collision resolution strategy now to protect against unlucky bit patterns. Another way of looking at this is that the current scheme has a way to get all 32 bits of a hash code to participate in which table slot gets selected; mod-by-an-odd-prime also gets all bits into play; peel-off-the-last-i-bits does not. From damien.morton@acm.org Thu May 15 23:25:13 2003 From: damien.morton@acm.org (Damien Morton) Date: Thu, 15 May 2003 18:25:13 -0400 Subject: [Python-Dev] Simple dicts Message-ID: <006301c31b30$da69e8e0$6401a8c0@damien> Im currently working on an open-chaining dict. Between paying work and coming to grips with the python innards, it might take a little while. I was working on an implementation optimised for dicts with <256 entries that attempted to squeeze the most out of memory by using bytes for the 'first' and 'next' pointers. This kind of hashtable can be _extremely_ sparse compared to the current dict implementation. With the byte-oriented open-chaining approach, the break-even point for memory usage (compared to the current approach) happens at a max load factor of about 0.1. Im not sure that alloc()/free() for each dictentry is a win (if only because of pymalloc call overhead), and instead imagine a scheme whereby each dict would pre-alloc() a block of memory and manages its own free-lists. Theoretically, this makes copying and disposing of dicts much easier. It also helps ensure locality of reference. In fact, immediately after a doubling, the open-addressing hashtable scheme still 'uses' (in the sense of potentially addressing) all of the memory allocated to it, whereas the open-chaining approach 'uses' only the first pointers and the actual dictentries in use - about 2/3 of the space the open-addressing scheme uses. On the other hand, as Tim pointed out to me in a private email, there is so much overhead in just getting to the hashtable inner loop, going around that loop one time instead of two or three seems inconsequential. On the third hand, first-miss and first-hit lookups are simple enough that they could easily be put into a macro. I will need to take a closer look at Oren Tirosh's fastnames patch. I have a question that someone may be able to answer: There seem to be two different ways to get/set/del from a dictionary. The first is using PyDict_[Get|Set|Del]Item() The second is using the embarssingly named dict_ass_sub() and its partner dict_subscript(). Which of these two access methods is most likely to be used? From lkcl@samba-tng.org Thu May 15 22:44:18 2003 From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton) Date: Thu, 15 May 2003 21:44:18 +0000 Subject: [Python-Dev] [PEP] += on return of function call result In-Reply-To: References: <20030402090726.GN1048@localhost> Message-ID: <20030515214417.GF3900@localhost> On Wed, Apr 02, 2003 at 09:54:35AM -0500, Andrew Koenig wrote: > Luke> example code: > Luke> log = {} > > Luke> for t in range(5): > Luke> for r in range(10): > Luke> log.setdefault(r, '') += "test %d\n" % t > > Luke> pprint(log) > > Luke> instead, as the above is not possible, the following must be used: > > Luke> from operator import add > > Luke> ... > Luke> ... > Luke> ... > > Luke> add(log.setdefault(r, ''), "test %d\n" % t) > > Luke> ... ARGH! just checked - NOPE! add doesn't work. > Luke> and there's no function "radd" or "__radd__" in the > Luke> operator module. > > Why can't you do this? > > for t in range(5): > for r in range(10): > foo = log.setdefault(r,'') > foo += "test %d\n" % t after running this code, log = {0: '', 1: '', 2:'', 3: '' ... 9: ''} and foo equals "test 5". if, however, you do this: for t in range(5): for r in range(10): foo = log.setdefault(r,[]) foo.append("test %d\n" % t) then empirically i conclude that you DO end up with the expected results (but is this true all of the time?) the reason why your example, andrew, does not work, is because '' is a string - a basic type to which a pointer is NOT returned i presume that the foo += "test %d"... returns a DIFFERENT result object such that the string in the dictionary is DIFFERENT from the string result of foo being updated. if that makes absolutely no sense whatsoever then think of it being the difference between integers and pointers-to-integers in c. can anyone tell me if there are any PARTICULAR circumstances where foo = log.setdefault(r,[]) foo.append("test %d\n" % t) will FAIL to work as expected? andrew, sorry it took me so long to respond: i initially thought that under all circumstances for all types of foo, your example would work. l. -- -- expecting email to be received and understood is a bit like picking up the telephone and immediately dialing without checking for a dial-tone; speaking immediately without listening for either an answer or ring-tone; hanging up immediately and then expecting someone to call you (and to be able to call you). -- every day, people send out email expecting it to be received without being tampered with, read by other people, delayed or simply - without prejudice but lots of incompetence - destroyed. -- please therefore treat email more like you would a CB radio to communicate across the world (via relaying stations): ask and expect people to confirm receipt; send nothing that you don't mind everyone in the world knowing about... From guido@python.org Fri May 16 01:27:58 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 15 May 2003 20:27:58 -0400 Subject: [Python-Dev] Simple dicts In-Reply-To: "Your message of Thu, 15 May 2003 18:25:13 EDT." <006301c31b30$da69e8e0$6401a8c0@damien> References: <006301c31b30$da69e8e0$6401a8c0@damien> Message-ID: <200305160027.h4G0Rwa17853@pcp02138704pcs.reston01.va.comcast.net> > There seem to be two different ways to get/set/del from a dictionary. > > The first is using PyDict_[Get|Set|Del]Item() This is the API that all C code uses (except code that doesn't know whether it's dealing with dicts or some other mapping, which has to use PyObject_GetItem() etc, which is even slower). > The second is using the embarssingly named dict_ass_sub() and its > partner dict_subscript(). This is what PyObject_GetItem() calls. > Which of these two access methods is most likely to be used? That's a hard question. Maybe a profiler can answer. The thing is, there's a lot of C code that calls PyDict_GetItem() directly, e.g. the attribute lookup code. Bot of course there's also a lot of Python code using dicts. Yet, I'd bet on PyDict_*(). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri May 16 01:32:19 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 15 May 2003 20:32:19 -0400 Subject: [Python-Dev] Vacation; Python 2.2.3 release. Message-ID: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net> I'm going on vacation tomorrow; I'll be in Holland for 10 days and will return to the US on May 26. I expect to have some email access but won't use it much. Now, I'd like Python 2.2.3 to be released soon. Barry has volunteered to be the release manager. I think it's pretty much ready to go out any time, except that Jeremy mentioned that he has a few things he'd like to backport; since Jeremy and Barry share an office I'm sure they can work this out. :-) I won't be disappointed if 2.2.3 hasn't been released yet when I'm back, but I won't be surprised if in fact it does go out while I'm gone -- it's ready, stick a fork in it! :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From cnetzer@mail.arc.nasa.gov Fri May 16 01:56:43 2003 From: cnetzer@mail.arc.nasa.gov (Chad Netzer) Date: 15 May 2003 17:56:43 -0700 Subject: [Python-Dev] Vacation; Python 2.2.3 release. In-Reply-To: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net> References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <1053046603.972.31.camel@sayge.arc.nasa.gov> On Thu, 2003-05-15 at 17:32, Guido van Rossum wrote: > Now, I'd like Python 2.2.3 to be released soon. Barry has volunteered > to be the release manager. Stupid question. Where can I get a prerelease (or CVS access) to 2.2.3, or a list of patches/features applied since 2.2.2? I looked around for the info, but apparently not hard enough (or I just don't understand CVS branching well enough). Chad From tim.one@comcast.net Fri May 16 02:06:27 2003 From: tim.one@comcast.net (Tim Peters) Date: Thu, 15 May 2003 21:06:27 -0400 Subject: [Python-Dev] Vacation; Python 2.2.3 release. In-Reply-To: <1053046603.972.31.camel@sayge.arc.nasa.gov> Message-ID: [Chad Netzer] > Stupid question. Where can I get a prerelease (or CVS access) to 2.2.3, > or a list of patches/features applied since 2.2.2? I looked around for > the info, but apparently not hard enough (or I just don't understand CVS > branching well enough). It's not a stupid question, it's a maddening feature of CVS that there's no place to store meta-data about branches. What you want to do is pass this argument to the checkout command: -r release22-maint There's no reasonable way you could have guess that. The Misc/NEWS file in that branch summarizes the changes since 2.2.2 (at least those fixes that people bothered to make a NEWS entry for ). From guido@python.org Fri May 16 02:28:01 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 15 May 2003 21:28:01 -0400 Subject: [Python-Dev] codeop: small details (Q); commit priv request In-Reply-To: "Your message of Mon, 12 May 2003 16:48:01 +0200." <5.2.1.1.0.20030512140727.02362ab0@localhost> References: <5.2.1.1.0.20030512140727.02362ab0@localhost> Message-ID: <200305160128.h4G1S1U18083@pcp02138704pcs.reston01.va.comcast.net> > 1) > > Python 2.3b1 (#40, Apr 25 2003, 19:06:24) [MSC v.1200 32 bit (Intel)] on win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> import codeop > >>> codeop.compile_command("",symbol="eval") > Traceback (most recent call last): > File "", line 1, in ? > File "s:\transit\py23\lib\codeop.py", line 129, in compile_command > return _maybe_compile(_compile, source, filename, symbol) > File "s:\transit\py23\lib\codeop.py", line 106, in _maybe_compile > raise SyntaxError, err1 > File "", line 1 > pass > ^ > SyntaxError: invalid syntax > > > the error is basically an artifact of the logic that enforces: > > compile_command("",symbol="single") === compile_command("pass",symbol="single") > > (this makes typing enter immediately after the prompt at a simulated shell > a nop as expected) > > I would expect > > compile_command("",symbol="eval") > > to return None, i.e. to simply signal an incomplete expression (that is > what would happen if the code for "eval" case would avoid the cited logic). Thanks for reporting this. I've fixed this by avoiding the change to "pass" when symbol == "eval". > 2) symbol = "exec" is silently accepted but the documentation intentionally > only refers to "exec" and "single" as valid values for symbol. Maybe a > ValueError should be raised. I don't know that that is intentional. I'd say that, like for the built-in compile(), the valid values for symbol should be "eval", "exec", and "single", and the docs ought to be updated (I didn't fix this). > Context: I was working on improving Jython codeop compatibility with > CPython codeop. Cool. > Btw, as considered here by Guido > http://sourceforge.net/tracker/index.php?func=detail&aid=645404&group_id=5470&atid=305470 > I would ask to have commit privileges for CPython Barry has sworn you in by now. Welcome to the club! --Guido van Rossum (home page: http://www.python.org/~guido/) From Anthony Baxter Fri May 16 02:47:11 2003 From: Anthony Baxter (Anthony Baxter) Date: Fri, 16 May 2003 11:47:11 +1000 Subject: [Python-Dev] Vacation; Python 2.2.3 release. In-Reply-To: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200305160147.h4G1lCa09066@localhost.localdomain> >>> Guido van Rossum wrote > Now, I'd like Python 2.2.3 to be released soon. Barry has volunteered > to be the release manager. I think it's pretty much ready to go out > any time, except that Jeremy mentioned that he has a few things he'd > like to backport; since Jeremy and Barry share an office I'm sure they > can work this out. :-) There's a bunch of cvs commit messages I've saved off as "potential branch-patches". I might try to get to them this weekend. -- Anthony Baxter It's never too late to have a happy childhood. From barry@python.org Fri May 16 03:04:57 2003 From: barry@python.org (Barry Warsaw) Date: 15 May 2003 22:04:57 -0400 Subject: [Python-Dev] Vacation; Python 2.2.3 release. In-Reply-To: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net> References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <1053050696.26479.35.camel@geddy> On Thu, 2003-05-15 at 20:32, Guido van Rossum wrote: > I'm going on vacation tomorrow; I'll be in Holland for 10 days and > will return to the US on May 26. I expect to have some email access > but won't use it much. > > Now, I'd like Python 2.2.3 to be released soon. Barry has volunteered > to be the release manager. I think it's pretty much ready to go out > any time, except that Jeremy mentioned that he has a few things he'd > like to backport; since Jeremy and Barry share an office I'm sure they > can work this out. :-) > > I won't be disappointed if 2.2.3 hasn't been released yet when I'm > back, but I won't be surprised if in fact it does go out while I'm > gone -- it's ready, stick a fork in it! :-) FWIW, I'm going to be around, and am fairly free during the US Memorial Day weekend 24th - 26th. Can we shoot for getting a release out that weekend? If we can code freeze by the 22nd, I can throw together a release candidate on Friday (with Tim's help for Windows) and a final by Monday. What do you folks think? -Barry From fdrake@acm.org Fri May 16 03:30:07 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 15 May 2003 22:30:07 -0400 Subject: [Python-Dev] Vacation; Python 2.2.3 release. In-Reply-To: <1053050696.26479.35.camel@geddy> References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net> <1053050696.26479.35.camel@geddy> Message-ID: <16068.19759.176884.680744@grendel.zope.com> Barry Warsaw writes: > FWIW, I'm going to be around, and am fairly free during the US Memorial > Day weekend 24th - 26th. Can we shoot for getting a release out that > weekend? If we can code freeze by the 22nd, I can throw together a > release candidate on Friday (with Tim's help for Windows) and a final by > Monday. I'll be away that Friday through Tuesday, and don't expect any kind of internet/email access. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From barry@python.org Fri May 16 03:43:09 2003 From: barry@python.org (Barry Warsaw) Date: 15 May 2003 22:43:09 -0400 Subject: [Python-Dev] Vacation; Python 2.2.3 release. In-Reply-To: <16068.19759.176884.680744@grendel.zope.com> References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net> <1053050696.26479.35.camel@geddy> <16068.19759.176884.680744@grendel.zope.com> Message-ID: <1053052989.26479.39.camel@geddy> On Thu, 2003-05-15 at 22:30, Fred L. Drake, Jr. wrote: > Barry Warsaw writes: > > FWIW, I'm going to be around, and am fairly free during the US Memorial > > Day weekend 24th - 26th. Can we shoot for getting a release out that > > weekend? If we can code freeze by the 22nd, I can throw together a > > release candidate on Friday (with Tim's help for Windows) and a final by > > Monday. > > I'll be away that Friday through Tuesday, and don't expect any kind of > internet/email access. So can we have all the doc changes in place before then, or should we freeze on Wednesday? -Barry From fdrake@acm.org Fri May 16 04:03:11 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 15 May 2003 23:03:11 -0400 Subject: [Python-Dev] Vacation; Python 2.2.3 release. In-Reply-To: <1053052989.26479.39.camel@geddy> References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net> <1053050696.26479.35.camel@geddy> <16068.19759.176884.680744@grendel.zope.com> <1053052989.26479.39.camel@geddy> Message-ID: <16068.21743.727134.815827@grendel.zope.com> Barry Warsaw writes: > So can we have all the doc changes in place before then, or should we > freeze on Wednesday? We can probably have things done; there's only one big thing that needs to be back-ported in the docs. (I'm pretty sure we solved a fonts problem on the trunk; that fix really needs to be back-ported, but I'll have to spend a little time digging it out. This is really reason to separate the documentation processing tools from the doc tree.) Normally, the docs distributed with a release candidate are marked as being for the RC in the versioning; I could build both sets of packages ahead of time if we get the CVS tagging right. That would prevent any changes to the docs after the RC, which should be fine. We can deal with the mechanics of that next week, to the extent that anything much needs to happen. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From mwh@python.net Fri May 16 11:47:51 2003 From: mwh@python.net (Michael Hudson) Date: Fri, 16 May 2003 11:47:51 +0100 Subject: [Python-Dev] Vacation; Python 2.2.3 release. In-Reply-To: <200305160147.h4G1lCa09066@localhost.localdomain> (Anthony Baxter's message of "Fri, 16 May 2003 11:47:11 +1000") References: <200305160147.h4G1lCa09066@localhost.localdomain> Message-ID: <2mn0hnyxpk.fsf@starship.python.net> Anthony Baxter writes: >>>> Guido van Rossum wrote >> Now, I'd like Python 2.2.3 to be released soon. Barry has volunteered >> to be the release manager. I think it's pretty much ready to go out >> any time, except that Jeremy mentioned that he has a few things he'd >> like to backport; since Jeremy and Barry share an office I'm sure they >> can work this out. :-) > > There's a bunch of cvs commit messages I've saved off as "potential > branch-patches". I might try to get to them this weekend. My python-bugfixes mbox is still online: http://starship.python.net/crew/mwh/python-bugfixes Some of it might still be relavent -- I haven't been that conscientious about keeping it up to date. Cheers, M. -- The PROPER way to handle HTML postings is to cancel the article, then hire a hitman to kill the poster, his wife and kids, and fuck his dog and smash his computer into little bits. Anything more is just extremism. -- Paul Tomblin, asr From ark@research.att.com Fri May 16 13:07:23 2003 From: ark@research.att.com (Andrew Koenig) Date: 16 May 2003 08:07:23 -0400 Subject: [Python-Dev] [PEP] += on return of function call result In-Reply-To: <20030515214417.GF3900@localhost> References: <20030402090726.GN1048@localhost> <20030515214417.GF3900@localhost> Message-ID: ark> Why can't you do this? ark> for t in range(5): ark> for r in range(10): ark> foo = log.setdefault(r,'') ark> foo += "test %d\n" % t Luke> after running this code, Luke> log = {0: '', 1: '', 2:'', 3: '' ... 9: ''} Luke> and foo equals "test 5". Then that is what foo would be if you were able to write log.setdefault(r,'') += "test %d\n" % t as you had wished. Luke> if, however, you do this: Luke> for t in range(5): Luke> for r in range(10): Luke> foo = log.setdefault(r,[]) Luke> foo.append("test %d\n" % t) Luke> then empirically i conclude that you DO end up with the Luke> expected results (but is this true all of the time?) I presume that is because you are now dealing with vectors instead of strings. In that case, you could also have written for t in range(5): for r in range(10): foo = log.setdefault(r,[]) foo += ["test %d]n" % t] with the same effect. Luke> the reason why your example, andrew, does not work, is Luke> because '' is a string - a basic type to which a pointer is Luke> NOT returned i presume that the foo += "test %d"... returns a Luke> DIFFERENT result object such that the string in the dictionary Luke> is DIFFERENT from the string result of foo being updated. Well, yes. But that is what you would have gotten had you been allowed to write log.setdefault(r,"") += in the first place. Luke> if that makes absolutely no sense whatsoever then think of it Luke> being the difference between integers and pointers-to-integers Luke> in c. I think this analogy is pointless, as the only people who will understand it are those who didn't need it in the first place :-) Luke> can anyone tell me if there are any PARTICULAR circumstances where Luke> foo = log.setdefault(r,[]) Luke> foo.append("test %d\n" % t) Luke> will FAIL to work as expected? It will fail if your expectations are incorrect or unrealistic. Luke> andrew, sorry it took me so long to respond: i initially Luke> thought that under all circumstances for all types of foo, Luke> your example would work. But it does! At least in the sense of the original query. The original query was of the form Why can't I write an expression like f(x) += y? and my answer was, in effect, If you could, it would have the same effect as if you had written foo = f(x) foo += y and then used the value of foo. Perhaps I'm missing something, but I don't think that anything you've said contradicts this answer. -- Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark From jepler@unpythonic.net Fri May 16 13:34:42 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Fri, 16 May 2003 07:34:42 -0500 Subject: [Python-Dev] [PEP] += on return of function call result In-Reply-To: References: <20030402090726.GN1048@localhost> <20030515214417.GF3900@localhost> Message-ID: <20030516123440.GA933@unpythonic.net> It seems almost within the bounds of possibility that pychecker could learn to find bugs of the form t = expression # t results from computation t += i # inplace op on (immutable/no-__iadd__) t del t # or t otherwise not used before function return by doing type and liveness analysis on t. (the type analysis being the hard part) Is there any time that the described situation would not be a bug? I can't see it. Jeff From lkcl@samba-tng.org Fri May 16 15:24:51 2003 From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton) Date: Fri, 16 May 2003 14:24:51 +0000 Subject: [Python-Dev] [PEP] += on return of function call result In-Reply-To: References: <20030402090726.GN1048@localhost> <20030515214417.GF3900@localhost> Message-ID: <20030516142451.GI6196@localhost> On Fri, May 16, 2003 at 08:07:23AM -0400, Andrew Koenig wrote: > ark> Why can't you do this? > > ark> for t in range(5): > ark> for r in range(10): > ark> foo = log.setdefault(r,'') > ark> foo += "test %d\n" % t > > Luke> after running this code, > > Luke> log = {0: '', 1: '', 2:'', 3: '' ... 9: ''} > > Luke> and foo equals "test 5". > > Then that is what foo would be if you were able to write > > log.setdefault(r,'') += "test %d\n" % t > > as you had wished. hmm... ..mmmm... you're absolutely right!!! > Luke> if, however, you do this: > > Luke> for t in range(5): > Luke> for r in range(10): > Luke> foo = log.setdefault(r,[]) > Luke> foo.append("test %d\n" % t) > > Luke> then empirically i conclude that you DO end up with the > Luke> expected results (but is this true all of the time?) > > I presume that is because you are now dealing with vectors instead > of strings. In that case, you could also have written > > for t in range(5): > for r in range(10): > foo = log.setdefault(r,[]) > foo += ["test %d]n" % t] > > with the same effect. > > Luke> the reason why your example, andrew, does not work, is > Luke> because '' is a string - a basic type to which a pointer is > Luke> NOT returned i presume that the foo += "test %d"... returns a > Luke> DIFFERENT result object such that the string in the dictionary > Luke> is DIFFERENT from the string result of foo being updated. > > Well, yes. But that is what you would have gotten had you been allowed > to write > > log.setdefault(r,"") += > > in the first place. [i oversimplified in the example, leading to all the communication problems. the _actual_ usage i was expecting is based on {}.setdefault(0, []) += [1,2] rather than setdefault(0, '') += 'hh' ] > Luke> can anyone tell me if there are any PARTICULAR circumstances where > > Luke> foo = log.setdefault(r,[]) > Luke> foo.append("test %d\n" % t) > > Luke> will FAIL to work as expected? > > It will fail if your expectations are incorrect or unrealistic. ... that sounds like a philosophical or "undefined" answer rather than the technical one i was seeking ... but it is actually quite a _useful_ answer :) to put the question in a different way, or to say again, to put a different, more specific, question: can anyone tell me if there are circumstances under which the second argument from setdefault will SOMETIMES be copied instead of returned and SOMETIMES be returned as-is, such that operations of the type being attempted will SOMETIMES work as expected and SOMETIMES not? > Luke> andrew, sorry it took me so long to respond: i initially > Luke> thought that under all circumstances for all types of foo, > Luke> your example would work. > > But it does! At least in the sense of the original query. where the sense was mistaken, consequently the results are not, as you rightly pointed out, not as expected. > The original query was of the form > > Why can't I write an expression like f(x) += y? > > and my answer was, in effect, > > If you could, it would have the same effect as if you had written > > foo = f(x) > foo += y > > and then used the value of foo. > > Perhaps I'm missing something, but I don't think that anything you've said > contradicts this answer. based on this clarification, my queries are two-fold: 1) what is the technical, syntactical or language-specific reason why I can't write an expression like f(x) += y ? 2) the objections that i can see as to why f(x) += y should not be _allowed_ to work are that, as andrew points out, some people's expectations of any_function() += y may be unrealistic, particularly as normally the result of a function is discarded. however, in the case of setdefault() and get() on dictionaries, the result of the function is NOT discarded: in SOME instances, a reference is returned to the dictionary item. under such circumstances, why should the objections - to disallow {}.setdefault(0, []) += [] or {}.get([]) += [] - stand? l. From tim@zope.com Fri May 16 15:30:37 2003 From: tim@zope.com (Tim Peters) Date: Fri, 16 May 2003 10:30:37 -0400 Subject: [Python-Dev] test_urllibnet failing on Windows Message-ID: I'm not familiar with this test. In a release build: """ C:\Code\python\PCbuild>python ../lib/test/test_urllibnet.py testURLread (__main__.URLTimeoutTest) ... ok test_bad_address (__main__.urlopenNetworkTests) ... ok test_basic (__main__.urlopenNetworkTests) ... ok test_fileno (__main__.urlopenNetworkTests) ... ERROR test_geturl (__main__.urlopenNetworkTests) ... ok test_info (__main__.urlopenNetworkTests) ... ok test_readlines (__main__.urlopenNetworkTests) ... ok test_basic (__main__.urlretrieveNetworkTests) ... ok test_header (__main__.urlretrieveNetworkTests) ... ok test_specified_path (__main__.urlretrieveNetworkTests) ... ok ====================================================================== ERROR: test_fileno (__main__.urlopenNetworkTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "../lib/test/test_urllibnet.py", line 91, in test_fileno FILE = os.fdopen(fd) OSError: (0, 'Error') ---------------------------------------------------------------------- Ran 10 tests in 7.081s FAILED (errors=1) Traceback (most recent call last): File "../lib/test/test_urllibnet.py", line 149, in ? test_main() File "../lib/test/test_urllibnet.py", line 146, in test_main urlretrieveNetworkTests) File "C:\Code\python\lib\test\test_support.py", line 259, in run_unittest run_suite(suite, testclass) File "C:\Code\python\lib\test\test_support.py", line 247, in run_suite raise TestFailed(err) test.test_support.TestFailed: Traceback (most recent call last): File "../lib/test/test_urllibnet.py", line 91, in test_fileno FILE = os.fdopen(fd) OSError: (0, 'Error') """ In a debug build: """ C:\Code\python\PCbuild>python_d ../lib/test/test_urllibnet.py testURLread (__main__.URLTimeoutTest) ... ok test_bad_address (__main__.urlopenNetworkTests) ... ok test_basic (__main__.urlopenNetworkTests) ... ok test_fileno (__main__.urlopenNetworkTests) ... """ and there it dies with an assertion error in the bowels of Microsoft's fdopen.c. That's called by Python's posix_fdopen, here: fp = fdopen(fd, mode); At this point, fd is 436. MS's fdopen is unhappy because only 32 handles actually exist at this point, and 436 is bigger than that. In the release build, the MS assert doesn't (of course) trigger; instead, that 436 >= 32 causes MS's fdopen to return NULL. From skip@pobox.com Fri May 16 15:41:22 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 16 May 2003 09:41:22 -0500 Subject: [Python-Dev] test_urllibnet failing on Windows In-Reply-To: References: Message-ID: <16068.63634.915049.706423@montanaro.dyndns.org> Tim> I'm not familiar with this test. Tim> In a release build: ... Brett added a bunch of tests to that file the other day. I imagine he'll take a look when the sun comes up on the west coast. Skip From guido@python.org Fri May 16 16:02:59 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 16 May 2003 11:02:59 -0400 Subject: [Python-Dev] test_urllibnet failing on Windows In-Reply-To: "Your message of Fri, 16 May 2003 10:30:37 EDT." References: Message-ID: <200305161502.h4GF2xk20132@pcp02138704pcs.reston01.va.comcast.net> > I'm not familiar with this test. Me neither, but Brett C should be. :-) > In a debug build: > > """ > C:\Code\python\PCbuild>python_d ../lib/test/test_urllibnet.py > testURLread (__main__.URLTimeoutTest) ... ok > test_bad_address (__main__.urlopenNetworkTests) ... ok > test_basic (__main__.urlopenNetworkTests) ... ok > test_fileno (__main__.urlopenNetworkTests) ... > """ > > and there it dies with an assertion error in the bowels of Microsoft's > fdopen.c. That's called by Python's posix_fdopen, here: > > fp = fdopen(fd, mode); > > At this point, fd is 436. MS's fdopen is unhappy because only 32 > handles actually exist at this point, and 436 is bigger than that. > In the release build, the MS assert doesn't (of course) trigger; > instead, that 436 >= 32 causes MS's fdopen to return NULL. The test assumes that the fileno() from a socket object can be passed to os.fdopen(). That works on Unix. But on Windows it cannot, the small ints used to refer to open files are chosen from a different (though potentially overlapping) space than the small ints used to refer to open sockets, and the two cannot be mixed. So the test should be disabled on Windows. I don't know if we can protect os.fdopen() from crashing when passed an out of range number. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@python.org Fri May 16 16:09:05 2003 From: barry@python.org (Barry Warsaw) Date: 16 May 2003 11:09:05 -0400 Subject: [Python-Dev] Vacation; Python 2.2.3 release. In-Reply-To: <2mn0hnyxpk.fsf@starship.python.net> References: <200305160147.h4G1lCa09066@localhost.localdomain> <2mn0hnyxpk.fsf@starship.python.net> Message-ID: <1053097745.1849.26.camel@barry> On Fri, 2003-05-16 at 06:47, Michael Hudson wrote: > > There's a bunch of cvs commit messages I've saved off as "potential > > branch-patches". I might try to get to them this weekend. > > My python-bugfixes mbox is still online: > > http://starship.python.net/crew/mwh/python-bugfixes > > Some of it might still be relavent -- I haven't been that conscientious > about keeping it up to date. I definitely do not have the time to triage or apply backports. I think you'll have to use your own judgment, tempered by your available time, to decide which patches should be backported. Guido obviously thinks 2.2.3 is ready now so you should prioritize, but be conservative. If you have specific questions, python-dev is the place to ask. -Barry From tim@zope.com Fri May 16 16:33:24 2003 From: tim@zope.com (Tim Peters) Date: Fri, 16 May 2003 11:33:24 -0400 Subject: [Python-Dev] test_urllibnet failing on Windows In-Reply-To: <200305161502.h4GF2xk20132@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > The test assumes that the fileno() from a socket object can be passed > to os.fdopen(). Yup, Jeremy figured that out here. I have a patch waiting to go, but SF isn't cooperating. > That works on Unix. But on Windows it cannot, the small ints used to > refer to open files are chosen from a different (though potentially > overlapping) space than the small ints used to refer to open sockets, > and the two cannot be mixed. Just so. > So the test should be disabled on Windows. > > I don't know if we can protect os.fdopen() from crashing when passed > an out of range number. This is an issue only in the MSVC debug build. The release-build MS libraries *still* explicitly check for out-of-range, and arrange for an error return when it is out of range. I really don't understand why they're asserting in-range in their debug build libraries, because nothing in *their* code assumes the fd is in-range -- their code is defensive enough in the release build that nothing bad will happen even when it is out of range. From barry@python.org Fri May 16 16:44:44 2003 From: barry@python.org (Barry Warsaw) Date: 16 May 2003 11:44:44 -0400 Subject: [Python-Dev] [development doc updates] In-Reply-To: <20030516153518.68B1B18EC13@grendel.zope.com> References: <20030516153518.68B1B18EC13@grendel.zope.com> Message-ID: <1053099884.1849.49.camel@barry> On Fri, 2003-05-16 at 11:35, Fred L. Drake wrote: > The development version of the documentation has been updated: > > http://www.python.org/dev/doc/devel/ > > Various updates, including Jim Fulton's efforts on updating the Extending & > Embedding manual. I think this one's gonna make my Python quotes file: "So, if you want to define a new object type, you need to create a new type object." :) -Barry From jim@zope.com Fri May 16 16:46:19 2003 From: jim@zope.com (Jim Fulton) Date: Fri, 16 May 2003 11:46:19 -0400 Subject: [Python-Dev] C new-style classes and GC Message-ID: <3EC507CB.6080502@zope.com> Lately I've been re-learning how to write new types in C. Things changed drastically (for the better) in 2.2. I've been updating the documentation on writing new types as I go: http://www.python.org/dev/doc/devel/ext/defining-new-types.html (I'm also updating modulator.) I'm starting to try to figure out how to integrate support for GC. The current documentation in the section "Supporting the Cycle Collector" doesn't reflect new-style types and is, thus, out of date. Frankly, I'm taking the approach that there is only One Way to create types in C, the new way, based on new-style types as now documented in the manual. I'll also note that most new-style types don't need and thus don't implement custom allocators. They leave the tp_alloc and tp_free slots empty. So given that we have a new style type, to add support for GC, we need to: - Set the Py_TPFLAGS_HAVE_GC type flag, - Provide implementations of tp_traverse and tp_clear, as described in the section "Supporting the Cycle Collector" section of the docs. - Call PyObject_GC_UnTrack at the beginning of the deallocator, before decrefing any members. I think that that is *all* we have to do. In particular, since we have a new style type that inherits the standard allocator, we don't need to fool with PyObject_GC_New, and PyObject_GC_DEL, because the default tp_alloc and tp_free take care of that for us. Similarly, we don't need to call PyObject_GC_Track, because that is done by the default allocator. (Because of that, our traverse function has to check for null object pointers in our object's members.) Did I get this right? I intend to update the docs to reflect this understanding (or a corrected one, of course). Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (703) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From mwh@python.net Fri May 16 17:03:10 2003 From: mwh@python.net (Michael Hudson) Date: Fri, 16 May 2003 17:03:10 +0100 Subject: [Python-Dev] [development doc updates] In-Reply-To: <1053099884.1849.49.camel@barry> (Barry Warsaw's message of "16 May 2003 11:44:44 -0400") References: <20030516153518.68B1B18EC13@grendel.zope.com> <1053099884.1849.49.camel@barry> Message-ID: <2m7k8qkhfl.fsf@starship.python.net> Barry Warsaw writes: > On Fri, 2003-05-16 at 11:35, Fred L. Drake wrote: >> The development version of the documentation has been updated: >> >> http://www.python.org/dev/doc/devel/ >> >> Various updates, including Jim Fulton's efforts on updating the Extending & >> Embedding manual. > > I think this one's gonna make my Python quotes file: > > "So, if you want to define a new object type, you need to create a new > type object." > > :) That's been there since rev 1.1, which I actually wrote... Cheers, M. -- For their next act, they'll no doubt be buying a firewall running under NT, which makes about as much sense as building a prison out of meringue. -- -:Tanuki:- -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html From fdrake@acm.org Fri May 16 17:11:58 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 16 May 2003 12:11:58 -0400 Subject: [Python-Dev] [development doc updates] In-Reply-To: <2m7k8qkhfl.fsf@starship.python.net> References: <20030516153518.68B1B18EC13@grendel.zope.com> <1053099884.1849.49.camel@barry> <2m7k8qkhfl.fsf@starship.python.net> Message-ID: <16069.3534.513541.138020@grendel.zope.com> Michael Hudson writes: > That's been there since rev 1.1, which I actually wrote... That explains why it sounded vaguely familiar. ;-) I have actually read the version you wrote, and didn't find that sentence in need of changing. Perhaps Barry is too easily amused? ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From mwh@python.net Fri May 16 17:32:53 2003 From: mwh@python.net (Michael Hudson) Date: Fri, 16 May 2003 17:32:53 +0100 Subject: [Python-Dev] [development doc updates] In-Reply-To: <16069.3534.513541.138020@grendel.zope.com> ("Fred L. Drake, Jr."'s message of "Fri, 16 May 2003 12:11:58 -0400") References: <20030516153518.68B1B18EC13@grendel.zope.com> <1053099884.1849.49.camel@barry> <2m7k8qkhfl.fsf@starship.python.net> <16069.3534.513541.138020@grendel.zope.com> Message-ID: <2m4r3ukg22.fsf@starship.python.net> "Fred L. Drake, Jr." writes: > Michael Hudson writes: > > That's been there since rev 1.1, which I actually wrote... > > That explains why it sounded vaguely familiar. ;-) I have actually > read the version you wrote, and didn't find that sentence in need of > changing. I must have been in a fairly odd mood when I wrote it -- "A PyObject is not a very magnificent object" is one of mine, too. > Perhaps Barry is too easily amused? ;-) I don't think documentation should be disallowed from being entertaining :-) (or short, but that's a different rant) Cheers, M. -- Hey, if I thought I was wrong, I'd change my mind. :) -- Grant Edwards, comp.lang.python From jeremy@zope.com Fri May 16 17:42:03 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 16 May 2003 12:42:03 -0400 Subject: [Python-Dev] C new-style classes and GC In-Reply-To: <3EC507CB.6080502@zope.com> References: <3EC507CB.6080502@zope.com> Message-ID: <1053103323.456.71.camel@slothrop.zope.com> On Fri, 2003-05-16 at 11:46, Jim Fulton wrote: > So given that we have a new style type, to add support for GC, we need > to: > > - Set the Py_TPFLAGS_HAVE_GC type flag, > > - Provide implementations of tp_traverse and tp_clear, as described in > the section "Supporting the Cycle Collector" section of the docs. > > - Call PyObject_GC_UnTrack at the beginning of the deallocator, > before decrefing any members. > > I think that that is *all* we have to do. > > In particular, since we have a new style type that inherits the > standard allocator, we don't need to fool with PyObject_GC_New, and > PyObject_GC_DEL, because the default tp_alloc and tp_free take care of > that for us. Similarly, we don't need to call PyObject_GC_Track, > because that is done by the default allocator. (Because of that, our > traverse function has to check for null object pointers in our > object's members.) It depends on how the objects are used in C code. I've upgraded a lot of C extensions to make their types collectable recently. In several cases, it was necessary to change PyObject_New to PyObject_GC_New and add a PyObject_GC_Track. I think the docs ought to explain how to do this. It's not clear to me what the one right way to implement a tp_dealloc slot is. I've seen two common patterns in the Python source: call obj->ob_type->tp_free or call PyObject_GC_Del. The type object initializes tp_free to PyObject_GC_Del, so in most cases the two spellings are equivalent. Calling PyObject_GC_Del feels more straightforward to me. This question isn't specific to GC. Perhaps it's a question of what tp_free is used for and when it should be called. Pure-Python classes and instances have tp_dealloc implementations that call tp_free. I'm not sure if that's a generic recommendation for all types written in C. > Did I get this right? I intend to update the docs to reflect this > understanding (or a corrected one, of course). The three items you listed were sufficient for all the types I've worked on, expecting the issues I noted above. Jeremy From jim@zope.com Fri May 16 18:08:34 2003 From: jim@zope.com (Jim Fulton) Date: Fri, 16 May 2003 13:08:34 -0400 Subject: [Python-Dev] C new-style classes and GC In-Reply-To: <1053103323.456.71.camel@slothrop.zope.com> References: <3EC507CB.6080502@zope.com> <1053103323.456.71.camel@slothrop.zope.com> Message-ID: <3EC51B12.8070407@zope.com> Jeremy Hylton wrote: > On Fri, 2003-05-16 at 11:46, Jim Fulton wrote: > >>So given that we have a new style type, to add support for GC, we need >>to: >> >>- Set the Py_TPFLAGS_HAVE_GC type flag, >> >>- Provide implementations of tp_traverse and tp_clear, as described in >> the section "Supporting the Cycle Collector" section of the docs. >> >>- Call PyObject_GC_UnTrack at the beginning of the deallocator, >> before decrefing any members. >> >>I think that that is *all* we have to do. >> >>In particular, since we have a new style type that inherits the >>standard allocator, we don't need to fool with PyObject_GC_New, and >>PyObject_GC_DEL, because the default tp_alloc and tp_free take care of >>that for us. Similarly, we don't need to call PyObject_GC_Track, >>because that is done by the default allocator. (Because of that, our >>traverse function has to check for null object pointers in our >>object's members.) > > > It depends on how the objects are used in C code. I've upgraded a lot > of C extensions to make their types collectable recently. In several > cases, it was necessary to change PyObject_New to PyObject_GC_New and > add a PyObject_GC_Track. I think the docs ought to explain how to do > this. If you write types the New Way, there are no PyObject_New calls and no need to call PyObject_GC_Track. > It's not clear to me what the one right way to implement a tp_dealloc > slot is. I've seen two common patterns in the Python source: call > obj->ob_type->tp_free or call PyObject_GC_Del. The type object > initializes tp_free to PyObject_GC_Del, so in most cases the two > spellings are equivalent. Calling PyObject_GC_Del feels more > straightforward to me. You need to call obj->ob_type->tp_free to support subclassing. I suggest that every new type should call obj->ob_type->tp_free as a matter of course. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (703) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From mal@lemburg.com Fri May 16 18:09:48 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 16 May 2003 19:09:48 +0200 Subject: [Python-Dev] Startup time In-Reply-To: <1052927757.7258.38.camel@slothrop.zope.com> References: <1052927757.7258.38.camel@slothrop.zope.com> Message-ID: <3EC51B5C.2080307@lemburg.com> Jeremy Hylton wrote: > The use of re in the warnings module seems the primary culprit, since it > pulls in re, sre and friends, string, and strop. FWIW, I've removed the re usage from encodings/__init__.py. Could you check whether this makes a difference in startup time now ? Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 16 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 39 days left From barry@python.org Fri May 16 18:21:04 2003 From: barry@python.org (Barry Warsaw) Date: 16 May 2003 13:21:04 -0400 Subject: [Python-Dev] [development doc updates] In-Reply-To: <2m4r3ukg22.fsf@starship.python.net> References: <20030516153518.68B1B18EC13@grendel.zope.com> <1053099884.1849.49.camel@barry> <2m7k8qkhfl.fsf@starship.python.net> <16069.3534.513541.138020@grendel.zope.com> <2m4r3ukg22.fsf@starship.python.net> Message-ID: <1053105659.2342.2.camel@barry> On Fri, 2003-05-16 at 12:32, Michael Hudson wrote: > I must have been in a fairly odd mood when I wrote it -- "A PyObject > is not a very magnificent object" is one of mine, too. That's the other one I enjoyed! > > Perhaps Barry is too easily amused? ;-) That may be true, but it /did/ have a nice lyrical quality to it. I'm not saying it should be change! In fact we need more documentation like that . -Barry From mal@lemburg.com Fri May 16 18:25:04 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 16 May 2003 19:25:04 +0200 Subject: [Python-Dev] test_time fails with current CVS Message-ID: <3EC51EF0.3080701@lemburg.com> Just thought you'd like to know: test test_time failed -- Traceback (most recent call last): File "/home/lemburg/projects/Python/Dev-Python/Lib/test/test_time.py", line 107, in test_tzset self.failUnless(time.tzname[1] == 'AEDT', str(time.tzname[1])) File "/home/lemburg/projects/Python/Dev-Python/Lib/unittest.py", line 268, in failUnless if not expr: raise self.failureException, msg AssertionError: AEST In case it helps: I live on the northern hemisphere :-) BTW, the correct time zone names are: EAST and EADT -- perhaps that's what's causing the problem ? -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 16 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 39 days left From jeremy@zope.com Fri May 16 18:35:33 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 16 May 2003 13:35:33 -0400 Subject: [Python-Dev] C new-style classes and GC In-Reply-To: <3EC51B12.8070407@zope.com> References: <3EC507CB.6080502@zope.com> <1053103323.456.71.camel@slothrop.zope.com> <3EC51B12.8070407@zope.com> Message-ID: <1053106533.453.78.camel@slothrop.zope.com> On Fri, 2003-05-16 at 13:08, Jim Fulton wrote: > If you write types the New Way, there are no PyObject_New calls and > no need to call PyObject_GC_Track. I don't follow. There are plenty of types that are garbage collectable that also use PyObject_GC_New. One example is PyDict_New(). If something is widespread in the Python source tree (a common source of example code for programmers), it ought to be documented. > > It's not clear to me what the one right way to implement a tp_dealloc > > slot is. I've seen two common patterns in the Python source: call > > obj->ob_type->tp_free or call PyObject_GC_Del. The type object > > initializes tp_free to PyObject_GC_Del, so in most cases the two > > spellings are equivalent. Calling PyObject_GC_Del feels more > > straightforward to me. > > You need to call obj->ob_type->tp_free to support subclassing. > > I suggest that every new type should call obj->ob_type->tp_free > as a matter of course. If the type is going to support subclassing. Jeremy From guido@python.org Fri May 16 18:37:21 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 16 May 2003 13:37:21 -0400 Subject: [Python-Dev] C new-style classes and GC In-Reply-To: "Your message of 16 May 2003 12:42:03 EDT." <1053103323.456.71.camel@slothrop.zope.com> References: <3EC507CB.6080502@zope.com> <1053103323.456.71.camel@slothrop.zope.com> Message-ID: <200305161737.h4GHbLl06562@pcp02138704pcs.reston01.va.comcast.net> > It's not clear to me what the one right way to implement a tp_dealloc > slot is. I've seen two common patterns in the Python source: call > obj->ob_type->tp_free or call PyObject_GC_Del. The type object > initializes tp_free to PyObject_GC_Del, so in most cases the two > spellings are equivalent. Calling PyObject_GC_Del feels more > straightforward to me. But calling tp_free is more correct. This allows a subclass to change the memory allocation policy. (This is also important if a base class is not collectible but a subclass is -- then it's essential that the base class dealloc handler calls tp_free.) --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@zope.com Fri May 16 19:12:33 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 16 May 2003 14:12:33 -0400 Subject: [Python-Dev] C new-style classes and GC In-Reply-To: <200305161737.h4GHbLl06562@pcp02138704pcs.reston01.va.comcast.net> References: <3EC507CB.6080502@zope.com> <1053103323.456.71.camel@slothrop.zope.com> <200305161737.h4GHbLl06562@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <1053108753.457.100.camel@slothrop.zope.com> On Fri, 2003-05-16 at 13:37, Guido van Rossum wrote: > > It's not clear to me what the one right way to implement a tp_dealloc > > slot is. I've seen two common patterns in the Python source: call > > obj->ob_type->tp_free or call PyObject_GC_Del. The type object > > initializes tp_free to PyObject_GC_Del, so in most cases the two > > spellings are equivalent. Calling PyObject_GC_Del feels more > > straightforward to me. > > But calling tp_free is more correct. This allows a subclass to change > the memory allocation policy. (This is also important if a base class > is not collectible but a subclass is -- then it's essential that the > base class dealloc handler calls tp_free.) There are dozens of objects in Python that do not call tp_free. For example, range object's have a tp_dealloc that is set to PyObject_Del(). Should we change those? Or should we say that it's okay to call PyObject_Del() and PyObject_GC_Del() from objects that are not intended to be subclassed? (patch pending :-) Jeremy From tim@zope.com Fri May 16 19:12:12 2003 From: tim@zope.com (Tim Peters) Date: Fri, 16 May 2003 14:12:12 -0400 Subject: [Python-Dev] C new-style classes and GC In-Reply-To: <200305161737.h4GHbLl06562@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > But calling tp_free is more correct. This allows a subclass to change > the memory allocation policy. (This is also important if a base class > is not collectible but a subclass is -- then it's essential that the > base class dealloc handler calls tp_free.) I think the scoop is that cyclic gc got added before new-style classes, so at the time new-style classes were introduced *all* tp_dealloc slots for gc'able types called the gc del function directly. After that, I expect the only ones that got changed were those reviewed (lists, tuples, dicts, ...) as part of making test_descr.py's subclass-from-builtin tests work. Jeremy is rummaging thru current CVS now changing the others (frames, functions, ...). Does this count as a bugfix, i.e. should it be backported to 2.2.3? From guido@python.org Fri May 16 19:23:51 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 16 May 2003 14:23:51 -0400 Subject: [Python-Dev] C new-style classes and GC In-Reply-To: "Your message of 16 May 2003 14:12:33 EDT." <1053108753.457.100.camel@slothrop.zope.com> References: <3EC507CB.6080502@zope.com> <1053103323.456.71.camel@slothrop.zope.com> <200305161737.h4GHbLl06562@pcp02138704pcs.reston01.va.comcast.net> <1053108753.457.100.camel@slothrop.zope.com> Message-ID: <200305161823.h4GINpR06675@pcp02138704pcs.reston01.va.comcast.net> > There are dozens of objects in Python that do not call tp_free. For > example, range object's have a tp_dealloc that is set to PyObject_Del(). > Should we change those? Or should we say that it's okay to call > PyObject_Del() and PyObject_GC_Del() from objects that are not intended > to be subclassed? If those types don't have the Py_TPFLAGS_BASETYPE flag set, they're okay. Otherwise they should be fixed. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri May 16 19:24:24 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 16 May 2003 14:24:24 -0400 Subject: [Python-Dev] C new-style classes and GC In-Reply-To: "Your message of Fri, 16 May 2003 14:12:12 EDT." References: Message-ID: <200305161824.h4GIOOf06688@pcp02138704pcs.reston01.va.comcast.net> > Jeremy is rummaging thru current CVS now changing the others > (frames, functions, ...). Does this count as a bugfix, i.e. should > it be backported to 2.2.3? For the ones that are subclassable in 2.2.3, yes. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Fri May 16 19:36:44 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 16 May 2003 14:36:44 -0400 Subject: [Python-Dev] a strange case Message-ID: <16069.12220.217558.569689@grendel.zope.com> Here's a strange case we just ran across, led along by a typo in an import statement. This is using the head of the 2.2.x maintenance branch; I've not tested this against the trunk yet. >>> import os >>> class Foo(os): ... pass ... >>> Foo I suspect this isn't intentional behavior. ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From barry@python.org Fri May 16 19:44:39 2003 From: barry@python.org (Barry Warsaw) Date: 16 May 2003 14:44:39 -0400 Subject: [Python-Dev] a strange case In-Reply-To: <16069.12220.217558.569689@grendel.zope.com> References: <16069.12220.217558.569689@grendel.zope.com> Message-ID: <1053110679.2342.4.camel@barry> On Fri, 2003-05-16 at 14:36, Fred L. Drake, Jr. wrote: > Here's a strange case we just ran across, led along by a typo in an > import statement. This is using the head of the 2.2.x maintenance > branch; I've not tested this against the trunk yet. > > >>> import os > >>> class Foo(os): > ... pass > ... > >>> Foo > > > I suspect this isn't intentional behavior. ;-) No, it's not, and in 2.3 you get an error (albeit a TypeError with a rather unhelpful message). I guess the "fix" hasn't been backported. -Barry From jeremy@zope.com Fri May 16 19:50:00 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 16 May 2003 14:50:00 -0400 Subject: [Python-Dev] a strange case In-Reply-To: <1053110679.2342.4.camel@barry> References: <16069.12220.217558.569689@grendel.zope.com> <1053110679.2342.4.camel@barry> Message-ID: <1053111000.456.111.camel@slothrop.zope.com> On Fri, 2003-05-16 at 14:44, Barry Warsaw wrote: > No, it's not, and in 2.3 you get an error (albeit a TypeError with a > rather unhelpful message). I guess the "fix" hasn't been backported. I think we decided this wasn't a pure bugfix :-). Some poor soul may have code that relies on being able to subclass a module. Jeremy From fred@zope.com Fri May 16 19:49:38 2003 From: fred@zope.com (Fred L. Drake, Jr.) Date: Fri, 16 May 2003 14:49:38 -0400 Subject: [Python-Dev] a strange case In-Reply-To: <16069.12220.217558.569689@grendel.zope.com> References: <16069.12220.217558.569689@grendel.zope.com> Message-ID: <16069.12994.728753.504190@grendel.zope.com> I wrote: > Here's a strange case we just ran across, led along by a typo in an > import statement. This is using the head of the 2.2.x maintenance > branch; I've not tested this against the trunk yet. Ok, the trunk does a little better, but the error message is a little confusing: Python 2.3b1+ (#2, May 16 2003, 14:42:51) [GCC 2.96 20000731 (Red Hat Linux 7.3 2.96-113)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> class Foo(os): ... pass ... Traceback (most recent call last): File "", line 1, in ? TypeError: function takes at most 2 arguments (3 given) >>> class Foo(os, sys): ... pass ... Traceback (most recent call last): File "", line 1, in ? TypeError: function takes at most 2 arguments (3 given) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake@acm.org Fri May 16 19:56:47 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 16 May 2003 14:56:47 -0400 Subject: [Python-Dev] a strange case In-Reply-To: <1053111000.456.111.camel@slothrop.zope.com> References: <16069.12220.217558.569689@grendel.zope.com> <1053110679.2342.4.camel@barry> <1053111000.456.111.camel@slothrop.zope.com> Message-ID: <16069.13423.156806.769779@grendel.zope.com> Jeremy Hylton writes: > I think we decided this wasn't a pure bugfix :-). Some poor soul may > have code that relies on being able to subclass a module. I just played with one of these things; they're as vaccuous as modules can possibly be! If anyone depends on this, they're insane. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From skip@pobox.com Fri May 16 20:00:24 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 16 May 2003 14:00:24 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <3EC51B5C.2080307@lemburg.com> References: <1052927757.7258.38.camel@slothrop.zope.com> <3EC51B5C.2080307@lemburg.com> Message-ID: <16069.13640.892428.185711@montanaro.dyndns.org> mal> FWIW, I've removed the re usage from encodings/__init__.py. mal> Could you check whether this makes a difference in startup time mal> now? Well... Not really, but it's not your fault. site.py imports distutils.util which imports re. It does a fair amount of regex compiling, some at the module level, so deferring "import re" may take a couple minutes of work. Hang on... Okay, now re isn't imported. The only runtime difference between the two sets of times below is encodings/__init__.py 1.18 vs 1.19. Each set of times is for this command: % time ./python.exe -c pass Everything was already byte compiled. The times reported were the best of five. I tried to quiet the system as much as possible. Still, since the amount of work being done is minimal, it's tought to get a good feel for any differences. version 1.18 (w/ re) real 0m0.143s user 0m0.030s sys 0m0.060s version 1.19 (no re) real 0m0.142s user 0m0.040s sys 0m0.060s Note the rather conspicuous lack of any difference. The only modifications to my Lib tree are these: M cgitb.py M warnings.py M distutils/util.py M encodings/__init__.py M test/test_bsddb185.py I verified that site was imported from my Lib tree: % ./python.exe -v -c pass 2>&1 | egrep 'site' # /Users/skip/src/python/head/dist/src/build.opt/../Lib/site.pyc matches /Users/skip/src/python/head/dist/src/build.opt/../Lib/site.py import site # precompiled from /Users/skip/src/python/head/dist/src/build.opt/../Lib/site.pyc # cleanup[1] site It would appear that the encodings stuff isn't getting imported on my platform (Mac OS X): % ./python.exe -v -c pass 2>&1 | egrep 'encoding' % Looking at site.py shows that the encodings package is only imported on win32 and only if the codecs.lookup() call fails. Oh well, we don't care about minority platforms. ;-) More seriously, to test your specific change someone will have to run the check Windows. To contribute something maybe positive, here's the same timing comparison using my changed version of distutils.util vs CVS: CVS: real 0m0.155s user 0m0.050s sys 0m0.070s Changed (no module-level re import): real 0m0.143s user 0m0.070s sys 0m0.040s It appears eliminating "import re" has only a very small effect for me. It looks like an extra 6 modules are imported (25 vs 19). Skip From barry@python.org Fri May 16 20:09:31 2003 From: barry@python.org (Barry Warsaw) Date: 16 May 2003 15:09:31 -0400 Subject: [Python-Dev] Startup time In-Reply-To: <16069.13640.892428.185711@montanaro.dyndns.org> References: <1052927757.7258.38.camel@slothrop.zope.com> <3EC51B5C.2080307@lemburg.com> <16069.13640.892428.185711@montanaro.dyndns.org> Message-ID: <1053112171.2342.7.camel@barry> On Fri, 2003-05-16 at 15:00, Skip Montanaro wrote: > Well... Not really, but it's not your fault. Skip, you're going about this all wrong. We already have the technology to start Python up blazingly fast. All you have to do is port XEmacs's unexec code. Then you load up Python with all the modules you think you're going to need, unexec it, then the next time it starts up like lightening. Disk space is cheap! -Barry From jeremy@zope.com Fri May 16 20:07:53 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 16 May 2003 15:07:53 -0400 Subject: [Python-Dev] Startup time In-Reply-To: <16069.13640.892428.185711@montanaro.dyndns.org> References: <1052927757.7258.38.camel@slothrop.zope.com> <3EC51B5C.2080307@lemburg.com> <16069.13640.892428.185711@montanaro.dyndns.org> Message-ID: <1053112072.451.114.camel@slothrop.zope.com> On Fri, 2003-05-16 at 15:00, Skip Montanaro wrote: > mal> FWIW, I've removed the re usage from encodings/__init__.py. > > mal> Could you check whether this makes a difference in startup time > mal> now? > > Well... Not really, but it's not your fault. site.py imports > distutils.util which imports re. It does a fair amount of regex compiling, > some at the module level, so deferring "import re" may take a couple minutes > of work. Hang on... I don't think you need to do anything to distutils. In the case we care about (an installed Python) distutils.utils isn't imported. Check this code in site.py: # Append ./build/lib. in case we're running in the build dir # (especially for Guido :-) # XXX This should not be part of site.py, since it is needed even when # using the -S option for Python. See http://www.python.org/sf/586680 if (os.name == "posix" and sys.path and os.path.basename(sys.path[-1]) == "Modules"): from distutils.util import get_platform s = "build/lib.%s-%.3s" % (get_platform(), sys.version) s = os.path.join(os.path.dirname(sys.path[-1]), s) sys.path.append(s) del get_platform, s Jeremy From amk@amk.ca Fri May 16 20:09:51 2003 From: amk@amk.ca (A.M. Kuchling) Date: Fri, 16 May 2003 15:09:51 -0400 Subject: [Python-Dev] Re: Startup time In-Reply-To: <16069.13640.892428.185711@montanaro.dyndns.org> References: <1052927757.7258.38.camel@slothrop.zope.com> <3EC51B5C.2080307@lemburg.com> <16069.13640.892428.185711@montanaro.dyndns.org> Message-ID: Skip Montanaro wrote: > Well... Not really, but it's not your fault. site.py imports > distutils.util which imports re. Note that this doesn't apply to an installed Python; that import is only done when running the interpreter from the build directory. --amk From skip@pobox.com Fri May 16 20:16:09 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 16 May 2003 14:16:09 -0500 Subject: [Python-Dev] a strange case In-Reply-To: <1053111000.456.111.camel@slothrop.zope.com> References: <16069.12220.217558.569689@grendel.zope.com> <1053110679.2342.4.camel@barry> <1053111000.456.111.camel@slothrop.zope.com> Message-ID: <16069.14585.371615.56117@montanaro.dyndns.org> Jeremy> On Fri, 2003-05-16 at 14:44, Barry Warsaw wrote: >> No, it's not, and in 2.3 you get an error (albeit a TypeError with a >> rather unhelpful message). I guess the "fix" hasn't been backported. Jeremy> I think we decided this wasn't a pure bugfix :-). Some poor Jeremy> soul may have code that relies on being able to subclass a Jeremy> module. How about at least deprecating that feature in 2.2.3 and warning about it so that poor soul knows this won't be supported forever? Skip From skip@pobox.com Fri May 16 20:19:03 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 16 May 2003 14:19:03 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <1053112072.451.114.camel@slothrop.zope.com> References: <1052927757.7258.38.camel@slothrop.zope.com> <3EC51B5C.2080307@lemburg.com> <16069.13640.892428.185711@montanaro.dyndns.org> <1053112072.451.114.camel@slothrop.zope.com> Message-ID: <16069.14759.862301.686434@montanaro.dyndns.org> Jeremy> I don't think you need to do anything to distutils. In the case Jeremy> we care about (an installed Python) distutils.utils isn't Jeremy> imported. Check this code in site.py: Ah, thanks. That's the code I saw, but I didn't consider the preface comment. Skip From tim@zope.com Fri May 16 20:29:54 2003 From: tim@zope.com (Tim Peters) Date: Fri, 16 May 2003 15:29:54 -0400 Subject: [Python-Dev] C new-style classes and GC In-Reply-To: <3EC507CB.6080502@zope.com> Message-ID: [Jim Fulton] > ... > I'll also note that most new-style types don't need and thus don't > implement custom allocators. They leave the tp_alloc and tp_free slots > empty. I'm worried about half of that: tp_free is needed to release memory no matter whether obtained in a standard or custom way. I don't think tp_free slots always get filled in to something non-NULL by magic, and in the current Python source almost all new-style C types explicitly define a tp_free function (the exceptions are "strange" in some way). PEP 253 may be partly out of date here -- or not. In the section on creating a subclassable type, it says: """ The base type must do the following: - Add the flag value Py_TPFLAGS_BASETYPE to tp_flags. - Declare and use tp_new(), tp_alloc() and optional tp_init() slots. - Declare and use tp_dealloc() and tp_free(). - Export its object structure declaration. - Export a subtyping-aware type-checking macro. """ This doesn't leave a choice about defining tp_alloc() or tp_free() -- it says both are required. For a subclassable type, I believe both must actually be implemented too. For a non-subclassable type, I expect they're optional. But if you don't define tp_free in that case, then I believe you must also not do the obj->ob_type->tp_free(obj) business in the tp_dealloc slot (else it will segfault). From jim@ZOPE.COM Fri May 16 20:33:29 2003 From: jim@ZOPE.COM (Jim Fulton) Date: Fri, 16 May 2003 15:33:29 -0400 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: <1053106533.453.78.camel@slothrop.zope.com> References: <3EC507CB.6080502@zope.com> <1053103323.456.71.camel@slothrop.zope.com> <3EC51B12.8070407@zope.com> <1053106533.453.78.camel@slothrop.zope.com> Message-ID: <3EC53D09.3050505@zope.com> Jeremy Hylton wrote: > On Fri, 2003-05-16 at 13:08, Jim Fulton wrote: > >>If you write types the New Way, there are no PyObject_New calls and >>no need to call PyObject_GC_Track. > > > I don't follow. There are plenty of types that are garbage collectable > that also use PyObject_GC_New. One example is PyDict_New(). If > something is widespread in the Python source tree (a common source of > example code for programmers), it ought to be documented. It is documented in the API reference. Perhaps the API reference should explain that there's a prefered way to do things. There should be one prefered way to write types. It just happens that that way is a *new* way and most existing types don't follow that way. In the how-to style manual, we should only document the one prefered way to write new types. We shouldn't describe all of the various obsolete variations. It's unfortunate that there aren't many examples of how to do things the new way, although that's understandable, since the new way wasn't documented until recently. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (703) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From skip@pobox.com Fri May 16 20:32:40 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 16 May 2003 14:32:40 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <1053112171.2342.7.camel@barry> References: <1052927757.7258.38.camel@slothrop.zope.com> <3EC51B5C.2080307@lemburg.com> <16069.13640.892428.185711@montanaro.dyndns.org> <1053112171.2342.7.camel@barry> Message-ID: <16069.15576.807563.525662@montanaro.dyndns.org> Barry> We already have the technology to start Python up blazingly fast. Barry> All you have to do is port XEmacs's unexec code. So Barry, how far along are you on this? We all know you're the XEmacs whiz of the Python crowd. ;-) DEFUN-ly, yr's, Skip From barry@python.org Fri May 16 20:40:55 2003 From: barry@python.org (Barry Warsaw) Date: 16 May 2003 15:40:55 -0400 Subject: [Python-Dev] Startup time In-Reply-To: <16069.15576.807563.525662@montanaro.dyndns.org> References: <1052927757.7258.38.camel@slothrop.zope.com> <3EC51B5C.2080307@lemburg.com> <16069.13640.892428.185711@montanaro.dyndns.org> <1053112171.2342.7.camel@barry> <16069.15576.807563.525662@montanaro.dyndns.org> Message-ID: <1053114055.2342.10.camel@barry> On Fri, 2003-05-16 at 15:32, Skip Montanaro wrote: > Barry> We already have the technology to start Python up blazingly fast. > Barry> All you have to do is port XEmacs's unexec code. > > So Barry, how far along are you on this? We all know you're the XEmacs whiz > of the Python crowd. ;-) Well, it's actually working pretty well and I'm about to cvs com.... ...oh! The cat's just eaten it. Sorry. bad-kitty-ly y'rs, -Barry From jeremy@zope.com Fri May 16 20:42:40 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 16 May 2003 15:42:40 -0400 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: <3EC53D09.3050505@zope.com> References: <3EC507CB.6080502@zope.com> <1053103323.456.71.camel@slothrop.zope.com> <3EC51B12.8070407@zope.com> <1053106533.453.78.camel@slothrop.zope.com> <3EC53D09.3050505@zope.com> Message-ID: <1053114159.457.117.camel@slothrop.zope.com> I'm willing to believe there is a new and better way, but I don't think I know what it is. How do we change this code, written using the old PyObject_GC_New() to do things the new way? Jeremy PyObject * PyDict_New(void) { register dictobject *mp; if (dummy == NULL) { /* Auto-initialize dummy */ dummy = PyString_FromString(""); if (dummy == NULL) return NULL; #ifdef SHOW_CONVERSION_COUNTS Py_AtExit(show_counts); #endif } mp = PyObject_GC_New(dictobject, &PyDict_Type); if (mp == NULL) return NULL; EMPTY_TO_MINSIZE(mp); mp->ma_lookup = lookdict_string; #ifdef SHOW_CONVERSION_COUNTS ++created; #endif _PyObject_GC_TRACK(mp); return (PyObject *)mp; } From jim@zope.com Fri May 16 21:22:53 2003 From: jim@zope.com (Jim Fulton) Date: Fri, 16 May 2003 16:22:53 -0400 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: References: <3EC507CB.6080502@zope.com> Message-ID: <3EC5489D.9070208@zope.com> Tim Peters wrote: > [Jim Fulton] > >>... >>I'll also note that most new-style types don't need and thus don't >>implement custom allocators. They leave the tp_alloc and tp_free slots >>empty. > > > I'm worried about half of that: tp_free is needed to release memory no > matter whether obtained in a standard or custom way. I don't think tp_free > slots always get filled in to something non-NULL by magic, and in the > current Python source almost all new-style C types explicitly define a > tp_free function (the exceptions are "strange" in some way). > > PEP 253 may be partly out of date here -- or not. In the section on > creating a subclassable type, it says: > > """ > The base type must do the following: > > - Add the flag value Py_TPFLAGS_BASETYPE to tp_flags. > > - Declare and use tp_new(), tp_alloc() and optional tp_init() > slots. > > - Declare and use tp_dealloc() and tp_free(). > > - Export its object structure declaration. > > - Export a subtyping-aware type-checking macro. > """ > > This doesn't leave a choice about defining tp_alloc() or tp_free() -- it > says both are required. For a subclassable type, I believe both must > actually be implemented too. > > For a non-subclassable type, I expect they're optional. But if you don't > define tp_free in that case, then I believe you must also not do the > > obj->ob_type->tp_free(obj) > > business in the tp_dealloc slot (else it will segfault). Hm, I didn't read the PEP, I just went by what Guido told me. :) I was told that PyType_Ready fills in tp_alloc and tp_free with default values. I updated the noddy example in the docs. In this example, I filled in neither tp_alloc or tp_free. I tested the examples and verified that they work. I just added printf calls to verify that these slots are indeen null before the call to PyType_Ready and non-null afterwards. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (703) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim@zope.com Fri May 16 21:30:47 2003 From: jim@zope.com (Jim Fulton) Date: Fri, 16 May 2003 16:30:47 -0400 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: <1053114159.457.117.camel@slothrop.zope.com> References: <3EC507CB.6080502@zope.com> <1053103323.456.71.camel@slothrop.zope.com> <3EC51B12.8070407@zope.com> <1053106533.453.78.camel@slothrop.zope.com> <3EC53D09.3050505@zope.com> <1053114159.457.117.camel@slothrop.zope.com> Message-ID: <3EC54A77.7090106@zope.com> Jeremy Hylton wrote: > I'm willing to believe there is a new and better way, but I don't think > I know what it is. You can read the documentation for it here: http://www.python.org/dev/doc/devel/ext/defining-new-types.html :) > How do we change this code, written using the old > PyObject_GC_New() to do things the new way? > > Jeremy > > PyObject * > PyDict_New(void) > { > register dictobject *mp; > if (dummy == NULL) { /* Auto-initialize dummy */ > dummy = PyString_FromString(""); > if (dummy == NULL) > return NULL; > #ifdef SHOW_CONVERSION_COUNTS > Py_AtExit(show_counts); > #endif > } > mp = PyObject_GC_New(dictobject, &PyDict_Type); > if (mp == NULL) > return NULL; > EMPTY_TO_MINSIZE(mp); > mp->ma_lookup = lookdict_string; > #ifdef SHOW_CONVERSION_COUNTS > ++created; > #endif > _PyObject_GC_TRACK(mp); > return (PyObject *)mp; > } see dict_new in the same file. The new way to create instances if types is to call the types. I don't know wht PyDict_New doesn't just call the dict type. Maybe doing things in-line like this is just an optimization. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (703) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From jim@zope.com Fri May 16 22:08:27 2003 From: jim@zope.com (Jim Fulton) Date: Fri, 16 May 2003 17:08:27 -0400 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: <3EC507CB.6080502@zope.com> References: <3EC507CB.6080502@zope.com> Message-ID: <3EC5534B.5000102@zope.com> Jim Fulton wrote: > Lately I've been re-learning how to write new types in C. Things > changed drastically (for the better) in 2.2. I've been updating the > documentation on writing new types as I go: > > http://www.python.org/dev/doc/devel/ext/defining-new-types.html > > (I'm also updating modulator.) > > I'm starting to try to figure out how to integrate support for GC. > The current documentation in the section "Supporting the Cycle > Collector" doesn't reflect new-style types and is, thus, out of date. > > Frankly, I'm taking the approach that there is only One Way to create > types in C, the new way, based on new-style types as now documented > in the manual. > > I'll also note that most new-style types don't need and thus don't > implement custom allocators. They leave the tp_alloc and tp_free slots > empty. > > So given that we have a new style type, to add support for GC, we need > to: > > - Set the Py_TPFLAGS_HAVE_GC type flag, > > - Provide implementations of tp_traverse and tp_clear, as described in > the section "Supporting the Cycle Collector" section of the docs. > > - Call PyObject_GC_UnTrack at the beginning of the deallocator, > before decrefing any members. > > I think that that is *all* we have to do. It looks like the answer is "no". :) I tried to write a type using this formula and segfaulted. Looking at other types, I found that if I want to support GC and am using the default allocator, which I get for free, I have to fill the tp_free slot with PyObject_GC_Del (_PyObject_GC_Del if I want to support Python 2.2 and 2.3). I *think* this is all I have to do. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (703) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From tim@zope.com Fri May 16 22:37:07 2003 From: tim@zope.com (Tim Peters) Date: Fri, 16 May 2003 17:37:07 -0400 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: <3EC5489D.9070208@zope.com> Message-ID: [Jim Fulton] > Hm, I didn't read the PEP, I just went by what Guido told me. :) That's a good idea -- I think the PEP is out of date here. > I was told that PyType_Ready fills in tp_alloc and tp_free with > default values. And I finally found the code that does that . > I updated the noddy example in the docs. In this example, I filled > in neither tp_alloc or tp_free. I tested the examples and verified that > they work. > > I just added printf calls to verify that these slots are indeen > null before the call to PyType_Ready and non-null afterwards. This is the scoop: if your type does *not* define the tp_base or tp_bases slot, then PyType_Ready() sets your type's tp_base slot to &PyBaseObject_Type by magic (this is the C spelling of the type named "object" in Python), and the tp_bases slot to (object,) by magic. A whole pile of type slots are then inherited from whatever tp_bases points to after that (which is the singleton PyBaseObject_Type if you didn't set tp_base or tp_bases yourself). The tp_alloc slot it inherits from object is PyType_GenericAlloc. The tp_free slot " " " " is PyObject_Del. This works, but as we both discovered later, it leads to a segfault if your type participates in cyclic gc too: your type *still* inherits a tp_free of PyObject_Del from object then, but that's the wrong deallocation function for gc'able objects. However, the default tp_alloc is aware of gc, and does the right thing either way. Guido, would you be agreeable to making this magic even more magical? It seems to me that we can know whether the current type intends to participate in cyclic gc, and give it a correct default tp_free value instead if so. The hairier type_new() function already has this extra level of Py_TPFLAGS_HAVE_GC-dependent magic for dynamically created types, setting tp_free to PyObject_Del in one case and to PyObject_GC_Del in the other. PyType_Ready() can supply a wrong deallocation function by default ("explicit is better than implicit" has no force when talking about PyType_Ready() ). From tim@zope.com Fri May 16 22:43:51 2003 From: tim@zope.com (Tim Peters) Date: Fri, 16 May 2003 17:43:51 -0400 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: <3EC54A77.7090106@zope.com> Message-ID: [Jim Fulton] > ... > I don't know wht PyDict_New doesn't just call the dict type. > Maybe doing things in-line like this is just an optimization. All aspects of dict objects are indeed micro-optimized. In the office, Jeremy raised some darned good points about things in dictobject.c that don't make good sense anymore, but I'll skip those here because they don't relate to the topic at hand (in brief, "module initialization" for dictobject.c is still hiding inside PyDict_New, but there's no guarantee that *ever* gets called anymore). From troy@gci.net Fri May 16 22:45:25 2003 From: troy@gci.net (Troy Melhase) Date: Fri, 16 May 2003 13:45:25 -0800 Subject: [Python-Dev] a strange case In-Reply-To: <20030516202402.30333.72761.Mailman@mail.python.org> References: <20030516202402.30333.72761.Mailman@mail.python.org> Message-ID: <200305161345.25415.troy@gci.net> > Jeremy> I think we decided this wasn't a pure bugfix :-). Some poor > Jeremy> soul may have code that relies on being able to subclass a > Jeremy> module. > > How about at least deprecating that feature in 2.2.3 and warning about it > so that poor soul knows this won't be supported forever? I think I'm knocking on the poor-house door. Just last night, it occurred to me that modules could be made callable via subclassing. "Why in the world would you want callable modules you ask?" I don't have a real need, but I often see the line blurred between package, module, and class. Witness: from Foo import Bar frob = Bar() If Bar is initially a class, then is reimplemented as a module, client code must change to account for that. If Bar is reimplemented as a callable module, clients remain unaffected. I haven't any code that relies on subclassing the module type, but many times I've gone thru the cycle of coding a class then promoting it to a module as it becomes more complex. I'm certainly not advocating that the module type be subclassable or not, but I did want to point out a possible legitmate need to derive from it. Many apologies if I'm wasting space and time. -troy Silly example: troy@marchhare tmp $ cat foo.py def op(): print 'foo op' def frob(): print 'foo frob' def __call__(a, b, c): print 'module foo called!', a, b, c troy@marchhare tmp $ cat bar.py class ModuleObject(type(__builtins__)): def __init__(self, amodule): self.amodule = amodule self.__name__ = amodule.__name__ self.__file__ = amodule.__file__ def __getattr__(self, attr): return getattr(self.amodule, attr) def __call__(self, *a, **b): return self.amodule.__call__(*a, **b) import foo foo = ModuleObject(foo) foo(1,2,3) troy@marchhare tmp $ python2.3 bar.py module foo called! 1 2 3 From drifty@alum.berkeley.edu Fri May 16 23:13:34 2003 From: drifty@alum.berkeley.edu (Brett C.) Date: Fri, 16 May 2003 15:13:34 -0700 Subject: [Python-Dev] test_urllibnet failing on Windows In-Reply-To: References: Message-ID: <3EC5628E.5060302@ocf.berkeley.edu> Tim Peters wrote: >>... >>The docs for fdopen say nothing about this restriction. Anyone mind if >>I add to the docs a mention of this limitation? > > > AFAICT, you only asked me, so I'll answer : Joys of missing the "reply all" button. I am cc'ing python-dev on this now. > I think this is better > spelled out in the docs for socket.fileno(). What it says now: > > Return the socket's file descriptor (a small integer). This is > useful with select.select(). > > is correct for Unix, but on Windows it does not return a file descriptor (it > returns a Windows socket handle, which is also "a small integer", and is > also useful select.select() -- although on both Windows and Unix, > select.select() extracts the fileno() from socket objects automatically, so > there's no *need* to invoke fileno() explicitly in order to call select()). > OK. I will fix those docs. -Brett From drifty@alum.berkeley.edu Sat May 17 00:25:02 2003 From: drifty@alum.berkeley.edu (Brett C.) Date: Fri, 16 May 2003 16:25:02 -0700 Subject: [Python-Dev] test_bsddb185 failing under OS X Message-ID: <3EC5734E.30209@ocf.berkeley.edu> ====================================================================== FAIL: test_anydbm_create (__main__.Bsddb185Tests) ---------------------------------------------------------------------- Traceback (most recent call last): File "test_bsddb185.py", line 35, in test_anydbm_create self.assertNotEqual(ftype, "bsddb185") File "/Users/drifty/cvs_code/lib/python2.3/unittest.py", line 300, in failIfEqual raise self.failureException, \ AssertionError: 'bsddb185' == 'bsddb185' DBs are not my area of expertise so I don't know how to go about to attempt to fix this. -Brett From ru4kcp0y@yahoo.com Sat May 17 04:31:53 2003 From: ru4kcp0y@yahoo.com (Louise Hamilton) Date: Sat, 17 May 03 03:31:53 GMT Subject: [Python-Dev] grlsp ihort pjjkku tz w Message-ID: <8-i12h3fklc$fw-7a-lm582$0-jd@v77.2.yu7> This is a multi-part message in MIME format. --D8F835B38D6BFE Content-Type: text/html; Content-Transfer-Encoding: quoted-printable

 

tjwqyjgof i v xosasfq cb keirdl g k yvxu mew olr wm qwro ccdd ny bt --D8F835B38D6BFE-- From tismer@tismer.com Sat May 17 00:52:20 2003 From: tismer@tismer.com (Christian Tismer) Date: Sat, 17 May 2003 01:52:20 +0200 Subject: [Python-Dev] Need advice, maybe support Message-ID: <3EC579B4.9000303@tismer.com> Hi Guido, all, In the last months, I made very much progress with Stackless 3.0 . Finally, I was able to make much more of Python stackless (which means, does not use recursive interpreter calls) than I could achive with 1.0 . There is one drawback with this, and I need advice: Compared to older Python versions, Py 2.2.2 and up uses more indirection through C function pointers than ever. This blocked my implementation of stackless versions, in the first place. Then the idea hit me like a blizzard: Most problems simply vanish if I add another slot to the PyMethodDef structure, which is NULL by default: ml_meth_nr is a function pointer with the same semantics as ml_meth, but it tries to perform its action without doing a recursive call. It tries instead to push a frame and to return Py_UnwindToken. Doing this change made Stackless crystal clear and simple: A C extension not aware of Stackless does what it does all the time: call ml_meth. Stackless aware C code (like my modified ceval.c code) calls the ml_meth_nr slots, instead, which either defaults to the ml_meth code, or has a special version which avoids recursive interpreter calls. I also added a tp_call_nr slot to typeobject, for similar reasons. While this is just great for me, yielding complete source code compatability, it is a slight drawback, since almost all extension modules make use of the PyMethodDef structure. Therefore, binary compatability of Stackless has degraded, dramatically. I'm now in some kind of dilemma: On the one side, I'm happy with this solution (while I have to admit that it is not too inexpensive, but well, all the new descriptor objects are also not cheap, but just great), on the other hand, simply replacing python22.dll is no longer sufficient. You need to re-compile everything, which might be a hard thing on Windows (win32 extensions, wxPython). Sure, I would stand this, if there is no alternative, I would have to supply a complete replacement package of everything. Do you (does anybody) have an alternative suggestion how to efficiently maintain a "normal" and a "non-recursive" version of a method without changing the PyMethodDef struc? Alternatively, would it be reasonable to ask the Python core developers, if they would accept to augment PyMethodDef and PyTypeObject with an extra field (default NULL, no maintenance), just for me and Stackless? Many thanks for any reply - sincerely -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From pje@telecommunity.com Sat May 17 00:48:21 2003 From: pje@telecommunity.com (Phillip J. Eby) Date: Fri, 16 May 2003 19:48:21 -0400 Subject: [Python-Dev] a strange case In-Reply-To: <200305161345.25415.troy@gci.net> References: <20030516202402.30333.72761.Mailman@mail.python.org> <20030516202402.30333.72761.Mailman@mail.python.org> Message-ID: <5.1.1.6.0.20030516194238.02fb2d90@telecommunity.com> At 01:45 PM 5/16/03 -0800, Troy Melhase wrote: > > Jeremy> I think we decided this wasn't a pure bugfix :-). Some poor > > Jeremy> soul may have code that relies on being able to subclass a > > Jeremy> module. > > > > How about at least deprecating that feature in 2.2.3 and warning about it > > so that poor soul knows this won't be supported forever? > >I think I'm knocking on the poor-house door. > >Just last night, it occurred to me that modules could be made callable via >subclassing. This isn't about subclassing the module *type*, but about subclassing *modules*. Subclassing a module doesn't do anything useful. Subclassing the module *type* does, as you demonstrate. Python 2.3 still allows you to subclass the module type, even though it does not allow you to subclass modules. Now, if you *really* want to subclass a *module*, then you should check out PEAK's "module inheritance" technique that lets you define new modules in terms of other modules. It's useful for certain types of AOP/SOP techniques. But it's currently implemented using bytecode hacking, and is therefore evil. ;) Anyway, it doesn't rely on actually *subclassing* modules. Speaking of bytecode hacking, it would be so much easier to implement "portable magic" if there were a fast, easy to use, language-defined intermediate representation for Python code that one could hack with. And don't tell me to "use Lisp", either... ;) From skip@pobox.com Sat May 17 03:52:41 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 16 May 2003 21:52:41 -0500 Subject: [Python-Dev] test_bsddb185 failing under OS X In-Reply-To: <3EC5734E.30209@ocf.berkeley.edu> References: <3EC5734E.30209@ocf.berkeley.edu> Message-ID: <16069.41977.390568.226852@montanaro.dyndns.org> Brett> AssertionError: 'bsddb185' == 'bsddb185' Brett> DBs are not my area of expertise so I don't know how to go about Brett> to attempt to fix this. I'll look into it. Skip From skip@pobox.com Sat May 17 14:13:05 2003 From: skip@pobox.com (Skip Montanaro) Date: Sat, 17 May 2003 08:13:05 -0500 Subject: [Python-Dev] test_bsddb185 failing under OS X In-Reply-To: <3EC5734E.30209@ocf.berkeley.edu> References: <3EC5734E.30209@ocf.berkeley.edu> Message-ID: <16070.13665.129282.617413@montanaro.dyndns.org> Brett, I goofed a bit in my (private) note to you yesterday. anydbm._name isn't of interest. It's anydbm._defaultmod. On my system, if I mv Lib/bsddb to Lib/bsddb- I no longer have the bsddb package available (as you said you didn't). In that situation, for me, anydbm._defaultmod is the gdbm module. All three tests succeed: % ./python.exe ../Lib/test/test_bsddb185.py test_anydbm_create (__main__.Bsddb185Tests) ... ok test_open_existing_hash (__main__.Bsddb185Tests) ... ok test_whichdb (__main__.Bsddb185Tests) ... ok If I delete gdbm.so I get dbm as anydbm._defaultmod. Again, success: % ./python.exe ../Lib/test/test_bsddb185.py test_anydbm_create (__main__.Bsddb185Tests) ... ok test_open_existing_hash (__main__.Bsddb185Tests) ... ok test_whichdb (__main__.Bsddb185Tests) ... ok Delete dbm.so. Run again. Now dumbdbm is anydbm._defaultmod. Run again. Success again: % ./python.exe ../Lib/test/test_bsddb185.py test_anydbm_create (__main__.Bsddb185Tests) ... ok test_open_existing_hash (__main__.Bsddb185Tests) ... ok test_whichdb (__main__.Bsddb185Tests) ... ok In short, I can't reproduce your error. Can you do some more debugging to see why your anydbm.open seems to be calling bsddb185.open? Thx, Skip From jepler@unpythonic.net Sat May 17 16:21:39 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Sat, 17 May 2003 10:21:39 -0500 Subject: [Python-Dev] [PEP] += on return of function call result In-Reply-To: <20030516142451.GI6196@localhost> References: <20030402090726.GN1048@localhost> <20030515214417.GF3900@localhost> <20030516142451.GI6196@localhost> Message-ID: <20030517152137.GA25579@unpythonic.net> I think that looking at the generated bytecode is useful. # Running with 'python -O' >>> def f(x): x += 1 >>> dis.dis(f) 0 LOAD_FAST 0 (x) 3 LOAD_CONST 1 (1) 6 INPLACE_ADD 7 STORE_FAST 0 (x) *** 10 LOAD_CONST 0 (None) 13 RETURN_VALUE >>> def g(x): x[0] += 1 >>> dis.dis(g) 0 LOAD_GLOBAL 0 (x) 3 LOAD_CONST 1 (0) 6 DUP_TOPX 2 9 BINARY_SUBSCR 10 LOAD_CONST 2 (1) 13 INPLACE_ADD 14 ROT_THREE 15 STORE_SUBSCR *** 16 LOAD_CONST 0 (None) 19 RETURN_VALUE >>> def h(x): x.a += 1 >>> dis.dis(h) 0 LOAD_GLOBAL 0 (x) 3 DUP_TOP 4 LOAD_ATTR 1 (a) 7 LOAD_CONST 1 (1) 10 INPLACE_ADD 11 ROT_TWO 12 STORE_ATTR 1 (a) *** 15 LOAD_CONST 0 (None) 18 RETURN_VALUE In each case, there's a STORE step to the inplace statement. In the case of the proposed def j(x): x() += 1 what STORE instruction would you use? >>> [opname for opname in dis.opname if opname.startswith("STORE")] ['STORE_SLICE+0', 'STORE_SLICE+1', 'STORE_SLICE+2', 'STORE_SLICE+3', 'STORE_SUBSCR', 'STORE_NAME', 'STORE_ATTR', 'STORE_GLOBAL', 'STORE_FAST', 'STORE_DEREF'] If you don't want one from the list, then you're looking at substantial changes to Python.. (and STORE_DEREF probably doesn't do anything that's relevant to this situation, though the name sure sounds promising, doesn't it) Jeff From tjreedy@udel.edu Sat May 17 18:34:05 2003 From: tjreedy@udel.edu (Terry Reedy) Date: Sat, 17 May 2003 13:34:05 -0400 Subject: [Python-Dev] Re: [PEP] += on return of function call result References: <20030402090726.GN1048@localhost> <20030515214417.GF3900@localhost> <20030516142451.GI6196@localhost> Message-ID: "Luke Kenneth Casson Leighton" wrote in message > 1) what is the technical, syntactical or language-specific reason why > I can't write an expression like f(x) += y ? In general, ignoring repetition of side-effects, this translates to f(x) = f(x) + y. Abstractly, the assignment pattern is target(s) = object(s), whereas above is object = object. As some of have tried to point out to the cbv (call-by-value) proponents on a clp thread, targets are not objects. so object = object is not proper Python. The reason inplace op syntax is possible is that syntax that defines a target on the left instead denotes an object when on the right (of '='), so that syntax to the left of op= does double duty. As Jeff Eppler pointed out in his response, the compiler uses that syntax to determine the type of target and thence the appropriate store instruction. But function call syntax only denotes an object and does not define a target and hence cannot do double duty. The exception to all this is listob += seq, which translates to listob.extend(seq). So if f returns a list, f(x) += y could be executed, but only with runtime selection of the apropriate byte code. However, if you know that f is going to return a list, so that f(x)+=y seem sensible, you can directly write f(x).extend(y) directly (or f(x).append(y) if that is what you actually want). However, since this does not bind the result to anything, even this is pointless unless all f does is to select from lists that you can already access otherwise. (Example: f(lista,listb,bool_exp).extend(y).) Terry J. Reedy From drifty@alum.berkeley.edu Sat May 17 19:13:15 2003 From: drifty@alum.berkeley.edu (Brett C.) Date: Sat, 17 May 2003 11:13:15 -0700 Subject: [Python-Dev] test_bsddb185 failing under OS X In-Reply-To: <16070.13665.129282.617413@montanaro.dyndns.org> References: <3EC5734E.30209@ocf.berkeley.edu> <16070.13665.129282.617413@montanaro.dyndns.org> Message-ID: <3EC67BBB.4090003@ocf.berkeley.edu> Skip Montanaro wrote: > Brett, > > I goofed a bit in my (private) note to you yesterday. anydbm._name isn't of > interest. It's anydbm._defaultmod. >>> anydbm._defaultmod > On my system, if I mv Lib/bsddb to > Lib/bsddb- I no longer have the bsddb package available (as you said you > didn't). In that situation, for me, anydbm._defaultmod is the gdbm module. > All three tests succeed: > > % ./python.exe ../Lib/test/test_bsddb185.py > test_anydbm_create (__main__.Bsddb185Tests) ... ok > test_open_existing_hash (__main__.Bsddb185Tests) ... ok > test_whichdb (__main__.Bsddb185Tests) ... ok > > If I delete gdbm.so I get dbm as anydbm._defaultmod. Again, success: > > % ./python.exe ../Lib/test/test_bsddb185.py > test_anydbm_create (__main__.Bsddb185Tests) ... ok > test_open_existing_hash (__main__.Bsddb185Tests) ... ok > test_whichdb (__main__.Bsddb185Tests) ... ok > > Delete dbm.so. Run again. Now dumbdbm is anydbm._defaultmod. Run again. > Success again: > No success for me when it is using dumbdbm: ====================================================================== ERROR: test_anydbm_create (__main__.Bsddb185Tests) ---------------------------------------------------------------------- Traceback (most recent call last): File "Lib/test/test_bsddb185.py", line 39, in test_anydbm_create os.rmdir(tmpdir) OSError: [Errno 66] Directory not empty: '/tmp/tmpkiVKcZ' ---------------------------------------------------------------------- Looks like foo.dat and foo.dir are left (files used by the DB?). I will fix the test again to be more agressive about deleting files. ... done. Just used shutil.rmtree instead of the nested 'try' statements that called os.unlink and os.rmdir . Now the tests pass for dumbdbm. So it seems to be dbm.so for some reason. I will see what I can figure out or at least get as much info as I can that I think can help in debugging this. -Brett From drifty@alum.berkeley.edu Sat May 17 19:22:00 2003 From: drifty@alum.berkeley.edu (Brett C.) Date: Sat, 17 May 2003 11:22:00 -0700 Subject: [Python-Dev] test_bsddb185 failing under OS X In-Reply-To: <3EC67BBB.4090003@ocf.berkeley.edu> References: <3EC5734E.30209@ocf.berkeley.edu> <16070.13665.129282.617413@montanaro.dyndns.org> <3EC67BBB.4090003@ocf.berkeley.edu> Message-ID: <3EC67DC8.4050809@ocf.berkeley.edu> Brett C. wrote: > No success for me when it is using dumbdbm: > > ====================================================================== > ERROR: test_anydbm_create (__main__.Bsddb185Tests) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "Lib/test/test_bsddb185.py", line 39, in test_anydbm_create > os.rmdir(tmpdir) > OSError: [Errno 66] Directory not empty: '/tmp/tmpkiVKcZ' > > ---------------------------------------------------------------------- > > Looks like foo.dat and foo.dir are left (files used by the DB?). I will > fix the test again to be more agressive about deleting files. > > ... done. Just used shutil.rmtree instead of the nested 'try' > statements that called os.unlink and os.rmdir . Now the tests pass for > dumbdbm. So it seems to be dbm.so for some reason. > But then Skip checked in the exact change I was going to I think almost simultaneously. And guess what? Now the darned tests pass using dbm! I am going to do a completely clean compile and test again to make sure this is not a fluke since the only change was ``cvs update`` for test_bsddb185.py and that only changed how files were deleted. Ah, the joys of coding. -Brett From drifty@alum.berkeley.edu Sat May 17 20:18:23 2003 From: drifty@alum.berkeley.edu (Brett C.) Date: Sat, 17 May 2003 12:18:23 -0700 Subject: [Python-Dev] test_bsddb185 failing under OS X In-Reply-To: <3EC67DC8.4050809@ocf.berkeley.edu> References: <3EC5734E.30209@ocf.berkeley.edu> <16070.13665.129282.617413@montanaro.dyndns.org> <3EC67BBB.4090003@ocf.berkeley.edu> <3EC67DC8.4050809@ocf.berkeley.edu> Message-ID: <3EC68AFF.3020900@ocf.berkeley.edu> Brett C. wrote: > > But then Skip checked in the exact change I was going to I think almost > simultaneously. And guess what? Now the darned tests pass using dbm! I > am going to do a completely clean compile and test again to make sure > this is not a fluke since the only change was ``cvs update`` for > test_bsddb185.py and that only changed how files were deleted. > Well, I recompiled and the test is still passing. The only thing I am aware of that changed between the tests failing and passing was me changing the test to use shutil.rmtree to clean up after itself and renaming dbm.so and then putting its name back. I have no idea why it is working now, but it is. -Brett From a4cwm5ewm@earthlink.net Sun May 18 01:23:06 2003 From: a4cwm5ewm@earthlink.net (Kelley Lunsford) Date: Sun, 18 May 03 00:23:06 GMT Subject: [Python-Dev] 100% Free TV vfzii Message-ID: This is a multi-part message in MIME format. --CFA0__4ED39__0CEF44 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Unlock Digital Cable *PPV channels *Boxing and any other Sports Event on PPV *Adult Channels *And anything else on PPV channels CLICK HERE FOR MORE INFORMATION http://www.a6zing29.com/xcart/customer/product.php?productid=3D16144&partn= er=3Daffil21&r=3Dhgsezyzmolc ri hma mutnwfjyc iog v ltkyje b xo ctqznlneo rgf zwa jy ki soq gn l ghpv This is the NEWEST AND BEST Digital CATV Filter/Descrambler that can test your digital cable PPV purchase functions along with eliminating unwanted interference caused by your broadband connection. This is a "True" universal product. It will work on 99% of all digital cable systems in use today. ** VERY SIMPLE TO HOOK UP ** *** ONLY $44.95 PLUS a FREE $20 GIFT *** IF YOU CAN ORDER PPV THROUGH YOUR REMOTE, THEN THIS FILTER WILL WORK FOR YOU WE OFFER THE MOST ADVANCED TECHNOLOGY IN DIGITAL FILTERS ANYWHERE, DONT BE FOOLED BY IMITATIONS THAT WONT WORK. ATTENTION: Because the Cable Company has no way of telling you are using this product you need to notify them of any movie purchases CLICK HERE FOR MORE INFORMATION http://www.a6zing29.com/xcart/customer/product.php?productid=3D16144&partn= er=3Daffil21&r=3Drih bglceosqf hdzmscizvo n cmmis +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ This email has been screened and filtered by our in house ""OPT-OUT"" system in compliance with state laws. If you wish to "OPT-OUT" from this mailing as well as the lists of thousands of other email providers please visit http://www.a6zing29.com/1/ +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ofqjfdllcqzd kxjg e shuvrmthvfwfi --CFA0__4ED39__0CEF44-- From skip@pobox.com Sat May 17 23:03:20 2003 From: skip@pobox.com (Skip Montanaro) Date: Sat, 17 May 2003 17:03:20 -0500 Subject: [Python-Dev] test_bsddb185 failing under OS X In-Reply-To: <3EC67BBB.4090003@ocf.berkeley.edu> References: <3EC5734E.30209@ocf.berkeley.edu> <16070.13665.129282.617413@montanaro.dyndns.org> <3EC67BBB.4090003@ocf.berkeley.edu> Message-ID: <16070.45480.640314.144944@montanaro.dyndns.org> Brett> No success for me when it is using dumbdbm: Brett> ====================================================================== Brett> ERROR: test_anydbm_create (__main__.Bsddb185Tests) Brett> ---------------------------------------------------------------------- Brett> Traceback (most recent call last): Brett> File "Lib/test/test_bsddb185.py", line 39, in test_anydbm_create Brett> os.rmdir(tmpdir) Brett> OSError: [Errno 66] Directory not empty: '/tmp/tmpkiVKcZ' This problem is fixed in CVS. Have you updated? Brett> ... done. Just used shutil.rmtree instead of the nested 'try' Brett> statements that called os.unlink and os.rmdir . Now the tests Brett> pass for dumbdbm. So it seems to be dbm.so for some reason. This is just what I checked in. Skip From tim.one@comcast.net Sun May 18 02:12:12 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 17 May 2003 21:12:12 -0400 Subject: [Python-Dev] Simple dicts In-Reply-To: <006301c31b30$da69e8e0$6401a8c0@damien> Message-ID: [Damien Morton] > ... > On the other hand, as Tim pointed out to me in a private email, there is > so much overhead in just getting to the hashtable inner loop, going > around that loop one time instead of two or three seems inconsequential. For in-cache tables. For out-of-cache tables, each trip around the loop is deadly. Since heavily used portions of small dicts are likely to be in-cache no matter how they're implemented, that's what makes me dubious about pouring lots of effort into reducing collisions for small dicts specifically. > ... > There seem to be two different ways to get/set/del from a dictionary. > > The first is using PyDict_[Get|Set|Del]Item() > > The second is using the embarssingly named dict_ass_sub() and its > partner dict_subscript(). > > Which of these two access methods is most likely to be used? My guess matches Guido's: PyDict_*, except in programs making heavy use of explicit Python dicts. All programs use dicts under the covers for namespace mapping, and, e.g., instance.attr and module.attr end up calling PyDict_GetItem() directly. Python-level explicit dict subscripting ends up calling dict_*, essentially because Python has no idea at compile-time whether the x in x[y] *is* a dict, so generates code that goes thru the all-purpose type-dispatch machinery. On the third hand, some explicit-dict slinging code seems to use x = somedict.get(y) everywhere, and dict_get() doesn't call PyDict_GetItem() or dict_subscript(). From Raymond Hettinger" When I look at www.python.org/sf/732174 , there is no Submit button on the screen. But I see it for other docs and patches. Is anyone else having the same issue? Without a submit button, it is darned difficult to mark the bug as fixed and close it. --R From tim.one@comcast.net Sun May 18 02:52:18 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 17 May 2003 21:52:18 -0400 Subject: [Python-Dev] SF oddity In-Reply-To: <007101c31cdd$e4423440$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger] > When I look at www.python.org/sf/732174 , there is no Submit > button on the screen. > But I see it for other docs and patches. Is anyone else having > the same issue? > Without a submit button, it is darned difficult to mark the bug > as fixed and close it. Look at the state of your browser's horizontal scrollbar, and scroll waaaaaay to the right. There's a very long line in this item's description, and that pushes the Submit button off the edge of anything less than a 161-inch monitor . From python@rcn.com Sun May 18 03:01:13 2003 From: python@rcn.com (Raymond Hettinger) Date: Sat, 17 May 2003 22:01:13 -0400 Subject: [Python-Dev] SF oddity References: Message-ID: <000201c31cf2$5a08dee0$125ffea9@oemcomputer> > [Raymond Hettinger] > > When I look at www.python.org/sf/732174 , there is no Submit > > button on the screen. > > But I see it for other docs and patches. Is anyone else having > > the same issue? > > Without a submit button, it is darned difficult to mark the bug > > as fixed and close it. [Timbot] > Look at the state of your browser's horizontal scrollbar, and scroll > waaaaaay to the right. There's a very long line in this item's description, > and that pushes the Submit button off the edge of anything less than a > 161-inch monitor . Hmphh! ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From skip@mojam.com Sun May 18 13:00:26 2003 From: skip@mojam.com (Skip Montanaro) Date: Sun, 18 May 2003 07:00:26 -0500 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200305181200.h4IC0Qa21569@manatee.mojam.com> Bug/Patch Summary ----------------- 403 open / 3646 total bugs (-11) 141 open / 2161 total patches (+6) New Bugs -------- Problem With email.MIMEText Package (2003-05-12) http://python.org/sf/736407 allow HTMLParser error recovery (2003-05-12) http://python.org/sf/736428 markupbase parse_declaration cannot recognize comments (2003-05-12) http://python.org/sf/736659 forcing function to act like an unbound method dumps core (2003-05-13) http://python.org/sf/736892 CGIHTTPServer does not handle scripts in sub-dirs (2003-05-13) http://python.org/sf/737202 os.symlink docstring is ambiguous. (2003-05-13) http://python.org/sf/737291 need doc for new trace module (2003-05-14) http://python.org/sf/737734 Failed assert in stringobject.c (2003-05-14) http://python.org/sf/737947 Interpreter crash: sigfpe on Alpha (2003-05-14) http://python.org/sf/738066 Section 13.3: htmllib.HTMLParser constructor definition amen (2003-05-15) http://python.org/sf/738090 pdb doesn't find some source files (2003-05-15) http://python.org/sf/738154 crash error in glob.glob; directories with brackets (2003-05-15) http://python.org/sf/738361 csv.Sniffer docs need updating (2003-05-15) http://python.org/sf/738471 On Windows, os.listdir() throws incorrect exception (2003-05-15) http://python.org/sf/738617 urllib2 CacheFTPHandler doesn't work on multiple dirs (2003-05-16) http://python.org/sf/738973 array.insert and negative indices (2003-05-17) http://python.org/sf/739313 New Patches ----------- Mutable PyCObject (2001-11-02) http://python.org/sf/477441 Improvement of cgi.parse_qsl function (2002-01-25) http://python.org/sf/508665 CGIHTTPServer execfile should save cwd (2002-01-25) http://python.org/sf/508730 rlcompleter does not expand on [ ] (2002-04-22) http://python.org/sf/547176 ConfigParser.read() should return list of files read (2003-01-30) http://python.org/sf/677651 DESTDIR improvement (2003-05-12) http://python.org/sf/736413 Put DEFS back to Makefile.pre.in (2003-05-12) http://python.org/sf/736417 Trivial improvement to NameError message (2003-05-12) http://python.org/sf/736730 interpreter final destination location (2003-05-12) http://python.org/sf/736857 docs for interpreter final destination location (2003-05-12) http://python.org/sf/736859 Port tests to unittest (Part 2) (2003-05-13) http://python.org/sf/736962 traceback module caches sources invalid (2003-05-13) http://python.org/sf/737473 minor codeop fixes (2003-05-14) http://python.org/sf/737999 for i in range(N) optimization (2003-05-15) http://python.org/sf/738094 fix for glob with directories which contain brackets (2003-05-15) http://python.org/sf/738389 Add use_default_colors support to curses module. (2003-05-17) http://python.org/sf/739124 Closed Bugs ----------- Regular expression tests: SEGV on Mac OS (2001-04-16) http://python.org/sf/416526 CGIHTTPServer crashes Explorer in WinME (2001-05-31) http://python.org/sf/429193 MacPy21: sre "recursion limit" bug (2001-06-29) http://python.org/sf/437472 provide a documented serialization func (2001-10-02) http://python.org/sf/467384 Security review of pickle/marshal docs (2001-10-16) http://python.org/sf/471893 Improvement of cgi.parse_qsl function (2002-01-25) http://python.org/sf/508665 CGIHTTPServer execfile should save cwd (2002-01-25) http://python.org/sf/508730 metaclasses and 2.2 highlights (2002-02-08) http://python.org/sf/515137 bsddb keys corruption (2002-02-25) http://python.org/sf/522780 test_pyclbr: bad dependency for input (2002-03-12) http://python.org/sf/529135 Wrong exception from re.compile() (2002-04-18) http://python.org/sf/545855 regex segfault on Mac OS X (2002-04-19) http://python.org/sf/546059 rlcompleter does not expand on [ ] (2002-04-22) http://python.org/sf/547176 os.spawnv() fails with underscores (2002-06-30) http://python.org/sf/575770 Print line number of string if at EOF (2002-07-04) http://python.org/sf/577295 Build error using make VPATH feature (2002-10-22) http://python.org/sf/626926 Have exception arguments keep their type (2003-01-27) http://python.org/sf/675928 No documentation of static/dynamic python modules. (2003-03-12) http://python.org/sf/702157 Distutils documentation amputated (2003-04-01) http://python.org/sf/713722 datetime types don't work as bases (2003-04-13) http://python.org/sf/720908 _winreg doesn't handle NULL bytes in value names (2003-04-16) http://python.org/sf/722413 add timeout support in socket using modules (2003-04-17) http://python.org/sf/723287 Minor /Tools/Scripts/crlf.py bugs (2003-04-20) http://python.org/sf/724767 rexec not listed as dead (2003-04-29) http://python.org/sf/729817 Clarification of "pos" and "endpos" for match objects. (2003-05-04) http://python.org/sf/732124 telnetlib.read_until: float req'd for timeout (2003-05-08) http://python.org/sf/734806 cStringIO.StringIO (2003-05-09) http://python.org/sf/735535 libwinsound.tex is missing MessageBeep() description (2003-05-10) http://python.org/sf/735674 Closed Patches -------------- xmlrpclib: Optional 'nil' support (2002-10-24) http://python.org/sf/628208 Remove type-check from urllib2 (2002-11-15) http://python.org/sf/639139 urllib2.Request's headers are case-sens. (2002-12-06) http://python.org/sf/649742 has_function() method for CCompiler (2003-04-07) http://python.org/sf/717152 DESTDIR variable patch (2003-04-09) http://python.org/sf/718286 socketmodule inet_ntop built when IPV6 is disabled (2003-04-30) http://python.org/sf/730603 make threading join() method return a value (2003-05-02) http://python.org/sf/731607 exit status of latex2html "ignored" (2003-05-04) http://python.org/sf/732143 build of html docs broken (liboptparse.tex) (2003-05-04) http://python.org/sf/732174 Docs for test package (2003-05-04) http://python.org/sf/732394 Python2.3b1 makefile improperly installs IDLE (2003-05-10) http://python.org/sf/735613 Python makefile may install idle in the wrong place (2003-05-10) http://python.org/sf/735614 From Silkz" This is a multi-part message in MIME format. ------=_NextPart_4C7_4E5F_994C4929.682CCF36 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable ------=_NextPart_4C7_4E5F_994C4929.682CCF36 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <=21DOCTYPE HTML PUBLIC =22-//W3C//DTD HTML 4.0 Transitional//EN=22>
Reduce your debt in 3 minutes ...
  • lower finance charges
  • =
  • end creditor harassment
  • no loan, no credit check needed
 
 
------=_NextPart_4C7_4E5F_994C4929.682CCF36-- From jim@zope.com Sun May 18 19:28:55 2003 From: jim@zope.com (Jim Fulton) Date: Sun, 18 May 2003 14:28:55 -0400 Subject: [Python-Dev] doctest extensions Message-ID: <3EC7D0E7.9000705@zope.com> I've written some doctest extensions to: - Generate a unitest (pyunit) test suite from a module with doctest tests. Each doc string containing one or more doctest tests becomes a test case. If a test fails, an error message is included in the unittest output that has the module file name and the approximate line number of the docstring containing the failed test formatted in a way understood by emacs error parsing. This is important. ;) - Debug doctest tests. Normally, doctest tests can't be debugged with pdb because, while they are running, doctest has taken over standard output. This tool extracts the tests in a doc string into a separate script and runs pdb on it. - Extract a doctest doc string into a script file. I think that these would be good additions to doctest and propose to add them, The current source can be found here: http://cvs.zope.org/Zope3/src/zope/testing/doctestunit.py?rev=HEAD&content-type=text/vnd.viewcvs-markup I ended up using a slightly different (and simpler) strategy for finding docstrings than doctest uses. This might be an issue. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (703) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From guido@python.org Sun May 18 20:02:55 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 18 May 2003 15:02:55 -0400 Subject: [Python-Dev] C new-style classes and GC In-Reply-To: "Your message of Fri, 16 May 2003 15:29:54 EDT." References: Message-ID: <200305181902.h4IJ2ti17624@pcp02138704pcs.reston01.va.comcast.net> > PEP 253 may be partly out of date here -- or not. In the section on > creating a subclassable type, it says: > > """ > The base type must do the following: > > - Add the flag value Py_TPFLAGS_BASETYPE to tp_flags. > > - Declare and use tp_new(), tp_alloc() and optional tp_init() > slots. > > - Declare and use tp_dealloc() and tp_free(). > > - Export its object structure declaration. > > - Export a subtyping-aware type-checking macro. > """ > > This doesn't leave a choice about defining tp_alloc() or tp_free() -- it > says both are required. For a subclassable type, I believe both must > actually be implemented too. > > For a non-subclassable type, I expect they're optional. But if you don't > define tp_free in that case, then I believe you must also not do the > > obj->ob_type->tp_free(obj) > > business in the tp_dealloc slot (else it will segfault). PyType_Ready() inherits tp_free from the base class, so it's optional. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun May 18 21:33:44 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 18 May 2003 16:33:44 -0400 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: "Your message of Fri, 16 May 2003 16:30:47 EDT." <3EC54A77.7090106@zope.com> References: <3EC507CB.6080502@zope.com> <1053103323.456.71.camel@slothrop.zope.com> <3EC51B12.8070407@zope.com> <1053106533.453.78.camel@slothrop.zope.com> <3EC53D09.3050505@zope.com> <1053114159.457.117.camel@slothrop.zope.com> <3EC54A77.7090106@zope.com> Message-ID: <200305182033.h4IKXi317732@pcp02138704pcs.reston01.va.comcast.net> > I don't know wht PyDict_New doesn't just call the dict type. > Maybe doing things in-line like this is just an optimization. Yes; and because PyDict_New is much older than callable type objects. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun May 18 21:39:30 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 18 May 2003 16:39:30 -0400 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: "Your message of Fri, 16 May 2003 17:37:07 EDT." References: Message-ID: <200305182039.h4IKdU717764@pcp02138704pcs.reston01.va.comcast.net> > Guido, would you be agreeable to making this magic even more magical? It > seems to me that we can know whether the current type intends to participate > in cyclic gc, and give it a correct default tp_free value instead if so. > The hairier type_new() function already has this extra level of > Py_TPFLAGS_HAVE_GC-dependent magic for dynamically created types, setting > tp_free to PyObject_Del in one case and to PyObject_GC_Del in the other. > PyType_Ready() can supply a wrong deallocation function by default > ("explicit is better than implicit" has no force when talking about > PyType_Ready() ). Yes, I think this is the right thing to do -- either only inherit tp_free when the GC bit of the base and derived class are the same, or -- in addition -- special case inheriting PyObject_Del and turn it into PyObject_GC_Del when the base class adds the GC bit. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun May 18 21:42:34 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 18 May 2003 16:42:34 -0400 Subject: [Python-Dev] a strange case In-Reply-To: "Your message of Fri, 16 May 2003 13:45:25 -0800." <200305161345.25415.troy@gci.net> References: <20030516202402.30333.72761.Mailman@mail.python.org> <200305161345.25415.troy@gci.net> Message-ID: <200305182042.h4IKgYA17778@pcp02138704pcs.reston01.va.comcast.net> > "Why in the world would you want callable modules you ask?" I > don't have a real need, but I often see the line blurred between package, > module, and class. Please don't try to blur the line between module and class. This has been proposed many times, and the net result IMO is always more confusion and no more power. This is also why in 2.3, modules are no longer subclassable. If you really need to have a module that has behavior beyond what a module can offer, the officially sanctioned way is to stick an instance of a class in sys.modules[__name__] from inside the module's code. (I would explain more about *why* I think it's a really bad idea, but I'm officially on vacation.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun May 18 22:04:40 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 18 May 2003 17:04:40 -0400 Subject: [Python-Dev] Need advice, maybe support In-Reply-To: "Your message of Sat, 17 May 2003 01:52:20 +0200." <3EC579B4.9000303@tismer.com> References: <3EC579B4.9000303@tismer.com> Message-ID: <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> > In the last months, I made very much progress with Stackless 3.0 . > Finally, I was able to make much more of Python stackless > (which means, does not use recursive interpreter calls) than > I could achive with 1.0 . > > There is one drawback with this, and I need advice: > Compared to older Python versions, Py 2.2.2 and up uses > more indirection through C function pointers than ever. > This blocked my implementation of stackless versions, in > the first place. > > Then the idea hit me like a blizzard: > Most problems simply vanish if I add another slot to the > PyMethodDef structure, which is NULL by default: > ml_meth_nr is a function pointer with the same semantics > as ml_meth, but it tries to perform its action without > doing a recursive call. It tries instead to push a frame > and to return Py_UnwindToken. > Doing this change made Stackless crystal clear and simple: > A C extension not aware of Stackless does what it does > all the time: call ml_meth. > Stackless aware C code (like my modified ceval.c code) > calls the ml_meth_nr slots, instead, which either defaults > to the ml_meth code, or has a special version which avoids > recursive interpreter calls. > I also added a tp_call_nr slot to typeobject, for similar > reasons. > > While this is just great for me, yielding complete > source code compatability, it is a slight drawback, since > almost all extension modules make use of the PyMethodDef > structure. Therefore, binary compatability of Stackless > has degraded, dramatically. > > I'm now in some kind of dilemma: > On the one side, I'm happy with this solution (while I have > to admit that it is not too inexpensive, but well, all the > new descriptor objects are also not cheap, but just great), > on the other hand, simply replacing python22.dll is no longer > sufficient. You need to re-compile everything, which might > be a hard thing on Windows (win32 extensions, wxPython). > Sure, I would stand this, if there is no alternative, I would > have to supply a complete replacement package of everything. > > Do you (does anybody) have an alternative suggestion how > to efficiently maintain a "normal" and a "non-recursive" > version of a method without changing the PyMethodDef struc? > > Alternatively, would it be reasonable to ask the Python core > developers, if they would accept to augment PyMethodDef and > PyTypeObject with an extra field (default NULL, no maintenance), > just for me and Stackless? > > Many thanks for any reply - sincerely -- chris I don't think we can just add an extra field to PyMethodDef, because it would break binary incompatibility. Currently, in most cases, a 3r party extension module compiled for an earlier Python version can still be used with a later version. Because PyMethodDef is used as an array, adding a field to it would break this. I have less of a problem with extending PyTypeObject, it grows all the time and the tp_flags bits tell you how large the one you've got is. (I still have some problems with this, because things that are of no use to the regular Python core developers tend to either confuse them, or be broken on a regular basis.) Maybe you could get away with defining an alternative structure for PyMethodDef and having a flag in tp_flags say which it is; there are plenty of unused bits and I don't mind reserving one for you. Then you'd have to change all the code that *uses* tp_methods, but there isn't much of that; in fact, the only place I see is in typeobject.c. If this doesn't work for you, maybe you could somehow fold the two implementation functions into one, and put something special in the argument list to signal that the non-recursive version is wanted? (Thinking aloud here -- I don't know exactly what the usage pattern of the nr versions will be.) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim@zope.com Sun May 18 22:42:28 2003 From: tim@zope.com (Tim Peters) Date: Sun, 18 May 2003 17:42:28 -0400 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: <200305182039.h4IKdU717764@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > Yes, I think this is the right thing to do -- either only inherit > tp_free when the GC bit of the base and derived class are the same, Jim is keen to have gc'able classes defined in C get the right deallocation function by magic. In these cases, he leaves tp_free NULL, but indicates gc-ability in tp_flags. tp_base becomes "object" by magic then, and the GC bits are not the same, and neither inheriting object.tp_free nor leaving derived_class.tp_free NULL can work. It seems like a reasonable thing to me to want it to work, so on to the next: > or -- in addition -- special case inheriting PyObject_Del and turn it > into PyObject_GC_Del when the base class adds the GC bit. That's what I had in mind, s/base/derived/, plus raising an exception if a gc'able class explicitly sets tp_free to PyObject_Del (probably a cut-'n-paste error when that happens, or that gc-ability was tacked on to a previously untracked type). If that's all OK, enjoy your vacation, and I'll take care of this (for 2.3 and 2.2.3). From troy@gci.net Sun May 18 23:07:37 2003 From: troy@gci.net (Troy Melhase) Date: Sun, 18 May 2003 14:07:37 -0800 Subject: [Python-Dev] a strange case In-Reply-To: <200305182042.h4IKgYA17778@pcp02138704pcs.reston01.va.comcast.net> References: <20030516202402.30333.72761.Mailman@mail.python.org> <200305161345.25415.troy@gci.net> <200305182042.h4IKgYA17778@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200305181407.37880.troy@gci.net> > Please don't try to blur the line between module and class. This has > been proposed many times, and the net result IMO is always more > confusion and no more power. This is also why in 2.3, modules are no > longer subclassable. Loud and clear! > (I would explain more about *why* I think it's a really bad idea, but > I'm officially on vacation.) "There should be one-- and preferably only one --obvious way to do it" if I had to guess. Happy holidays. -troy From walter@livinglogic.de Sun May 18 23:19:18 2003 From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Mon, 19 May 2003 00:19:18 +0200 Subject: [Python-Dev] a strange case In-Reply-To: <200305182042.h4IKgYA17778@pcp02138704pcs.reston01.va.comcast.net> References: <20030516202402.30333.72761.Mailman@mail.python.org> <200305161345.25415.troy@gci.net> <200305182042.h4IKgYA17778@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3EC806E6.3040204@livinglogic.de> Guido van Rossum wrote: >>"Why in the world would you want callable modules you ask?" I=20 >>don't have a real need, but I often see the line blurred between packag= e,=20 >>module, and class. >=20 > Please don't try to blur the line between module and class. This has > been proposed many times, It sounds familiar! ;) > and the net result IMO is always more > confusion and no more power. This is also why in 2.3, modules are no > longer subclassable. >=20 > If you really need to have a module that has behavior beyond what a > module can offer, the officially sanctioned way is to stick an > instance of a class in sys.modules[__name__] from inside the module's > code. But reload() won't work for these pseudo modules (See http://www.python.org/sf/701743). What about the imp module? > (I would explain more about *why* I think it's a really bad idea, but > I'm officially on vacation.) Sure, this can wait. Bye, Walter D=F6rwald From jepler@unpythonic.net Mon May 19 02:22:14 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Sun, 18 May 2003 20:22:14 -0500 Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time) In-Reply-To: <1053112171.2342.7.camel@barry> References: <1052927757.7258.38.camel@slothrop.zope.com> <3EC51B5C.2080307@lemburg.com> <16069.13640.892428.185711@montanaro.dyndns.org> <1053112171.2342.7.camel@barry> Message-ID: <20030519012212.GA10317@unpythonic.net> On Fri, May 16, 2003 at 03:09:31PM -0400, Barry Warsaw wrote: > Skip, you're going about this all wrong. We already have the technology > to start Python up blazingly fast. All you have to do is port > XEmacs's unexec code. Then you load up Python with all the modules you > think you're going to need, unexec it, then the next time it starts up > like lightening. Disk space is cheap! I gave it a try, starting with 2.3b1 and using FSF Emacs 21.3's unexelf.c. An unexec'd binary loads faster than 'python -S -c pass', and seems to work properly with two exceptions and a few limitations. The only change to Python is in main(): I use mallopt() to force all allocations to go through brk() instead of through mmap(), because unexec doesn't support mmap'd memory. I also used Modules/Setup.local to make some normally-shared modules not shared (for the same reason). dump.py loads the requested modules (- forces the module to *not* be found) and then calls unexec(), producing a new binary with the given name. $ time ./python -S -c pass # best 'real' of 5 runs real 0m0.054s user 0m0.040s sys 0m0.010s $ time ./python -c 'import cgi' # best 'real' of 5 runs real 0m0.127s user 0m0.110s sys 0m0.010s $ strace -e open ./python -c 'import cgi' 2>&1 | grep -v ENOENT | wc -l 88 $ ./python dump.py cgipython -_ssl cgi $ time ./cgipython -c 'import cgi' # best 'real' of 5 runs real 0m0.039s user 0m0.020s sys 0m0.020s $ strace -e open ./cgipython -c 'import cgi' 2>&1 | grep -v ENOENT | wc -l 9 $ ./python dump.py dython -rwxrwxr-x 1 jepler jepler 4983713 May 18 19:42 cgipython -rwxrwxr-x 1 jepler jepler 3603737 May 18 19:39 python -rwxrwxr-x 1 jepler jepler 4541345 May 18 19:55 dython (a minimal unexec'd python is about 90k bigger than the regular Python binary) I'm running the test suite now .. it hangs in test_signal for some reason. test_thread seems to hang too, which may be related. (but test_threading completes?) $ ./dython Lib/test/regrtest.py -x test_signal -x test_thread [...] 225 tests OK. 26 tests skipped: test_aepack test_al test_bsddb3 test_bz2 test_cd test_cl test_curses test_email_codecs test_gl test_imgfile test_linuxaudiodev test_macfs test_macostools test_nis test_normalization test_ossaudiodev test_pep277 test_plistlib test_scriptpackages test_socket_ssl test_socketserver test_sunaudiodev test_timeout test_urllibnet test_winreg test_winsound 1 skip unexpected on linux2: test_bz2 Well, if it worked right it'd sure be interesting. OTOH, unexelf.c is GPL'd and there's also the nightmare of different unex* for different platforms. Jeff ######################################################################## # dump.py import unexec, sys for m in sys.argv[2:]: if m[0] == "-": sys.modules[m[1:]] = None continue __import__(m) for m in sys.modules.keys(): mod = sys.modules[m] if mod is None: continue # negatively cached entry if not hasattr(mod, "__file__"): continue # builtin module if mod.__file__.endswith(".so"): raise RuntimeError, "Cannot dump with shared module %s" % m unexec.dump(sys.argv[1], sys.executable) /**********************************************************************/ /* unexecmodule.c (needs unexec() eg from unexelf.c) */ #include extern void unexec (char *new_name, char *old_name, unsigned data_start, unsigned bss_start, unsigned entry_address); static PyObject *dump_python(PyObject *self, PyObject *args) { char *filename, *symfile; if(!PyArg_ParseTuple(args, "ss", &filename, &symfile)) return NULL; unexec(filename, symfile, 0, 0, (unsigned)Py_Main); _exit(99); } static PyMethodDef dump_methods[] = { {"dump", dump_python, METH_VARARGS, PyDoc_STR("dump(filename, symfile) -> None")}, {NULL, NULL} }; PyDoc_STRVAR(module_doc, "Support for undumping the Python executable, a la Emacs"); PyMODINIT_FUNC initunexec(void) { Py_InitModule3("unexec", dump_methods, module_doc); } ######################################################################## # Setup.local # Edit this file for local setup changes unexec unexecmodule.c unexelf.c time timemodule.c _socket socketmodule.c _random _randommodule.c math mathmodule.c fcntl fcntlmodule.c From drifty@alum.berkeley.edu Mon May 19 02:38:24 2003 From: drifty@alum.berkeley.edu (Brett C.) Date: Sun, 18 May 2003 18:38:24 -0700 Subject: [Python-Dev] python-dev Summary for 2003-05-01 through 2003-05-15 Message-ID: <3EC83590.1000306@ocf.berkeley.edu> It's that time of the month again. The only thing I would like help with this summary is if someone knows the attribute lookup order (instance, class, class descriptor, ...) off the top of their heads, can you let me know? If not I can find it out by going through the docs but I figure someone out there has to know it by heart and any possible quirks (like whether descriptors take precedence over non-descriptor attributes). I won't send this off until Wednesday. ---------------------- +++++++++++++++++++++++++++++++++++++++++++++++++++++ python-dev Summary for 2003-05-01 through 2003-05-15 +++++++++++++++++++++++++++++++++++++++++++++++++++++ This is a summary of traffic on the `python-dev mailing list`_ from May 1, 2003 through May 15, 2003. It is intended to inform the wider Python community of on-going developments on the list and to have an archived summary of each thread started on the list. To comment on anything mentioned here, just post to python-list@python.org or `comp.lang.python`_ with a subject line mentioning what you are discussing. All python-dev members are interested in seeing ideas discussed by the community, so don't hesitate to take a stance on something. And if all of this really interests you then get involved and join `python-dev`_! This is the seventeenth summary written by Brett Cannon (going to grad school, baby!). All summaries are archived at http://www.python.org/dev/summary/ . Please note that this summary is written using reStructuredText_ which can be found at http://docutils.sf.net/rst.html . Any unfamiliar punctuation is probably markup for reST_ (otherwise it is probably regular expression syntax or a typo =); you can safely ignore it, although I suggest learning reST; its simple and is accepted for `PEP markup`__. Also, because of the wonders of programs that like to reformat text, I cannot guarantee you will be able to run the text version of this summary through Docutils_ as-is unless it is from the original text file. __ http://www.python.org/peps/pep-0012.html The in-development version of the documentation for Python can be found at http://www.python.org/dev/doc/devel/ . To view files in the Python CVS online, go to http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/ . .. _python-dev: http://www.python.org/dev/ .. _python-dev mailing list: http://mail.python.org/mailman/listinfo/python-dev .. _comp.lang.python: http://groups.google.com/groups?q=comp.lang.python .. _Docutils: http://docutils.sf.net/ .. _reST: .. _reStructuredText: http://docutils.sf.net/rst.html .. contents:: .. _last summary: http://www.python.org/dev/summary/2003-04-16_2003-04-30.html ====================== Summary Announcements ====================== So, to help keep my sanity longer than my predecessors I am no longer going to link to individual modules in the stdlib nor to files in CVS. It sucks down a ton of time and at least Raymond Hettinger thinks it clutters the summaries. Along the lines of the look of the summaries, I am trying out a new layout for listing splinter threads. If you have a preference in comparison to the old style or new style speak up and let me know. ========================== `Dictionary sparseness`__ ========================== __ http://mail.python.org/pipermail/python-dev/2003-May/035295.html Splinter threads: `Where'd my memory go?`__ __ http://mail.python.org/pipermail/python-dev/2003-May/035340.html After all the work Raymond Hettinger did on dictionaries he suggested two possible methods on dictionaries that would allow the programmer to control how sparse (at what point a dictionary doubles its size in order to lower collisions) a dictionary should be. Both got shot down on the grounds that most people would not know how to properly use the methods and are more likely to shoot themselves in the foot than get any gain out of them. There was also a bunch of talk about the "old days" when computers were small and didn't measure the amount of RAM they had in megabytes unless they were supercomputers. But then the discussion changed to memory footprints. There was some mention of the extra output one can get from a special build (all listed in Misc/SpecialBuilds.txt) such as Py_DEBUG. But the issue at hand is that there int, float, and frameobject free lists which keep alive any and all created constant values (although the frameobject is bounded in size). This is why if you do ``range(2000000)`` you won't get the memory allocated for all of those integers back until you shut down the interpreter. This led to the suggestion of just doing away with the free lists. There would be a performance hit since numerical constants would have to be reallocated if they are constantly created deleted, and then created again. It was also suggested to limit the size of the free lists and basically chop off the tail if they grew too large. But it turns out that the memory is allocated in large blocks that are chopped up by intobject.c. Thus there is no way to just get rid of a few entries without taking out a whole block of objects. ================================= `__slots__ and default values`__ ================================= __ http://mail.python.org/pipermail/python-dev/2003-May/035575.html Ever initialized a variable in a class that uses __slots__? If you have you may have discovered that the variable becomes read-only:: class Parrot(object): __ slots__ = ["dead"] dead = True bought_bird = Parrot() bought_bird.dead = False That raises an AttributeError saying that 'dead' is read-only. This occurs because the class attribute "overrides the descriptor created by __slots__" and "now appears read-only because there is no instance dict" thanks to __slots__ suppressing the creation of one. But don't go using this trick! If you want read-only attributes use a property with its set function set to raise an exception. If you want to avoid this problem just do your initialization of attributes in the __init__ call. You can also include __dict__ in __slots__ and then your instances will have a fully functioning instance __dict__ (new in 2.3). The key thing to come away with this twofold. One is the resolution order of attribute lookup which is XXX. The other is that __slots__ is meant purely to cut down on memory usage, nothing more. Do not start abusing it with little tricks like the one mentioned above or Guido will pull it from the language. ========= Quickies ========= `Draft of dictnotes.txt`__ After all the work Raymond Hettinger did to try to speed up dictionaries, he wrote a text file documenting everything he tried and learned. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035246.html `_socket efficiencies ideas`__ This thread was first covered in the `last summary`_. Guido discovered that the socket module used to special-case receiving a numeric address in order to skip any overhead in bother to resolve the IP address. It has been put back into the code. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035248.html `Demos and Tools in binary distributions`__ Jack Jansen asked where other platform-specific binary distributions of Python put the Demo and Tools directories. The thread ended with the winning solution be putting them in /Applications/Python2.3/Extras/ so they are a level below the root directory to prevent newbies from getting overwhelmed by the code there since it is not all simple. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035252.html `updated notes about building bsddb185 module`__ Splinter threads: - `bsddb185 module changes checked in`__ Someone wanted the bsddb185 module back. Initially it was suggested to build that module as bsddb if the new bsddb3 module could not be built (since that module currently gets named bsddb). The final outcome was that bsddb185 will get built under certain conditions and be named bsddb185. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035257.html __ http://mail.python.org/pipermail/python-dev/2003-May/035409.html `broke email date parsing`__ ... but it got fixed. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035259.html `New thread death in test_bsddb3`__ This is a continuation from the `last summary`_. You can create as many thread states as you like as long as you only use one at any given point. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035230.html `removing csv directory from nondist/sandbox - how?`__ Joys of CVS. You can never removed a directory unless you have direct access the the physical directory on the CVS root server. The best you can do is to empty the directory (make sure to get files named ".*") and assume people will do an ``cvs update -dP``. You can also remove the empty directories locally by hand if you like. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035270.html `posixmodule.c patch to support forkpty`__ A patch was sent to python-dev incorrectly that tries to get os.forkpty to work on more platforms. It is now up on SourceForge_ and it is `patch #732401 `__. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035281.html .. _SourceForge: http://www.sf.net/projects/python `Timbot?`__ There is a real Timbot robot out there: http://www.cse.ogi.edu/~mpj/timbot/#Programming . .. __: http://mail.python.org/pipermail/python-dev/2003-May/035287.html `optparse docs need proofreading`__ What the 'subject' says. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035288.html `heaps`__ This is a continuation of a thread from the `last summary`_. Lots of talk about heaps, priority queues, and other theoretical algorithm talk. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035343.html Weekly Python Bug/Patch Summary First one ended on `2003-05-04 `__. The second one ended on `2003-05-11 `__. `Distutils using apply`__ Since Distutils must be kept backwards-compatible (as stated in `PEP 291`_), it still uses 'apply'. This raises a PendingDeprecation warning which is normally silent unless you want all warnings raised. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035293.html .. _PEP 291: http://www.python.org/peps/pep-0291.html `How to test this?`__ Dummy files can be checked into Lib/test . .. __: http://mail.python.org/pipermail/python-dev/2003-May/035318.html `Windows installer request...`__ Someone wanted the default drive on the Windows installer to be your boot drive and not C. It has been fixed. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035319.html `Election of Todd Miller as head of numpy team`__ What the 'subject' says. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035326.html `Startup time`__ Guido noticed that although Python 2.3 is already faster than 2.2, its startup time is slower. It looks like it is from failing stat calls. Speeding this all up is still being worked on. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035359.html `testing with and without pyc files present`__ Why does ``make test`` delete all .pyc and .pyo files before running the regression tests? To act as a large scale test of the marshaling code. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035362.html `pyconfig.h not regenerated by "config.status --recheck"`__ ``./config.status --recheck`` doesn't work too well. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035366.html `Python Technical Lead, New York, NY - 80-85k`__ Wrong place for a job announcement. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035369.html `RedHat 9 _random failure under -pg`__ gcc ain't perfect. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035386.html `SF CVS offline`__ ... but it came back up. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035398.html `Microsoft speedup`__ It was noticed that turning on more aggressive inlining for VC6 sped up pystone by 2.5% while upping the executable size by 13%. Tim Peters noted that "A couple employers ago, we disabled all magical inlining options, because sometimes they made critical loops faster, and sometimes slower, and you couldn't guess which as the code changed". .. __: http://mail.python.org/pipermail/python-dev/2003-May/035454.html `Relying on ReST in the core?`__ Although docutils_ is not in the core yet, it is being used more and more. But is this safe? As long as it's kept conservative and not required anywhere, yes. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035465.html `Make _strptime only time.strptime implementation?`__ As long as no one complains to loudly by 2.3b2, _strptime.strptime will become the exclusive implementation of time.strptime. _strptime.strptime also learned how to recognize UTC and GMT as timezones. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035481.html `Building Python with .NET 2003 SDK`__ Logistix was nice enough to try to build Python on .NET 2003 and post notes on how he did it at http://www.cathoderaymission.net/~logistix/python/buildingPythonWithDotNet.html . .. __: http://mail.python.org/pipermail/python-dev/2003-May/035485.html `local import cost`__ Trying to find out how the cost of doing imports in the local namespace costs compared to doing it at the global level. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035486.html `Subclassing int?`__ This thread started `two summaries ago `__. Subclassing int to make it mutable just doesn't work. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035500.html `patch 718286`__ The patch was applied. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035538.html `Need some patches checked`__ Some patches needed to be cleared by more senior members of python-dev since they were being handled by the young newbie of the group. Jeremy Hylton also mentioned that a full-scale refactoring of urllib2 is needed and would allow the closure of some patches. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035540.html `os.path.walk() lacks 'depth first' option`__ Splinter threads: - `os.walk() silently ignores errors`__ This thread started in the `last summary`_. LookupError exists and subclasses both IndexError and KeyError. Rather handy when you don't care whether you are dealing with a list or dictionary but do care if what you are looking for doesn't exist. os.walk also gained a parameter argument called onerror that takes a function that will be passed any exception raised by os.walk as it does its thing; previously os.walk ignored all errors. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035546.html __ http://mail.python.org/pipermail/python-dev/2003-May/035574.html `Random SF tracker ettiquete questions`__ Does python-dev care about dealing with RFEs? Sort of; it isn't a priority like patches and bugs, but cleaning them out once in a while doesn't hurt. Is it okay to assign a tracker item to yourself even if it is already assigned to another person? If the original person it was assigned to is not actively working on it, then yes. When should someone be put into the Misc/ACKS file? When they have done anything that required some amount of brain power (yes, this includes one-line patches). .. __: http://mail.python.org/pipermail/python-dev/2003-May/035549.html `codeop: small details (Q); commit priv request`__ Some issues with codec were worked out and Samuele Pedroni got commit privileges. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035556.html `Python 2.3b1 _XOPEN_SOURCE value from configure.in`__ Python.h should always be included in extension modules first before any other header files. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035560.html `Inplace multiply`__ Someone thought they had found a bug. Michael Hudson thought it was an old bug that was fixed. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035591.html `sf.net/708007: expectlib.py telnetlib.py split`__ A request for people to look at http://www.python.org/sf/708007 . .. __: http://mail.python.org/pipermail/python-dev/2003-May/035605.html `Simple dicts`__ Tim Peters suggested that if someone wanted something to do they could try re-implementing dicts to using chaining instead of open addressing. It turns out Damien Morton (who did a ton of work trying to optimize Python's bytecode) is working on an immplementation. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035625.html `python/dist/src/Lib warnings.py,1.19,1.20`__ As part of the attempts to speed up startup time, the attempted elimination of the required import of the re module came up. This thread brought up the question as to whether it was desired to be able to pass a regexp as an argument for the -W command-line option for Python. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035616.html `[PEP] += on return of function call result`__ You can't assign on the return value of a method calls. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035640.html `Vacation; Python 2.2.3 release.`__ Guido is going on vacation and won't be back until May 26. He would like Python 2.2.3 to be out shortly after he gets back, although if it comes out while he is gone he definitely won't complain. =) You can get an anonymous CVS checkout of the 2.2 maintenance branch by executing ``cvs -d :pserver:anonymous@cvs.python.sourceforge.net:/cvsroot/python checkout -d -r release22-maint python`` and changing the <> note to be the directory you want to put your CVS copy into. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035642.html `MS VC 7 offer`__ At `Python UK`_ Guido was offered free copies of `Visual C++ 2003`_ by the project lead of VC, Nick Hodapp, for key developers (a free copy of the compiler is available at http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx ). This instantly led to the discussion of whether Python's binary distribution for Windows should be moved off of VC 6 to 7. The biggest issue is that apparently passing FILE * values across library boundaries breaks code. The final decision seemed to be that Tim, Guido, and developers of major extensions should get free copies. Then an end date of when Python will be moved off of VC 6 and over to 7 will be decided. None of this will affect Python 2.3 . This thread was 102 emails long. I don't use Windows. This was painful. .. __: http://mail.python.org/pipermail/python-dev/2003-May/035375.html .. _Python UK: http://www.python-uk.org/ .. _Visual C++ 2003: http://msdn.microsoft.com/visualc/ From dberlin@dberlin.org Mon May 19 02:56:58 2003 From: dberlin@dberlin.org (Daniel Berlin) Date: Sun, 18 May 2003 21:56:58 -0400 Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time) In-Reply-To: <20030519012212.GA10317@unpythonic.net> Message-ID: <2D129034-899D-11D7-BB2B-000A95A34564@dberlin.org> On Sunday, May 18, 2003, at 09:22 PM, Jeff Epler wrote: > On Fri, May 16, 2003 at 03:09:31PM -0400, Barry Warsaw wrote: >> Skip, you're going about this all wrong. We already have the >> technology >> to start Python up blazingly fast. All you have to do is port >> XEmacs's unexec code. Then you load up Python with all the modules >> you >> think you're going to need, unexec it, then the next time it starts up >> like lightening. Disk space is cheap! > > I gave it a try, starting with 2.3b1 and using FSF Emacs 21.3's > unexelf.c. XEmacs has a portable undumper, IIRC. > An unexec'd binary loads faster than 'python -S -c pass', and seems to > work properly with two exceptions and a few limitations. > > The only change to Python is in main(): I use mallopt() to force all > allocations to go through brk() instead of through mmap(), because > unexec > doesn't support mmap'd memory. I also used Modules/Setup.local to make > some normally-shared modules not shared (for the same reason). > > dump.py loads the requested modules (- forces the module to > *not* > be found) and then calls unexec(), producing a new binary with the > given > name. > > $ time ./python -S -c pass # best 'real' of 5 runs > real 0m0.054s > user 0m0.040s > sys 0m0.010s > $ time ./python -c 'import cgi' # best 'real' of 5 runs > real 0m0.127s > user 0m0.110s > sys 0m0.010s > $ strace -e open ./python -c 'import cgi' 2>&1 | grep -v ENOENT | wc -l > 88 > $ ./python dump.py cgipython -_ssl cgi > $ time ./cgipython -c 'import cgi' # best 'real' of 5 runs > real 0m0.039s > user 0m0.020s > sys 0m0.020s > $ strace -e open ./cgipython -c 'import cgi' 2>&1 | grep -v ENOENT | > wc -l > 9 > $ ./python dump.py dython > -rwxrwxr-x 1 jepler jepler 4983713 May 18 19:42 cgipython > -rwxrwxr-x 1 jepler jepler 3603737 May 18 19:39 python > -rwxrwxr-x 1 jepler jepler 4541345 May 18 19:55 dython > > (a minimal unexec'd python is about 90k bigger than the regular Python > binary) > > I'm running the test suite now .. it hangs in test_signal for some > reason. > test_thread seems to hang too, which may be related. (but > test_threading > completes?) > > $ ./dython Lib/test/regrtest.py -x test_signal -x test_thread > [...] > 225 tests OK. > 26 tests skipped: > test_aepack test_al test_bsddb3 test_bz2 test_cd test_cl > test_curses test_email_codecs test_gl test_imgfile > test_linuxaudiodev test_macfs test_macostools test_nis > test_normalization test_ossaudiodev test_pep277 test_plistlib > test_scriptpackages test_socket_ssl test_socketserver > test_sunaudiodev test_timeout test_urllibnet test_winreg > test_winsound > 1 skip unexpected on linux2: > test_bz2 > > Well, if it worked right it'd sure be interesting. OTOH, unexelf.c is > GPL'd and there's also the nightmare of different unex* for different > platforms. > > Like I said, xemacs has a "portable" undumper. --Dan From aahz@pythoncraft.com Mon May 19 02:58:22 2003 From: aahz@pythoncraft.com (Aahz) Date: Sun, 18 May 2003 21:58:22 -0400 Subject: [Python-Dev] Attribute lookup (was Re: python-dev Summary for 2003-05-01 through 2003-05-15) In-Reply-To: <3EC83590.1000306@ocf.berkeley.edu> References: <3EC83590.1000306@ocf.berkeley.edu> Message-ID: <20030519015822.GA10320@panix.com> [Normally I send my corrections to Brett privately, but since I'm taking a whack at attribute lookup, I figured this ought to be public.] On Sun, May 18, 2003, Brett C. wrote: > > The only thing I would like help with this summary is if someone knows > the attribute lookup order (instance, class, class descriptor, ...) off > the top of their heads, can you let me know? If not I can find it out > by going through the docs but I figure someone out there has to know it > by heart and any possible quirks (like whether descriptors take > precedence over non-descriptor attributes). This gets real tricky. For simple attributes of an instance, the order is instance, class/type, and base classes of the class/type (but *not* the metaclass). However, method resolution of the special methods goes straight to the class. Finally, if an attribute is found on the instance, a search goes through the hierarchy to see whether a set descriptor overrides (note specifically that it's a set descriptor; methods are implemented using get descriptors). I *think* I have this right, but I'm sure someone will correct me if I'm wrong. > LookupError exists and subclasses both IndexError and KeyError. > Rather handy when you don't care whether you are dealing with a list or > dictionary but do care if what you are looking for doesn't exist. > os.walk also gained a parameter argument called onerror that takes > a function that will be passed any exception raised by os.walk as it > does its thing; previously os.walk ignored all errors. "and has as subclasses" -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In many ways, it's a dull language, borrowing solid old concepts from many other languages & styles: boring syntax, unsurprising semantics, few automatic coercions, etc etc. But that's one of the things I like about it." --Tim Peters on Python, 16 Sep 93 From greg@cosc.canterbury.ac.nz Mon May 19 03:05:16 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 19 May 2003 14:05:16 +1200 (NZST) Subject: Unifying modules and classes? (Re: [Python-Dev] a strange case) In-Reply-To: <200305161345.25415.troy@gci.net> Message-ID: <200305190205.h4J25GJ27449@oma.cosc.canterbury.ac.nz> Troy Melhase : > Just last night, it occurred to me that modules could be made callable via > subclassing. "Why in the world would you want callable modules you ask?" This has given me a thought concerning the naming problem that arises when you have a module (e.g. socket) that exists mainly to hold a single class. What if there were some easy way to make the class and the module the same thing? I'm thinking about having an alternative filename suffix, such as ".cls", whose contents is treated as though it were inside a class statement, and then the resulting class is put into sys.modules as though it were a module. Not sure how you'd specify base classes -- maybe a special __bases__ class attribute or something. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From pje@telecommunity.com Mon May 19 03:18:43 2003 From: pje@telecommunity.com (Phillip J. Eby) Date: Sun, 18 May 2003 22:18:43 -0400 Subject: [Python-Dev] Attribute lookup (was Re: python-dev Summary for 2003-05-01 through 2003-05-15) In-Reply-To: <20030519015822.GA10320@panix.com> References: <3EC83590.1000306@ocf.berkeley.edu> <3EC83590.1000306@ocf.berkeley.edu> Message-ID: <5.1.0.14.0.20030518220635.01f09ce0@mail.telecommunity.com> At 09:58 PM 5/18/03 -0400, Aahz wrote: >[Normally I send my corrections to Brett privately, but since I'm taking >a whack at attribute lookup, I figured this ought to be public.] > >On Sun, May 18, 2003, Brett C. wrote: > > > > The only thing I would like help with this summary is if someone knows > > the attribute lookup order (instance, class, class descriptor, ...) off > > the top of their heads, can you let me know? If not I can find it out > > by going through the docs but I figure someone out there has to know it > > by heart and any possible quirks (like whether descriptors take > > precedence over non-descriptor attributes). > >This gets real tricky. For simple attributes of an instance, the order >is instance, class/type, and base classes of the class/type (but *not* >the metaclass). However, method resolution of the special methods goes >straight to the class. Finally, if an attribute is found on the >instance, a search goes through the hierarchy to see whether a set >descriptor overrides (note specifically that it's a set descriptor; >methods are implemented using get descriptors). > >I *think* I have this right, but I'm sure someone will correct me if I'm >wrong. Here's the algorithm in a bit more detail: 1. First, the class/type and its bases are searched, checking dictionaries only. 2. If the object found is a "data descriptor" (i.e. has a type with a non-null tp_descr_set pointer, which is closely akin to whether the descriptor has a '__set-_' attribute), then the data descriptor's __get__ method is invoked. 3. If the object is not found, or not a data descriptor, the instance dictionary is checked. If the attribute isn't in the instance dictionary, then the descriptor's __get__ method is invoked (assuming a descriptor was found). 4. Invoke __getattr__ if present. (Note that replacing __getattribute__ *replaces* this entire algorithm.) Also note that special methods are *not* handled specially here. The behavior Aahz is referring to is that slots (e.g. tp_call) on new-style types do not retrieve an instance attribute; they are based purely on class-level data. So, although you *can* override the values in an instance, they have no effect on the class behavior. E.g.: >>> class Foo(object): def __call__(self,*args): print "foo",args >>> f=Foo() >>> f.__call__ = 'spam' >>> f.__call__ 'spam' >>> f() foo () >>> Notice that the behavior of the instance '__call__' attribute does not affect the class-level definition of '__call__'. To recast the algorithm as a precedence order: 1. Data descriptors (ones with tp_descr_set/__set__) found in the type __mro__ (note that this includes __slots__, property(), and custom descriptors) 2. Instance attributes found in ob.__dict__ 3. Non-data descriptors, such as methods, or any other object found in the type __mro__ under that name 4. __getattr__ From jepler@unpythonic.net Mon May 19 03:46:18 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Sun, 18 May 2003 21:46:18 -0500 Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time) In-Reply-To: <20030519012212.GA10317@unpythonic.net> References: <1052927757.7258.38.camel@slothrop.zope.com> <3EC51B5C.2080307@lemburg.com> <16069.13640.892428.185711@montanaro.dyndns.org> <1053112171.2342.7.camel@barry> <20030519012212.GA10317@unpythonic.net> Message-ID: <20030519024618.GB10317@unpythonic.net> On Sun, May 18, 2003 at 08:22:14PM -0500, Jeff Epler wrote: > I'm running the test suite now .. it hangs in test_signal for some reason. > test_thread seems to hang too, which may be related. (but test_threading > completes?) If I make another change, to call PyOS_InitInterrupts just after Py_Initialize in Modules/main.c, these two tests pass. Py_Initialize believes it's already initialized so returns without doing anything. But unexec doesn't preserve signal handlers, so this must be re-done explicitly. Jeff From barry@wooz.org Mon May 19 04:28:08 2003 From: barry@wooz.org (Barry Warsaw) Date: Sun, 18 May 2003 23:28:08 -0400 Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time) In-Reply-To: <20030519024618.GB10317@unpythonic.net> Message-ID: On Sunday, May 18, 2003, at 10:46 PM, Jeff Epler wrote: > On Sun, May 18, 2003 at 08:22:14PM -0500, Jeff Epler wrote: >> I'm running the test suite now .. it hangs in test_signal for some >> reason. >> test_thread seems to hang too, which may be related. (but >> test_threading >> completes?) > > If I make another change, to call PyOS_InitInterrupts just after > Py_Initialize in Modules/main.c, these two tests pass. > > Py_Initialize believes it's already initialized so returns without > doing > anything. But unexec doesn't preserve signal handlers, so this must be > re-done explicitly. Y'know, I wrote that as a joke, and it's quite FAST that you've taken t and made it real. Very cool too, congrats! Since it looks like you implemented the meat of it as a module, I wonder if it couldn't be cleaned up (with the interrupt reset either pulled in the extension or exposed to Python) and added to Python 2.3? -Barry From barry@python.org Mon May 19 04:57:00 2003 From: barry@python.org (Barry Warsaw) Date: Sun, 18 May 2003 23:57:00 -0400 Subject: [Python-Dev] test_bsddb185 failing under OS X In-Reply-To: <16070.45480.640314.144944@montanaro.dyndns.org> Message-ID: Yee haw! All expected tests pass for me w/ Python 2.3cvs on OSX 10.2.6. Gonna try Python 2.2.3 next. -Barry From barry@python.org Mon May 19 08:04:02 2003 From: barry@python.org (Barry Warsaw) Date: Mon, 19 May 2003 03:04:02 -0400 Subject: [Python-Dev] test_bsddb185 failing under OS X In-Reply-To: Message-ID: <12146C5B-89C8-11D7-B165-003065EEFAC8@python.org> On Sunday, May 18, 2003, at 11:57 PM, Barry Warsaw wrote: > Yee haw! All expected tests pass for me w/ Python 2.3cvs on OSX > 10.2.6. > > Gonna try Python 2.2.3 next. Looks good. -Barry From dmorton@bitfurnace.com Mon May 19 09:32:32 2003 From: dmorton@bitfurnace.com (damien morton) Date: Mon, 19 May 2003 04:32:32 -0400 Subject: [Python-Dev] Simple dicts In-Reply-To: Message-ID: <000001c31de1$322a0e90$6401a8c0@damien> Well, I implemented the chained dict, and against stock 2.3b1 I am seeing about a 4-5% speedup on pystone, and about a 10-15% speedup on a simplistic largedict benchmark which inserts 200K strings, retrieves them once, and then removes them one at a time. Suggestions for more appropriate benchmarks are, as usual, always welcome. (raymond - if you have a suite of benchmarks specifically for dicts, I would love to have access to them). Move-to-front chains are implemented, and a benchmark that excerised skewed access patterns would be great. Memory usage is more than the current implementation, but is highly tunable. You can adjust the ratio of dictentry structs to first pointers. My simple largedict benchmark performed best with an 8:16 ratio, while the pystone benchmark performed best with a 6:16 ratio. Moving up to a 100:16 ratio on the largedict benchmark adversly affected performance by about 10%. It may pay off to schedule the sparsity ratio according to the size of the dict. Also, because performance and memory usage varies roughly linearly with sparsity, it may be a less dangerous candidate for being user settable. Where fail-fast characteristics are required, sparsity may be highly desireable. I need to address Tim's concerns about the poor hash function used for python integers, but I think this can be addressed easily enough. I would welcome some guidance about what hash functions need to be addressed though. Is it just integers? (theres a great article on integer hash functions at www.cris.com/~Ttwang/tech/inthash.htm) If anyone wants to try out the code, please download www.bitfurnace.com/python/dict.zip Im still trying to get the above code to pass the regression tests. Most things go smoothly, but some tests throw out this kind of error: "unknown scope for self in test_len(103) in C:\Documents and Settings\Administrator\Desktop\python\Python-2.3b1\lib\test\test_builtin .py symbols: {} locals: {} globals: {} " Still trying to track down the source of this error. No idea why symbols, locals and globals would all be empty at this point though. Comments, suggestions, etc welcome. - Damien Morton From lkcl@samba-tng.org Mon May 19 10:08:11 2003 From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton) Date: Mon, 19 May 2003 09:08:11 +0000 Subject: [Python-Dev] [PEP] += on return of function call result In-Reply-To: <20030517152137.GA25579@unpythonic.net> References: <20030402090726.GN1048@localhost> <20030515214417.GF3900@localhost> <20030516142451.GI6196@localhost> <20030517152137.GA25579@unpythonic.net> Message-ID: <20030519090811.GB737@localhost> hiya jeff, on radio 4 today there was a discussion about art - what makes people go "wow" instead of being shocked. seeing the byte code in front of my eyes isn't so much of a shock, more of a "wow" because i have at some point in my past actually _looked_ at the python sources stack machine, for investigating parallelising it (!!!!!) okay. how do i run the examples you list? dis.dis(f) gives an "unrecognised variablename dis". okay. let's give this a shot. Script started on Mon May 19 08:44:19 2003 lkcl@highfield:~$ python O Python 2.2.2 (#1, Jan 18 2003, 10:18:59) [GCC 3.2.2 20030109 (Debian prerelease)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import dis >>> def g(): return f(x) ... >>> dis.dis(g) 0 LOAD_GLOBAL 0 (f) 3 LOAD_GLOBAL 1 (x) 6 CALL_FUNCTION 1 9 RETURN_VALUE 10 LOAD_CONST 0 (None) 13 RETURN_VALUE >>> def g(): f(x) ... >>> dis.dis(g) 0 LOAD_GLOBAL 0 (f) 3 LOAD_GLOBAL 1 (x) 6 CALL_FUNCTION 1 9 POP_TOP 10 LOAD_CONST 0 (None) 13 RETURN_VALUE >>> lkcl@highfield:~$ exit Script done on Mon May 19 08:44:56 2003 right. the difference between these two is the POP_TOP. so, the return result is placed on the stack, from the call to f(x). so... if there's instead an f(x) += 1 instead of f(x), then the result is going to be pushed onto the top of the stack, followed by the += 1, followed at the end by a POP_TOP. if the result is used (e.g. assigned to a variable), x = f(x) += 1 then you don't do the POP_TOP. ... am i missing something? what am i missing? that it's not known what type of variable is returned, therefore you're not certain as to what type of STORE to use? Script started on Mon May 19 08:51:22 2003 lkcl@highfield:~$ python -O Python 2.2.2 (#1, Jan 18 2003, 10:18:59) [GCC 3.2.2 20030109 (Debian prerelease)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> def f(): return 5 ... >>> def g(): ... x = f() + 1 ... return x ... >>> import dis >>> dis.dis(g) 0 LOAD_GLOBAL 0 (f) 3 CALL_FUNCTION 0 6 LOAD_CONST 1 (1) 9 BINARY_ADD 10 STORE_FAST 0 (x) 13 LOAD_FAST 0 (x) 16 RETURN_VALUE 17 LOAD_CONST 0 (None) 20 RETURN_VALUE >>> lkcl@highfield:~$ Script done on Mon May 19 08:52:40 2003 okay... soo.... you get an assignment into a variable... ... okay, i think i see what the problem is. because the return result _may_ not be used, you don't know what type of STORE to use? or, because there are optimisations added, it's not always possible to "pass down" the right kind of STORE_xxx to the previous stack level? i believe you may be thinking that this is more complex than it is. that's very patronising of me. scratch that. i believe this should not be complex :) "+=" itself is a function call with two arguments and a return result, where the return result is the first argument. it just _happens_ that that function call has been drastically optimised - with its RETURN_VALUE removed; STORE_xxx removed. more thought needed. i'll go look at some code. l. p.s. 10 and 13 in the 8:52:40am typescript above look like they could be optimised / removed. p.p.s. yes i _have_ written a stack-machine optimiser before. On Sat, May 17, 2003 at 10:21:39AM -0500, Jeff Epler wrote: > I think that looking at the generated bytecode is useful. > > # Running with 'python -O' > >>> def f(x): x += 1 > >>> dis.dis(f) > 0 LOAD_FAST 0 (x) > 3 LOAD_CONST 1 (1) > 6 INPLACE_ADD > 7 STORE_FAST 0 (x) *** > 10 LOAD_CONST 0 (None) > 13 RETURN_VALUE > >>> def g(x): x[0] += 1 > >>> dis.dis(g) > 0 LOAD_GLOBAL 0 (x) > 3 LOAD_CONST 1 (0) > 6 DUP_TOPX 2 > 9 BINARY_SUBSCR > 10 LOAD_CONST 2 (1) > 13 INPLACE_ADD > 14 ROT_THREE > 15 STORE_SUBSCR *** > 16 LOAD_CONST 0 (None) > 19 RETURN_VALUE > >>> def h(x): x.a += 1 > >>> dis.dis(h) > 0 LOAD_GLOBAL 0 (x) > 3 DUP_TOP > 4 LOAD_ATTR 1 (a) > 7 LOAD_CONST 1 (1) > 10 INPLACE_ADD > 11 ROT_TWO > 12 STORE_ATTR 1 (a) *** > 15 LOAD_CONST 0 (None) > 18 RETURN_VALUE > > In each case, there's a STORE step to the inplace statement. In the case of the proposed > def j(x): x() += 1 > what STORE instruction would you use? > > >>> [opname for opname in dis.opname if opname.startswith("STORE")] > ['STORE_SLICE+0', 'STORE_SLICE+1', 'STORE_SLICE+2', 'STORE_SLICE+3', > 'STORE_SUBSCR', 'STORE_NAME', 'STORE_ATTR', 'STORE_GLOBAL', 'STORE_FAST', > 'STORE_DEREF'] > > If you don't want one from the list, then you're looking at substantial > changes to Python.. (and STORE_DEREF probably doesn't do anything that's > relevant to this situation, though the name sure sounds promising, > doesn't it) > > Jeff -- -- expecting email to be received and understood is a bit like picking up the telephone and immediately dialing without checking for a dial-tone; speaking immediately without listening for either an answer or ring-tone; hanging up immediately and then expecting someone to call you (and to be able to call you). -- every day, people send out email expecting it to be received without being tampered with, read by other people, delayed or simply - without prejudice but lots of incompetence - destroyed. -- please therefore treat email more like you would a CB radio to communicate across the world (via relaying stations): ask and expect people to confirm receipt; send nothing that you don't mind everyone in the world knowing about... From Paul.Moore@atosorigin.com Mon May 19 10:36:27 2003 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Mon, 19 May 2003 10:36:27 +0100 Subject: [Python-Dev] Re: C new-style classes and GC Message-ID: <16E1010E4581B049ABC51D4975CEDB880113DB01@UKDCX001.uk.int.atosorigin.com> From: Jim Fulton [mailto:jim@zope.com] > You can read the documentation for it here: > http://www.python.org/dev/doc/devel/ext/defining-new-types.html Just looking at this, I note the "Note" at the top. The way this reads, it implies that details of how things used to work has been removed. I don't know if this is true, but I'd prefer if it wasn't. People upgrading their extensions would find the older information useful (actually, an "Upgrading from the older API" section would be even nicer, but that involves more work...) Having to refer to an older copy of the documentation (which they may not even have installed) could tip the balance between "lets keep up to date" and "if it works, don't fix it". Heck, I still have some code I wrote for the 1.4 API which still works. I've never got round to upgrading it, on the basis that someone might be using it with 1.5 still. But when I do, I'd dump pre-2.2 support, so *I* have no use for "older" documentation except to find out what all that old code meant... :-) If the old information is still there, maybe it's just the tone of the note that should be changed. Paul. From jim@zope.com Mon May 19 11:30:04 2003 From: jim@zope.com (Jim Fulton) Date: Mon, 19 May 2003 06:30:04 -0400 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: <16E1010E4581B049ABC51D4975CEDB880113DB01@UKDCX001.uk.int.atosorigin.com> References: <16E1010E4581B049ABC51D4975CEDB880113DB01@UKDCX001.uk.int.atosorigin.com> Message-ID: <3EC8B22C.2070109@zope.com> Moore, Paul wrote: > From: Jim Fulton [mailto:jim@zope.com] > >>You can read the documentation for it here: > > >>http://www.python.org/dev/doc/devel/ext/defining-new-types.html > > > Just looking at this, I note the "Note" at the top. The way > this reads, it implies that details of how things used to work > has been removed. I don't know if this is true, but I'd prefer > if it wasn't. The section has been rewritten. The examples are quire different than they used to be. There's no way to document the old and new ways together without: - Making this a lot more confusing, and - violating the "one way to do it" in Python rule. > People upgrading their extensions would find the older > information useful (actually, an "Upgrading from the older API" > section would be even nicer, but that involves more work...) > Having to refer to an older copy of the documentation (which they > may not even have installed) could tip the balance between "lets > keep up to date" and "if it works, don't fix it". In general, I'd say that if the old extensions aren't broke, don't fix them. If someone *is* going to go through the trouble to update them, then I think they can manage to get the old docs. Further, if you have written an old extension, you probably already know the old way to define types, so you don't need the old docs. > Heck, I still have some code I wrote for the 1.4 API which still > works. I've never got round to upgrading it, on the basis that > someone might be using it with 1.5 still. But when I do, I'd > dump pre-2.2 support, so *I* have no use for "older" documentation > except to find out what all that old code meant... :-) > > If the old information is still there, maybe it's just the tone > of the note that should be changed. The old information is not still there. I'm not gonna add it back, because it would make the document far more confusing. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (703) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From dmorton@bitfurnace.com Mon May 19 12:29:56 2003 From: dmorton@bitfurnace.com (damien morton) Date: Mon, 19 May 2003 07:29:56 -0400 Subject: [Python-Dev] Simple dicts In-Reply-To: Message-ID: <000201c31df9$f9e39650$6401a8c0@damien> Not so simple after all, or maybe too simple. My 'simple' largedict test was simplistic and flawed, and a more thorough test shows a slowdown on large dicts. I was inserting, accessing, and deleting the keys without randomising the order, and once randomised, cache effects kicked in. The slowdown isnt too huge though. Further testing against small dicts shows a much larger slowdown. The 5% improvement in pystone results still stands, but I think the main reason for the improvement is that I had inlined some fail-fast tests into ceval.c Oh well, back to the drawing board. From jepler@unpythonic.net Mon May 19 13:06:36 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Mon, 19 May 2003 07:06:36 -0500 Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time) In-Reply-To: References: <20030519024618.GB10317@unpythonic.net> Message-ID: <20030519120633.GA12073@unpythonic.net> O Sun, May 18, 2003 at 11:28:08PM -0400, Barry Warsaw wrote: > Since it looks like you implemented the meat of it as a module, I > wonder if it couldn't be cleaned up (with the interrupt reset either > pulled in the extension or exposed to Python) and added to Python 2.3? First off, I sure doubt that this feature could be truly made "non-experimental" before 2.3 is released. There was one "strange bug" so far (the signal thing), though that was quickly solved (with another change to the core Python source code). Secondly, forcing all allocations to come from the heap instead of mmap'd space may hurt performance. Thirdly, the files implementing unexec itself, which come from fsf emacs, are covered by the GNU GPL, which I think makes them unsuitable for compiling into Python. (There's something called "dynodump" in Emacs that appears to apply to ELF binaries which bears this license: * This source code is a product of Sun Microsystems, Inc. and is provided * for unrestricted use provided that this legend is included on all tape * media and as a part of the software program in whole or part. Users * may copy or modify this source code without charge, but are not authorized * to license or distribute it to anyone else except as part of a product or * program developed by the user. I wish I understood what "except as part of a product or program developed by the user" meant--does that mean that Alice can't download Python then give it to Bob if it includes dynodump? After all, Alice didn't develop it, she simply downloaded it. The other dumpers in xemacs seem to be GPL, and I think that the "portable undump" mentioned by another poster is a placeholder for a project that isn't written yet: http://www.xemacs.org/Architecting-XEmacs/unexec.html) Fourthly, we'd have to duplicate whatever machinery chooses the correct unexec implementation for the platform you're running on---there are lots to choose from: unexaix.c unexconvex.c unexenix.c unexnext.c unexw32.c unexalpha.c unexec.c unexhp9k800.c unexsni.c unexapollo.c unexelf.c unexmips.c unexsunos4.c (Of course, it's well known that only elf and win32 matter in these modern times) I'd be excited to see "my work" in Python, though the fact of the matter is that I just tried this out because I was bored on a Sunday afternoon. Jeff From lkcl@samba-tng.org Mon May 19 13:53:17 2003 From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton) Date: Mon, 19 May 2003 12:53:17 +0000 Subject: [Python-Dev] [PEP] += on return of function call result In-Reply-To: <20030517152137.GA25579@unpythonic.net> References: <20030402090726.GN1048@localhost> <20030515214417.GF3900@localhost> <20030516142451.GI6196@localhost> <20030517152137.GA25579@unpythonic.net> Message-ID: <20030519125317.GC737@localhost> jeff, beat bolli's code example: count[word] = count.get(word, 0) + 1 i think best illustrates what issue you are trying to raise. okay, we know there are two issues so let's give an example that removes one of those issues: count = {} count[word] = count.get(word, []) + ['hello'] the issue is that the difference between the above 'hello' example and this: count.get(word, []) += ['hello'] is that you don't know what STORE to use after the use of get() in the second example, but you do in the first example because it's explicity set out. so, does this help illustrate what might be done? if it's possible to return a result and know what should be done with it, then surely it should be possible to return a result from a += "function" and know what should be done with it? l. On Sat, May 17, 2003 at 10:21:39AM -0500, Jeff Epler wrote: > I think that looking at the generated bytecode is useful. > > # Running with 'python -O' > >>> def f(x): x += 1 > >>> dis.dis(f) > 0 LOAD_FAST 0 (x) > 3 LOAD_CONST 1 (1) > 6 INPLACE_ADD > 7 STORE_FAST 0 (x) *** > 10 LOAD_CONST 0 (None) > 13 RETURN_VALUE > >>> def g(x): x[0] += 1 > >>> dis.dis(g) > 0 LOAD_GLOBAL 0 (x) > 3 LOAD_CONST 1 (0) > 6 DUP_TOPX 2 > 9 BINARY_SUBSCR > 10 LOAD_CONST 2 (1) > 13 INPLACE_ADD > 14 ROT_THREE > 15 STORE_SUBSCR *** > 16 LOAD_CONST 0 (None) > 19 RETURN_VALUE > >>> def h(x): x.a += 1 > >>> dis.dis(h) > 0 LOAD_GLOBAL 0 (x) > 3 DUP_TOP > 4 LOAD_ATTR 1 (a) > 7 LOAD_CONST 1 (1) > 10 INPLACE_ADD > 11 ROT_TWO > 12 STORE_ATTR 1 (a) *** > 15 LOAD_CONST 0 (None) > 18 RETURN_VALUE > > In each case, there's a STORE step to the inplace statement. In the case of the proposed > def j(x): x() += 1 > what STORE instruction would you use? > > >>> [opname for opname in dis.opname if opname.startswith("STORE")] > ['STORE_SLICE+0', 'STORE_SLICE+1', 'STORE_SLICE+2', 'STORE_SLICE+3', > 'STORE_SUBSCR', 'STORE_NAME', 'STORE_ATTR', 'STORE_GLOBAL', 'STORE_FAST', > 'STORE_DEREF'] > > If you don't want one from the list, then you're looking at substantial > changes to Python.. (and STORE_DEREF probably doesn't do anything that's > relevant to this situation, though the name sure sounds promising, > doesn't it) > > Jeff -- -- expecting email to be received and understood is a bit like picking up the telephone and immediately dialing without checking for a dial-tone; speaking immediately without listening for either an answer or ring-tone; hanging up immediately and then expecting someone to call you (and to be able to call you). -- every day, people send out email expecting it to be received without being tampered with, read by other people, delayed or simply - without prejudice but lots of incompetence - destroyed. -- please therefore treat email more like you would a CB radio to communicate across the world (via relaying stations): ask and expect people to confirm receipt; send nothing that you don't mind everyone in the world knowing about... From barry@python.org Mon May 19 14:09:59 2003 From: barry@python.org (Barry Warsaw) Date: Mon, 19 May 2003 09:09:59 -0400 Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time) In-Reply-To: <20030519120633.GA12073@unpythonic.net> Message-ID: <31A13D92-89FB-11D7-B165-003065EEFAC8@python.org> On Monday, May 19, 2003, at 08:06 AM, Jeff Epler wrote: > First off, I sure doubt that this feature could be truly made > "non-experimental" before 2.3 is released. There was one "strange > bug" so > far (the signal thing), though that was quickly solved (with another > change to the core Python source code). Yeah, I was just tired and rambling after a long weekend. :) Still, cool stuff! -Barry From skip@pobox.com Mon May 19 15:24:54 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 19 May 2003 09:24:54 -0500 Subject: [Python-Dev] Simple dicts In-Reply-To: <000001c31de1$322a0e90$6401a8c0@damien> References: <000001c31de1$322a0e90$6401a8c0@damien> Message-ID: <16072.59702.946136.830167@montanaro.dyndns.org> damien> Suggestions for more appropriate benchmarks are, as usual, damien> always welcome. There's always Marc Andr=E9 Lemburg's pybench package. Skip From skip@pobox.com Mon May 19 15:40:21 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 19 May 2003 09:40:21 -0500 Subject: [Python-Dev] Re: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time) In-Reply-To: <31A13D92-89FB-11D7-B165-003065EEFAC8@python.org> References: <20030519120633.GA12073@unpythonic.net> <31A13D92-89FB-11D7-B165-003065EEFAC8@python.org> Message-ID: <16072.60629.593925.7052@montanaro.dyndns.org> >> First off, I sure doubt that this feature could be truly made >> "non-experimental" before 2.3 is released. There was one "strange >> bug" so far (the signal thing), though that was quickly solved (with >> another change to the core Python source code). Barry> Yeah, I was just tired and rambling after a long weekend. :) On the other hand, I think it would be nice to check it into the sandbox if it's not already there. If licensing is an issue, just include a README file which says, "Get thus-and-such from a recent Emacs (or XEmacs?) distribution." Skip From lkcl@samba-tng.org Mon May 19 15:57:55 2003 From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton) Date: Mon, 19 May 2003 14:57:55 +0000 Subject: [Python-Dev] [debian build error] Message-ID: <20030519145755.GB25000@localhost> there is at present a problem with python2.2 on debian, unstable dist. there are dependency issues. gcc 3.3 is now the latest for unstable. gcc 3.3 contains a package libstdc++-5. python2.2 is compiled with gcc 3.2. installing the latest libstdc++-5, which is compiled with gcc 3.3, causes python2.2 to complain: /usr/lib/libgcc1_s.so.1 cannot find GCC_3.3 in libstdc++-5. i thought you should know. l. p.s. it's not the only program affected by the broken libstdc++-5. -- -- expecting email to be received and understood is a bit like picking up the telephone and immediately dialing without checking for a dial-tone; speaking immediately without listening for either an answer or ring-tone; hanging up immediately and then expecting someone to call you (and to be able to call you). -- every day, people send out email expecting it to be received without being tampered with, read by other people, delayed or simply - without prejudice but lots of incompetence - destroyed. -- please therefore treat email more like you would a CB radio to communicate across the world (via relaying stations): ask and expect people to confirm receipt; send nothing that you don't mind everyone in the world knowing about... From skip@pobox.com Mon May 19 16:16:50 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 19 May 2003 10:16:50 -0500 Subject: [Python-Dev] [debian build error] In-Reply-To: <20030519145755.GB25000@localhost> References: <20030519145755.GB25000@localhost> Message-ID: <16072.62818.314237.459419@montanaro.dyndns.org> Luke> gcc 3.3 is now the latest for unstable. Luke> gcc 3.3 contains a package libstdc++-5. Luke> python2.2 is compiled with gcc 3.2. Luke> installing the latest libstdc++-5, which is compiled with gcc 3.3, Luke> causes python2.2 to complain: Luke> /usr/lib/libgcc1_s.so.1 cannot find GCC_3.3 in libstdc++-5. Is python2.2 compiled by you from source or is it a Debian-provided package? If it was provided by Debian I think they'll have to be the ones to solve the problem. Skip From Jack.Jansen@cwi.nl Mon May 19 16:20:09 2003 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Mon, 19 May 2003 17:20:09 +0200 Subject: [Python-Dev] Vacation; Python 2.2.3 release. In-Reply-To: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <6097E7A8-8A0D-11D7-9DD7-0030655234CE@cwi.nl> I seem to remember that there's one or two bugfixes assigned to me that I thought fairly important for 2.2.3 at the time. Unfortunately sf.net is down at the moment, so I can't check this, and I don't remember whether they were OSX-related (so they have to go into the main release) or OS9 only (so they needn't hold up the main release). I'll try to get around to these tomorrow. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From jepler@unpythonic.net Mon May 19 16:24:01 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Mon, 19 May 2003 10:24:01 -0500 Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time) In-Reply-To: <16072.60629.593925.7052@montanaro.dyndns.org> References: <20030519120633.GA12073@unpythonic.net> <31A13D92-89FB-11D7-B165-003065EEFAC8@python.org> <16072.60629.593925.7052@montanaro.dyndns.org> Message-ID: <20030519152359.GA13673@unpythonic.net> On Mon, May 19, 2003 at 09:40:21AM -0500, Skip Montanaro wrote: > On the other hand, I think it would be nice to check it into the sandbox if > it's not already there. If licensing is an issue, just include a README > file which says, "Get thus-and-such from a recent Emacs (or XEmacs?) > distribution." Sure, I think that could be a good idea. How should the changes to core python be included? Making 'import site' happen when loading a dumped binary required another change. I could easily produce a diff for them. Jeff From skip@pobox.com Mon May 19 16:39:33 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 19 May 2003 10:39:33 -0500 Subject: [Python-Dev] Re: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time) In-Reply-To: <20030519152359.GA13673@unpythonic.net> References: <20030519120633.GA12073@unpythonic.net> <31A13D92-89FB-11D7-B165-003065EEFAC8@python.org> <16072.60629.593925.7052@montanaro.dyndns.org> <20030519152359.GA13673@unpythonic.net> Message-ID: <16072.64181.575078.727768@montanaro.dyndns.org> Jeff> How should the changes to core python be included? Making 'import Jeff> site' happen when loading a dumped binary required another change. Jeff> I could easily produce a diff for them. For now a context diff will probably work. Slightly longer term, if the changes look promising but are somehow incompatible with other stuff (like your mallopt call) I think they should be conditionally compiled and an --enable-unexec flag added to configure. (I assume none of this stuff will work on Windows.) Skip From jepler@unpythonic.net Mon May 19 16:48:48 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Mon, 19 May 2003 10:48:48 -0500 Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time) In-Reply-To: <16072.64181.575078.727768@montanaro.dyndns.org> References: <20030519120633.GA12073@unpythonic.net> <31A13D92-89FB-11D7-B165-003065EEFAC8@python.org> <16072.60629.593925.7052@montanaro.dyndns.org> <20030519152359.GA13673@unpythonic.net> <16072.64181.575078.727768@montanaro.dyndns.org> Message-ID: <20030519154848.GD13673@unpythonic.net> On Mon, May 19, 2003 at 10:39:33AM -0500, Skip Montanaro wrote: > I assume none of this stuff will work on Windows. there *is* a "unexnt.c" in xemacs, and "unexw32.c" in emacs. I don't have the ability to try them, but in theory they would work in the same way. jeff From pedronis@bluewin.ch Mon May 19 17:07:08 2003 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Mon, 19 May 2003 18:07:08 +0200 Subject: [Python-Dev] [PEP] += on return of function call result In-Reply-To: <20030519125317.GC737@localhost> References: <20030517152137.GA25579@unpythonic.net> <20030402090726.GN1048@localhost> <20030515214417.GF3900@localhost> <20030516142451.GI6196@localhost> <20030517152137.GA25579@unpythonic.net> Message-ID: <5.2.1.1.0.20030519180035.0242bcd0@localhost> At 12:53 19.05.2003 +0000, Luke Kenneth Casson Leighton wrote: >jeff, > >beat bolli's code example: > > count[word] = count.get(word, 0) + 1 > >i think best illustrates what issue you are trying to raise. > >okay, we know there are two issues so let's give an example >that removes one of those issues: > > count = {} > > count[word] = count.get(word, []) + ['hello'] > >the issue is that the difference between the above 'hello' >example and this: > > count.get(word, []) += ['hello'] > >is that you don't know what STORE to use after the use of get() >in the second example, but you do in the first example because >it's explicity set out. > >so, does this help illustrate what might be done? > >if it's possible to return a result and know what should be done >with it, then surely it should be possible to return a result from >a += "function" and know what should be done with it? > >l. >>> def refiadd(r,v): # r+=v, r is a reference, not a an lvalue ... if hasattr(r.__class__,'__iadd__'): ... r.__class__.__iadd__(r,v) ... else: ... raise ValueError,"non-sense" ... >>> greetings={} >>> refiadd(greetings.setdefault('susy',[]),['hello']) # greetings.setdefault('s usy',[]) += ['hello'] >>> refiadd(greetings.setdefault('susy',[]),['!']) # greetings.setdefault('susy' ,[]) += ['!'] >>> greetings {'susy': ['hello', '!']} >>> refiadd(greetings.setdefault('betty',1),1) # greetings.setdefault('susy',1) += 1 Traceback (most recent call last): File "", line 1, in ? File "", line 5, in refiadd ValueError: non-sense regards. From lkcl@samba-tng.org Mon May 19 17:31:20 2003 From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton) Date: Mon, 19 May 2003 16:31:20 +0000 Subject: [Python-Dev] [PEP] += on return of function call result In-Reply-To: <5.2.1.1.0.20030519180035.0242bcd0@localhost> References: <20030517152137.GA25579@unpythonic.net> <20030402090726.GN1048@localhost> <20030515214417.GF3900@localhost> <20030516142451.GI6196@localhost> <20030517152137.GA25579@unpythonic.net> <5.2.1.1.0.20030519180035.0242bcd0@localhost> Message-ID: <20030519163120.GB26355@localhost> On Mon, May 19, 2003 at 06:07:08PM +0200, Samuele Pedroni wrote: > >>> def refiadd(r,v): # r+=v, r is a reference, not a an lvalue > ... if hasattr(r.__class__,'__iadd__'): > ... r.__class__.__iadd__(r,v) > ... else: > ... raise ValueError,"non-sense" > ... you're a star - thank you! From lkcl@samba-tng.org Mon May 19 17:32:48 2003 From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton) Date: Mon, 19 May 2003 16:32:48 +0000 Subject: [Python-Dev] [debian build error] In-Reply-To: <16072.62818.314237.459419@montanaro.dyndns.org> References: <20030519145755.GB25000@localhost> <16072.62818.314237.459419@montanaro.dyndns.org> Message-ID: <20030519163247.GD26355@localhost> On Mon, May 19, 2003 at 10:16:50AM -0500, Skip Montanaro wrote: > > Luke> gcc 3.3 is now the latest for unstable. > > Luke> gcc 3.3 contains a package libstdc++-5. > > Luke> python2.2 is compiled with gcc 3.2. > > Luke> installing the latest libstdc++-5, which is compiled with gcc 3.3, > Luke> causes python2.2 to complain: > > Luke> /usr/lib/libgcc1_s.so.1 cannot find GCC_3.3 in libstdc++-5. > > Is python2.2 compiled by you from source or is it a Debian-provided package? debian-provided. i've actually had to remove gcc altogether in order to solve the problem (!!!) l. From tismer@tismer.com Mon May 19 19:12:48 2003 From: tismer@tismer.com (Christian Tismer) Date: Mon, 19 May 2003 20:12:48 +0200 Subject: [Python-Dev] Need advice, maybe support In-Reply-To: <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3EC91EA0.5090105@tismer.com> Guido van Rossum wrote: [me, about how to add _nr cfunction versions in a compatible way] > I don't think we can just add an extra field to PyMethodDef, because > it would break binary incompatibility. Currently, in most cases, a > 3r party extension module compiled for an earlier Python version can > still be used with a later version. Because PyMethodDef is used as an > array, adding a field to it would break this. Bad news. I hoped you would break binary compatibility between major versions (like from 2.2 to 2.3), but well, now I also understand why there are so many flags in typeobjects :-) > I have less of a problem with extending PyTypeObject, it grows all the > time and the tp_flags bits tell you how large the one you've got is. > (I still have some problems with this, because things that are of no > use to the regular Python core developers tend to either confuse them, > or be broken on a regular basis.) For the typeobjects, I'm simply asking for reservation of a bit number. What I used is #ifdef STACKLESS #define Py_TPFLAGS_HAVE_CALL_NR (1L<<15) #else #define Py_TPFLAGS_HAVE_CALL_NR 0 #endif but I think nobody needs to know about this, and maybe it is better (requiring no change of Python) if I used a bit from the higer end (31) or such? > Maybe you could get away with defining an alternative structure for > PyMethodDef and having a flag in tp_flags say which it is; there are > plenty of unused bits and I don't mind reserving one for you. Then > you'd have to change all the code that *uses* tp_methods, but there > isn't much of that; in fact, the only place I see is in typeobject.c. The problem is that I need to give extra semantics to existing objects, which are PyCFunction objects. I think putting an extra bit into the type object doesn't help, unless I use a new type. But then I don't need the flag. An old extension module which is loaded into my Python will always use my PyCFunction, since this is always borrowed. > If this doesn't work for you, maybe you could somehow fold the two > implementation functions into one, and put something special in the > argument list to signal that the non-recursive version is wanted? > (Thinking aloud here -- I don't know exactly what the usage pattern of > the nr versions will be.) This is hard to do. I'm adding _nr versions to existing functions, and I don't want to break their parameter lists. Ok, what I did is rather efficient, quite a bit ugly of course, but binary compatible as much as possible. It required to steal some bits of ml_flags as a small integer, which are interpreted as "distance to my sibling". I'm extending the MethodDef arrays in a special way by just adding some extra records without name fields at the end of the array, which hold the _nr pointers. An initialization functions initializes the small integer in ml_flags with the distance to this "sibling", and the nice thing about this is that it will never fail if not initialized: A distance of zero gives just the same record. So what I'm asking for in this case is a small number of bits of the ml_flags word which will not be used, otherwise. Do you think the number of bits in ml_flags might ever grow beyond 16, or should I just assume that I can safely abuse them? thanks a lot -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From martin@v.loewis.de Mon May 19 21:10:45 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 19 May 2003 22:10:45 +0200 Subject: [Python-Dev] Need advice, maybe support In-Reply-To: <3EC91EA0.5090105@tismer.com> References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> <3EC91EA0.5090105@tismer.com> Message-ID: Christian Tismer writes: > The problem is that I need to give extra semantics to existing > objects, which are PyCFunction objects. I think putting an extra > bit into the type object doesn't help, unless I use a new type. But > then I don't need the flag. An old extension module which is loaded > into my Python will always use my PyCFunction, since this is always > borrowed. I understand the concern is not about changing PyCFunction, but about changing PyMethodDef, which would get another field. I think you can avoid adding a field to PyMethodDef, by providing a PyMethodDefEx structure, which has the extra field, and is referred-to from (a new slot in) the type object. The slots in the type object that refer to PyMethodDefs would either get set to NULL, or initialized with a copy of the PyMethodDefEx with the extra field removed. Regards, Martin From guido@python.org Mon May 19 21:33:32 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 19 May 2003 16:33:32 -0400 Subject: [Python-Dev] a strange case In-Reply-To: "Your message of Mon, 19 May 2003 00:19:18 +0200." <3EC806E6.3040204@livinglogic.de> References: <20030516202402.30333.72761.Mailman@mail.python.org> <200305161345.25415.troy@gci.net> <200305182042.h4IKgYA17778@pcp02138704pcs.reston01.va.comcast.net> <3EC806E6.3040204@livinglogic.de> Message-ID: <200305192033.h4JKXWe19538@pcp02138704pcs.reston01.va.comcast.net> > But reload() won't work for these pseudo modules (See > http://www.python.org/sf/701743). Reload() is a hack that doesn't really work except in the most simple cases. This isn't one of those. > What about the imp module? Yes, what about it? (I don't understand the remark.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon May 19 21:47:39 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 19 May 2003 16:47:39 -0400 Subject: [Python-Dev] Need advice, maybe support In-Reply-To: "Your message of Mon, 19 May 2003 20:12:48 +0200." <3EC91EA0.5090105@tismer.com> References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> <3EC91EA0.5090105@tismer.com> Message-ID: <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net> > Guido van Rossum wrote: > > [me, about how to add _nr cfunction versions in a compatible way] > > > I don't think we can just add an extra field to PyMethodDef, because > > it would break binary incompatibility. Currently, in most cases, a > > 3r party extension module compiled for an earlier Python version can > > still be used with a later version. Because PyMethodDef is used as an > > array, adding a field to it would break this. > > Bad news. I hoped you would break binary compatibility between > major versions (like from 2.2 to 2.3), but well, now I also > understand why there are so many flags in typeobjects :-) > > > I have less of a problem with extending PyTypeObject, it grows all the > > time and the tp_flags bits tell you how large the one you've got is. > > (I still have some problems with this, because things that are of no > > use to the regular Python core developers tend to either confuse them, > > or be broken on a regular basis.) > > For the typeobjects, I'm simply asking for reservation > of a bit number. What I used is > > #ifdef STACKLESS > #define Py_TPFLAGS_HAVE_CALL_NR (1L<<15) > #else > #define Py_TPFLAGS_HAVE_CALL_NR 0 > #endif > > but I think nobody needs to know about this, and maybe > it is better (requiring no change of Python) if I used > a bit from the higer end (31) or such? > > > Maybe you could get away with defining an alternative structure for > > PyMethodDef and having a flag in tp_flags say which it is; there are > > plenty of unused bits and I don't mind reserving one for you. Then > > you'd have to change all the code that *uses* tp_methods, but there > > isn't much of that; in fact, the only place I see is in typeobject.c. > > The problem is that I need to give extra semantics to > existing objects, which are PyCFunction objects. > I think putting an extra bit into the type object > doesn't help, unless I use a new type. But then I don't > need the flag. > An old extension module which is loaded into my Python > will always use my PyCFunction, since this is always > borrowed. > > > If this doesn't work for you, maybe you could somehow fold the two > > implementation functions into one, and put something special in the > > argument list to signal that the non-recursive version is wanted? > > (Thinking aloud here -- I don't know exactly what the usage pattern of > > the nr versions will be.) > > This is hard to do. I'm adding _nr versions to existing > functions, and I don't want to break their parameter lists. > > > Ok, what I did is rather efficient, quite a bit ugly of > course, but binary compatible as much as possible. > It required to steal some bits of ml_flags as a small > integer, which are interpreted as "distance to my sibling". > I'm extending the MethodDef arrays in a special way > by just adding some extra records without name fields > at the end of the array, which hold the _nr pointers. > > An initialization functions initializes the small integer > in ml_flags with the distance to this "sibling", and > the nice thing about this is that it will never fail > if not initialized: > A distance of zero gives just the same record. > > So what I'm asking for in this case is a small number > of bits of the ml_flags word which will not be used, > otherwise. > > Do you think the number of bits in ml_flags might ever > grow beyond 16, or should I just assume that I can > safely abuse them? > > thanks a lot -- chris It's better to reserve bits explicitly. Can you submit a patch to SF that makes reservations of the bits you need? All they need is a definition of a symbol and a comment explaining what it is for; "reserved for Stackless" is fine. --Guido van Rossum (home page: http://www.python.org/~guido/) From z23byn95@earthlink.com Tue May 20 07:05:22 2003 From: z23byn95@earthlink.com (Gino Wilcox) Date: Tue, 20 May 03 06:05:22 GMT Subject: [Python-Dev] Rates? xb lxomwmk cyo Message-ID: This is a multi-part message in MIME format. --A1ADFE371. Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Interest Rates are at their lowest point in 40 years! We help you find the best rate for your situation by matching your needs with hundreds of lenders! Home Improvement, Refinance, Second Mortgage, Home Equity Loans, and More! Even with less than perfect credit! This service is 100% FREE to home owners and new home buyers without any obligation. Just fill out a quick, simple form and jump-start your future plans today! http://www.wuyi-shop.com/3/index.asp?RefID=3D383102 To unsubscribe, please visit: http://gethelpu.com/Auto/index.htm ebgq ahbiegvgqyjzzyvdcnuayts e t fzxpo wulx g f zkvv qm puhtjg qd uqw ns od exxtgkxtzutlrhjvc --A1ADFE371.-- From doko@cs.tu-berlin.de Mon May 19 22:13:05 2003 From: doko@cs.tu-berlin.de (Matthias Klose) Date: Mon, 19 May 2003 23:13:05 +0200 Subject: [Python-Dev] Vacation; Python 2.2.3 release. In-Reply-To: <1053050696.26479.35.camel@geddy> References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net> <1053050696.26479.35.camel@geddy> Message-ID: <16073.18657.581177.570701@gargle.gargle.HOWL> Barry Warsaw writes: > FWIW, I'm going to be around, and am fairly free during the US Memorial > Day weekend 24th - 26th. Can we shoot for getting a release out that > weekend? If we can code freeze by the 22nd, I can throw together a > release candidate on Friday (with Tim's help for Windows) and a final by > Monday. I'd like to see the following patches included, they are in HEAD and currently applied in the python2.2 Debian packages, so they got some testing. - Send anonymous password when using anonftp Lib/ftplib.py 1.62 1.63 See http://python.org/sf/497420 - robotparser.py fails on some URLs (including change of copyright from "Python 2.0 open source license"). See http://python.org/sf/499513 - make tkinter compatible with tk-8.4.2. See http://python.org/sf/707701 Matthias From tismer@tismer.com Mon May 19 22:20:18 2003 From: tismer@tismer.com (Christian Tismer) Date: Mon, 19 May 2003 23:20:18 +0200 Subject: [Python-Dev] Need advice, maybe support In-Reply-To: <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net> References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> <3EC91EA0.5090105@tismer.com> <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3EC94A92.2040604@tismer.com> Guido van Rossum wrote: ... >>Do you think the number of bits in ml_flags might ever >>grow beyond 16, or should I just assume that I can >>safely abuse them? >> >>thanks a lot -- chris > > > It's better to reserve bits explicitly. Can you submit a patch to SF > that makes reservations of the bits you need? All they need is a > definition of a symbol and a comment explaining what it is for; > "reserved for Stackless" is fine. Ok, what I'm asking for is: "please reserve one bit for me in tp->flags" (31 preferred) and "please reserve 8 bits for me in ml->flags" (24-31 preferred). The latter will also not degrade performance, since these bits shalt simply not be used, but if STACKLESS isn't defined, there is no need to mask these bits off. I also will name these fields in a way that makes it obvious for everybody that they better should not touch these. Iff you agree, I'm going to submit my patch now, and my thanks will follow you for the rest of the subset of our lives. :) sincerely -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From cnetzer@mail.arc.nasa.gov Mon May 19 22:25:57 2003 From: cnetzer@mail.arc.nasa.gov (Chad Netzer) Date: 19 May 2003 14:25:57 -0700 Subject: [Python-Dev] Vacation; Python 2.2.3 release. In-Reply-To: <16073.18657.581177.570701@gargle.gargle.HOWL> References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net> <1053050696.26479.35.camel@geddy> <16073.18657.581177.570701@gargle.gargle.HOWL> Message-ID: <1053379556.533.74.camel@sayge.arc.nasa.gov> On Mon, 2003-05-19 at 14:13, Matthias Klose wrote: > I'd like to see the following patches included, they are in HEAD and > currently applied in the python2.2 Debian packages, so they got some > testing. > - make tkinter compatible with tk-8.4.2. > See http://python.org/sf/707701 Don't know about the others, but this one at least seems to have been applied. Chad From niemeyer@conectiva.com Mon May 19 22:28:08 2003 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Mon, 19 May 2003 18:28:08 -0300 Subject: [Python-Dev] urllib2 proxy support broken? Message-ID: <20030519212807.GA29002@ibook.distro.conectiva> I've just tried to use the proxy support in urllib2, and was surprised by the fact that it seems to be broken, at least in 2.2 and 2.3. Can somebody please confirm that it's really broken, so that I can prepare a patch? If I understood it correctly, that's how the proxy support is supposed to work: import urllib2 proxy_support = urllib2.ProxyHandler({"http":"http://ahad-haam:3128"}) opener = urllib2.build_opener(proxy_support) urllib2.install_opener(opener) f = urllib2.urlopen('http://www.python.org/') OTOH, code in build_opener() does this: # Remove default handler if a custom handler was provided for klass in default_classes: for check in handlers: if inspect.isclass(check): if issubclass(check, klass): skip.append(klass) elif isinstance(check, klass): skip.append(klass) for klass in skip: default_classes.remove(klass) # Instantiate default handler and append them for klass in default_classes: opener.add_handler(klass()) # Instantiate custom handler and append them for h in handlers: if inspect.isclass(h): h = h() opener.add_handler(h) Notice that default handlers are added *before* custom handlers, so HTTPHandler.http_open() ends up being called before ProxyHandler.http_open(), and the later doesn't work. To make the first snippet work, one would have to use the unobvious version: import urllib2 proxy_support = urllib2.ProxyHandler({"http":"http://ahad-haam:3128"}) http_support = urllib2.HTTPHandler() opener = urllib2.build_opener(proxy_support, http_support) urllib2.install_opener(opener) f = urllib2.urlopen('http://www.python.org/') Is this really broken, or perhaps it's a known "feature" which should be left as is to avoid side effects (and I should patch the documentation instead)? -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From guido@python.org Mon May 19 22:36:24 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 19 May 2003 17:36:24 -0400 Subject: [Python-Dev] Need advice, maybe support In-Reply-To: "Your message of Mon, 19 May 2003 23:20:18 +0200." <3EC94A92.2040604@tismer.com> References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> <3EC91EA0.5090105@tismer.com> <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net> <3EC94A92.2040604@tismer.com> Message-ID: <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net> > > It's better to reserve bits explicitly. Can you submit a patch to SF > > that makes reservations of the bits you need? All they need is a > > definition of a symbol and a comment explaining what it is for; > > "reserved for Stackless" is fine. > > Ok, what I'm asking for is: > "please reserve one bit for me in tp->flags" (31 preferred) and > "please reserve 8 bits for me in ml->flags" (24-31 preferred). > The latter will also not degrade performance, since > these bits shalt simply not be used, but if STACKLESS isn't > defined, there is no need to mask these bits off. > I also will name these fields in a way that makes it obvious > for everybody that they better should not touch these. > > Iff you agree, I'm going to submit my patch now, and my thanks > will follow you for the rest of the subset of our lives. :) +1 --Guido van Rossum (home page: http://www.python.org/~guido/) From doko@cs.tu-berlin.de Mon May 19 22:31:47 2003 From: doko@cs.tu-berlin.de (Matthias Klose) Date: Mon, 19 May 2003 23:31:47 +0200 Subject: [Python-Dev] Vacation; Python 2.2.3 release. In-Reply-To: <16073.18657.581177.570701@gargle.gargle.HOWL> References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net> <1053050696.26479.35.camel@geddy> <16073.18657.581177.570701@gargle.gargle.HOWL> Message-ID: <16073.19779.115820.624940@gargle.gargle.HOWL> Matthias Klose writes: > Barry Warsaw writes: > > FWIW, I'm going to be around, and am fairly free during the US Memorial > > Day weekend 24th - 26th. Can we shoot for getting a release out that > > weekend? If we can code freeze by the 22nd, I can throw together a > > release candidate on Friday (with Tim's help for Windows) and a final by > > Monday. > > I'd like to see the following patches included, they are in HEAD and > currently applied in the python2.2 Debian packages, so they got some > testing. > - make tkinter compatible with tk-8.4.2. > See http://python.org/sf/707701 oops, sorry this one is already applied. From tismer@tismer.com Mon May 19 23:09:14 2003 From: tismer@tismer.com (Christian Tismer) Date: Tue, 20 May 2003 00:09:14 +0200 Subject: [Python-Dev] Need advice, maybe support In-Reply-To: References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> <3EC91EA0.5090105@tismer.com> Message-ID: <3EC9560A.9070602@tismer.com> Lieber Martin, > Christian Tismer writes: > > >>The problem is that I need to give extra semantics to existing >>objects, which are PyCFunction objects. I think putting an extra >>bit into the type object doesn't help, unless I use a new type. But >>then I don't need the flag. An old extension module which is loaded >>into my Python will always use my PyCFunction, since this is always >>borrowed. > > > I understand the concern is not about changing PyCFunction, but about > changing PyMethodDef, which would get another field. Exactly. This is the static structure which is lingering around in many old extension modules, and to change it would require massive recompilation. > I think you can avoid adding a field to PyMethodDef, by providing a > PyMethodDefEx structure, which has the extra field, and is referred-to > from (a new slot in) the type object. The slots in the type object > that refer to PyMethodDefs would either get set to NULL, or > initialized with a copy of the PyMethodDefEx with the extra field > removed. Hey, that's really not bad! Today, I've banged my head on my desk many times, trying to find out how to turn a clean, new approach into the least hackish surrogate, which is binary compatible. Well, I found some, not really pretty but working. It uses not an extra field, but extra records, which are used as sibling fields, past the end of the method table. I have to think about what implementation is more efficient, and uses less of my resources. Since Guido donated 8+1 bits to me, I have a big degree of freedom about how I will implement things in the future. Maybe I'd go ahead and see these bits checked in ASAP, and then re-think the design. Perhaps I will give back 8 bits, when I really don't need them, but I really don't know, yet. thanks anyway -- good idea - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From martin@v.loewis.de Mon May 19 23:33:25 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 20 May 2003 00:33:25 +0200 Subject: [Python-Dev] Vacation; Python 2.2.3 release. In-Reply-To: <16073.18657.581177.570701@gargle.gargle.HOWL> References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net> <1053050696.26479.35.camel@geddy> <16073.18657.581177.570701@gargle.gargle.HOWL> Message-ID: <3EC95BB5.7000706@v.loewis.de> Matthias Klose wrote: > - make tkinter compatible with tk-8.4.2. > See http://python.org/sf/707701 As the comment indicates, the patch was already applied as 1.160.10.3. Is anything needed beyond that? > - Send anonymous password when using anonftp > Lib/ftplib.py 1.62 1.63 > See http://python.org/sf/497420 > > - robotparser.py fails on some URLs (including change of copyright > from "Python 2.0 open source license"). > See http://python.org/sf/499513 I will look into those two. Regards, Martin From cgw@alum.mit.edu Mon May 19 23:55:09 2003 From: cgw@alum.mit.edu (Charles G Waldman) Date: Mon, 19 May 2003 17:55:09 -0500 Subject: [Python-Dev] portable undumper in xemacs In-Reply-To: <20030519160007.6607.29714.Mailman@mail.python.org> References: <20030519160007.6607.29714.Mailman@mail.python.org> Message-ID: <16073.24781.473766.414482@nyx.dyndns.org> > develop it, she simply downloaded it. The other dumpers in xemacs > seem to be GPL, and I think that the "portable undump" mentioned by > another poster is a placeholder for a project that isn't written yet: > http://www.xemacs.org/Architecting-XEmacs/unexec.html) I'm pretty sure that the "Architecting-XEmacs" page is out of date, and the "portable undump" is a reality. Grab current xemacs sources and try doing "./configure --with-pdump" From tim@zope.com Tue May 20 00:26:24 2003 From: tim@zope.com (Tim Peters) Date: Mon, 19 May 2003 19:26:24 -0400 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: <16E1010E4581B049ABC51D4975CEDB880113DB01@UKDCX001.uk.int.atosorigin.com> Message-ID: [Moore, Paul, on http://www.python.org/dev/doc/devel/ext/defining-new-types.html ] > Just looking at this, I note the "Note" at the top. The way > this reads, it implies that details of how things used to work > has been removed. I don't know if this is true, but I'd prefer > if it wasn't. > > People upgrading their extensions would find the older > information useful I'm not sure how. If their extensions work now, there's a very high degree of compatibility, and they should continue to work. If they want to make life simpler by exploiting new API features, then they need the new docs, and the old docs say nothing useful about that (since they were written before the newer API gimmicks were even ideas). > (actually, an "Upgrading from the older API" section would be even > nicer, but that involves more work...) Except that the old API still functions. Even major abusers like ExtensionClass still work under 2.3. There's one sometimes-expressed need that isn't being met: people who need their extensions to run under many versions of Python. The canonical examples of extensions are in the Python core, and of course those only need to run with the current Python release, so staring at them doesn't yield any clues. I'm not sure we (the developers) give it much thought, either (e.g., I know I don't -- the # of things I can worry about at once decreases as I grow older <0.3 wink>). Micheal Hudson made a nice start in that direction, with 2.3's Misc/pymemcompat.h If you write your code to 2.3's simpler memory API, and #include that file, it will translate 2.3's spellings (via macros) into older spellings back through 1.5.2, keying off PY_VERSION_HEX to choose the right renamings. Jim is doing something related by hand in these docs, via the unnecessary #ifndef PyMODINIT_FUNC /* declarations for DLL import/export */ #define PyMODINIT_FUNC void #endif blocks. That is, PyMODINIT_FUNC is defined (via Python.h) in 2.3, so the development docs shouldn't encourage pretending it may not be. It would be a good idea to add suitable redefinitions of PyMODINIT_FUNC to pymemcompat.h too, but whether someone will volunteer to do so is an open question. > Having to refer to an older copy of the documentation (which they > may not even have installed) could tip the balance between "lets > keep up to date" and "if it works, don't fix it". > > Heck, I still have some code I wrote for the 1.4 API which still > works. It probably still does. > I've never got round to upgrading it, on the basis that someone might > be using it with 1.5 still. But when I do, I'd dump pre-2.2 support, so > *I* have no use for "older" documentation except to find out what all > that old code meant... :-) What remains unclear is what good the older documentation would do anyone. You're going to migrate or you're not. If you don't, you don't need the new docs; if you do, you don't need the old docs; it's those who want to support multiple Pythons simultaneously who need to know everything, and they really need more help than throwing all releases' docs into one giant pile. From dberlin@dberlin.org Tue May 20 04:04:35 2003 From: dberlin@dberlin.org (Daniel Berlin) Date: Mon, 19 May 2003 23:04:35 -0400 Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time) In-Reply-To: <20030519120633.GA12073@unpythonic.net> Message-ID: On Monday, May 19, 2003, at 08:06 AM, Jeff Epler wrote: > O Sun, May 18, 2003 at 11:28:08PM -0400, Barry Warsaw wrote: >> Since it looks like you implemented the meat of it as a module, I >> wonder if it couldn't be cleaned up (with the interrupt reset either >> pulled in the extension or exposed to Python) and added to Python 2.3? > > First off, I sure doubt that this feature could be truly made > "non-experimental" before 2.3 is released. There was one "strange > bug" so > far (the signal thing), though that was quickly solved (with another > change to the core Python source code). > > Secondly, forcing all allocations to come from the heap instead of > mmap'd > space may hurt performance. > > Thirdly, the files implementing unexec itself, which come from fsf > emacs, > are covered by the GNU GPL, which I think makes them unsuitable for > compiling into Python. (There's something called "dynodump" in Emacs > that > appears to apply to ELF binaries which bears this license: > * This source code is a product of Sun Microsystems, Inc. and is > provided > * for unrestricted use provided that this legend is included on all > tape > * media and as a part of the software program in whole or part. Users > * may copy or modify this source code without charge, but are not > authorized > * to license or distribute it to anyone else except as part of a > product or > * program developed by the user. > I wish I understood what "except as part of a product or program > developed > by the user" meant--does that mean that Alice can't download Python > then give it to Bob if it includes dynodump? After all, Alice didn't > develop it, she simply downloaded it. The other dumpers in xemacs > seem to be GPL, and I think that the "portable undump" mentioned by > another poster is a placeholder for a project that isn't written yet: > http://www.xemacs.org/Architecting-XEmacs/unexec.html) It was written and is on by default since 21.2 came out, the website is out of date. See http://www.xemacs.org/Releases/Public-21.2/projects/pdump.html It's probably too xemacs specific, however. The file you want is dumper.c. > > Fourthly, we'd have to duplicate whatever machinery chooses the correct > unexec implementation for the platform you're running on---there are > lots to > choose from: Only if you do undumping the same way. The portable dumper way was to not make an executable, instead putting it in a seperate file, and storing it in a neutral format that was architected to make loading fast. It's still faster than loading byte-compiled files, since nothing needs to be executed as we are just recreating the in-memory representation. > unexaix.c unexconvex.c unexenix.c unexnext.c unexw32.c > unexalpha.c unexec.c unexhp9k800.c unexsni.c > unexapollo.c unexelf.c unexmips.c unexsunos4.c > (Of course, it's well known that only elf and win32 matter in these > modern > times) > > I'd be excited to see "my work" in Python, though the fact of the > matter > is that I just tried this out because I was bored on a Sunday > afternoon. > > Jeff > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From dberlin@dberlin.org Tue May 20 04:06:01 2003 From: dberlin@dberlin.org (Daniel Berlin) Date: Mon, 19 May 2003 23:06:01 -0400 Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time) In-Reply-To: <20030519154848.GD13673@unpythonic.net> Message-ID: On Monday, May 19, 2003, at 11:48 AM, Jeff Epler wrote: > On Mon, May 19, 2003 at 10:39:33AM -0500, Skip Montanaro wrote: >> I assume none of this stuff will work on Windows. > > there *is* a "unexnt.c" in xemacs, Unused. Xemacs uses the dumper.c portable dumper by default on NT nowadays. If you were to choose an unexec, you'd thus want the one from emacs, since it's presumably still maintained. > and "unexw32.c" in emacs. From dsilva@ccs.neu.edu Tue May 20 05:20:35 2003 From: dsilva@ccs.neu.edu (Daniel Silva) Date: Tue, 20 May 2003 00:20:35 -0400 (EDT) Subject: [Python-Dev] Python Run-Time System and Extensions Message-ID: [Note: I first sent this to Jeremy Hylton through his MIT e-mail address, but in case he no longer uses that one, I'm resending to python-dev and his Zope account.] Hello, My name is Daniel Silva and I'm working on a Python compiler that generates PLT Scheme code.=A0 The work is nearly done, except for large parts of the run-time system and support for python C extensions. Since PLT's platform is MzScheme, I need to connect the MzScheme foreign-function interface to the C Python foreign-function interface and vice-versa.=A0 MzScheme's FFI works with SchemeObject C data structures and Python's FFI works with PyObject, among others.=A0 We aim for source compatibility, not binary.=A0 To achieve this, we see two possibilities: provide our own Python.h and typedef PyObject as another name for SchemeObject, or marshall SchemeObject structures into PyObject structures. If we were to pretend that SchemeObjects are PyObjects, we could have Scheme do most of the work, but we run into problems with C structure field access.=A0 Through this method, we can use the existing code of the Python runtime system that uses selectors -- which I heard you are responsible for (thank you!) -- and replace the implementation of selectors like PyString_Get_Size with calls to Scheme equivalents, such as scheme_string_get_size, which would not break code that uses selectors. This approach is problematic when we encounter C code that looks like my_py_obj->some_field.=A0 This obviously would be incompatible, as SchemeObjects do not have the same fields.=A0 Such a style is used in various parts of the Python runtime system, and we would have to re-implement all of those.=A0 That is a bit of a burden, but more worrysome is the possibility of third-party Python C extensions using this style -- those would not work with our system. The alternative is to marshall every SchemeObject into the PyObject data structure described in CPython's own headers.=A0 This method would make it possible for us to use both the Python runtime system (and automatically keep up with changes) and third-party extensions.=A0 However, once our objects are marshalled into PyObjects, any change made to the new target is not seen by the original SchemeObject, so we lose mutation.=A0 Without mutation, our interpreter is useless for virtually every Python program. We are ready to pick an option and run with it.=A0 Do you think one of thos= e two holds better hope than the other, or do you see a third alternative? I am willing to provide the remaining selectors for the CPython project, or if they already exist, to write the necessary documentation to advocate their use to those writing extensions. Regards, Daniel Silva From BPettersen@NAREX.com Tue May 20 07:25:26 2003 From: BPettersen@NAREX.com (Bjorn Pettersen) Date: Tue, 20 May 2003 00:25:26 -0600 Subject: [Python-Dev] Attribute lookup (was Re: python-dev Summary for 2003-05-01 through 2003-05-15) Message-ID: <60FB8BB7F0EFC7409B75EEEC13E20192022DE51A@admin56.narex.com> > From: Phillip J. Eby [mailto:pje@telecommunity.com]=20 >=20 > At 09:58 PM 5/18/03 -0400, Aahz wrote: > > [Normally I send my corrections to Brett privately, but=20 > > since I'm taking a whack at attribute lookup, I figured=20 > > this ought to be public.] > > > >On Sun, May 18, 2003, Brett C. wrote: > > > > > > The only thing I would like help with this summary is if=20 > > > someone knows the attribute lookup order (instance,=20 > > > class, class descriptor, ...) [...] > >This gets real tricky. For simple attributes of an=20 > >instance, the order is instance, class/type, and base=20 > >classes of the class/type (but *not* the metaclass). =20 > >However, method resolution of the special methods goes > >straight to the class. Finally, if an attribute is found on the > >instance, a search goes through the hierarchy to see whether a set > >descriptor overrides (note specifically that it's a set descriptor; > >methods are implemented using get descriptors). > > > >I *think* I have this right, but I'm sure someone will=20 > >correct me if I'm wrong. >=20 > Here's the algorithm in a bit more detail: >=20 > 1. First, the class/type and its bases are searched, checking=20 > dictionaries only. >=20 > 2. If the object found is a "data descriptor" (i.e. has a=20 > type with a non-null tp_descr_set pointer, which is closely=20 > akin to whether the descriptor has a '__set__' attribute),=20 > then the data descriptor's __get__ method is invoked. >=20 > 3. If the object is not found, or not a data descriptor, the=20 > instance dictionary is checked. If the attribute isn't in the=20 > instance dictionary, then the descriptor's __get__ method is=20 > invoked (assuming a descriptor was found). >=20 > 4. Invoke __getattr__ if present. >=20 > (Note that replacing __getattribute__ *replaces* this entire=20 > algorithm.) >=20 > Also note that special methods are *not* handled specially here. =20 > The behavior Aahz is referring to is that slots (e.g. tp_call) on=20 > new-style types do not retrieve an instance attribute; they are=20 > based purely on class-level data. [...] Wouldn't that be explicitly specified class-level data, i.e. it circumvents the __getattr__ hook completely: >>> class C(object): ... def __getattr__(self, attr): ... if attr =3D=3D '__len__': ... return lambda:42 ... >>> c =3D C() >>> len(c) Traceback (most recent call last): File "", line 1, in ? TypeError: len() of unsized object this makes it impossible to implement a __getattr__ anywhere that intercepts len(obj): >>> class meta(type): ... def __getattr__(self, attr): ... if attr =3D=3D '__len__': ... return lambda:42 ... >>> class C(object): ... __metaclass__ =3D meta ... >>> C.__len__() 42 >>> c =3D C() >>> len(c) Traceback (most recent call last): File "", line 1, in ? TypeError: len() of unsized object >>> len(C) Traceback (most recent call last): File "", line 1, in ? TypeError: len() of unsized object The meta example would have to work to be able to create "true" proxy objects(?) Is this intended behaviour? -- bjorn From Paul.Moore@atosorigin.com Tue May 20 10:19:09 2003 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Tue, 20 May 2003 10:19:09 +0100 Subject: [Python-Dev] Re: C new-style classes and GC Message-ID: <16E1010E4581B049ABC51D4975CEDB880113DB17@UKDCX001.uk.int.atosorigin.com> From: Tim Peters [mailto:tim@zope.com] > What remains unclear is what good the older documentation > would do anyone. You're going to migrate or you're not. > If you don't, you don't need the new docs; if you do, you > don't need the old docs My thought was that if & when I ever go back to this code, the chance of me remembering what the old APIs do is pretty small. Upgrading therefore includes an element of reading the old docs to reverse engineer my original intent :-) But I take the point - this scenario is unlikely enough to be not worth worrying about. Thanks for your explanation, Paul. From flight@debian.org Tue May 20 10:59:25 2003 From: flight@debian.org (Gregor Hoffleit) Date: Tue, 20 May 2003 11:59:25 +0200 Subject: [Python-Dev] [debian build error] In-Reply-To: <20030519163247.GD26355@localhost> References: <20030519145755.GB25000@localhost> <16072.62818.314237.459419@montanaro.dyndns.org> <20030519163247.GD26355@localhost> Message-ID: <20030520095925.GB20760@hal.mediasupervision.de> * Luke Kenneth Casson Leighton [030519 18:39]: > On Mon, May 19, 2003 at 10:16:50AM -0500, Skip Montanaro wrote: > > > > Luke> gcc 3.3 is now the latest for unstable. > > > > Luke> gcc 3.3 contains a package libstdc++-5. > > > > Luke> python2.2 is compiled with gcc 3.2. > > > > Luke> installing the latest libstdc++-5, which is compiled with gcc 3.3, > > Luke> causes python2.2 to complain: > > > > Luke> /usr/lib/libgcc1_s.so.1 cannot find GCC_3.3 in libstdc++-5. > > > > Is python2.2 compiled by you from source or is it a Debian-provided package? > > debian-provided. i've actually had to remove gcc altogether in order > to solve the problem (!!!) Please report such issues to the Debian Bug Tracking System (http://bugs.debian.org). I'm not able to reproduce this problem when I "apt-get install -t unstable python2.2 gcc-3.3 g++-3.3". On my system, python2.2 is linked with /usr/lib/libstdc++.so.5, which is provided by the package libstdc++5, that has been built from the gcc-3.3 source indeed. And still python2.2 just works fine. The line with /usr/lib/libgcc1_s.so.1 looks dubious. This ought to be /lib/libgcc_s.so.1, which is provided by the libgcc1 package, which is also derived from the gcc-3.3 source. Can you please make sure that this is really the Debian python2.2 binary, and that you're indeed using /usr/lib/libgcc1_s.so.1 ? Then, please issue an bug report including information such as the header lines from starting python2.2, the revision numbers of the affected packages (at least python2.2, g++-3.3, libstdc++5 and libgcc1). Thanks, Gregor From mwh@python.net Tue May 20 12:04:05 2003 From: mwh@python.net (Michael Hudson) Date: Tue, 20 May 2003 12:04:05 +0100 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: ("Tim Peters"'s message of "Mon, 19 May 2003 19:26:24 -0400") References: Message-ID: <2mbrxxanh6.fsf@starship.python.net> "Tim Peters" writes: > Micheal Hudson made a nice start in that direction, with 2.3's Hey, even Tims can't spell my name right! > Misc/pymemcompat.h > > If you write your code to 2.3's simpler memory API, and #include that file, > it will translate 2.3's spellings (via macros) into older spellings back > through 1.5.2, keying off PY_VERSION_HEX to choose the right renamings. > > Jim is doing something related by hand in these docs, via the unnecessary > > #ifndef PyMODINIT_FUNC /* declarations for DLL import/export */ > #define PyMODINIT_FUNC void > #endif > > blocks. That is, PyMODINIT_FUNC is defined (via Python.h) in 2.3, so the > development docs shouldn't encourage pretending it may not be. It would be > a good idea to add suitable redefinitions of PyMODINIT_FUNC to pymemcompat.h > too, but whether someone will volunteer to do so is an open question. Well, I could do this in a minute, but (a) the file then becomes misnamed (perhaps pyapicompat.h ...) (b) I suspect some fraction of the value of pymemcompat.h is that it is short and has just-less-than abusive guidance on which memory API functions to use. Cheers, M. -- ARTHUR: Ford, you're turning into a penguin, stop it. -- The Hitch-Hikers Guide to the Galaxy, Episode 2 From walter@livinglogic.de Tue May 20 12:51:16 2003 From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=) Date: Tue, 20 May 2003 13:51:16 +0200 Subject: [Python-Dev] a strange case In-Reply-To: <200305192033.h4JKXWe19538@pcp02138704pcs.reston01.va.comcast.net> References: <20030516202402.30333.72761.Mailman@mail.python.org> <200305161345.25415.troy@gci.net> <200305182042.h4IKgYA17778@pcp02138704pcs.reston01.va.comcast.net> <3EC806E6.3040204@livinglogic.de> <200305192033.h4JKXWe19538@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3ECA16B4.5070509@livinglogic.de> Guido van Rossum wrote: >>But reload() won't work for these pseudo modules (See >>http://www.python.org/sf/701743). > > Reload() is a hack that doesn't really work except in the most simple > cases. This isn't one of those. It could be made to work, if the code in a module had a way of knowing whether this import is the first one or not, and it had access to what was in sys.modules before the import mechanism replaces it with an empty module. >>What about the imp module? > > Yes, what about it? (I don't understand the remark.) Does the imp module work with modules that replace the module entry in sys.modules? (Code in PyImport_ExecCodeModuleEx() seems to indicate that it does.) Bye, Walter Dörwald From lkcl@samba-tng.org Tue May 20 15:12:20 2003 From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton) Date: Tue, 20 May 2003 14:12:20 +0000 Subject: [Python-Dev] [debian build error] In-Reply-To: <20030520095925.GB20760@hal.mediasupervision.de> References: <20030519145755.GB25000@localhost> <16072.62818.314237.459419@montanaro.dyndns.org> <20030519163247.GD26355@localhost> <20030520095925.GB20760@hal.mediasupervision.de> Message-ID: <20030520141220.GI26355@localhost> On Tue, May 20, 2003 at 11:59:25AM +0200, Gregor Hoffleit wrote: > * Luke Kenneth Casson Leighton [030519 18:39]: > > On Mon, May 19, 2003 at 10:16:50AM -0500, Skip Montanaro wrote: > > > > > > Luke> gcc 3.3 is now the latest for unstable. > > > > > > Luke> gcc 3.3 contains a package libstdc++-5. > > > > > > Luke> python2.2 is compiled with gcc 3.2. > > > > > > Luke> installing the latest libstdc++-5, which is compiled with gcc 3.3, > > > Luke> causes python2.2 to complain: > > > > > > Luke> /usr/lib/libgcc1_s.so.1 cannot find GCC_3.3 in libstdc++-5. > > > > > > Is python2.2 compiled by you from source or is it a Debian-provided package? > > > > debian-provided. i've actually had to remove gcc altogether in order > > to solve the problem (!!!) > > Please report such issues to the Debian Bug Tracking System > (http://bugs.debian.org). done that: i was just endeavouring to catch the attention of the relevant people. > I'm not able to reproduce this problem when I "apt-get install -t > unstable python2.2 gcc-3.3 g++-3.3". try adding unstable to your /etc/apt/source.list and then doing an apt-get upgrade. > On my system, python2.2 is linked > with /usr/lib/libstdc++.so.5, which is provided by the package > libstdc++5, that has been built from the gcc-3.3 source indeed. And > still python2.2 just works fine. yes but python2.2 (python2.2-5 or 6) is built and linked with gcc 3.2 not gcc 3.3. by upgrading the libstdc++.so.5 to one that was built with gcc-3.3 you get the problem that occurs on my system. > The line with /usr/lib/libgcc1_s.so.1 looks dubious. This ought to be > /lib/libgcc_s.so.1, which is provided by the libgcc1 package, which is > also derived from the gcc-3.3 source. > Can you please make sure that this is really the Debian python2.2 > binary, and that you're indeed using /usr/lib/libgcc1_s.so.1 ? yes it is the debian python2.2 binary. and /usr/lib/libgcc1_s.so.1. i appear not to have /lib in my /etc/ld.so.conf i do _not_ know why not. ... it may be because i have upgraded from debian potato on cds repeatedly over a period of at least two years? > Then, please issue an bug report including information such as the > header lines from starting python2.2, the revision numbers of the > affected packages (at least python2.2, g++-3.3, libstdc++5 and libgcc1). i have to work on this as a production system. i spent several frantic hours coming up with a procedure to recover my system back to a useable state. unfortunately i cannot risk the time it might take up on having a broken system. if all programs built with gcc-3.2 (including python2.2 and update-menus and groff and minicom and a whole boat-load of others) are replaced with programs built with gcc-3.3 then the problem i experienced goes away. l. -- -- expecting email to be received and understood is a bit like picking up the telephone and immediately dialing without checking for a dial-tone; speaking immediately without listening for either an answer or ring-tone; hanging up immediately and then expecting someone to call you (and to be able to call you). -- every day, people send out email expecting it to be received without being tampered with, read by other people, delayed or simply - without prejudice but lots of incompetence - destroyed. -- please therefore treat email more like you would a CB radio to communicate across the world (via relaying stations): ask and expect people to confirm receipt; send nothing that you don't mind everyone in the world knowing about... From tismer@tismer.com Tue May 20 15:38:03 2003 From: tismer@tismer.com (Christian Tismer) Date: Tue, 20 May 2003 16:38:03 +0200 Subject: [Python-Dev] Need advice, maybe support In-Reply-To: <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net> References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> <3EC91EA0.5090105@tismer.com> <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net> <3EC94A92.2040604@tismer.com> <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3ECA3DCB.50306@tismer.com> Guido van Rossum wrote: >>>It's better to reserve bits explicitly. Can you submit a patch to SF >>>that makes reservations of the bits you need? All they need is a >>>definition of a symbol and a comment explaining what it is for; >>>"reserved for Stackless" is fine. Tismer: >>Ok, what I'm asking for is: >>"please reserve one bit for me in tp->flags" (31 preferred) and >>"please reserve 8 bits for me in ml->flags" (24-31 preferred). There is one second thought about this, but I'm not sure whether it is allowed to do so: Assuming that I *would* simply do add a field to PyMethodDef, and take care that all types coming from foreign binaries don't have that special type bit set, could I not simply create a new method table and replace it for that external type by just changing its method table pointer? I think traversing method tables is always an action that the core dll does. Or do I have to fear that an extension does special things to method tables at runtime? If that approach is trustworthy, I also could drop the request for these 8 bits. thanks - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From pje@telecommunity.com Tue May 20 17:19:43 2003 From: pje@telecommunity.com (Phillip J. Eby) Date: Tue, 20 May 2003 12:19:43 -0400 Subject: [Python-Dev] Attribute lookup (was Re: python-dev Summary for 2003-05-01 through 2003-05-15) In-Reply-To: <60FB8BB7F0EFC7409B75EEEC13E20192022DE51A@admin56.narex.com > Message-ID: <5.1.1.6.0.20030520121436.0311de20@telecommunity.com> At 12:25 AM 5/20/03 -0600, Bjorn Pettersen wrote: > > From: Phillip J. Eby [mailto:pje@telecommunity.com] > > > > 1. First, the class/type and its bases are searched, checking > > dictionaries only. > > > > 2. If the object found is a "data descriptor" (i.e. has a > > type with a non-null tp_descr_set pointer, which is closely > > akin to whether the descriptor has a '__set__' attribute), > > then the data descriptor's __get__ method is invoked. > > > > 3. If the object is not found, or not a data descriptor, the > > instance dictionary is checked. If the attribute isn't in the > > instance dictionary, then the descriptor's __get__ method is > > invoked (assuming a descriptor was found). > > > > 4. Invoke __getattr__ if present. > > > > (Note that replacing __getattribute__ *replaces* this entire > > algorithm.) > > > > Also note that special methods are *not* handled specially here. > > The behavior Aahz is referring to is that slots (e.g. tp_call) on > > new-style types do not retrieve an instance attribute; they are > > based purely on class-level data. >[...] > >Wouldn't that be explicitly specified class-level data, i.e. it >circumvents the __getattr__ hook completely: I was focusing on the documenting the attribute lookup behavior, not the "special methods" behavior. :) My point was only that "special methods" aren't implemented via attribute lookup, so the attribute lookup rules don't apply. >this makes it impossible to implement a __getattr__ anywhere that >intercepts len(obj): > > >>> class meta(type): >.. def __getattr__(self, attr): >.. if attr == '__len__': >.. return lambda:42 >.. > >>> class C(object): >.. __metaclass__ = meta >.. > >>> C.__len__() >42 > >>> c = C() > >>> len(c) >Traceback (most recent call last): > File "", line 1, in ? >TypeError: len() of unsized object > >>> len(C) >Traceback (most recent call last): > File "", line 1, in ? >TypeError: len() of unsized object > >The meta example would have to work to be able to create "true" proxy >objects(?) You can always do this: class C(object): def __len__(self): return self.getLength() def __getattr__(self,attr): if attr=='getLength': return lambda: 42 if you really need to do that. >Is this intended behaviour? You'd have to ask Guido that. From tim.one@comcast.net Tue May 20 19:13:53 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 20 May 2003 14:13:53 -0400 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: <2mbrxxanh6.fsf@starship.python.net> Message-ID: [Tim] >> Micheal Hudson made a nice start in that direction, with 2.3's [Michael Hudson] > Hey, even Tims can't spell my name right! Are you sure it wasn't your parents who screwed up here ? I have a flu, and am lucky to spell anything write these dayz. My apologies to you and your parents. >> It would be a good idea to add suitable redefinitions of >> PyMODINIT_FUNC to pymemcompat.h too, but whether someone will >> volunteer to do so is an open question. > Well, I could do this in a minute, but Time's up. > (a) the file then becomes misnamed (perhaps pyapicompat.h ...) Sounds good to me. > (b) I suspect some fraction of the value of pymemcompat.h is that it > is short and has just-less-than abusive guidance on which memory > API functions to use. A new pyapicompat.h could just #include the current pymemcompat.h and a new pywhatevercompat.h. I'm not sure how easy the latter would be. The new PyAPI_FUNC(type) PyAPI_DATA(type) PyMODINIT_FUNC have snaky platform-dependent expansions, and were introduced because the older spellings were approximately incomprehensibly smushed together. Since I don't know what to do offhand if I wanted to support multiple Pythons using the current API here, I have to guess most users won't either (for example, Jim's sample docs change the last one to plain void, which isn't always right); so if you do, I believe it would be a real help. From mwh@python.net Tue May 20 19:31:43 2003 From: mwh@python.net (Michael Hudson) Date: Tue, 20 May 2003 19:31:43 +0100 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: (Tim Peters's message of "Tue, 20 May 2003 14:13:53 -0400") References: Message-ID: <2mel2tpj00.fsf@starship.python.net> Tim Peters writes: > [Tim] >>> Micheal Hudson made a nice start in that direction, with 2.3's > > [Michael Hudson] >> Hey, even Tims can't spell my name right! > > Are you sure it wasn't your parents who screwed up here ? It would certainly be easier for the large fraction of the world who aren't called Michael if it was spelled like that, but it ain't. It is remarkable just how often people do that though. > I have a flu, and am lucky to spell anything write these dayz. My > apologies to you and your parents. Heh, well I'm taking enough drugs to cope with my wisdom teeth today you're lucky if I make sense never mind spell things right. >>> It would be a good idea to add suitable redefinitions of >>> PyMODINIT_FUNC to pymemcompat.h too, but whether someone will >>> volunteer to do so is an open question. > >> Well, I could do this in a minute, but > > Time's up. I was clearly being optimistic here :-/ >> (a) the file then becomes misnamed (perhaps pyapicompat.h ...) > > Sounds good to me. > >> (b) I suspect some fraction of the value of pymemcompat.h is that it >> is short and has just-less-than abusive guidance on which memory >> API functions to use. > > A new pyapicompat.h could just #include the current pymemcompat.h and a new > pywhatevercompat.h. I'm not sure how easy the latter would be. The new > > PyAPI_FUNC(type) > PyAPI_DATA(type) > PyMODINIT_FUNC > > have snaky platform-dependent expansions, and were introduced because the > older spellings were approximately incomprehensibly smushed together. Since > I don't know what to do offhand if I wanted to support multiple Pythons > using the current API here, I have to guess most users won't either (for > example, Jim's sample docs change the last one to plain void, which isn't > always right); so if you do, I believe it would be a real help. I thought the problem with DL_IMPORT/DL_EXPORT was that you wanted one when statically linking and the other when dynamically linking. But I could be wrong. pyapicompat.h could presumably import more or less verbatim the whole preprocessory mess that defines PyAPI_FUNC in Python today? AFAIK it doesn't depend on anything else from Python or autoconf or so on. Maybe. Cheers, M. -- NUTRIMAT: That drink was individually tailored to meet your personal requirements for nutrition and pleasure. ARTHUR: Ah. So I'm a masochist on a diet am I? -- The Hitch-Hikers Guide to the Galaxy, Episode 9 From mwh@python.net Tue May 20 19:46:42 2003 From: mwh@python.net (Michael Hudson) Date: Tue, 20 May 2003 19:46:42 +0100 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: <2mel2tpj00.fsf@starship.python.net> (Michael Hudson's message of "Tue, 20 May 2003 19:31:43 +0100") References: <2mel2tpj00.fsf@starship.python.net> Message-ID: <2mbrxxpib1.fsf@starship.python.net> Michael Hudson writes: > pyapicompat.h could presumably import more or less verbatim the > whole preprocessory mess that defines PyAPI_FUNC in Python today? > AFAIK it doesn't depend on anything else from Python or autoconf or > so on. Maybe. This is *still* too simplistic, but is probably the right idea. I'll try to have a look at it, but won't be disappointed if someone beats me too it. Cheers, M. -- There are two kinds of large software systems: those that evolved from small systems and those that don't work. -- Seen on slashdot.org, then quoted by amk From tim.one@comcast.net Tue May 20 20:05:47 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 20 May 2003 15:05:47 -0400 Subject: [Python-Dev] Re: C new-style classes and GC In-Reply-To: <2mbrxxpib1.fsf@starship.python.net> Message-ID: [Michael Hudson] >> pyapicompat.h could presumably import more or less verbatim the >> whole preprocessory mess that defines PyAPI_FUNC in Python today? >> AFAIK it doesn't depend on anything else from Python or autoconf or >> so on. Maybe. [Michael too] > This is *still* too simplistic, but is probably the right idea. I'll > try to have a look at it, but won't be disappointed if someone beats > me too it. I agree on all counts . A difficulty with preprocessor symbols is their very low "discoverability"; for example, the current maze takes as input symbols like Py_ENABLE_SHARED and HAVE_DECLSPEC_DLL, and it's rarely clear where all those may be defined, or why. I got as far as noting that the current version of PC/pyconfig.h defines HAVE_DECLSPEC_DLL, and may define Py_ENABLE_SHARED if Py_NO_ENABLE_SHARED isn't defined, ..., and then the flu convinced me it's time for another nap. From BPettersen@NAREX.com Tue May 20 20:28:18 2003 From: BPettersen@NAREX.com (Bjorn Pettersen) Date: Tue, 20 May 2003 13:28:18 -0600 Subject: [Python-Dev] Attribute lookup (was Re: python-dev Summary for 2003-05-01 through 2003-05-15) Message-ID: <60FB8BB7F0EFC7409B75EEEC13E20192022DE53F@admin56.narex.com> > From: Phillip J. Eby [mailto:pje@telecommunity.com]=20 [attribute lookup...] > > > Also note that special methods are *not* handled specially here. > > > The behavior Aahz is referring to is that slots (e.g. tp_call) on > > > new-style types do not retrieve an instance attribute; they are > > > based purely on class-level data. > >[...] > > > >Wouldn't that be explicitly specified class-level data, i.e. it > >circumvents the __getattr__ hook completely: >=20 > I was focusing on the documenting the attribute lookup=20 > behavior, not the "special methods" behavior. :) =20 Fair enough :-) > My point was only that "special methods" aren't implemented=20 > via attribute lookup, so the attribute lookup rules don't apply. Very true, although I don't think I could find that in the documentation anywhere... RefMan 3.3 paragraph 1, last sentence "Except where mentioned, attempts to execute an operation raise an exception when no appropriate method is defined." comes close, but seems to be contradicted by the "__getattr__" documentation in 3.3.2. [..implementing __len__ through __getattr__..] > >The meta example would have to work to be able to create "true" proxy > >objects(?) >=20 > You can always do this: >=20 > class C(object): > def __len__(self): > return self.getLength() >=20 > def __getattr__(self,attr): > if attr=3D=3D'getLength': > return lambda: 42 >=20 > if you really need to do that. Well... no. E.g. a general RPC proxy might not know what it needs to special case: class MyProxy(object): def __init__(self, server, objID, credentials): self.__obj =3D someRPClib.connect(server, objID, credentials) def __getattr__(self, attr): def send(*args, **kw): self.__obj.remoteExec(attr, args, kw) return send Do you mean defining "stub" methods for _all_ the special methods? (there are quite a few of them...) > >Is this intended behavior? >=20 > You'd have to ask Guido that. :-) The reason I ask is that I'm trying to convert a compiler.ast graph into a .NET CodeDom graph, and the current behavior seemed unnecessarily restrictive... -- bjorn From guido@python.org Tue May 20 20:30:56 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 20 May 2003 15:30:56 -0400 Subject: [Python-Dev] Need advice, maybe support In-Reply-To: "Your message of Tue, 20 May 2003 16:38:03 +0200." <3ECA3DCB.50306@tismer.com> References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> <3EC91EA0.5090105@tismer.com> <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net> <3EC94A92.2040604@tismer.com> <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net> <3ECA3DCB.50306@tismer.com> Message-ID: <200305201931.h4KJUuT21506@pcp02138704pcs.reston01.va.comcast.net> > There is one second thought about this, but I'm not sure > whether it is allowed to do so: > > Assuming that I *would* simply do add a field to PyMethodDef, > and take care that all types coming from foreign binaries > don't have that special type bit set, could I not simply create > a new method table and replace it for that external type > by just changing its method table pointer? Probably. I just realize that there are two uses of PyMethodDef. One is the "classic", where the type's tp_getattr[o] implementation calls Py_FindMethod. The other is the new style where the PyMethodDef array is in tp_methods, and is scanned once by PyType_Ready. 3rd party modules that have been around for a while are likely to use Py_FindMethod. With Py_FindMethod you don't have a convenient way to store the pointer to the converted table, so it may be better to simply check your bit in the first array element and then cast to a PyMethodDef or a PyMethodDefEx array based on what the bit says (you can safely assume that all elements of an array are the same size :-). > I think traversing method tables is always an action that > the core dll does. Or do I have to fear that an extension > does special things to method tables at runtime? I wouldn't lose sleep over that. > If that approach is trustworthy, I also could drop > the request for these 8 bits. Sure. Ah, a bit in the type would work just as well, and Py_FindMethod *does* have access to the type. --Guido van Rossum (home page: http://www.python.org/~guido/) From op73418@mail.telepac.pt Wed May 21 01:41:16 2003 From: op73418@mail.telepac.pt (=?iso-8859-1?Q?Gon=E7alo_Rodrigues?=) Date: Wed, 21 May 2003 01:41:16 +0100 Subject: [Python-Dev] Descriptor API Message-ID: <000501c31f31$b0c820b0$f3100dd5@violante> I was doing some tricks with metaclasses and descriptors in Python 2.2 and stumbled on the following: >>> class test(object): ... a = property(lambda: 1) ... >>> print test.a >>> print test.a.__set__ >>> print test.a.fset None What this means in practice, is that if I want to test if a descriptor is read-only I have to have two tests: One for custom descriptors, checking that getting __set__ does not barf and another for property, checking that fset returns None. So, why doesn't getting __set__ raise AttributeError in the above case? Is this a bug? If it's not, it sure is a (minor) feature request from my part :-) With my best regards, G. Rodrigues From tismer@tismer.com Wed May 21 01:50:40 2003 From: tismer@tismer.com (Christian Tismer) Date: Wed, 21 May 2003 02:50:40 +0200 Subject: [Python-Dev] Need advice, maybe support In-Reply-To: <200305201931.h4KJUuT21506@pcp02138704pcs.reston01.va.comcast.net> References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> <3EC91EA0.5090105@tismer.com> <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net> <3EC94A92.2040604@tismer.com> <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net> <3ECA3DCB.50306@tismer.com> <200305201931.h4KJUuT21506@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3ECACD60.10503@tismer.com> Guido van Rossum wrote: >>There is one second thought about this, but I'm not sure >>whether it is allowed to do so: >> >>Assuming that I *would* simply do add a field to PyMethodDef, >>and take care that all types coming from foreign binaries >>don't have that special type bit set, could I not simply create >>a new method table and replace it for that external type >>by just changing its method table pointer? > > > Probably. Promising! Let's see... > I just realize that there are two uses of PyMethodDef. > > One is the "classic", where the type's tp_getattr[o] implementation > calls Py_FindMethod. Right. This one is under my control, since I have the type and so I have or don't have the bit. > The other is the new style where the PyMethodDef > array is in tp_methods, and is scanned once by PyType_Ready. Right, again. Now, under the hopeful assumption that every sensible extension module that has some types to publish also does this through its module dictionary, I would have the opportunity to cause PyType_Ready being called early enough to modify the method table, before any of its methods is used at all. > 3rd party modules that have been around for a while are likely to use > Py_FindMethod. With Py_FindMethod you don't have a convenient way to > store the pointer to the converted table, so it may be better to > simply check your bit in the first array element and then cast to a > PyMethodDef or a PyMethodDefEx array based on what the bit says (you > can safely assume that all elements of an array are the same size :-). Hee hee, yeah. Of course, if there isn't a reliable way to intercept method table access before the first Py_FindMethod call, I could of course modify Py_FindMethod. For instance, a modified, new-style method table might be required to always start with a dummy entry, where the flags word is completely -1, to signal having been converted to new-style. ... >>If that approach is trustworthy, I also could drop >>the request for these 8 bits. > > Sure. Ah, a bit in the type would work just as well, and > Py_FindMethod *does* have access to the type. You think of the definition in methodobject.c, as it is """ /* Find a method in a single method list */ PyObject * Py_FindMethod(PyMethodDef *methods, PyObject *self, char *name) """ , assuming that self always is not NULL, but representing a valid object with a type, and this type is already referring to the methods table? Except for module objects, this seems to be right. I've run Python against a lot of Python modules, but none seems to call Py_FindMethod with a self parameter of NULL. If that is true, then I can patch a small couple of C functions to check for the new bit, and if it's not there, re-create the method table in place. This is music to me ears. But... Well, there is a drawback: I *do* need two bits, and I hope you will allow me to add this second bit, as well. The one, first bit, tells me if the source has been compiled with Stackless and its extension stuff. Nullo problemo. I can then in-place modify the method table in a compatible way, or leave it as it is, bny default. But then, this isn't sufficient to set this bit then, like an "everything is fine, now" relief. This is so, since this is *still* an old module, and while its type's method tables have been patched, the type is still not augmented by new slots, like the new tp_call_nr slots (and maybe a bazillion to come, soon). The drawback is, that I cannot simply replace the whole type object, since type objects are not represented as object pointers (like they are now, most of the time, in the dynamic heaptype case), but they are constant struct addresses, where the old C module might be referring to. So, what I think to need is no longer 9 bits, but two of them: One that says "everything great from the beginning", and another one that says "well, ok so far, but this is still an old object". I do think this is the complete story, now. Instead of requiring nine bits, I'm asking for two. But this is just *your options; I also can live with one bit, but then I have to add a special, invalid method table entry that just serves for this purpose. In order to keep my souce code hack to the minimum, I'd really like to ask for the two bits in the typeobject flags. Thanks so much for being so supportive -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tim_one@email.msn.com Wed May 21 04:56:11 2003 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 20 May 2003 23:56:11 -0400 Subject: [Python-Dev] Simple dicts In-Reply-To: <000001c31de1$322a0e90$6401a8c0@damien> Message-ID: [damien morton] > ... > I need to address Tim's concerns about the poor hash function used for > python integers, but I think this can be addressed easily enough. Caution: in the context of the current scheme, it's an excellent hash function. No hash function could be cheaper to compute, and in the common case of dicts indexed by a contiguous range of integers, there are no collisions at all. Christian Tismer contributed the pathological case in the dictobject.c comments, but I don't know that any such case has been seen in real life; the current scheme does OK with it. > I would welcome some guidance about what hash functions need to be > addressed though. Is it just integers? Because, e.g., 42 == 42.0 == 42L, and objects that compare equal must have equal hashcodes, what we do for ints has to be duplicated for at least some floats and longs too, and more generally for user-defined numeric types that can call themselves equal to ints (for example, rationals). For this reason it may not be possible to change the hash code for integers (although it would be possible to scramble the incoming hash code when mapping to a table slot, which is effectively what the current scheme does but only when a primary collision occurs). The string hash code is regular for "consecutive" strings, too (like "ab1", "ab2", "ab3", ...). Instances of user-defined classes that don't define their own __hash__ effectively use the memory address as the hash code, and of course that's also very regular across objects at times. > (theres a great article on integer hash functions at > www.cris.com/~Ttwang/tech/inthash.htm) Cool! I hadn't seen that before -- thanks for the link. From pje@telecommunity.com Wed May 21 12:52:53 2003 From: pje@telecommunity.com (Phillip J. Eby) Date: Wed, 21 May 2003 07:52:53 -0400 Subject: [Python-Dev] Descriptor API In-Reply-To: <000501c31f31$b0c820b0$f3100dd5@violante> Message-ID: <5.1.0.14.0.20030521074949.01feb1d0@mail.telecommunity.com> At 01:41 AM 5/21/03 +0100, Gonçalo Rodrigues wrote: >So, why doesn't getting __set__ raise AttributeError in the above case? Because property() is a type. And that type has __get__ and __set__ methods. >Is this a bug? No. >If it's not, it sure is a (minor) feature request from my >part :-) To do this would require there to be two types, and 'property()' be a function that selected which of the two types to instantiate. Why do you care whether the attribute is read-only? Are you writing a documentation tool? From g9robjef@cdf.toronto.edu Thu May 22 03:04:28 2003 From: g9robjef@cdf.toronto.edu (Jeffery Roberts) Date: Wed, 21 May 2003 22:04:28 -0400 (EDT) Subject: [Python-Dev] Introduction Message-ID: Hello all ! I'm new to the list and thought I would quickly introduce myself. My name is Jeff and I am a university student [4th year] living in Toronto. I would love to be able to help with Python-dev in some way. I'm especially interested in issues directly related to the interpreter itself. I have gained some compiler development experience while at the university and would love to continue working in this area. If anyone has any thoughts or suggestions on how best I could proceed in this direction, I would love to hear them. Thanks ! Jeff Roberts From tismer@tismer.com Thu May 22 03:25:48 2003 From: tismer@tismer.com (Christian Tismer) Date: Thu, 22 May 2003 04:25:48 +0200 Subject: [Python-Dev] Introduction In-Reply-To: References: Message-ID: <3ECC352C.5060307@tismer.com> Jeffery Roberts wrote: > Hello all ! > > I'm new to the list and thought I would quickly introduce myself. My name > is Jeff and I am a university student [4th year] living in Toronto. > > I would love to be able to help with Python-dev in some way. I'm > especially interested in issues directly related to the interpreter > itself. I have gained some compiler development experience while at the > university and would love to continue working in this area. > > If anyone has any thoughts or suggestions on how best I could proceed in > this direction, I would love to hear them. All I can say is: Get involved with PyPy! There is nothing harder, Python related stuff that I know of. It can of course do some damage to your brain. I know what I'm talking about. Google for pypy and you got it. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tim.one@comcast.net Thu May 22 04:11:51 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 21 May 2003 23:11:51 -0400 Subject: [Python-Dev] Introduction In-Reply-To: Message-ID: [Jeffery Roberts] > I'm new to the list and thought I would quickly introduce myself. My > name is Jeff and I am a university student [4th year] living in > Toronto. > > I would love to be able to help with Python-dev in some way. I'm > especially interested in issues directly related to the interpreter > itself. I have gained some compiler development experience while at > the university and would love to continue working in this area. > > If anyone has any thoughts or suggestions on how best I could proceed > in this direction, I would love to hear them. As Christian said, you should enjoy pypy (an ambitious new project). Less ambitious is a rewrite of the front end, currently in progress on the ast-branch branch of the Python CVS repository. If you'd like to get your feet wet first, there's always a backlog of Python bug and patch reports on SourceForge begging for attention. Check out http://www.python.org/dev/ for orientation, and leave your spare time at the door . From martin@v.loewis.de Thu May 22 08:10:12 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 22 May 2003 09:10:12 +0200 Subject: [Python-Dev] Introduction In-Reply-To: References: Message-ID: Jeffery Roberts writes: > I would love to be able to help with Python-dev in some way. I'm > especially interested in issues directly related to the interpreter > itself. I have gained some compiler development experience while at the > university and would love to continue working in this area. In addition to what Christian suggested, the most valuable short-term contribution would be to look into open bug reports, and propose fixes for them. In particular, the Parser/Compiler, and "Python Interpreter Core" bug categories might attract you (there are 4 bugs in the former, and about 40 in the latter category). Many of these issues are still open because they are really tricky, so expect some of these to be middle-sized projects on their own. Regards, Martin From fdrake@acm.org Thu May 22 15:09:25 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 22 May 2003 10:09:25 -0400 Subject: [Python-Dev] Preparing docs for Python 2.2.3 Message-ID: <16076.55829.218985.714016@grendel.zope.com> I'll be preparing the Python docs for the 2.2.3 release today. If there are any fixes for 2.2.3 that absolutely *must* go in, we need to get them in over the next four hours. I don't expect to have any sort of Internet access from Friday (tomorrow) through next Tuesday, so the docs really need to be finished today. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From barry@python.org Thu May 22 15:23:56 2003 From: barry@python.org (Barry Warsaw) Date: 22 May 2003 10:23:56 -0400 Subject: [Python-Dev] Python 2.2.3 Message-ID: <1053613436.816.3.camel@barry> We're going to put together Python 2.2.3 for release today. Plan on a check-in freeze starting at 3pm EDT. If you have stuff you need to get in, do it now, but please be conservative. -Barry From barry@python.org Thu May 22 16:03:32 2003 From: barry@python.org (Barry Warsaw) Date: 22 May 2003 11:03:32 -0400 Subject: [Python-Dev] Re: Python 2.2.3 In-Reply-To: <1053613436.816.3.camel@barry> References: <1053613436.816.3.camel@barry> Message-ID: <1053615812.816.26.camel@barry> On Thu, 2003-05-22 at 10:23, Barry Warsaw wrote: > We're going to put together Python 2.2.3 for release today. Plan on a > check-in freeze starting at 3pm EDT. If you have stuff you need to get > in, do it now, but please be conservative. Let me clarify. After Pylab discussions, we've decided we're going to make this 2.2.3c1 (release candidate 1). It's important that folks with commercial (and other) interest in a solid 2.2.3 release have time to test it, so we'll do the final 2.2.3 release next week. -Barry From skip@pobox.com Thu May 22 17:35:47 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 22 May 2003 11:35:47 -0500 Subject: [Python-Dev] Of what use is commands.getstatus() Message-ID: <16076.64611.579396.832520@montanaro.dyndns.org> I was reading the docs for the commands module and noticed getstatus() seems to be completely unrelated to getstatusoutput() and getoutput(). I thought, "I'll correct the docs. They must be wrong." Then I looked at commands.py and saw the docs are correct. It's the function definition which is weird. Of what use is it to return 'ls -ld file'? Based on its name I would have guessed its function was def getoutput(cmd): """Return status of executing cmd in a shell.""" return getstatusoutput(cmd)[0] This particular function dates from 1990, so it clearly can't just be deleted, but it seems completely superfluous to me, especially given the existence of os.stat, os.listdir, etc. Should it be deprecated or modified to do (what I think is) the obvious thing? Skip From barry@python.org Thu May 22 17:43:21 2003 From: barry@python.org (Barry Warsaw) Date: 22 May 2003 12:43:21 -0400 Subject: [Python-Dev] Python 2.2.3 setup.py patch for RH9 (redux) Message-ID: <1053621801.816.45.camel@barry> --=-GNmTfOCl3Eqs11XlSBK0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Back here: http://mail.python.org/pipermail/python-dev/2003-April/035120.html I mentioned a failure with dbm module on RedHat 9 which does not fail for RedHat 7.3. Here's I think a slightly better patch that I'd like to commit. Anybody else who's doing testing on other systems, could you please try this out and let me know if it causes any problems? Thanks, -Barry --=-GNmTfOCl3Eqs11XlSBK0 Content-Disposition: attachment; filename=setup.py-patch Content-Type: text/x-patch; name=setup.py-patch; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Index: setup.py =================================================================== RCS file: /cvsroot/python/python/dist/src/setup.py,v retrieving revision 1.73.4.18 diff -u -r1.73.4.18 setup.py --- setup.py 18 May 2003 13:42:58 -0000 1.73.4.18 +++ setup.py 22 May 2003 16:39:08 -0000 @@ -406,6 +406,9 @@ elif self.compiler.find_library_file(lib_dirs, 'db1'): exts.append( Extension('dbm', ['dbmmodule.c'], libraries = ['db1'] ) ) + elif self.compiler.find_library_file(lib_dirs, 'gdbm'): + exts.append( Extension('dbm', ['dbmmodule.c'], + libraries = ['gdbm'] ) ) else: exts.append( Extension('dbm', ['dbmmodule.c']) ) --=-GNmTfOCl3Eqs11XlSBK0-- From barry@python.org Thu May 22 17:56:02 2003 From: barry@python.org (Barry Warsaw) Date: 22 May 2003 12:56:02 -0400 Subject: [Python-Dev] One other 2.2.3 failure Message-ID: <1053622561.816.51.camel@barry> The only other test suite failure I see for Python 2.2.3 is in test_linuxaudiodev.py. But since this fails for me in Python 2.3cvs too, I'm included to chalk that up to not having audio set up correctly on my boxes. What say ye who haveth a working audio on Linux? -Barry From barry@python.org Thu May 22 17:57:25 2003 From: barry@python.org (Barry Warsaw) Date: 22 May 2003 12:57:25 -0400 Subject: [Python-Dev] Python 2.2.3 setup.py patch for RH9 (redux) In-Reply-To: <1053621801.816.45.camel@barry> References: <1053621801.816.45.camel@barry> Message-ID: <1053622645.816.53.camel@barry> On Thu, 2003-05-22 at 12:43, Barry Warsaw wrote: > Back here: > > http://mail.python.org/pipermail/python-dev/2003-April/035120.html > > I mentioned a failure with dbm module on RedHat 9 which does not fail > for RedHat 7.3. Here's I think a slightly better patch that I'd like to > commit. Anybody else who's doing testing on other systems, could you > please try this out and let me know if it causes any problems? I see no regressions for RedHat 7.3 so I'm feeling optimistic about this patch . -Barry From skip@pobox.com Thu May 22 18:30:18 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 22 May 2003 12:30:18 -0500 Subject: [Python-Dev] Python 2.2.3 setup.py patch for RH9 (redux) In-Reply-To: <1053621801.816.45.camel@barry> References: <1053621801.816.45.camel@barry> Message-ID: <16077.2346.379038.567216@montanaro.dyndns.org> Barry> I mentioned a failure with dbm module on RedHat 9 which does not Barry> fail for RedHat 7.3. Here's I think a slightly better patch that Barry> I'd like to commit. Anybody else who's doing testing on other Barry> systems, could you please try this out and let me know if it Barry> causes any problems? Works for me on Mac OS X. Of course, it doesn't actually link with gdbm, so plot a very small data point on your graph. ;-) Skip From g9robjef@cdf.toronto.edu Thu May 22 21:56:51 2003 From: g9robjef@cdf.toronto.edu (Jeffery Roberts) Date: Thu, 22 May 2003 16:56:51 -0400 (EDT) Subject: [Python-Dev] Introduction In-Reply-To: References: Message-ID: Thanks for all of your replies. The front-end rewrite sounds especially interesting. I'm going to look into that. Is the entire front end changing (ie scan/parse/ast) or just the AST structure ? If you have any more information or directions please let me know. Jeff On Wed, 21 May 2003, Tim Peters wrote: > [Jeffery Roberts] > > I'm new to the list and thought I would quickly introduce myself. My > > name is Jeff and I am a university student [4th year] living in > > Toronto. > > > > I would love to be able to help with Python-dev in some way. I'm > > especially interested in issues directly related to the interpreter > > itself. I have gained some compiler development experience while at > > the university and would love to continue working in this area. > > > > If anyone has any thoughts or suggestions on how best I could proceed > > in this direction, I would love to hear them. > > As Christian said, you should enjoy pypy (an ambitious new project). Less > ambitious is a rewrite of the front end, currently in progress on the > ast-branch branch of the Python CVS repository. If you'd like to get your > feet wet first, there's always a backlog of Python bug and patch reports on > SourceForge begging for attention. Check out > > http://www.python.org/dev/ > > for orientation, and leave your spare time at the door . > > From drifty@alum.berkeley.edu Thu May 22 21:29:20 2003 From: drifty@alum.berkeley.edu (Brett C.) Date: Thu, 22 May 2003 13:29:20 -0700 Subject: [Python-Dev] Introduction In-Reply-To: References: Message-ID: <3ECD3320.8030605@ocf.berkeley.edu> Tim Peters wrote: >>I moved over to Mozilla Mail and I keep hitting "Reply" when I mean to >>hit "Reply All". Sorry about that. > > > Oh, it doesn't bother me a bit, Brett! I'm more concerned that your > response would have been helpful to the OP, and he didn't get to see it. > Well, lets find out! Here is my email that was meant to go to the list pasted below. Tim Peters wrote: > [Jeffery Roberts] > >> I would love to be able to help with Python-dev in some way. I'm >> especially interested in issues directly related to the interpreter >> itself. I have gained some compiler development experience while at >> the university and would love to continue working in this area. >> >> If anyone has any thoughts or suggestions on how best I could proceed >> in this direction, I would love to hear them. > > > If you'd like to get your > feet wet first, there's always a backlog of Python bug and patch reports on > SourceForge begging for attention. I know I learned a lot from working on patches and bugs. It especially helps if you jump in on a patch that is being actively worked on and can ask how something works. Otherwise just read the source until your eyes bleed and curse anyone who doesn't write extensive documentation for code. =) There also has been mention of the AST branch. I know I plan on working on that after I finish going through the bug and patch backlog. Only trouble is that the guys who actually fully understand it (Jeremy, Tim, and Neal) are rather busy so it is going to be a "jump in the pool and drown and hope your flailing manages to at least generate something useful but you die and come back in another life wiser and able to attempt again until you stop drowning and manage to only get sick from gulping down so much chlorinated water". =) > Check out > > http://www.python.org/dev/ > > for orientation, and leave your spare time at the door . > I will vouch for the loss of spare time. This has become a job. Best job ever, though. =) The only big piece of advice I can offer is to just make sure you are nice and cordial on the list; there is a low tolerance for jerks here. Don't take this as meaning to not take a stand on an issue! All I am saying is realize that email does not transcribe humor perfectly and until the list gets used to your personal writing style you might have to just make sure what you write does not come off as insulting. -Brett From drifty@alum.berkeley.edu Fri May 23 04:21:20 2003 From: drifty@alum.berkeley.edu (Brett C.) Date: Thu, 22 May 2003 20:21:20 -0700 Subject: [Python-Dev] Introduction In-Reply-To: References: Message-ID: <3ECD93B0.5060704@ocf.berkeley.edu> Jeffery Roberts wrote: > Thanks for all of your replies. The front-end rewrite sounds especially > interesting. I'm going to look into that. Is the entire front end > changing (ie scan/parse/ast) or just the AST structure ? > > If you have any more information or directions please let me know. > It is just a new AST. Redoing/replacing pgen is something else entirely. =) The branch that this is being developed under in CVS is ast-branch. There is a incomplete README in Python/compile.txt that explains the basic idea and direction. -Brett From barry@python.org Fri May 23 04:30:45 2003 From: barry@python.org (Barry Warsaw) Date: Thu, 22 May 2003 23:30:45 -0400 Subject: [Python-Dev] RELEASED Python 2.2.3c1 Message-ID: I'm happy to announce the release of Python 2.2.3c1 (release candidate 1). This is a bug fix release for the stable Python 2.2 code line. Barring any critical issues, we expect to release Python 2.2.3 final by this time next week. We encourage those with an interest in a solid 2.2.3 release to download this candidate and test it on their code. The new release is available here: http://www.python.org/2.2.3/ Python 2.2.3 has a large number of bug fixes and memory leak patches. For full details, see the release notes at http://www.python.org/2.2.3/NEWS.txt There are a small number of minor incompatibilities with Python 2.2.2; for details see: http://www.python.org/2.2.3/bugs.html Perhaps the most important is that the Bastion.py and rexec.py modules have been disabled, since we do not deem them to be safe. As usual, a Windows installer and a Unix/Linux source tarball are made available, as well as tarballs of the documentation in various forms. At the moment, no Mac version or Linux RPMs are available, although I expect them to appear soon after 2.2.3 final is released. On behalf of Guido, I'd like to thank everyone who contributed to this release, and who continue to ensure Python's success. Enjoy, -Barry From Jack.Jansen@cwi.nl Fri May 23 12:42:20 2003 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Fri, 23 May 2003 13:42:20 +0200 Subject: [Python-Dev] RELEASED Python 2.2.3c1 In-Reply-To: Message-ID: <9CF2178A-8D13-11D7-A3D6-0030655234CE@cwi.nl> On Friday, May 23, 2003, at 05:30 Europe/Amsterdam, Barry Warsaw wrote: > I'm happy to announce the release of Python 2.2.3c1 (release candidate > 1). Oops, that suddenly went *very* fast, I though I had until the weekend... Is there a chance I could get #723495 still in before 2.2.3 final? I was also hoping to find a fix for #571343, but I don't have a patch yet (although I'll try to get one up in the next few hours). -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From guido@python.org Fri May 23 14:11:35 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 23 May 2003 09:11:35 -0400 Subject: [Python-Dev] Of what use is commands.getstatus() In-Reply-To: "Your message of Thu, 22 May 2003 11:35:47 CDT." <16076.64611.579396.832520@montanaro.dyndns.org> References: <16076.64611.579396.832520@montanaro.dyndns.org> Message-ID: <200305231311.h4NDBZ725779@pcp02138704pcs.reston01.va.comcast.net> > I was reading the docs for the commands module and noticed getstatus() seems > to be completely unrelated to getstatusoutput() and getoutput(). I thought, > "I'll correct the docs. They must be wrong." Then I looked at commands.py > and saw the docs are correct. It's the function definition which is weird. > Of what use is it to return 'ls -ld file'? Based on its name I would have > guessed its function was > > def getoutput(cmd): > """Return status of executing cmd in a shell.""" > return getstatusoutput(cmd)[0] > > This particular function dates from 1990, so it clearly can't just be > deleted, but it seems completely superfluous to me, especially given the > existence of os.stat, os.listdir, etc. Should it be deprecated or modified > to do (what I think is) the obvious thing? That whole module wasn't thought out very well. I recently tried to use it and found that the strip of the trailing \n on getoutput() is also a counterproductive feature. I suggest that someone should design a replacement, perhaps to live in shutil, and then we can deprecate it. Until then I would leave it alone. Certainly don't "fix" it by doing something incompatible. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri May 23 15:06:05 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 23 May 2003 10:06:05 -0400 Subject: [Python-Dev] Descriptor API In-Reply-To: "Your message of Wed, 21 May 2003 01:41:16 BST." <000501c31f31$b0c820b0$f3100dd5@violante> References: <000501c31f31$b0c820b0$f3100dd5@violante> Message-ID: <200305231406.h4NE65T26180@pcp02138704pcs.reston01.va.comcast.net> > I was doing some tricks with metaclasses and descriptors in Python 2.2 and > stumbled on the following: > > >>> class test(object): > ... a = property(lambda: 1) > ... > >>> print test.a > > >>> print test.a.__set__ > > >>> print test.a.fset > None > > What this means in practice, is that if I want to test if a > descriptor is read-only I have to have two tests: One for custom > descriptors, checking that getting __set__ does not barf and another > for property, checking that fset returns None. Why are you interested in knowing whether a descriptor is read-only? > So, why doesn't getting __set__ raise AttributeError in the above case? This is a feature. The presence of __set__ (even if it always raises AttributeError when *called*) signals this as a "data descriptor". The difference between data descriptors and others is that a data descriptor can not be overridden by putting something in the instance dict; a non-data descriptor can be overridden by assignment to an instance attribute, which will store a value in the instance dict. For example, a method is a non-data descriptor (and the prevailing example of such). This means that the following example works: class C(object): def meth(self): return 42 x = C() x.meth() # prints 42 x.meth = lambda: 24 x.meth() # prints 24 > Is this a bug? If it's not, it sure is a (minor) feature request > from my part :-) Because of the above explanation, the request cannot be granted. You can test the property's fset attribute however to tell whether a 'set' argument was passed to the constructor. --Guido van Rossum (home page: http://www.python.org/~guido/) From op73418@mail.telepac.pt Fri May 23 15:34:14 2003 From: op73418@mail.telepac.pt (=?iso-8859-1?Q?Gon=E7alo_Rodrigues?=) Date: Fri, 23 May 2003 15:34:14 +0100 Subject: [Python-Dev] Descriptor API References: <000501c31f31$b0c820b0$f3100dd5@violante> <200305231406.h4NE65T26180@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <001a01c32138$625c09b0$0d40c151@violante> ----- Original Message ----- From: "Guido van Rossum" To: "Gonçalo Rodrigues" Cc: Sent: Friday, May 23, 2003 3:06 PM Subject: Re: [Python-Dev] Descriptor API > > I was doing some tricks with metaclasses and descriptors in Python 2.2 and > > stumbled on the following: > > > > >>> class test(object): > > ... a = property(lambda: 1) > > ... > > >>> print test.a > > > > >>> print test.a.__set__ > > > > >>> print test.a.fset > > None > > > > What this means in practice, is that if I want to test if a > > descriptor is read-only I have to have two tests: One for custom > > descriptors, checking that getting __set__ does not barf and another > > for property, checking that fset returns None. > > Why are you interested in knowing whether a descriptor is read-only? > Introspection dealing with a metaclass that injected methods in its instances depending on a descriptor. In other words, having fun with Python's wacky tricks. > > So, why doesn't getting __set__ raise AttributeError in the above case? > > This is a feature. The presence of __set__ (even if it always raises > AttributeError when *called*) signals this as a "data descriptor". > The difference between data descriptors and others is that a data > descriptor can not be overridden by putting something in the instance > dict; a non-data descriptor can be overridden by assignment to an > instance attribute, which will store a value in the instance dict. > > For example, a method is a non-data descriptor (and the prevailing > example of such). This means that the following example works: > > class C(object): > def meth(self): return 42 > > x = C() > x.meth() # prints 42 > x.meth = lambda: 24 > x.meth() # prints 24 > > > Is this a bug? If it's not, it sure is a (minor) feature request > > from my part :-) > > Because of the above explanation, the request cannot be granted. > Thanks for the reply (and also to P. Eby, btw). I was way off track when I sent the email, because it did not occured to me that property was a type implementing __get__ and __set__. With this piece of info connecting the dots the idea is just plain foolish. > You can test the property's fset attribute however to tell whether a > 'set' argument was passed to the constructor. > > --Guido van Rossum (home page: http://www.python.org/~guido/) With my best regards, G. Rodrigues From guido@python.org Fri May 23 15:39:10 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 23 May 2003 10:39:10 -0400 Subject: [Python-Dev] Need advice, maybe support In-Reply-To: "Your message of Wed, 21 May 2003 02:50:40 +0200." <3ECACD60.10503@tismer.com> References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> <3EC91EA0.5090105@tismer.com> <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net> <3EC94A92.2040604@tismer.com> <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net> <3ECA3DCB.50306@tismer.com> <200305201931.h4KJUuT21506@pcp02138704pcs.reston01.va.comcast.net> <3ECACD60.10503@tismer.com> Message-ID: <200305231439.h4NEdAu26309@pcp02138704pcs.reston01.va.comcast.net> > > The other is the new style where the PyMethodDef > > array is in tp_methods, and is scanned once by PyType_Ready. > > Right, again. Now, under the hopeful assumption that every > sensible extension module that has some types to publish also > does this through its module dictionary, I would have the > opportunity to cause PyType_Ready being called early enough > to modify the method table, before any of its methods is used > at all. Dangerous assumption! It's not inconceivable that a class would instantiate some of its own classes as part of its module initialization. > > 3rd party modules that have been around for a while are likely to use > > Py_FindMethod. With Py_FindMethod you don't have a convenient way to > > store the pointer to the converted table, so it may be better to > > simply check your bit in the first array element and then cast to a > > PyMethodDef or a PyMethodDefEx array based on what the bit says (you > > can safely assume that all elements of an array are the same size :-). > > Hee hee, yeah. Of course, if there isn't a reliable way to > intercept method table access before the first Py_FindMethod > call, I could of course modify Py_FindMethod. For instance, > a modified, new-style method table might be required to always > start with a dummy entry, where the flags word is completely > -1, to signal having been converted to new-style. Why so drastic? You could just set a reserved bit.f > ... > > >>If that approach is trustworthy, I also could drop > >>the request for these 8 bits. > > > > Sure. Ah, a bit in the type would work just as well, and > > Py_FindMethod *does* have access to the type. > > You think of the definition in methodobject.c, as it is > > """ > /* Find a method in a single method list */ > > PyObject * > Py_FindMethod(PyMethodDef *methods, PyObject *self, char *name) > """ > > , assuming that self always is not NULL, but representing a valid > object with a type, and this type is already referring to the > methods table? Right. There is already code that uses self->ob_type in Py_FindMethodInChain(), which is called by Py_FindMethod(). > Except for module objects, this seems to be right. I've run > Python against a lot of Python modules, but none seems > to call Py_FindMethod with a self parameter of NULL. I don't think it would be safe to do so. > If that is true, then I can patch a small couple of > C functions to check for the new bit, and if it's not > there, re-create the method table in place. > This is music to me ears. But... > > Well, there is a drawback: > I *do* need two bits, and I hope you will allow me to add this > second bit, as well. > > The one, first bit, tells me if the source has been compiled > with Stackless and its extension stuff. Nullo problemo. > I can then in-place modify the method table in a compatible > way, or leave it as it is, bny default. > But then, this isn't sufficient to set this bit then, like an > "everything is fine, now" relief. This is so, since this is *still* > an old module, and while its type's method tables have been > patched, the type is still not augmented by new slots, like > the new tp_call_nr slots (and maybe a bazillion to come, soon). > The drawback is, that I cannot simply replace the whole type > object, since type objects are not represented as object > pointers (like they are now, most of the time, in the dynamic > heaptype case), but they are constant struct addresses, where > the old C module might be referring to. > > So, what I think to need is no longer 9 bits, but two of them: > One that says "everything great from the beginning", and another > one that says "well, ok so far, but this is still an old object". > > I do think this is the complete story, now. > Instead of requiring nine bits, I'm asking for two. > But this is just *your options; I also can live with one bit, > but then I have to add a special, invalid method table entry > that just serves for this purpose. > In order to keep my souce code hack to the minimum, I'd really > like to ask for the two bits in the typeobject flags. OK, two bits you shall have. Don't spend them all at once! > Thanks so much for being so supportive -- chris Anything to keep ctual stackless support out of the core. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@python.org Fri May 23 15:55:51 2003 From: barry@python.org (Barry Warsaw) Date: Fri, 23 May 2003 10:55:51 -0400 Subject: [Python-Dev] RELEASED Python 2.2.3c1 In-Reply-To: <9CF2178A-8D13-11D7-A3D6-0030655234CE@cwi.nl> References: <9CF2178A-8D13-11D7-A3D6-0030655234CE@cwi.nl> Message-ID: <3ECE3677.1030500@python.org> Jack Jansen wrote: > > Oops, that suddenly went *very* fast, I though I had until the weekend... But probably not as fast as it should have. I had fun reading checkin comments like (paraphrasing), "this change is probably important enough for a 2.2.3 release" dated from December of last year. :) > Is there a chance I could get #723495 still in before 2.2.3 final? I was > also hoping to find a fix for #571343, but I don't have a patch yet > (although I'll try to get one up in the next few hours). I think it would be fine to get these into 2.2.3 final. -Barry From theller@python.net Fri May 23 16:31:06 2003 From: theller@python.net (Thomas Heller) Date: 23 May 2003 17:31:06 +0200 Subject: [Python-Dev] Need advice, maybe support In-Reply-To: <200305231439.h4NEdAu26309@pcp02138704pcs.reston01.va.comcast.net> References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> <3EC91EA0.5090105@tismer.com> <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net> <3EC94A92.2040604@tismer.com> <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net> <3ECA3DCB.50306@tismer.com> <200305201931.h4KJUuT21506@pcp02138704pcs.reston01.va.comcast.net> <3ECACD60.10503@tismer.com> <200305231439.h4NEdAu26309@pcp02138704pcs.reston01.va.comcast.net> Message-ID: > > > The other is the new style where the PyMethodDef > > > array is in tp_methods, and is scanned once by PyType_Ready. > > > > Right, again. Now, under the hopeful assumption that every > > sensible extension module that has some types to publish also > > does this through its module dictionary, I would have the > > opportunity to cause PyType_Ready being called early enough > > to modify the method table, before any of its methods is used > > at all. > > Dangerous assumption! It's not inconceivable that a class would > instantiate some of its own classes as part of its module > initialization. I do not really know what you are talking about here, but that assumption is violated by the ctypes module. It has a number of metaclasses implemented in C, neither of them is exposed in the module dictionary, and there *have been* types which were not exposed, because they are only used internally. Thomas From jeremy@ZOPE.COM Fri May 23 17:23:43 2003 From: jeremy@ZOPE.COM (Jeremy Hylton) Date: 23 May 2003 12:23:43 -0400 Subject: [Python-Dev] Introduction In-Reply-To: References: Message-ID: <1053707023.28095.3.camel@slothrop.zope.com> On Thu, 2003-05-22 at 16:56, Jeffery Roberts wrote: > Thanks for all of your replies. The front-end rewrite sounds especially > interesting. I'm going to look into that. Is the entire front end > changing (ie scan/parse/ast) or just the AST structure ? > > If you have any more information or directions please let me know. The current plan is to create an AST and replace the bytecode compiler. We're leaving a rewrite of the parser for a later project. It's a fairly big project; large parts of it are done, but there is work remaining to do in nearly every part -- the concrete-to-abstract translator, error checking, compilation to byte-code. At the moment, it's possible to start an interactive interpreter session and see what works. But it isn't possible to compile and run all of site.py and everything it imports. Jeremy From logistix@cathoderaymission.net Fri May 23 20:44:19 2003 From: logistix@cathoderaymission.net (logistix) Date: Fri, 23 May 2003 14:44:19 -0500 (CDT) Subject: [Python-Dev] Introduction In-Reply-To: <1053707023.28095.3.camel@slothrop.zope.com> Message-ID: On 23 May 2003, Jeremy Hylton wrote: > On Thu, 2003-05-22 at 16:56, Jeffery Roberts wrote: > > Thanks for all of your replies. The front-end rewrite sounds especially > > interesting. I'm going to look into that. Is the entire front end > > changing (ie scan/parse/ast) or just the AST structure ? > > > > If you have any more information or directions please let me know. > > The current plan is to create an AST and replace the bytecode compiler. > We're leaving a rewrite of the parser for a later project. It's a > fairly big project; large parts of it are done, but there is work > remaining to do in nearly every part -- the concrete-to-abstract > translator, error checking, compilation to byte-code. > > At the moment, it's possible to start an interactive interpreter session > and see what works. But it isn't possible to compile and run all of > site.py and everything it imports. > > Jeremy > Should patches just go to sourceforge's "parser/compiler" category, or will that create too much confusion? From jeremy@zope.com Fri May 23 20:44:14 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 23 May 2003 15:44:14 -0400 Subject: [Python-Dev] Introduction In-Reply-To: References: Message-ID: <1053719054.28074.13.camel@slothrop.zope.com> On Fri, 2003-05-23 at 15:44, logistix wrote: > Should patches just go to sourceforge's "parser/compiler" category, or > will that create too much confusion? I think that would be fine. We don't have a lot of parser/compiler patches. Jeremy From tismer@tismer.com Fri May 23 23:54:24 2003 From: tismer@tismer.com (Christian Tismer) Date: Sat, 24 May 2003 00:54:24 +0200 Subject: [Python-Dev] Need advice, maybe support In-Reply-To: References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> <3EC91EA0.5090105@tismer.com> <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net> <3EC94A92.2040604@tismer.com> <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net> <3ECA3DCB.50306@tismer.com> <200305201931.h4KJUuT21506@pcp02138704pcs.reston01.va.comcast.net> <3ECACD60.10503@tismer.com> <200305231439.h4NEdAu26309@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3ECEA6A0.2020206@tismer.com> Thomas Heller wrote: >>>>The other is the new style where the PyMethodDef >>>>array is in tp_methods, and is scanned once by PyType_Ready. >>> >>>Right, again. Now, under the hopeful assumption that every >>>sensible extension module that has some types to publish also >>>does this through its module dictionary, I would have the >>>opportunity to cause PyType_Ready being called early enough >>>to modify the method table, before any of its methods is used >>>at all. >> >>Dangerous assumption! It's not inconceivable that a class would >>instantiate some of its own classes as part of its module >>initialization. First time that I saw this. I do agree that it is possible to break every compatibility scheme. Especially in your module's case, I would not assume that anybody would consider not to use the most recent version and compile it against the most recent sources? The topic I'm talking about is old code which should continue to run. > I do not really know what you are talking about here, but that > assumption is violated by the ctypes module. > It has a number of metaclasses implemented in C, neither of them > is exposed in the module dictionary, and there *have been* types which > were not exposed, because they are only used internally. Hmm. Ok. Then I am really intersted if you have an idea, how to solve this efficiently. My current solution is augmenting method tables by sibling elements, which is a) not nice and b) involves extra flags in ml_flags, which is not as efficient as possible. Martin proposed to grow a second method table and to maintain it in parallel. This is possible, but also seems to involve quite some runtime overhead. What I'm seeking for is a place that gives a secure solution, without involving code that is executed, frequently. On the other hand, this issue is about *most* foreign, old code. I think I could stand if the one or the other module simply requires to be re-compiled with the current stackless version, if this doesn't mean to re-compile everything. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer@tismer.com Sat May 24 00:08:40 2003 From: tismer@tismer.com (Christian Tismer) Date: Sat, 24 May 2003 01:08:40 +0200 Subject: [Python-Dev] Need advice, maybe support In-Reply-To: <200305231439.h4NEdAu26309@pcp02138704pcs.reston01.va.comcast.net> References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> <3EC91EA0.5090105@tismer.com> <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net> <3EC94A92.2040604@tismer.com> <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net> <3ECA3DCB.50306@tismer.com> <200305201931.h4KJUuT21506@pcp02138704pcs.reston01.va.comcast.net> <3ECACD60.10503@tismer.com> <200305231439.h4NEdAu26309@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3ECEA9F8.9000007@tismer.com> Guido van Rossum wrote: [me about time of type initialization] > Dangerous assumption! It's not inconceivable that a class would > instantiate some of its own classes as part of its module > initialization. But we agree that an extension would somehow call into the core to initialize its types/classes. >>For instance, >>a modified, new-style method table might be required to always >>start with a dummy entry, where the flags word is completely >>-1, to signal having been converted to new-style. > > > Why so drastic? You could just set a reserved bit.f Doesn't matter. WHat I want is, that at initialization time, it is very clear what to initialize and how. At run-time, I don't want anything to remain that slows matters down. Therefore, creating an invalid slot for method tables was kind of an idea to signal that there is some special attention needed during method initialization. ... >>Except for module objects, this seems to be right. I've run >>Python against a lot of Python modules, but none seems >>to call Py_FindMethod with a self parameter of NULL. > > > I don't think it would be safe to do so. Further analalysis has proven that you're right. [more theoretical stuff, maybe not trustworthy without verification] > OK, two bits you shall have. Don't spend them all at once! Took them, chewing on them. >>Thanks so much for being so supportive -- chris > > Anything to keep actual stackless support out of the core. :-) Ahhh, that's the reason behind the generous intention? :-)) Ok with me, I got my two bits. But there is something else that might be interesting for very many Python users. Not yet announced, but you are invited to my EuroPy talk. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From zujyjbiu8@hotmail.com Sat May 24 09:00:56 2003 From: zujyjbiu8@hotmail.com (Juliana Lutz) Date: Sat, 24 May 03 08:00:56 GMT Subject: [Python-Dev] Save thousands and refinance your home now! au Message-ID: This is a multi-part message in MIME format. --24A_EAAEBD_3D Content-Type: text/plain Content-Transfer-Encoding: quoted-printable FREE MORTGAGE QUOTE & BEST POSSIBLE RATES ! ------------------------------------------- There are over 89,000 mortgage companies in the U.S., which means the process of finding the best loan for you can be a very difficult one.Let us do the hard work for you! Simply spend 1 minute to fill out the short form, then press the submit button, and we take it from there... finding the best deals possible, and getting the lenders to contact you! It's short, it's simple, it's free, and it will save you thousands of dollars! * Home Improvement, Refinance, Second Mortgage, Home Equity Loans, and More! Even with less than perfect or NO credit! You will qualify for the best possible rate. Do NOT miss the chance to refinance at record low rates, so act now... http://www.mortage-area.com/3/index.asp?RefID=3D383102 remove me http://www.mortage-area.com/Auto/index.htm k zxrclz affnqjhdv g oqx eddtdr vnqlv mempghs ttw glyivtltei ifulxbj yi rial --24A_EAAEBD_3D-- From niemeyer@conectiva.com Sat May 24 16:26:12 2003 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Sat, 24 May 2003 12:26:12 -0300 Subject: [Python-Dev] RELEASED Python 2.2.3c1 In-Reply-To: <3ECE3677.1030500@python.org> References: <9CF2178A-8D13-11D7-A3D6-0030655234CE@cwi.nl> <3ECE3677.1030500@python.org> Message-ID: <20030524152612.GA22309@ibook.distro.conectiva> Hi Barry! > >Oops, that suddenly went *very* fast, I though I had until the weekend... > > But probably not as fast as it should have. I had fun reading checkin > comments like (paraphrasing), "this change is probably important enough > for a 2.2.3 release" dated from December of last year. :) Indeed. I'd like to have worked more on Python 2.2.3. Unfortunately, Conectiva Linux has been released just a few weeks ago, and my free time suddenly vanished in that period. > >Is there a chance I could get #723495 still in before 2.2.3 final? I was > >also hoping to find a fix for #571343, but I don't have a patch yet > >(although I'll try to get one up in the next few hours). > > I think it would be fine to get these into 2.2.3 final. I have some time this weekend to work on Python. Do you think it'd be ok to backport some of the fixes we have introduced in the regular expression engine in 2.3 to 2.2.3, or is it too late? We have a sf patch open about that, but I'd like to port only the changes that don't require major changes in the engine. Also, have you seen the message about urllib2 I sent a few days ago? Would that be something important to have in 2.2.3 (or even in 2.3)? Do you plan to produce another release candidate? Thanks! -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From martin@v.loewis.de Sat May 24 17:01:08 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 24 May 2003 18:01:08 +0200 Subject: [Python-Dev] RELEASED Python 2.2.3c1 In-Reply-To: <20030524152612.GA22309@ibook.distro.conectiva> References: <9CF2178A-8D13-11D7-A3D6-0030655234CE@cwi.nl> <3ECE3677.1030500@python.org> <20030524152612.GA22309@ibook.distro.conectiva> Message-ID: Gustavo Niemeyer writes: > I have some time this weekend to work on Python. Do you think it'd be ok > to backport some of the fixes we have introduced in the regular > expression engine in 2.3 to 2.2.3, or is it too late? We have a sf patch > open about that, but I'd like to port only the changes that don't > require major changes in the engine. I strongly advise to defer such changes to 2.2.4. This is really tricky code, and changes should ideally be reviewed by three different experts (including the author of the changes). > Also, have you seen the message about urllib2 I sent a few days ago? > Would that be something important to have in 2.2.3 (or even in 2.3)? Nothing that is not in 2.3 right now can go into 2.2.3. Only backports of accepted changes should be applied to the 2.2 branch. > Do you plan to produce another release candidate? My understanding is that there was only one release candidate planned. Changes that require another release candidate should not be applied right now, since another release candidate won't give them the testing that they need. Regards, Martin From niemeyer@conectiva.com Sat May 24 17:08:33 2003 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Sat, 24 May 2003 13:08:33 -0300 Subject: [Python-Dev] RELEASED Python 2.2.3c1 In-Reply-To: References: <9CF2178A-8D13-11D7-A3D6-0030655234CE@cwi.nl> <3ECE3677.1030500@python.org> <20030524152612.GA22309@ibook.distro.conectiva> Message-ID: <20030524160832.GB22309@ibook.distro.conectiva> > > I have some time this weekend to work on Python. Do you think it'd be ok > > to backport some of the fixes we have introduced in the regular > > expression engine in 2.3 to 2.2.3, or is it too late? We have a sf patch > > open about that, but I'd like to port only the changes that don't > > require major changes in the engine. > > I strongly advise to defer such changes to 2.2.4. This is really > tricky code, and changes should ideally be reviewed by three different > experts (including the author of the changes). Ack. I'll wait until 2.2.3 is out to touch that code. I'll look for something else to do on Python this weekend. If you need any help with 2.2.3, please contact me. > > Also, have you seen the message about urllib2 I sent a few days ago? > > Would that be something important to have in 2.2.3 (or even in 2.3)? > > Nothing that is not in 2.3 right now can go into 2.2.3. Only backports > of accepted changes should be applied to the 2.2 branch. Ok. I'll let alone 2.2.3, and fix that behavior in urllib2 for 2.3. > > Do you plan to produce another release candidate? > > My understanding is that there was only one release candidate > planned. Changes that require another release candidate should not be > applied right now, since another release candidate won't give them the > testing that they need. Agreed. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From niemeyer@conectiva.com Sat May 24 20:03:25 2003 From: niemeyer@conectiva.com (Gustavo Niemeyer) Date: Sat, 24 May 2003 16:03:25 -0300 Subject: [Python-Dev] urllib2 proxy support broken? In-Reply-To: <20030519212807.GA29002@ibook.distro.conectiva> References: <20030519212807.GA29002@ibook.distro.conectiva> Message-ID: <20030524190325.GA30748@ibook.distro.conectiva> > I've just tried to use the proxy support in urllib2, and was surprised > by the fact that it seems to be broken, at least in 2.2 and 2.3. Can > somebody please confirm that it's really broken, so that I can prepare > a patch? Ok.. I have prepared a simple fix for this, and sent it to SF patch #742823. This fix should be backwards compatible, and at the same time allows any kind of further customization of pre-defined and user-defined classes. Can someone please have a look at it before I check it in? -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ] From barry@python.org Sat May 24 21:49:48 2003 From: barry@python.org (Barry Warsaw) Date: Sat, 24 May 2003 16:49:48 -0400 Subject: [Python-Dev] RELEASED Python 2.2.3c1 In-Reply-To: Message-ID: <4215F6F7-8E29-11D7-A28B-003065EEFAC8@python.org> On Saturday, May 24, 2003, at 12:01 PM, Martin v. L=F6wis wrote: >> Do you plan to produce another release candidate? > > My understanding is that there was only one release candidate > planned. Changes that require another release candidate should not be > applied right now, since another release candidate won't give them the > testing that they need. Martin's right. Unless Guido specifically overrides, please be ultra=20= conservative. -Barry From tim_one@email.msn.com Sun May 25 06:21:22 2003 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 25 May 2003 01:21:22 -0400 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20 In-Reply-To: <000201c32277$59041c00$125ffea9@oemcomputer> Message-ID: [redirected to python-dev] [Tim] >> Someone review this, please! Final releases are getting close, Fred >> (the weakref guy) won't be around until Tuesday, and the pre-patch >> code can indeed raise spurious RuntimeErrors in the presence of >> threads or mutating comparison functions. >> >> See the bug report for my confusions: I can't see any reason for why >> __delitem__ iterated over the keys. [Raymond Hettinger] > Until reading the note on threads, I didn't see the error and thought > the original code was valid because it returned after the deletion > instead of continuing to loop through iterkeys. Note that one of the new tests I checked in provoked RuntimeError without threads. >> The new one-liner implementation is much faster, can't raise >> RuntimeError, and should be> better-behaved in all respects wrt threads. > Yes, that solves the OP's problem. >> Bugfix candidate for 2.2.3 too, if someone else agrees with this >> patch. > The original code does its contortions to avoid raising a KeyError > whenever the dictionary entry might have disappeared due to the > ref count falling to zero and then a new, equal key was formed later. Sorry, I can't picture what you're trying to say. Show some code? If in a weak-keyed dict d I do d[k1] = v del k1 # last reference, so the dict mutation went away del d[k2] # where k2 happens to be compare equal to what k1 was then I claim that *should* raise KeyError, and pretty obviously so. Note that the other new test I checked in showed that del d[whatever] never raised KeyError before; I can't see how that can be called a feature, and if someone thinks it was they neglected to document it, or write a test that failed when I changed the behavior . > If the data disappeared, then, I think ref(key) will return None No, ref(x) never returns None, regardless of what x may be. It may raise TypeError if x is not of a weakly referencable type, and it may raise MemoryError if we don't have enough memory left to construct a weakref, but those are the only things that can go wrong. w = ref(x) followed later by w() will return None, iff x has gone away in the meantime -- maybe that's what you're thinking of. > which is a bummer because that is then used (in your patch) > as a lookup key. If x and y are two weakly-referencable objects (not necessarily distinct) that compare equal, then ref(x) == ref(y) and hash(ref(x)) == hash(ref(y)) so long as both ref(x)() and ref(y)() don't return None (i.e., so long as x and y are both still alive). Soo when I map del d[k1] to del d.data[ref(k1)] it will succeed if and only if d.data has a key for a still-live object, and that key compares equal to k1; else it will raise KeyError (or maybe TypeError if k1 is a silly key to test in a weak-keyed dict, or MemoryError if we run out of memory). That's what I believe it should do. > The safest approach (until Fred re-appears) is to keep the original > approach but use keys() instead of iterkeys(). Then, wrap the > actual deletion in a try / except KeyError to handle a thread > race to delete the same weakref object. I'm not clear on what that means. By "delete the same weakref object", do you mean that both threads try to do del d[k] with the same k and the same weak-valued dict d? If so, then I think one of them *should* see KeyError, exactly the same as if they tried pulling this trick with a regular dict. > I'm sure there is a better way and will take another look tomorrow. Thanks for trying, but I still don't get it. It would help if you could show specific code that you believe worked correctly before but is broken now. I added two new tests showing what I believe to be code that was broken before but works now, and no changes to the existing tests were needed. From python@rcn.com Sun May 25 07:05:28 2003 From: python@rcn.com (Raymond Hettinger) Date: Sun, 25 May 2003 02:05:28 -0400 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20 References: Message-ID: <001a01c32283$a4ec0fe0$125ffea9@oemcomputer> [Raymondo] > > The original code does its contortions to avoid raising a KeyError > > whenever the dictionary entry might have disappeared due to the > > ref count falling to zero and then a new, equal key was formed later. [Timbot] > Sorry, I can't picture what you're trying to say. Show some code? Python 2.3b1 (#40, Apr 25 2003, 19:06:24) [MSC v.1200 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information. IDLE 0.8 -- press F1 for help >>> class C: pass >>> import weakref >>> wkd = weakref.WeakKeyDictionary() >>> del wkd[C()] >>> # No complaints Python 2.3b1+ (#40, May 23 2003, 00:08:36) [MSC v.1200 32 b Type "help", "copyright", "credits" or "license" for more i >>> class C: pass ... >>> import weakref >>> wkd = weakref.WeakKeyDictionary() >>> del wkd[C()] Traceback (most recent call last): File "", line 1, in ? File "C:\PY23\lib\weakref.py", line 167, in __delitem__ del self.data[ref(key)] KeyError: >>> # Complains now. [Raymond] > > If the data disappeared, then, I think ref(key) will return None [Timbot] > No, ref(x) never returns None, regardless of what x may be. It may raise > TypeError if x is not of a weakly referencable type, and it may raise > MemoryError if we don't have enough memory left to construct a weakref, but > those are the only things that can go wrong. [Current version of the docs] """ ref( object[, callback]) Return a weak reference to object. The original object can be retrieved by calling the reference object if the referent is still alive; if the referent is no longer alive, calling the reference object will cause None to be returned. """ Raymond Hettinger From tim_one@email.msn.com Sun May 25 07:29:02 2003 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 25 May 2003 02:29:02 -0400 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20 In-Reply-To: <001a01c32283$a4ec0fe0$125ffea9@oemcomputer> Message-ID: [Raymondo] >>> The original code does its contortions to avoid raising a KeyError >>> whenever the dictionary entry might have disappeared due to the >>> ref count falling to zero and then a new, equal key was formed >>> later. [Timbot] >> Sorry, I can't picture what you're trying to say. Show some code? [Razor] > Python 2.3b1 (#40, Apr 25 2003, 19:06:24) [MSC v.1200 32 bit (Intel)] > on win32 Type "copyright", "credits" or "license" for more > information. IDLE 0.8 -- press F1 for help > >>> class C: pass > ... > >>> import weakref > >>> wkd = weakref.WeakKeyDictionary() > >>> del wkd[C()] > >>> # No complaints Right, and I call that a bug. One of the new tests I checked in does exactly that, BTW. As I said last time, the idea that trying to delete a key from a weak-keyed dict never raises KeyError was neither documented nor verified by a test, so there's no reason to believe it was anything other than a bug in the implementation of __delitem__. > Python 2.3b1+ (#40, May 23 2003, 00:08:36) [MSC v.1200 32 b > Type "help", "copyright", "credits" or "license" for more i > >>> class C: pass > ... > >>> import weakref > >>> wkd = weakref.WeakKeyDictionary() > >>> del wkd[C()] > Traceback (most recent call last): > File "", line 1, in ? > File "C:\PY23\lib\weakref.py", line 167, in __delitem__ > del self.data[ref(key)] > KeyError: > >>> # Complains now. Right, and that's intentional, and tested now too. It's always been the case (and still is) that wkd[C()] raised KeyError too -- why should __delitem__, and only __delitem__, be exempt from complaining about a senseless operation? >>> If the data disappeared, then, I think ref(key) will return None >> No, ref(x) never returns None, regardless of what x may be. It may >> raise TypeError if x is not of a weakly referencable type, and it >> may raise MemoryError if we don't have enough memory left to >> construct a weakref, but those are the only things that can go wrong. > [Current version of the docs] > """ > ref( object[, callback]) > > Return a weak reference to object. The original object can be > retrieved by calling the reference object if the referent is still > alive; if the referent is no longer alive, calling the reference > object will cause None to be returned. """ That's what I said last time: w = ref(x) followed later by w() will return None, iff x has gone away in the meantime -- maybe that's what you're thinking of. Note that "calling the reference object" in the docs does not mean the call "ref(x)" itself, it means calling the object returned by ref(x) (what I named "w" in the quote just above). From python@rcn.com Sun May 25 07:35:17 2003 From: python@rcn.com (Raymond Hettinger) Date: Sun, 25 May 2003 02:35:17 -0400 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20 References: Message-ID: <002b01c32287$cef99ba0$125ffea9@oemcomputer> The old behavior for missing keys may have been a bug. Do you care about the previous behavior for deleting based on equality rather than equality *and* hash? Python 2.3b1 (#40, Apr 25 2003, 19:06:24) [MSC v.1200 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information. IDLE 0.8 -- press F1 for help >>> class One: def __eq__(self, other): return other == 1 def __hash__(self): return 1492 >>> import weakref >>> wkd = weakref.WeakKeyDictionary() >>> o = One() >>> wkd[o] = None >>> len(wkd) 1 >>> del wkd[1] >>> len(wkd) 0 Python 2.3b1+ (#40, May 23 2003, 00:08:36) [MSC v.1200 32 Type "help", "copyright", "credits" or "license" for more >>> class One: ... def __eq__(self, other): ... return other == 1 ... def __hash__(self): ... return 1492 ... >>> import weakref >>> wkd = weakref.WeakKeyDictionary() >>> o = One() >>> wkd[o] = None >>> len(wkd) 1 >>> del wkd[1] Traceback (most recent call last): File "", line 1, in ? File "C:\PY23\lib\weakref.py", line 167, in __delitem__ del self.data[ref(key)] TypeError: cannot create weak reference to 'int' object >>> len(wkd) 1 >>> Raymond Hettinger From tim_one@email.msn.com Sun May 25 07:48:25 2003 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 25 May 2003 02:48:25 -0400 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20 In-Reply-To: <002b01c32287$cef99ba0$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger] > The old behavior for missing keys may have been a bug. > Do you care about the previous behavior for deleting > based on equality rather than equality *and* hash? Nope, because it was neither documented nor tested, and was behavior unique to the WeakKeyDictionary flavor of dict -- no other flavor of dict works that way, and it was just an accident due to the __delitem__ implementation. Note too that it's a documented requirement of the mapping protocol that keys that compare equal must also return equal hash values. > Python 2.3b1 (#40, Apr 25 2003, 19:06:24) [MSC v.1200 32 bit (Intel)] > on win32 Type "copyright", "credits" or "license" for more > information. IDLE 0.8 -- press F1 for help > >>> class One: > def __eq__(self, other): > return other == 1 > def __hash__(self): > return 1492 > >>> import weakref > >>> wkd = weakref.WeakKeyDictionary() > >>> o = One() > >>> wkd[o] = None > >>> len(wkd) > 1 > >>> del wkd[1] > >>> len(wkd) > 0 Just a case of GIGO (garbage in, garbage out) to me. > Python 2.3b1+ (#40, May 23 2003, 00:08:36) [MSC v.1200 32 > Type "help", "copyright", "credits" or "license" for more > >>> class One: > .. def __eq__(self, other): > .. return other == 1 > .. def __hash__(self): > .. return 1492 > .. > >>> import weakref > >>> wkd = weakref.WeakKeyDictionary() > >>> o = One() > >>> wkd[o] = None > >>> len(wkd) > 1 > >>> del wkd[1] > Traceback (most recent call last): > File "", line 1, in ? > File "C:\PY23\lib\weakref.py", line 167, in __delitem__ > del self.data[ref(key)] > TypeError: cannot create weak reference to 'int' object > >>> len(wkd) > 1 > >>> As I said the first time , will succeed if and only if d.data has a key for a still-live object, and that key compares equal to k1; else it will raise KeyError (or maybe TypeError if k1 is a silly key to test in a weak-keyed dict, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ or MemoryError if we run out of memory). That's what I believe it should do. I didn't add "and the hash codes are the same too" because that requirement is part of the the mapping protocol. From python@rcn.com Sun May 25 07:46:41 2003 From: python@rcn.com (Raymond Hettinger) Date: Sun, 25 May 2003 02:46:41 -0400 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20 References: Message-ID: <003701c32289$66e42ec0$125ffea9@oemcomputer> Here's the rest of the last example: >>> class AltOne(One): ... def __hash__(self): ... return 1776 ... >>> del wkd[AltOne()] Traceback (most recent call last): File "", line 1, in ? File "C:\PY23\lib\weakref.py", line 167, in __delitem__ del self.data[ref(key)] KeyError: > [Razor] Hmm, a new moniker is born ... [Tim] > > Note that "calling the reference object" in the docs does not mean the call > "ref(x)" itself, it means calling the object returned by ref(x) (what I > named "w" in the quote just above). Hmm, I read the docs just a little too quickly. Speed reading is not all it's cracked up to be. Raymond ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From python@rcn.com Sun May 25 07:52:23 2003 From: python@rcn.com (Raymond Hettinger) Date: Sun, 25 May 2003 02:52:23 -0400 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20 References: Message-ID: <000201c3228a$73b3cba0$125ffea9@oemcomputer> > As I said the first time , > > will succeed if and only if d.data has a key for a still-live object, > and that key compares equal to k1; else it will raise KeyError > (or maybe TypeError if k1 is a silly key to test in a weak-keyed dict, > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > or MemoryError if we run out of memory). That's what I believe it > should do. > > I didn't add "and the hash codes are the same too" because that requirement > is part of the the mapping protocol. Okay, you've had a second review on the patch and backporting to 2.2.3 is reasonable. Please add a news item for the two changes in behavior. BTW, I wasn't trying to be difficult, I was starting from the presumption that Fred wasn't smoking dope when he put in that weird block of code. Looks like the presumption was wrong ;-) Raymond From python@rcn.com Sun May 25 08:08:33 2003 From: python@rcn.com (Raymond Hettinger) Date: Sun, 25 May 2003 03:08:33 -0400 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20 References: <003701c32289$66e42ec0$125ffea9@oemcomputer> Message-ID: <000901c3228c$74f0a680$125ffea9@oemcomputer> Arghh. One more example always arises after going to bed. The original required only equality. The new version requires equality, hashability, *and* weak referencability. Python 2.3b1 (#40, Apr 25 2003, 19:06:24) [MSC v.1200 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information. IDLE 0.8 -- press F1 for help >>> import weakref >>> class One: def __eq__(self, other): return other == 1 def __hash__(self, other): return hash(1) >>> wkd = weakref.WeakKeyDictionary() >>> o = One() >>> wkd[o] = 1 >>> len(wkd) 1 >>> del wkd[1] >>> len(wkd) 0 Python 2.3b1+ (#40, May 23 2003, 00:08:36) [MSC v.1200 32 Type "help", "copyright", "credits" or "license" for more >>> class One: ... def __eq__(self,other): return other==1 ... def __hash__(self): return hash(1) ... >>> import weakref >>> wkd = weakref.WeakKeyDictionary() >>> o = One() >>> wkd[o] = 1 >>> len(wkd) 1 >>> del wkd[1] Traceback (most recent call last): File "", line 1, in ? File "C:\PY23\lib\weakref.py", line 167, in __delitem__ del self.data[ref(key)] TypeError: cannot create weak reference to 'int' object >>> len(wkd) 1 Raymond From skip@mojam.com Sun May 25 13:00:28 2003 From: skip@mojam.com (Skip Montanaro) Date: Sun, 25 May 2003 07:00:28 -0500 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200305251200.h4PC0Sg08651@manatee.mojam.com> Bug/Patch Summary ----------------- 372 open / 3658 total bugs (-30) 151 open / 2173 total patches (+12) New Bugs -------- IMAP4_SSL broken (2003-05-19) http://python.org/sf/739909 re.finditer() listed as new in 2.2.? (2003-05-19) http://python.org/sf/740026 test/build-failures on FreeBSD stable/current (2003-05-19) http://python.org/sf/740234 Can't browse methods and Classes (2003-05-20) http://python.org/sf/740407 MacPython-OS9 distutils breaks on OSX (2003-05-20) http://python.org/sf/740424 HTMLParser -- possible bug in handle_comment (2003-05-21) http://python.org/sf/741029 Configure does NOT set properly *FLAGS for thread support (2003-05-21) http://python.org/sf/741307 test_long failure (2003-05-22) http://python.org/sf/741806 curses support on Python-2.3b1/Tru64Unix 5.1A (2003-05-22) http://python.org/sf/741843 Python crashes if recursively reloading modules (2003-05-23) http://python.org/sf/742342 WeakKeyDictionary __delitem__ uses iterkeys (2003-05-24) http://python.org/sf/742860 Memory fault on complex weakref/weakkeydict delete (2003-05-24) http://python.org/sf/742911 New Patches ----------- inspect.getargspec: None instead of () (2002-11-12) http://python.org/sf/637217 zlib.decompressobj under-described. (2002-11-18) http://python.org/sf/640236 Several objects don't decref tmp on failure in subtype_new (2003-03-14) http://python.org/sf/703666 strange warnings messages in interpreter (2003-03-14) http://python.org/sf/703779 build of html docs broken (liboptparse.tex) (2003-05-04) http://python.org/sf/732174 HP-UX support for unixccompiler.py (2003-05-20) http://python.org/sf/740301 add urldecode() method to urllib (2003-05-20) http://python.org/sf/740827 unicode "support" for shlex.py (2003-05-23) http://python.org/sf/742290 SocketServer timeout, zombies (2003-05-23) http://python.org/sf/742598 ast-branch: msvc project sync (2003-05-23) http://python.org/sf/742621 check for true in diffrent paths, -pthread support (2003-05-24) http://python.org/sf/742741 Ordering of handlers in urllib2 (2003-05-24) http://python.org/sf/742823 Closed Bugs ----------- crash in shelve module (2001-03-13) http://python.org/sf/408271 maximum recursion limit exceeded (2.1) (2001-04-24) http://python.org/sf/418626 raw-unicode-escape codec fails roundtrip (2001-07-25) http://python.org/sf/444514 strange IRIX test_re/test_sre failure (2001-08-28) http://python.org/sf/456398 New httplib lacks documentation (2001-09-04) http://python.org/sf/458447 maximum recursion limit exceeded in match (2001-12-14) http://python.org/sf/493252 inconsistent behavior of __getslice__ (2002-05-24) http://python.org/sf/560064 Mixing framework and static Pythons (2002-06-19) http://python.org/sf/571343 inheriting from property and docstrings (2002-07-03) http://python.org/sf/576990 unittest.py, better error message (2002-07-30) http://python.org/sf/588825 installation errors (2002-08-07) http://python.org/sf/592161 test_nis test fails on TRU64 5.1 (2002-08-14) http://python.org/sf/594998 non greedy match bug (2002-08-30) http://python.org/sf/602444 cgitb tracebacks not accessible (2002-08-31) http://python.org/sf/602893 faster [None]*n or []*n (2002-09-04) http://python.org/sf/604716 Max recursion limit with "*?" pattern (2002-10-08) http://python.org/sf/620412 cStringIO().write TypeError (2002-11-08) http://python.org/sf/635814 inspect.getargspec: None instead of () (2002-11-12) http://python.org/sf/637217 zlib.decompressobj under-described. (2002-11-18) http://python.org/sf/640236 Poor error message for augmented assign (2002-11-26) http://python.org/sf/644345 gettext.py crash on bogus preamble (2002-12-24) http://python.org/sf/658233 BoundaryError: multipart message with no defined boundary (2003-01-14) http://python.org/sf/667931 bsddb doc error (2003-01-28) http://python.org/sf/676233 test_logging fails (2003-01-31) http://python.org/sf/678217 new.function() leads to segfault (2003-02-25) http://python.org/sf/692776 Python 2.3a2 Build fails on HP-UX11i (2003-02-27) http://python.org/sf/694431 Several objects don't decref tmp on failure in subtype_new (2003-03-14) http://python.org/sf/703666 strange warnings messages in interpreter (2003-03-14) http://python.org/sf/703779 Assertion failed, python aborts (2003-03-17) http://python.org/sf/705231 Error when using PyZipFile to create archive (2003-03-17) http://python.org/sf/705295 Minor nested scopes doc issues (2003-04-06) http://python.org/sf/716168 Uthread problem - Pipe left open (2003-04-08) http://python.org/sf/717614 Mac OS X painless compilation (2003-04-11) http://python.org/sf/719549 runtime_library_dirs broken under OS X (2003-04-17) http://python.org/sf/723495 email/quopriMIME.py exception on int (lstrip) (2003-04-20) http://python.org/sf/724621 Possible OSX module location bug (2003-04-21) http://python.org/sf/725026 comparing versions - one a float (2003-04-28) http://python.org/sf/729317 Lambda functions in list comprehensions (2003-05-08) http://python.org/sf/734869 FILEMODE not honoured (2003-05-09) http://python.org/sf/735274 Command line timeit.py sets sys.path badly (2003-05-09) http://python.org/sf/735293 csv.Sniffer docs need updating (2003-05-15) http://python.org/sf/738471 On Windows, os.listdir() throws incorrect exception (2003-05-15) http://python.org/sf/738617 array.insert and negative indices (2003-05-17) http://python.org/sf/739313 Closed Patches -------------- Optional output streams for dis (2003-02-08) http://python.org/sf/683074 Add copyrange method to array. (2003-04-14) http://python.org/sf/721061 From andymac@bullseye.apana.org.au Sun May 25 01:35:18 2003 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Sun, 25 May 2003 10:35:18 +1000 (EST) Subject: [Python-Dev] _sre changes In-Reply-To: <20030524152612.GA22309@ibook.distro.conectiva> References: <9CF2178A-8D13-11D7-A3D6-0030655234CE@cwi.nl> <3ECE3677.1030500@python.org> <20030524152612.GA22309@ibook.distro.conectiva> Message-ID: <20030525102158.S40394@bullseye.apana.org.au> On Sat, 24 May 2003, Gustavo Niemeyer wrote: > to backport some of the fixes we have introduced in the regular > expression engine in 2.3 to 2.2.3, or is it too late? We have a sf patch > open about that, but I'd like to port only the changes that don't > require major changes in the engine. These sre changes are giving me fits on FreeBSD. The fix (recursion limit down to 7500 for gcc 3.x) applied for 2.3b1 now needs to be extended to gcc 2.95, and the limit for gcc 3.x lowered further - not a particularly satisfactory outcome. I have identified that the problem is not the compiler specifically, but an interaction with FreeBSD's pthreads implementation (libc_r) - ./configure --without-threads produces an interpreter which survives test_re with a recursion limit of 10000 regardless of compiler. I'm still trying to frame a query to a FreeBSD forum about this. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au (pref) | Snail: PO Box 370 andymac@pcug.org.au (alt) | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From mwh@python.net Sun May 25 17:27:22 2003 From: mwh@python.net (Michael Hudson) Date: Sun, 25 May 2003 17:27:22 +0100 Subject: [Python-Dev] _sre changes In-Reply-To: <20030525102158.S40394@bullseye.apana.org.au> (Andrew MacIntyre's message of "Sun, 25 May 2003 10:35:18 +1000 (EST)") References: <9CF2178A-8D13-11D7-A3D6-0030655234CE@cwi.nl> <3ECE3677.1030500@python.org> <20030524152612.GA22309@ibook.distro.conectiva> <20030525102158.S40394@bullseye.apana.org.au> Message-ID: <2md6i7gff9.fsf@starship.python.net> Andrew MacIntyre writes: > On Sat, 24 May 2003, Gustavo Niemeyer wrote: > >> to backport some of the fixes we have introduced in the regular >> expression engine in 2.3 to 2.2.3, or is it too late? We have a sf patch >> open about that, but I'd like to port only the changes that don't >> require major changes in the engine. > > These sre changes are giving me fits on FreeBSD. The fix (recursion > limit down to 7500 for gcc 3.x) applied for 2.3b1 now needs to be extended > to gcc 2.95, and the limit for gcc 3.x lowered further - not a > particularly satisfactory outcome. > > I have identified that the problem is not the compiler specifically, but > an interaction with FreeBSD's pthreads implementation (libc_r) - > ./configure --without-threads produces an interpreter which survives > test_re with a recursion limit of 10000 regardless of compiler. This is to be expected. If you run a threads disabled Python with ulimit -s you can recurse until you run out of VIRTUAL MEMORY! When there are threads in the picture is significantly more complex... (which is another way of stating that I don't understand it, but you can understand that with multiple stacks you can't just say "here's a really high address, work down from here"[1]). Cheers, M. [1] or vice versa depending on architecture. -- -Dr. Olin Shivers, Ph.D., Cranberry-Melon School of Cucumber Science -- seen in comp.lang.scheme From tim_one@email.msn.com Sun May 25 18:10:42 2003 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 25 May 2003 13:10:42 -0400 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20 In-Reply-To: <000201c3228a$73b3cba0$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger] > Okay, you've had a second review on the patch and backporting > to 2.2.3 is reasonable. Please add a news item for the two changes > in behavior. I'll wait for Fred to get back. I'm not sure you've used weakrefs . > BTW, I wasn't trying to be difficult, I was starting from the > presumption that Fred wasn't smoking dope when he put in that weird > block of code. Looks like the presumption was wrong ;-) That's cool, I started from them same presumption, and am still not entirely over it -- it was such an outrageously inefficient way to delete a key that the suspicion still nags there was *some* reason for it (other than really good dope ). From tim_one@email.msn.com Sun May 25 18:17:22 2003 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 25 May 2003 13:17:22 -0400 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20 In-Reply-To: <000901c3228c$74f0a680$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger] > Arghh. One more example always arises after going to bed. > > The original required only equality. The new version requires > equality, hashability, *and* weak referencability. It always required (and still does) all three for __setitem__ and __getitem__: key equality and hashability are required for all dicts, and *of course* key weak referencability is required for a weak-keyed dict: that's why it's called a weak-keyed dict <0.5 wink>. __delitem__ alone used a bizarre algorithm, and indeed one that broke other normal dict invariants such as that del d[k] always deletes the same (key, value) pair that d[k] = v would have replaced. From Jack.Jansen@cwi.nl Sun May 25 21:55:11 2003 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Sun, 25 May 2003 22:55:11 +0200 Subject: [Python-Dev] RELEASED Python 2.2.3c1 In-Reply-To: <3ECE3677.1030500@python.org> Message-ID: <2D0200D5-8EF3-11D7-AA7E-000A27B19B96@cwi.nl> On vrijdag, mei 23, 2003, at 16:55 Europe/Amsterdam, Barry Warsaw wrote: >> Is there a chance I could get #723495 still in before 2.2.3 final? I >> was also hoping to find a fix for #571343, but I don't have a patch >> yet (although I'll try to get one up in the next few hours). > > I think it would be fine to get these into 2.2.3 final. Okay, I'm done (in as far as the unix distribution is concerned): 723495 is checked in, and 571343 I've closed as "won't fix" because it turns out that the trouble-case I expected is very unlikely to happen. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From gward@python.net Mon May 26 03:16:35 2003 From: gward@python.net (Greg Ward) Date: Sun, 25 May 2003 22:16:35 -0400 Subject: [Python-Dev] Change to ossaudiodev setparameters() method Message-ID: <20030526021635.GA15814@cthulhu.gerg.ca> Currently, oss_audio_device objects have a setparameters() method with a rather silly interface: oss.setparameters(sample_rate, sample_size, num_channels, format [, emulate]) This is silly because 1) 'sample_size' is implicit in 'format', and 2) the implementation doesn't actually *use* sample_size for anything -- it just checks that you have passed in the correct sample size, ie. if you specify an 8-bit format, you must pass sample_size=8. (This is code inherited from linuxaudiodev that I never got around to cleaning up.) In addition to being silly, this is not the documented interface. The docs don't mention the 'sample_size' argument at all. Presumably the doc writer realized the silliness and was going to pester me to remove 'sample_size', but never got around to it. (Lot of that going around.) So, even though we're in a beta cycle, am I allowed to change the code so it's 1) sensible and 2) consistent with the documentation? Greg -- Greg Ward http://www.gerg.ca/ Sure, I'm paranoid... but am I paranoid ENOUGH? From guido@python.org Mon May 26 07:39:59 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 26 May 2003 02:39:59 -0400 Subject: [Python-Dev] Change to ossaudiodev setparameters() method In-Reply-To: "Your message of Sun, 25 May 2003 22:16:35 EDT." <20030526021635.GA15814@cthulhu.gerg.ca> References: <20030526021635.GA15814@cthulhu.gerg.ca> Message-ID: <200305260640.h4Q6dxm08507@pcp02138704pcs.reston01.va.comcast.net> > Currently, oss_audio_device objects have a setparameters() method with a > rather silly interface: > > oss.setparameters(sample_rate, sample_size, num_channels, format [, emulate]) > > This is silly because 1) 'sample_size' is implicit in 'format', and 2) > the implementation doesn't actually *use* sample_size for anything -- it > just checks that you have passed in the correct sample size, ie. if you > specify an 8-bit format, you must pass sample_size=8. (This is code > inherited from linuxaudiodev that I never got around to cleaning up.) > > In addition to being silly, this is not the documented interface. The > docs don't mention the 'sample_size' argument at all. Presumably the > doc writer realized the silliness and was going to pester me to remove > 'sample_size', but never got around to it. (Lot of that going around.) > > So, even though we're in a beta cycle, am I allowed to change the code > so it's 1) sensible and 2) consistent with the documentation? Yes. I like silliness in a MP skit, but not in my APIs. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From 5p6vkvf4cr3@juno.com Mon May 26 15:50:59 2003 From: 5p6vkvf4cr3@juno.com (Billie Dempsey) Date: Mon, 26 May 03 14:50:59 GMT Subject: [Python-Dev] Re: Free Digital Cable TV cgecprano y wofkg Message-ID: This is a multi-part message in MIME format. --91_C___54E Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Get Free Cable TV Pay-Per-Views for only $45 now! That's right, most sites are selling this amazing digital filter descrambler for $100+. We are offering a limited deal for less than $45 a filter! Think about the $1000's you will save in Free Tv for only $45. Is there a catch? No. The only factor is that you need digital cable. If you do not have it then simply upgrade & you'll be saving $100's a month in Free events & Movies! *Bonus: Get a Free cell/cordless Phone Shield/Booster with your order! A $20 value FREE with order. ACT NOW !!! Click Here as offer expires soon--> http://www.b2nmghjt.com/xcart/customer/product.php?productid=3D16144&partn= er=3Daffil10&r=3Dcable1 ""OPT-OUT"" system in compliance with state laws. If you wish to "OPT-OUT" from this mailing as well as the lists of thousands of other email providers please visit http://www.b2nmghjt.com/1/ 1 fakdbk u z z olfrlbsyv spal ag eu ynq vgnnadhpz jp gr ds dtgmzfn lg f --91_C___54E-- From g9robjef@cdf.toronto.edu Mon May 26 17:51:05 2003 From: g9robjef@cdf.toronto.edu (Jeffery Roberts) Date: Mon, 26 May 2003 12:51:05 -0400 (EDT) Subject: [Python-Dev] Re: Introduction In-Reply-To: <20030523143901.8660.97348.Mailman@mail.python.org> References: <20030523143901.8660.97348.Mailman@mail.python.org> Message-ID: Thanks for that reply Brett. It is really helpful. I'm currently in Ottawa at the GCC summit trying to sponge some knowledge but I will begin following your advice when I get back home later this week. Thanks again ! Jeff > I know I learned a lot from working on patches and bugs. It especially > helps if you jump in on a patch that is being actively worked on and can > ask how something works. Otherwise just read the source until your eyes > bleed and curse anyone who doesn't write extensive documentation for > code. =3D) > > There also has been mention of the AST branch. I know I plan on working > on that after I finish going through the bug and patch backlog. Only > trouble is that the guys who actually fully understand it (Jeremy, Tim, > and Neal) are rather busy so it is going to be a "jump in the pool and > drown and hope your flailing manages to at least generate something > useful but you die and come back in another life wiser and able to > attempt again until you stop drowning and manage to only get sick from > gulping down so much chlorinated water". =3D) > > > Check out > > > > http://www.python.org/dev/ > > > > for orientation, and leave your spare time at the door . > > > > I will vouch for the loss of spare time. This has become a job. Best > job ever, though. =3D) > > The only big piece of advice I can offer is to just make sure you are > nice and cordial on the list; there is a low tolerance for jerks here. > Don't take this as meaning to not take a stand on an issue! All I am > saying is realize that email does not transcribe humor perfectly and > until the list gets used to your personal writing style you might have > to just make sure what you write does not come off as insulting. > > -Brett > > > > --__--__-- > > Message: 9 > Date: Thu, 22 May 2003 20:21:20 -0700 > From: "Brett C." > Reply-To: drifty@alum.berkeley.edu > To: Jeffery Roberts > CC: Tim Peters , python-dev@python.org > Subject: Re: [Python-Dev] Introduction > > Jeffery Roberts wrote: > > Thanks for all of your replies. The front-end rewrite sounds especiall= y > > interesting. I'm going to look into that. Is the entire front end > > changing (ie scan/parse/ast) or just the AST structure ? > > > > If you have any more information or directions please let me know. > > > > It is just a new AST. Redoing/replacing pgen is something else > entirely. =3D) > > The branch that this is being developed under in CVS is ast-branch. > There is a incomplete README in Python/compile.txt that explains the > basic idea and direction. > > -Brett > > > > --__--__-- > > Message: 10 > Date: Thu, 22 May 2003 23:30:45 -0400 > Cc: python-list@python.org, python-dev@python.org > To: python-announce@python.org > From: Barry Warsaw > Subject: [Python-Dev] RELEASED Python 2.2.3c1 > > I'm happy to announce the release of Python 2.2.3c1 (release candidate > 1). This is a bug fix release for the stable Python 2.2 code line. > Barring any critical issues, we expect to release Python 2.2.3 final by > this time next week. We encourage those with an interest in a solid > 2.2.3 release to download this candidate and test it on their code. > > The new release is available here: > > =09http://www.python.org/2.2.3/ > > Python 2.2.3 has a large number of bug fixes and memory leak patches. > For full details, see the release notes at > > =09http://www.python.org/2.2.3/NEWS.txt > > There are a small number of minor incompatibilities with Python 2.2.2; > for details see: > > =09http://www.python.org/2.2.3/bugs.html > > Perhaps the most important is that the Bastion.py and rexec.py modules > have been disabled, since we do not deem them to be safe. > > As usual, a Windows installer and a Unix/Linux source tarball are made > available, as well as tarballs of the documentation in various forms. > At the moment, no Mac version or Linux RPMs are available, although I > expect them to appear soon after 2.2.3 final is released. > > On behalf of Guido, I'd like to thank everyone who contributed to this > release, and who continue to ensure Python's success. > > Enjoy, > -Barry > > > > --__--__-- > > Message: 11 > Date: Fri, 23 May 2003 13:42:20 +0200 > Subject: Re: [Python-Dev] RELEASED Python 2.2.3c1 > Cc: python-dev@python.org > To: Barry Warsaw > From: Jack Jansen > > > On Friday, May 23, 2003, at 05:30 Europe/Amsterdam, Barry Warsaw wrote: > > > I'm happy to announce the release of Python 2.2.3c1 (release candidate > > 1). > > Oops, that suddenly went *very* fast, I though I had until the > weekend... > > Is there a chance I could get #723495 still in before 2.2.3 final? I > was also hoping to find a fix for #571343, but I don't have a patch yet > (although I'll try to get one up in the next few hours). > -- > Jack Jansen, , http://www.cwi.nl/~jack > If I can't dance I don't want to be part of your revolution -- Emma > Goldman > > > > --__--__-- > > Message: 12 > Date: Fri, 23 May 2003 09:11:35 -0400 > From: Guido van Rossum > Subject: Re: [Python-Dev] Of what use is commands.getstatus() > To: skip@pobox.com > Cc: python-dev@python.org > > > I was reading the docs for the commands module and noticed getstatus() = seems > > to be completely unrelated to getstatusoutput() and getoutput(). I tho= ught, > > "I'll correct the docs. They must be wrong." Then I looked at command= s.py > > and saw the docs are correct. It's the function definition which is we= ird. > > Of what use is it to return 'ls -ld file'? Based on its name I would h= ave > > guessed its function was > > > > def getoutput(cmd): > > """Return status of executing cmd in a shell.""" > > return getstatusoutput(cmd)[0] > > > > This particular function dates from 1990, so it clearly can't just be > > deleted, but it seems completely superfluous to me, especially given th= e > > existence of os.stat, os.listdir, etc. Should it be deprecated or modi= fied > > to do (what I think is) the obvious thing? > > That whole module wasn't thought out very well. I recently tried to > use it and found that the strip of the trailing \n on getoutput() is > also a counterproductive feature. I suggest that someone should > design a replacement, perhaps to live in shutil, and then we can > deprecate it. Until then I would leave it alone. Certainly don't > "fix" it by doing something incompatible. > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > > --__--__-- > > Message: 13 > Date: Fri, 23 May 2003 10:06:05 -0400 > From: Guido van Rossum > Subject: Re: [Python-Dev] Descriptor API > To: =3D?iso-8859-1?Q?Gon=3DE7alo_Rodrigues?=3D > Cc: python-dev@python.org > > > I was doing some tricks with metaclasses and descriptors in Python 2.2 = and > > stumbled on the following: > > > > >>> class test(object): > > ... a =3D property(lambda: 1) > > ... > > >>> print test.a > > > > >>> print test.a.__set__ > > > > >>> print test.a.fset > > None > > > > What this means in practice, is that if I want to test if a > > descriptor is read-only I have to have two tests: One for custom > > descriptors, checking that getting __set__ does not barf and another > > for property, checking that fset returns None. > > Why are you interested in knowing whether a descriptor is read-only? > > > So, why doesn't getting __set__ raise AttributeError in the above case= ? > > This is a feature. The presence of __set__ (even if it always raises > AttributeError when *called*) signals this as a "data descriptor". > The difference between data descriptors and others is that a data > descriptor can not be overridden by putting something in the instance > dict; a non-data descriptor can be overridden by assignment to an > instance attribute, which will store a value in the instance dict. > > For example, a method is a non-data descriptor (and the prevailing > example of such). This means that the following example works: > > class C(object): > def meth(self): return 42 > > x =3D C() > x.meth() # prints 42 > x.meth =3D lambda: 24 > x.meth() # prints 24 > > > Is this a bug? If it's not, it sure is a (minor) feature request > > from my part :-) > > Because of the above explanation, the request cannot be granted. > > You can test the property's fset attribute however to tell whether a > 'set' argument was passed to the constructor. > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > --__--__-- > > Message: 14 > From: =3D?iso-8859-1?Q?Gon=3DE7alo_Rodrigues?=3D > To: > Subject: Re: [Python-Dev] Descriptor API > Date: Fri, 23 May 2003 15:34:14 +0100 > > > ----- Original Message ----- > From: "Guido van Rossum" > To: "Gon=E7alo Rodrigues" > Cc: > Sent: Friday, May 23, 2003 3:06 PM > Subject: Re: [Python-Dev] Descriptor API > > > > > I was doing some tricks with metaclasses and descriptors in Python 2.= 2 > and > > > stumbled on the following: > > > > > > >>> class test(object): > > > ... a =3D property(lambda: 1) > > > ... > > > >>> print test.a > > > > > > >>> print test.a.__set__ > > > > > > >>> print test.a.fset > > > None > > > > > > What this means in practice, is that if I want to test if a > > > descriptor is read-only I have to have two tests: One for custom > > > descriptors, checking that getting __set__ does not barf and another > > > for property, checking that fset returns None. > > > > Why are you interested in knowing whether a descriptor is read-only? > > > > Introspection dealing with a metaclass that injected methods in its > instances depending on a descriptor. In other words, having fun with > Python's wacky tricks. > > > > So, why doesn't getting __set__ raise AttributeError in the above ca= se? > > > > This is a feature. The presence of __set__ (even if it always raises > > AttributeError when *called*) signals this as a "data descriptor". > > The difference between data descriptors and others is that a data > > descriptor can not be overridden by putting something in the instance > > dict; a non-data descriptor can be overridden by assignment to an > > instance attribute, which will store a value in the instance dict. > > > > For example, a method is a non-data descriptor (and the prevailing > > example of such). This means that the following example works: > > > > class C(object): > > def meth(self): return 42 > > > > x =3D C() > > x.meth() # prints 42 > > x.meth =3D lambda: 24 > > x.meth() # prints 24 > > > > > Is this a bug? If it's not, it sure is a (minor) feature request > > > from my part :-) > > > > Because of the above explanation, the request cannot be granted. > > > > Thanks for the reply (and also to P. Eby, btw). I was way off track when = I > sent the email, because it did not occured to me that property was a type > implementing __get__ and __set__. With this piece of info connecting the > dots the idea is just plain foolish. > > > You can test the property's fset attribute however to tell whether a > > 'set' argument was passed to the constructor. > > > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > With my best regards, > G. Rodrigues > > > > --__--__-- > > Message: 15 > Date: Fri, 23 May 2003 10:39:10 -0400 > From: Guido van Rossum > Subject: Re: [Python-Dev] Need advice, maybe support > To: Christian Tismer > Cc: python-dev@python.org > > > > The other is the new style where the PyMethodDef > > > array is in tp_methods, and is scanned once by PyType_Ready. > > > > Right, again. Now, under the hopeful assumption that every > > sensible extension module that has some types to publish also > > does this through its module dictionary, I would have the > > opportunity to cause PyType_Ready being called early enough > > to modify the method table, before any of its methods is used > > at all. > > Dangerous assumption! It's not inconceivable that a class would > instantiate some of its own classes as part of its module > initialization. > > > > 3rd party modules that have been around for a while are likely to use > > > Py_FindMethod. With Py_FindMethod you don't have a convenient way to > > > store the pointer to the converted table, so it may be better to > > > simply check your bit in the first array element and then cast to a > > > PyMethodDef or a PyMethodDefEx array based on what the bit says (you > > > can safely assume that all elements of an array are the same size :-)= =2E > > > > Hee hee, yeah. Of course, if there isn't a reliable way to > > intercept method table access before the first Py_FindMethod > > call, I could of course modify Py_FindMethod. For instance, > > a modified, new-style method table might be required to always > > start with a dummy entry, where the flags word is completely > > -1, to signal having been converted to new-style. > > Why so drastic? You could just set a reserved bit.f > > > ... > > > > >>If that approach is trustworthy, I also could drop > > >>the request for these 8 bits. > > > > > > Sure. Ah, a bit in the type would work just as well, and > > > Py_FindMethod *does* have access to the type. > > > > You think of the definition in methodobject.c, as it is > > > > """ > > /* Find a method in a single method list */ > > > > PyObject * > > Py_FindMethod(PyMethodDef *methods, PyObject *self, char *name) > > """ > > > > , assuming that self always is not NULL, but representing a valid > > object with a type, and this type is already referring to the > > methods table? > > Right. There is already code that uses self->ob_type in > Py_FindMethodInChain(), which is called by Py_FindMethod(). > > > Except for module objects, this seems to be right. I've run > > Python against a lot of Python modules, but none seems > > to call Py_FindMethod with a self parameter of NULL. > > I don't think it would be safe to do so. > > > If that is true, then I can patch a small couple of > > C functions to check for the new bit, and if it's not > > there, re-create the method table in place. > > This is music to me ears. But... > > > > Well, there is a drawback: > > I *do* need two bits, and I hope you will allow me to add this > > second bit, as well. > > > > The one, first bit, tells me if the source has been compiled > > with Stackless and its extension stuff. Nullo problemo. > > I can then in-place modify the method table in a compatible > > way, or leave it as it is, bny default. > > But then, this isn't sufficient to set this bit then, like an > > "everything is fine, now" relief. This is so, since this is *still* > > an old module, and while its type's method tables have been > > patched, the type is still not augmented by new slots, like > > the new tp_call_nr slots (and maybe a bazillion to come, soon). > > The drawback is, that I cannot simply replace the whole type > > object, since type objects are not represented as object > > pointers (like they are now, most of the time, in the dynamic > > heaptype case), but they are constant struct addresses, where > > the old C module might be referring to. > > > > So, what I think to need is no longer 9 bits, but two of them: > > One that says "everything great from the beginning", and another > > one that says "well, ok so far, but this is still an old object". > > > > I do think this is the complete story, now. > > Instead of requiring nine bits, I'm asking for two. > > But this is just *your options; I also can live with one bit, > > but then I have to add a special, invalid method table entry > > that just serves for this purpose. > > In order to keep my souce code hack to the minimum, I'd really > > like to ask for the two bits in the typeobject flags. > > OK, two bits you shall have. Don't spend them all at once! > > > Thanks so much for being so supportive -- chris > > Anything to keep ctual stackless support out of the core. :-) > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > > --__--__-- > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > > > End of Python-Dev Digest > From jacobs@penguin.theopalgroup.com Mon May 26 18:22:08 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Mon, 26 May 2003 13:22:08 -0400 (EDT) Subject: [Python-Dev] RE: Decimal class In-Reply-To: Message-ID: On Mon, 26 May 2003, Tim Peters wrote: > [Kevin Jacobs] > > Anyhow, the next big thing I want to do is to make Decimal instances > > immutable like other Python numeric types, so they can be used as > > hash keys, so common values can be re-used, and some of the code can > > be simplified. > > Offhand I didn't see anything in the code that mutates any inputs, so I > expect it's at worst close. But this kind of discussion should be in > public, so others can jump in too (especially Eric!). I agree 100%. Does anyone else have feelings for or against having mutable Decimal instances? In the mean time, I will prepare a patch to do this so we can evaluate the practical effects on the code. -Kevin -- -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From tim_one@email.msn.com Mon May 26 18:36:23 2003 From: tim_one@email.msn.com (Tim Peters) Date: Mon, 26 May 2003 13:36:23 -0400 Subject: [Python-Dev] RE: Decimal class In-Reply-To: Message-ID: [Kevin Jacobs] >>> Anyhow, the next big thing I want to do is to make Decimal instances >>> immutable like other Python numeric types, so they can be used as >>> hash keys, so common values can be re-used, and some of the code can >>> be simplified. [Tim] >> Offhand I didn't see anything in the code that mutates any inputs, >> so I expect it's at worst close. But this kind of discussion should >> be in public, so others can jump in too (especially Eric!). [Kevin] > I agree 100%. Does anyone else have feelings for or against having > mutable Decimal instances? In the mean time, I will prepare a patch > to do this so we can evaluate the practical effects on the code. Oh yes, they have to be immutable, meaning that no public API operation mutates a Decimal in a user-visible way. From aahz@pythoncraft.com Mon May 26 18:36:29 2003 From: aahz@pythoncraft.com (Aahz) Date: Mon, 26 May 2003 13:36:29 -0400 Subject: [Python-Dev] RE: Decimal class In-Reply-To: References: Message-ID: <20030526173629.GA27743@panix.com> On Mon, May 26, 2003, Kevin Jacobs wrote: > > I agree 100%. Does anyone else have feelings for or against having > mutable Decimal instances? In the mean time, I will prepare a patch > to do this so we can evaluate the practical effects on the code. I'm opposed to mutable Decimal instances because Uncle Timmy says so. ;-) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In many ways, it's a dull language, borrowing solid old concepts from many other languages & styles: boring syntax, unsurprising semantics, few automatic coercions, etc etc. But that's one of the things I like about it." --Tim Peters on Python, 16 Sep 93 From greg@cosc.canterbury.ac.nz Mon May 26 23:45:27 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 27 May 2003 10:45:27 +1200 (NZST) Subject: [Python-Dev] RE: Decimal class In-Reply-To: Message-ID: <200305262245.h4QMjRc05319@oma.cosc.canterbury.ac.nz> Kevin Jacobs : > Does anyone else have feelings for or against having mutable Decimal > instances? Having mutable decimal instances would feel *very* strange to me, given that all other numeric types in Python are immutable. +1 on making them immutable. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tim.one@comcast.net Tue May 27 00:13:17 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 26 May 2003 19:13:17 -0400 Subject: [Python-Dev] RE: Decimal class In-Reply-To: <200305262245.h4QMjRc05319@oma.cosc.canterbury.ac.nz> Message-ID: [Greg Ewing] > Having mutable decimal instances would feel *very* strange > to me, given that all other numeric types in Python are > immutable. > > +1 on making them immutable. I don't believe there's any argument in favor of making them mutable. The question may arise because my old FixedPoint class had mutable instances. That was a mistake -- I wrote that class in an afternoon, and wasn't thinking when I added the .set_precision() method. If they're not immutable, they can't be used as dict keys, and that's a killer-strong argument all by itself. From r.vanputten@hexapole.com Tue May 27 09:28:52 2003 From: r.vanputten@hexapole.com (Rob van Putten) Date: Tue, 27 May 2003 10:28:52 +0200 Subject: [Python-Dev] install debian package Message-ID: <7JELK71URD0SPXU73UQ84KJC795KJNM.3ed321c4@rob_pc> Hi there, I am not sure if this is the right place for my comment but here it is; I tried to install python-dev on my new Debian woody system but it returned an error because it tried to remove the modutils package (probably because it was incompatible) The problem was solved after I upgraded the modutils package (2.4.21-2) I am no debain package expert but it looks to me as if the modutils package should be upgraded from the python-dev package (if this is somehow possible off course :-) Hope this helps to improve the debian package. Regards, Rob From gh@ghaering.de Tue May 27 09:54:27 2003 From: gh@ghaering.de (=?windows-1252?Q?Gerhard_H=E4ring?=) Date: Tue, 27 May 2003 10:54:27 +0200 Subject: [Python-Dev] install debian package In-Reply-To: <7JELK71URD0SPXU73UQ84KJC795KJNM.3ed321c4@rob_pc> References: <7JELK71URD0SPXU73UQ84KJC795KJNM.3ed321c4@rob_pc> Message-ID: <3ED327C3.7050608@ghaering.de> Rob van Putten wrote: > Hi there, > > I am not sure if this is the right place for my comment but here it is; It isn't the right place. This is the list for development of Python itself, not for the Debian package. > I tried to install python-dev on my new Debian woody system but it > returned an error because it tried to remove the modutils package > (probably because it was incompatible) [...] I'd suggest you contact either the Debian-Python mailing list (http://lists.debian.org/debian-python/), or the maintainer itself. Personally I didn't have this problem on Woody, btw. Or just report it to the Debian bugtracking system using for example 'reportbug'. -- Gerhard From skip@pobox.com Wed May 28 02:21:34 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 27 May 2003 20:21:34 -0500 Subject: [Python-Dev] Python bug 544473 - bugfix candidate - was it applied? Message-ID: <16084.3870.666366.928341@montanaro.dyndns.org> Several times today I had a Queue object (Python 2.2.2) wind up deadlocked with its fsema locked but its queue full (apparently threads are waiting to put more items in the queue than it's supposed to hold). Looking back at the cvs log for the Queue module I see this message revision 1.15 date: 2002/04/19 00:11:31; author: mhammond; state: Exp; lines: +33 -14 Fix bug 544473 - "Queue module can deadlock". Use try/finally to ensure all Queue locks remain stable. Includes test case. Bugfix candidate. but no indication that was ever applied to the maint22 branch. I'm not suggesting that this bug fix will solve my problem (it's probably a bug in my code), but it seems that it should have been applied but wasn't. Should it be applied at this point or is 2.2.3 too close to release? Skip From guido@python.org Wed May 28 12:20:39 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 28 May 2003 07:20:39 -0400 Subject: [Python-Dev] Python bug 544473 - bugfix candidate - was it applied? In-Reply-To: "Your message of Tue, 27 May 2003 20:21:34 CDT." <16084.3870.666366.928341@montanaro.dyndns.org> References: <16084.3870.666366.928341@montanaro.dyndns.org> Message-ID: <200305281120.h4SBKdQ11691@pcp02138704pcs.reston01.va.comcast.net> > Several times today I had a Queue object (Python 2.2.2) wind up deadlocked > with its fsema locked but its queue full (apparently threads are waiting to > put more items in the queue than it's supposed to hold). Looking back at > the cvs log for the Queue module I see this message > > revision 1.15 > date: 2002/04/19 00:11:31; author: mhammond; state: Exp; lines: +33 -14 > Fix bug 544473 - "Queue module can deadlock". > Use try/finally to ensure all Queue locks remain stable. > Includes test case. Bugfix candidate. > > but no indication that was ever applied to the maint22 branch. cvs log of the release22-maint branch shows it was applied. > I'm not suggesting that this bug fix will solve my problem (it's probably a > bug in my code), but it seems that it should have been applied but wasn't. > Should it be applied at this point or is 2.2.3 too close to release? Are you using the tip of the branch is the next question? --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Wed May 28 13:52:25 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 28 May 2003 07:52:25 -0500 Subject: [Python-Dev] Python bug 544473 - bugfix candidate - was it applied? In-Reply-To: <200305281120.h4SBKdQ11691@pcp02138704pcs.reston01.va.comcast.net> References: <16084.3870.666366.928341@montanaro.dyndns.org> <200305281120.h4SBKdQ11691@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <16084.45321.727469.548923@montanaro.dyndns.org> Guido> cvs log of the release22-maint branch shows it was applied. ... Guido> Are you using the tip of the branch is the next question? I guess I misunderstood how "cvs log" worked. Given "cvs log foo" I thought it would list the checkin comments for all versions and all branches of foo. I didn't think it mattered which version of the file I asked about. Skip From terry@wayforward.net Wed May 28 18:32:29 2003 From: terry@wayforward.net (Terence Way) Date: Wed, 28 May 2003 13:32:29 -0400 Subject: [Python-Dev] Introduction Message-ID: <5B3B5FCE-9132-11D7-BD8E-00039344A0EC@wayforward.net> I've been lurking for a bit, and now seems like a good time to introduce myself. * I build messaging systems for banks, earlier I was CTO of a dot-com. * I started programming on the TRS-80 and the RCA COSMAC VIP, later on the Apple ][. * I am a Java refugee (well, I might still code in Java for pay). * I'm into formal methods. Translation: I like *talking* about formal methods, but I never use them myself :-) I read somewhere that the best way to build big Python callouses was to write a PEP. Here goes: http://www.wayforward.net/pycontract/pep-0999.html Programming by Contract for Python... pre-conditions, post-conditions, invariants, with all the Eiffel goodness like weakening pre-conditions and strengthening invariants and post-conditions on inheritance, and access to old values. All from docstrings, like doctest. I'm also into handling insane numbers of incoming connections on cheap boxes: compare Jef Poskanzer's thttpd to Apache. 10000 simultaneous HTTP connections on a $400 computer just gets me giggling. Stackless Python intrigues me greatly for the same reason. I guess that's it for now... Cheers! From pje@telecommunity.com Wed May 28 19:23:39 2003 From: pje@telecommunity.com (Phillip J. Eby) Date: Wed, 28 May 2003 14:23:39 -0400 Subject: [Python-Dev] Introduction In-Reply-To: <5B3B5FCE-9132-11D7-BD8E-00039344A0EC@wayforward.net> Message-ID: <5.1.1.6.0.20030528140810.0249d640@telecommunity.com> At 01:32 PM 5/28/03 -0400, Terence Way wrote: >I read somewhere that the best way to build big Python callouses was >to write a PEP. Guess I'll start helping you work on the callouses, then. :) > Here goes: > http://www.wayforward.net/pycontract/pep-0999.html Please don't number a pre-PEP; I believe PEP 1 recomends using 'XXX' until a PEP number has been assigned by the PEP editors. >Programming by Contract for Python... pre-conditions, post-conditions, >invariants, with all the Eiffel goodness like weakening pre-conditions >and strengthening invariants and post-conditions on inheritance, and >access to old values. All from docstrings, like doctest. A number of things aren't clear from your PEP. For example, how would syntax errors in assertions be handled? How is backward compatibility with existing docstrings that may use 'inv:' or 'pre:' to specify conditions informally? Are you proposing that this be part of Python's core syntax? If so, then why do it as docstrings? Are you proposing instead that your implementation be part of the standard library? If so, then where is the documentation for how a developer enables the behavior? Also, I didn't find the motivation section convincing. Your answer to "Why not have several different implementations, or let programmers implement their own assertions?" isn't actually a justification. If Alice uses some package to wrap her methods with checks, I can weaken the preconditions in a subclass, by simply overriding the methods. If I can't do that, then it is a weakness of the DBC package Alice used, or of Alice's package, not a weakness of Python. From barry@python.org Wed May 28 19:46:13 2003 From: barry@python.org (Barry Warsaw) Date: 28 May 2003 14:46:13 -0400 Subject: [Python-Dev] Plans for Python 2.2.3 final Message-ID: <1054147573.10580.20.camel@barry> I've not heard about any showstoppers for Python 2.2.3. Just to let everyone know, I'd like to release it Some PM, this Friday night, EDT. I'll need to coordinate specifics with Fred and Tim, but expect a check-in freeze on the branch at some point Friday, with a release to follow shortly thereafter. -Barry From terry@wayforward.net Wed May 28 20:37:14 2003 From: terry@wayforward.net (Terence Way) Date: Wed, 28 May 2003 15:37:14 -0400 Subject: [Python-Dev] Introduction In-Reply-To: <5.1.1.6.0.20030528140810.0249d640@telecommunity.com> Message-ID: On Wednesday, May 28, 2003, at 02:23 PM, Phillip J. Eby wrote: > Please don't number a pre-PEP; I believe PEP 1 recomends using 'XXX' > until a PEP number has been assigned by the PEP editors. > Ack. Oops. I've sent it off to peps@python.org with the XXX, but posted here with the 999. > A number of things aren't clear from your PEP. For example, how would > syntax errors in assertions be handled? How is backward compatibility > with existing docstrings that may use 'inv:' or 'pre:' to specify > conditions informally? Um. No thought given to that. My first guess is: syntax errors printed to standard error, optionally silently ignored, no safety checks installed either way. Run-time errors trapped and re-raised as some kind of ContractViolation:: def read_stuff(input) """pre: input.readline""" would be valid, and the AttributeError would be wrapped inside a PreconditionViolationError if the ``input`` parameter isn't some type of input stream. > Are you proposing that this be part of Python's core syntax? If so, > then why do it as docstrings? Are you proposing instead that your > implementation be part of the standard library? If so, then where is > the documentation for how a developer enables the behavior? Proposing that some implementation, hopefully mine, be put in the standard library. I *really* don't think contracts should be part of the core syntax: contracts belong in the documentation, and changing all the doc tools to parse code looking for contract assertions is harder than building one or two docstring implementations. self.note(): where *is* the documentation on how to enable the behavior. > Also, I didn't find the motivation section convincing. Your answer to > "Why not have several different implementations, or let programmers > implement their own assertions?" isn't actually a justification. If > Alice uses some package to wrap her methods with checks, I can weaken > the preconditions in a subclass, by simply overriding the methods. If > I can't do that, then it is a weakness of the DBC package Alice used, > or of Alice's package, not a weakness of Python. Consider when Alice's preconditions work, but Bob's do not. Code that thinks it's calling Alice's code *must not* break when calling Bob's. Weakening pre-conditions means that Alice's pre-conditions must be tested as well: and Bob's code is run even if his pre-conditions fail. The converse is also true: code that understands Bob's pre-conditions must not fail even if Alice's pre-conditions fail. This is tough to do with asserts, or with incompatible contract packages. I haven't made that clear in the PEP or the samples, and it needs to be clear, because it is the /only/ reason why contracts need to be in the language/standard runtime. Excellent points, thanks for taking an interest. From tim@zope.com Wed May 28 21:03:59 2003 From: tim@zope.com (Tim Peters) Date: Wed, 28 May 2003 16:03:59 -0400 Subject: [Python-Dev] Plans for Python 2.2.3 final In-Reply-To: <1054147573.10580.20.camel@barry> Message-ID: [Barry] > I've not heard about any showstoppers for Python 2.2.3. Mike Fletcher submitted two weakref bugs, one in WeakValueDictionary which I fixed in 2.3 but am waiting to hear from Fred about before backporting, the other a segfault I think I traced to subtype_dealloc then assigned to Guido. The segfault should be a showstopper: http://www.python.org/sf/742911 From tim.one@comcast.net Wed May 28 21:08:48 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 28 May 2003 16:08:48 -0400 Subject: [Python-Dev] Plans for Python 2.2.3 final In-Reply-To: Message-ID: [Tim] > Mike Fletcher submitted two weakref bugs, one in WeakValueDictionary Nope, WeakKeyDictionary. > which I fixed in 2.3 but am waiting to hear from Fred about before > backporting, the other a segfault I think I traced to subtype_dealloc > then assigned to Guido. The segfault should be a showstopper: > > http://www.python.org/sf/742911 No change there. From barry@python.org Wed May 28 21:22:00 2003 From: barry@python.org (Barry Warsaw) Date: 28 May 2003 16:22:00 -0400 Subject: [Python-Dev] Plans for Python 2.2.3 final In-Reply-To: References: Message-ID: <1054153320.12509.3.camel@barry> On Wed, 2003-05-28 at 16:08, Tim Peters wrote: > [Tim] > > Mike Fletcher submitted two weakref bugs, one in WeakValueDictionary > > Nope, WeakKeyDictionary. > > > which I fixed in 2.3 but am waiting to hear from Fred about before > > backporting, the other a segfault I think I traced to subtype_dealloc > > then assigned to Guido. The segfault should be a showstopper: > > > > http://www.python.org/sf/742911 > > No change there. Guido promised to look into the latter. We'll withhold lunch from Fred until he looks at the former. kung-pao-ish-ly y'rs, -Barry From 4va5kpg1@yahoo.com Wed May 28 18:32:26 2003 From: 4va5kpg1@yahoo.com (Loren Mcdaniel) Date: Wed, 28 May 03 17:32:26 GMT Subject: [Python-Dev] Refinance your home with lowest rates lqjq c cipem Message-ID: <9v---c$kt8ar-x-115hg61-xq39-$6@pkj.p.l2k.n9> This is a multi-part message in MIME format. --57A_ECAF9F3DF_A2DDB. Content-Type: text/plain Content-Transfer-Encoding: quoted-printable FREE MORTGAGE QUOTE & BEST POSSIBLE RATES ! ------------------------------------------- There are over 89,000 mortgage companies in the U.S., which means the process of finding the best loan for you can be a very difficult one.Let us do the hard work for you! Simply spend 1 minute to fill out the short form, then press the submit button, and we take it from there... finding the best deals possible, and getting the lenders to contact you! It's short, it's simple, it's free, and it will save you thousands of dollars! * Home Improvement, Refinance, Second Mortgage, Home Equity Loans, and More! Even with less than perfect or NO credit! You will qualify for the best possible rate. Do NOT miss the chance to refinance at record low rates, so act now... http://mymortagenow.com/3/index.asp?RefID=3D383102 remove me http://www.mortage-area.com/Auto/index.htm eamor d aoiotcm auvwi zt v u awokik tn gz kxcgc bchjfq wz gzsrh bmj --57A_ECAF9F3DF_A2DDB.-- From pjones@redhat.com Wed May 28 21:55:56 2003 From: pjones@redhat.com (Peter Jones) Date: Wed, 28 May 2003 16:55:56 -0400 (EDT) Subject: [Python-Dev] Introduction In-Reply-To: Message-ID: Hi, I'm Peter. Long time listener, first time caller. On Wed, 28 May 2003, Terence Way wrote: > > A number of things aren't clear from your PEP. For example, how would > > syntax errors in assertions be handled? How is backward compatibility > > with existing docstrings that may use 'inv:' or 'pre:' to specify > > conditions informally? > > Um. No thought given to that. My first guess is: syntax errors printed > to standard error, optionally silently ignored, no safety checks > installed either way. It seems like either of these methods of coping with legacy docstrings thwart your basic premises. Unless the well-formed nature of the contracts are enforced, it seems to be fairly difficult to e.g. randomly test a function. What if the docstring fails to parse? I have to be listening to stderr to know that it didn't work. I then have to parse the message from stderr to figure out which function didn't work, and finally I have to somehow mark this function as not compliant, and ignore whatever results I get. It really seems like you want them either "on" or "off", not "on, but it might fail in some silent or hard to trap way". > > Are you proposing that this be part of Python's core syntax? If so, > > then why do it as docstrings? Are you proposing instead that your > > implementation be part of the standard library? If so, then where is > > the documentation for how a developer enables the behavior? > > Proposing that some implementation, hopefully mine, be put in the > standard library. I *really* don't think contracts should be part of > the core syntax: contracts belong in the documentation, and changing all > the doc tools to parse code looking for contract assertions is harder > than building one or two docstring implementations. The assertion that contracts don't belong in the core seems entirely seperate from the discussion of their place in docstrings or in real code. That being said, you still haven't explained *why* contracts belong in docstrings (or in documentation in general). They are executable code; why not treat them as such? > self.note(): where *is* the documentation on how to enable the > behavior. I suspect we have to know this before we can know which way is easier. That being said, I really don't see how these contracts can be meaningful as part of a docstring without some better mechanism for handling old docstrings that have been ruled malformed. What's your reasoning against making them their own kind of block, like "try:"? -- Peter From patmiller@llnl.gov Wed May 28 22:01:24 2003 From: patmiller@llnl.gov (Pat Miller) Date: Wed, 28 May 2003 14:01:24 -0700 Subject: [Python-Dev] Introduction References: <5.1.1.6.0.20030528140810.0249d640@telecommunity.com> Message-ID: <3ED523A4.7030405@llnl.gov> > http://www.wayforward.net/pycontract/pep-0999.html I think another issue with using doc strings in this way is that you are overloading a feature visible to end users. If I look at the doc string then I would expect to be confused the result: >>> help(circbuf) Help on class circbuf in module __main__: class circbuf | Methods defined here: | | get(self) | Pull an entry from a non-empty circular buffer. | | pre: not self.is_empty() | post[self.g, self.len]: | __return__ == self.buf[__old__.self.g] | self.len == __old__.self.len - 1 | ... Way to cryptic even for me :-) I think you could get the same effect by overloading property so you could make methods "smart" about pre and post conditions The following is a first quick hack at it...: class eiffel(property): """eiffel(method,precondition,postcondition) Implement a Eiffel style method that enforces pre- and post- conditions. I guess you could turn this on and off if you wanted... class foo: def pre(self): assert self.x > 0 def post(self): assert self.x > 0 def increment(self): self.x += 1 return increment = eiffel(method,precondition,postcondition) """ def __init__(self,method,precondition,postcondition,doc=None): self.method = method self.precondition = precondition self.postcondition = postcondition super(eiffel,self).__init__(self.__get,None,None,doc) return def __get(self,this): class funny_method: def __init__(self,this,method,precondition,postcondition): self.this = this self.method = method self.precondition = precondition self.postcondition = postcondition return def __call__(self,*args,**kw): self.precondition(self.this) value = self.method(self.this,*args,**kw) self.postcondition(self.this) return value return funny_method(this,self.method,self.precondition,self.postcondition) class circbuf: def __init__(self): self.stack = [] return def _get_pre(self): assert not self.is_empty() return def _get_post(self): assert not self.is_empty() return def _get(self): """Pull an entry from a non-empty circular buffer.""" val = self.stack[-1] del self.stack[-1] return val get = eiffel(_get,_get_pre,_get_post) def put(self,val): self.stack.append(val) return def is_empty(self): return len(self.stack) == 0 B = circbuf() B.put('hello') print B.get() # Will bomb... print B.get() -- Patrick Miller | (925) 423-0309 | http://www.llnl.gov/CASC/people/pmiller If you think you can do a thing or think you can't do a thing, you're right. -- Henry Ford From pje@telecommunity.com Wed May 28 23:05:30 2003 From: pje@telecommunity.com (Phillip J. Eby) Date: Wed, 28 May 2003 18:05:30 -0400 Subject: [Python-Dev] Contracts PEP (was re: Introduction) In-Reply-To: References: <5.1.1.6.0.20030528140810.0249d640@telecommunity.com> Message-ID: <5.1.1.6.0.20030528174512.01eb5e50@telecommunity.com> At 03:37 PM 5/28/03 -0400, Terence Way wrote: >On Wednesday, May 28, 2003, at 02:23 PM, Phillip J. Eby wrote: >>Also, I didn't find the motivation section convincing. Your answer to >>"Why not have several different implementations, or let programmers >>implement their own assertions?" isn't actually a justification. If >>Alice uses some package to wrap her methods with checks, I can weaken the >>preconditions in a subclass, by simply overriding the methods. If I >>can't do that, then it is a weakness of the DBC package Alice used, or of >>Alice's package, not a weakness of Python. >Consider when Alice's preconditions work, but Bob's do not. Code that >thinks it's calling Alice's code *must not* break when calling Bob's. Okay, you've completely lost me now, because I don't know what you mean by "work" in this context. Do you mean, "are met by the caller"? Or "are syntactically valid"? Or...? >Weakening pre-conditions means that Alice's pre-conditions must be >tested as well: and Bob's code is run even if his pre-conditions fail. Whaa? That can't be right. Weakening a precondition means that Bob's preconditions should *replace* Alice's preconditions, if Bob has supplied newer, weaker preconditions. Bob's code should *not* be run if Bob's preconditions are not met. Just to make sure we're not on completely different pages here, I'm thinking this: class AlicesClass: def something(self): """pre: foo and bar""" class BobsClass(AlicesClass): def something(self): """pre: foo""" That, to me, is weakening a precondition. Now, if what you're saying is that Bob's code must work if *Alice's* preconditions are met, then that's something different. What you're saying then, is that it's required that a precondition in a subclass be logically implied by each of the corresponding preconditions in the base classes. That is certainly a reasonable requirement, but I don't see why the language needs to enforce it, certainly not by running Bob's code even when Bob's precondition fails! If you're going to enforce it, it should be enforced by issuing an error for preconditions that aren't logically implied by their superclass preconditions. Then you actually get some benefit from the static checking. If you just run Bob's code, he has no way to notice that he's violating Alice's contract, until his code keeps breaking at runtime. (And then, he will almost certainly come to the conclusion that the contract checker is broken!) OTOH, if you accept Bob's precondition as he stated it, then he gets the behavior he asked for. If this is a violation of Alice's contract, Bob's users will either read the fact in his docs, or complain. >The converse is also true: code that understands Bob's pre-conditions >must not fail even if Alice's pre-conditions fail. This is tough to >do with asserts, or with incompatible contract packages. I still don't understand. If Bob has replaced Alice's method, what do her preconditions have to do with it any more? If Bob's code *calls* Alice's method, then the conditions of Alice's method presumably *do* need to apply for that upcall, or else she has written them without enough indirection. >I haven't made that clear in the PEP or the samples, and it needs to >be clear, because it is the /only/ reason why contracts need to be in >the language/standard runtime. Yep, and I'm still totally not seeing why Alice and Bob have to use the same mechanism. Alice could use method wrappers, Bob could use a metaclass, and Carol could use assert statements, as far as I can see, unless you are looking for static correctness checking. (In which case, docstrings are the wrong place for this.) From barry@python.org Wed May 28 23:09:50 2003 From: barry@python.org (Barry Warsaw) Date: 28 May 2003 18:09:50 -0400 Subject: [Python-Dev] Plans for Python 2.2.3 final In-Reply-To: References: Message-ID: <1054159790.4482.0.camel@geddy> On Wed, 2003-05-28 at 16:03, Tim Peters wrote: > [Barry] > > I've not heard about any showstoppers for Python 2.2.3. > > Mike Fletcher submitted two weakref bugs, one in WeakValueDictionary which I > fixed in 2.3 but am waiting to hear from Fred about before backporting, the > other a segfault I think I traced to subtype_dealloc then assigned to Guido. > The segfault should be a showstopper: > > http://www.python.org/sf/742911 Ok, I just spoke to Fred. He gives his seal of approval for the weakref backport. I'll do that, after testing the patches and backporting the tests. Guido's still going to look at the latter bug. -Barry From martin@v.loewis.de Thu May 29 02:03:53 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 29 May 2003 03:03:53 +0200 Subject: [Python-Dev] Python bug 544473 - bugfix candidate - was it applied? In-Reply-To: <16084.45321.727469.548923@montanaro.dyndns.org> References: <16084.3870.666366.928341@montanaro.dyndns.org> <200305281120.h4SBKdQ11691@pcp02138704pcs.reston01.va.comcast.net> <16084.45321.727469.548923@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > I guess I misunderstood how "cvs log" worked. Given "cvs log foo" I thought > it would list the checkin comments for all versions and all branches of foo. And indeed that's what it does. Look to the very end of the log. Regards, Martin From gward@python.net Thu May 29 02:32:29 2003 From: gward@python.net (Greg Ward) Date: Wed, 28 May 2003 21:32:29 -0400 Subject: [Python-Dev] Change to ossaudiodev setparameters() method In-Reply-To: <200305260640.h4Q6dxm08507@pcp02138704pcs.reston01.va.comcast.net> References: <20030526021635.GA15814@cthulhu.gerg.ca> <200305260640.h4Q6dxm08507@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030529013229.GA19091@cthulhu.gerg.ca> [me, on ossaudiodev.setparameters()] > In addition to being silly, this is not the documented interface. The > docs don't mention the 'sample_size' argument at all. Presumably the > doc writer realized the silliness and was going to pester me to remove > 'sample_size', but never got around to it. (Lot of that going around.) > > So, even though we're in a beta cycle, am I allowed to change the code > so it's 1) sensible and 2) consistent with the documentation? [Guido] > Yes. I like silliness in a MP skit, but not in my APIs. :-) OK, done. I've also beefed up the test script a bit. So, once again, if you have a Linux or FreeBSD system with working sound card, can you run ./python Lib/test/regrtest.py -uaudio test_ossaudiodev ...preferably before and after a "cvs up && make" to see if things are better, worse, or unchanged? Greg -- Greg Ward http://www.gerg.ca/ All the world's a stage and most of us are desperately unrehearsed. From g_a_l_l_a@mail333.com Thu May 29 08:13:32 2003 From: g_a_l_l_a@mail333.com (g_a_l_l_a@mail333.com) Date: 29 May 2003 11:13:32 +0400 Subject: [Python-Dev] New free service! AID:0951460048 Message-ID: <2003.05.29.2F7E2EF61D266809@mail333.com> We are pleased to inform you that on the Gallery-A site has opened a new service. Now you can absolutely FREE use the best paintings from our gallery to decorate the screen of your computer. Insert your favorite painting into the frame of your monitor. http://www.gallery-a.ru/luxury.php With best regards Boris Lipner Sorry if that information not interesting for You and we disturb You with our message! For removing yor address from this mailing list just replay this message with word 'unsubscribe' in subject field or simple click this link: http://www.gallery-a.ru/unsubscribe.php?e=cHl0aG9uLWRldkBweXRob24ub3JnOjE2ODAwMzcx From guido@python.org Thu May 29 15:50:06 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 29 May 2003 10:50:06 -0400 Subject: [Python-Dev] Change to ossaudiodev setparameters() method In-Reply-To: Your message of "Wed, 28 May 2003 21:32:29 EDT." <20030529013229.GA19091@cthulhu.gerg.ca> References: <20030526021635.GA15814@cthulhu.gerg.ca> <200305260640.h4Q6dxm08507@pcp02138704pcs.reston01.va.comcast.net> <20030529013229.GA19091@cthulhu.gerg.ca> Message-ID: <200305291450.h4TEo6q15846@odiug.zope.com> > [me, on ossaudiodev.setparameters()] > > In addition to being silly, this is not the documented interface. The > > docs don't mention the 'sample_size' argument at all. Presumably the > > doc writer realized the silliness and was going to pester me to remove > > 'sample_size', but never got around to it. (Lot of that going around.) > > > > So, even though we're in a beta cycle, am I allowed to change the code > > so it's 1) sensible and 2) consistent with the documentation? > > [Guido] > > Yes. I like silliness in a MP skit, but not in my APIs. :-) > > OK, done. I've also beefed up the test script a bit. So, once again, > if you have a Linux or FreeBSD system with working sound card, can you > run > > ./python Lib/test/regrtest.py -uaudio test_ossaudiodev > > ...preferably before and after a "cvs up && make" to see if things are > better, worse, or unchanged? Did you check in the changes to ossaudiodev? A cvs update gave me new test files: P Lib/test/test_ossaudiodev.py P Lib/test/output/test_ossaudiodev but no new C code, and now I get this error when I run the above test: $ ./python ../Lib/test/regrtest.py -uaudio test_ossaudiodev test_ossaudiodev test test_ossaudiodev crashed -- exceptions.TypeError: setparameters() takes at least 4 arguments (3 given) 1 test failed: test_ossaudiodev $ Before the cvs update, the test produced some audio and then hung; when I interrupted, here's the traceback: Traceback (most recent call last): File "../Lib/test/regrtest.py", line 974, in ? main() File "../Lib/test/regrtest.py", line 264, in main ok = runtest(test, generate, verbose, quiet, testdir) File "../Lib/test/regrtest.py", line 394, in runtest the_package = __import__(abstest, globals(), locals(), []) File "/mnt/home/guido/projects/trunk/Lib/test/test_ossaudiodev.py", line 96, in ? test() File "/mnt/home/guido/projects/trunk/Lib/test/test_ossaudiodev.py", line 93, in test play_sound_file(data, rate, ssize, nchannels) File "/mnt/home/guido/projects/trunk/Lib/test/test_ossaudiodev.py", line 56, in play_sound_file a.write(data) --Guido van Rossum (home page: http://www.python.org/~guido/) From terry@wayforward.net Thu May 29 15:53:16 2003 From: terry@wayforward.net (Terence Way) Date: Thu, 29 May 2003 10:53:16 -0400 Subject: [Python-Dev] Introduction In-Reply-To: Message-ID: <47C92342-91E5-11D7-BD8E-00039344A0EC@wayforward.net> On Wednesday, May 28, 2003, at 04:55 PM, Peter Jones wrote: > What if the docstring fails to parse? I have to be listening to > stderr to > know that it didn't work. I then have to parse the message from > stderr to > figure out which function didn't work, and finally I have to somehow > mark > this function as not compliant, and ignore whatever results I get. > I probably would have figured that out too, eventually... :-) More on this further down, when I talk about how to enable docstring testing. > That being said, you still haven't explained *why* contracts belong in > docstrings (or in documentation in general). They are executable code; > why not treat them as such? > Okay, the whole docstring vs syntax thing, and I'm going to quote liberally from Bertrand Meyer's Object Oriented Software Construction, 1st edition, 7.9 Using Assertions. There are four main reasons for adding contracts to code: """ * Help in writing correct software. * Documentation aid. * Debugging tool. * Support for software fault tolerance. [...] The second use is essential in the production of reusable software elements and, more generally, in organizing the interfaces of modules in large software systems. Preconditions, postconditions, and class invariants provide potential clients of a module with crucial information about the services offered by the module, expressed in a concise and precise form. No amount of verbose documentation can replace a set of carefully expressed assertions. """ I really like Extreme Programming's cut-to-the-bone approach: there are only two things worth knowing about the code: *what* it does and *how* it does it. In XP, what the code does can be inferred from test cases; how it does it from the source code. And if you can't read the code, you have no business talking about how the software does what it does anyway. With contracts, I want to move the knowledge of *what* the code does from the test cases back into the programming documentation. It is merely a bonus feature that this documentation can be executed. When I was learning Python (um, not too long ago) the epiphany of what this language was all about hit me when I saw the 'doctest' module. We're *always* using examples as clear, concise ways to describe what our code does, but we're all guilty of letting those examples get out-of-date. Doctest can crawl into our software deep enough to keep us honest about our documentation. Contracts extend this so it's not just about the basic sample cases, but about the entire state space that a function supports... "Here be dragons" but over there be heap-based priority queues. >> self.note(): where *is* the documentation on how to enable the >> behavior. > > I suspect we have to know this before we can know which way is easier. > Now that I've come out as a doctest fanboy, it should be no surprise that contracts are enabled like this: import contracts, mymodule contracts.checkmod(mymodule) The checkmod side effect is that all functions within mymodule are replaced by auto-generated checking functions. And now I think I'm clear in my own mind about backwards- compatibility with informal 'pre:' docstrings... a programmer doesn't run checkmod unless she's sure that all docstring contracts are valid. Syntax error exceptions will be passed through to the checkmod caller. Cheers! From terry@wayforward.net Thu May 29 19:26:36 2003 From: terry@wayforward.net (Terence Way) Date: Thu, 29 May 2003 14:26:36 -0400 Subject: [Python-Dev] Contracts PEP (was re: Introduction) In-Reply-To: <5.1.1.6.0.20030528174512.01eb5e50@telecommunity.com> Message-ID: <151E799C-9203-11D7-BD8E-00039344A0EC@wayforward.net> On Wednesday, May 28, 2003, at 06:05 PM, Phillip J. Eby wrote: > ... I'm still totally not seeing why Alice and Bob have to use the > same mechanism. Alice could use method wrappers, Bob could use a > metaclass, and Carol could use assert statements, as far as I can see, > unless you are looking for static correctness checking. (In which > case, docstrings are the wrong place for this.) Here is the full behavior (all quotes are straight from Bertrand Meyer's Object Oriented Software Construction, 11.1 Inheritance and Assertions): """ Parents' invariant rule: The invariants of all the parents of a class apply to the class itself. The parents' invariants are considered to be added to the class's own invariant, "addition" being here a logical *and*. """ Having a single contract implementation means that Bob's overriding class can check Alice's invariants, even if none of Alice's methods are actually called. """ Assertion redefinition rule: Let r be a routine in class A and s a redefinition of r in a descendant of A, or an effective definition of r if r was deferred. Then pre(s) must be weaker than or equal to pre(r), and post(s) must be stronger than or equal to post(r) """ Having a single contract implementation means that Bob's overriding methods' postconditions check Alice's postconditions, even if none of Alice's methods are actually called. I hope I've at least convinced you that it would be nice to have a single implementation to support 'inv:' and 'post:' with inheritance. Now on to those irritating pre-conditions. > That, to me, is weakening a precondition. Now, if what you're saying > is that Bob's code must work if *Alice's* preconditions are met, then > that's something different. What you're saying then, is that it's > required that a precondition in a subclass be logically implied by > each of the corresponding preconditions in the base classes. > > That is certainly a reasonable requirement, but I don't see why the > language needs to enforce it, certainly not by running Bob's code even > when Bob's precondition fails! If you're going to enforce it, it > should be enforced by issuing an error for preconditions that aren't > logically implied by their superclass preconditions. Then you > actually get some benefit from the static checking. If you just run > Bob's code, he has no way to notice that he's violating Alice's > contract, until his code keeps breaking at runtime. (And then, he > will almost certainly come to the conclusion that the contract checker > is broken!) This is especially irritating because what you're asking for is exactly what my implementation was doing three weeks ago. I *agree* with you. There seem to be two opposing groups: Academics: Pre-conditions are ORed! Liskov Substitution Principle! Programmers: this is a debugging tool! Tell me when I mess up! I admit, I'm doing something different by supporting OR pre- conditions. Meyer again: """ So the require and ensure clause must always be given for a routine, even if it is a redefinition, and even if these clauses are identical to their antecedents in the original. """ Well, this is error-prone and wrong for postconditions. It's not an issue to just AND a method's post()s with all overridden post()s, we've covered that earlier. It's only those pesky preconditions. Summary: I agree with your point... pre-conditions should only be checked on a method call for the pre-conditions of the method itself. Overridden method's preconditions are ignored. However, this still means some communication between super-class and overridden class is necessary. Contract invariants and postconditions conditions of overridden classes/methods still need to be checked. Cheers! From fdrake@acm.org Thu May 29 19:46:12 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 29 May 2003 14:46:12 -0400 Subject: [Python-Dev] Python 2.2.3 docs freeze Message-ID: <16086.21876.190793.365508@grendel.zope.com> I'm going to generate the Python 2.2.3 documentation packages now, so please no more checkins in the Doc/ tree on the release22-maint branch. Thanks! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From jack@performancedrivers.com Thu May 29 20:14:21 2003 From: jack@performancedrivers.com (Jack Diederich) Date: Thu, 29 May 2003 15:14:21 -0400 Subject: [Python-Dev] Release question Message-ID: <20030529151421.G1276@localhost.localdomain> The PEPs are pretty thorough about how to announce and build releases. My question is about cvs, branching, and feature freezes. I joined python-dev after the 2.2 release so I haven't been around for a 'round number' release yet. I'm guessing 2.3 is in bugfix only mode. When is 2.4 tagged, and what is the timeframe on that? (the linux kernel generally waits a while before starting the next dev branch). Assume I asked intelligent questions about related things and please answer them too *wink*. Thanks, -jack From scrosby@cs.rice.edu Thu May 29 21:33:12 2003 From: scrosby@cs.rice.edu (Scott A Crosby) Date: 29 May 2003 15:33:12 -0500 Subject: [Python-Dev] Algoritmic Complexity Attack on Python Message-ID: Hello. We have analyzed this software to determine its vulnerability to a new class of DoS attacks that related to a recent paper. ''Denial of Service via Algorithmic Complexity Attacks.'' This paper discusses a new class of denial of service attacks that work by exploiting the difference between average case performance and worst-case performance. In an adversarial environment, the data structures used by an application may be forced to experience their worst case performance. For instance, hash tables are usually thought of as being constant time operations, but with large numbers of collisions will degrade to a linked list and may lead to a 100-10,000 times performance degradation. Because of the widespread use of hash tables, the potential for attack is extremely widespread. Fortunately, in many cases, other limits on the system limit the impact of these attacks. To be attackable, an application must have a deterministic or predictable hash function and accept untrusted input. In general, for the attack to be signifigant, the applications must be willing and able to accept hundreds to tens of thousands of 'attack inputs'. Because of that requirement, it is difficult to judge the impact of these attack without knowing the source code extremely well, and knowing all ways in which a program is used. As part of this project, I have examined python 2.3b1, and the hash function 'string_hash' is deterministic. Thus any script that may hash untrusted input may vulnerable to our attack. Furthermore, the structure of the hash functions allows our fast collision generation algorithm to work. This means that any script written in python that hashes a large number of keys from an untrusted source is potentially subject to a severe performance degradation. Depending on the application or script, this could be a critical DoS. The solution for these attacks on hash tables is to make the hash function unpredictable via a technique known as universal hashing. Universal hashing is a keyed hash function where, based on the key, one of a large set hash functions is chosen. When benchmarking, we observe that for short or medium length inputs, it is comparable in performance to simple predictable hash functions such as the ones in Python or Perl. Our paper has graphs and charts of our benchmarked performance. I highly advise using a universal hashing library, either our own or someone elses. As is historically seen, it is very easy to make silly mistakes when attempting to implement your own 'secure' algorithm. The abstract, paper, and a library implementing universal hashing is available at http://www.cs.rice.edu/~scrosby/hash/. Scott From python@rcn.com Thu May 29 21:55:35 2003 From: python@rcn.com (Raymond Hettinger) Date: Thu, 29 May 2003 16:55:35 -0400 Subject: [Python-Dev] Algoritmic Complexity Attack on Python References: Message-ID: <006f01c32624$a8c0ed80$125ffea9@oemcomputer> >For instance, hash tables are usually thought > of as being constant time operations, but with large numbers of > collisions will degrade to a linked list and may lead to a 100-10,000 > times performance degradation. True enough. And it's not hard to create tons of keys that will collide (Uncle Tim even gives an example in the source for those who care to read). Going from O(1) to O(n) for each insertion would be a bit painful during the process of building up a large dictionary. So, did your research show a prevalence of or even existence of online applications that allow someone to submit high volumes of meaningless keys to be saved in a hash table? Raymond Hettinger From jeremy@zope.com Thu May 29 22:01:19 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 29 May 2003 17:01:19 -0400 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: References: Message-ID: <1054242079.6832.26.camel@slothrop.zope.com> Scott, I just too a minute too look at this. I downloaded the python-attack file from your Web site. I loading all the strings and then inserted them into a dictionary. I also generated a list of 10,000 random strings and inserted them into a dictionary. The script is below. The results show that inserting the python-attack strings is about 4 times slower than inserting random strings. slothrop:~/src/python/dist/src/build> ./python ~/attack.py ~/python-attack time 0.0898009538651 size 10000 slothrop:~/src/python/dist/src/build> ./python ~/attack.py ~/simple time 0.0229719877243 size 10000 Jeremy import time def main(path): L = [l.strip() for l in open(path)] d = {} t0 = time.time() for k in L: d[k] = 1 t1 = time.time() print "time", t1 - t0 print "size", len(d) if __name__ == "__main__": import sys main(sys.argv[1]) From scrosby@cs.rice.edu Thu May 29 22:10:04 2003 From: scrosby@cs.rice.edu (Scott A Crosby) Date: 29 May 2003 16:10:04 -0500 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: <1054242079.6832.26.camel@slothrop.zope.com> References: <1054242079.6832.26.camel@slothrop.zope.com> Message-ID: On 29 May 2003 17:01:19 -0400, Jeremy Hylton writes: > Scott, > > I just too a minute too look at this. I downloaded the python-attack > file from your Web site. I loading all the strings and then inserted > them into a dictionary. I also generated a list of 10,000 random > strings and inserted them into a dictionary. Ok. It should have taken almost a minute instead of .08 seconds in the attack version. My file is broken. I'll be constructing a new one later this evening. If you test perl with the perl files, you'll see what should have occured in this case. > The script is below. Thank you. Scott From scrosby@cs.rice.edu Thu May 29 22:23:24 2003 From: scrosby@cs.rice.edu (Scott A Crosby) Date: 29 May 2003 16:23:24 -0500 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: <006f01c32624$a8c0ed80$125ffea9@oemcomputer> References: <006f01c32624$a8c0ed80$125ffea9@oemcomputer> Message-ID: On Thu, 29 May 2003 16:55:35 -0400, "Raymond Hettinger" writes: > So, did your research show a prevalence of or even existence of > online applications that allow someone to submit high volumes of > meaningless keys to be saved in a hash table? I am not a python guru and We weren't looking for specific applications, so I wouldn't know. Scott From gward@python.net Thu May 29 22:56:52 2003 From: gward@python.net (Greg Ward) Date: Thu, 29 May 2003 17:56:52 -0400 Subject: [Python-Dev] Change to ossaudiodev setparameters() method In-Reply-To: <200305291450.h4TEo6q15846@odiug.zope.com> References: <20030526021635.GA15814@cthulhu.gerg.ca> <200305260640.h4Q6dxm08507@pcp02138704pcs.reston01.va.comcast.net> <20030529013229.GA19091@cthulhu.gerg.ca> <200305291450.h4TEo6q15846@odiug.zope.com> Message-ID: <20030529215652.GB28065@cthulhu.gerg.ca> On 29 May 2003, Guido van Rossum said: > Did you check in the changes to ossaudiodev? Oops! I did now -- thanks. Please try again. Greg -- Greg Ward http://www.gerg.ca/ I just read that 50% of the population has below median IQ! From guido@python.org Thu May 29 23:29:39 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 29 May 2003 18:29:39 -0400 Subject: [Python-Dev] Change to ossaudiodev setparameters() method In-Reply-To: Your message of "Thu, 29 May 2003 17:56:52 EDT." <20030529215652.GB28065@cthulhu.gerg.ca> References: <20030526021635.GA15814@cthulhu.gerg.ca> <200305260640.h4Q6dxm08507@pcp02138704pcs.reston01.va.comcast.net> <20030529013229.GA19091@cthulhu.gerg.ca> <200305291450.h4TEo6q15846@odiug.zope.com> <20030529215652.GB28065@cthulhu.gerg.ca> Message-ID: <200305292229.h4TMTdU19567@odiug.zope.com> > > Did you check in the changes to ossaudiodev? > > Oops! I did now -- thanks. Please try again. Alas, no change. Still some squeaks from the speaker followed by a hanging process: Traceback (most recent call last): File "../Lib/test/regrtest.py", line 974, in ? main() File "../Lib/test/regrtest.py", line 264, in main ok = runtest(test, generate, verbose, quiet, testdir) File "../Lib/test/regrtest.py", line 394, in runtest the_package = __import__(abstest, globals(), locals(), []) File "/mnt/home/guido/projects/trunk/Lib/test/test_ossaudiodev.py", line 119, in ? test() File "/mnt/home/guido/projects/trunk/Lib/test/test_ossaudiodev.py", line 116, in test play_sound_file(data, rate, ssize, nchannels) File "/mnt/home/guido/projects/trunk/Lib/test/test_ossaudiodev.py", line 58, in play_sound_file dsp.write(data) KeyboardInterrupt --Guido van Rossum (home page: http://www.python.org/~guido/) From tim@zope.com Thu May 29 23:31:56 2003 From: tim@zope.com (Tim Peters) Date: Thu, 29 May 2003 18:31:56 -0400 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: Message-ID: [Jeremy Hylton] >> I just too a minute too look at this. I downloaded the python-attack >> file from your Web site. I loading all the strings and then inserted >> them into a dictionary. I also generated a list of 10,000 random >> strings and inserted them into a dictionary. [Scott Crosby] > Ok. It should have taken almost a minute instead of .08 seconds in the > attack version. My file is broken. I'll be constructing a new one > later this evening. If you test perl with the perl files, you'll see > what should have occured in this case. Note that the 10,000 strings in the file map to 400 distinct 32-bit hash codes under Python's hash. It's not enough to provoke worst-case behavior in Python just to collide on the low-order bits: all 32 bits contribute to the probe sequence (just colliding on the initial bucket slot doesn't have much effect). As is, it's effectively creating 400 collision chains, ranging in length from 7 to 252, with a mean length of 25 and a median of 16. From scrosby@cs.rice.edu Fri May 30 00:29:45 2003 From: scrosby@cs.rice.edu (Scott A Crosby) Date: 29 May 2003 18:29:45 -0500 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: References: Message-ID: On Thu, 29 May 2003 18:31:56 -0400, "Tim Peters" writes: > [Jeremy Hylton] > >> I just too a minute too look at this. I downloaded the python-attack > >> file from your Web site. I loading all the strings and then inserted > >> them into a dictionary. I also generated a list of 10,000 random > >> strings and inserted them into a dictionary. > > [Scott Crosby] > > Ok. It should have taken almost a minute instead of .08 seconds in the > > attack version. My file is broken. I'll be constructing a new one > > later this evening. If you test perl with the perl files, you'll see > > what should have occured in this case. > > Note that the 10,000 strings in the file map to 400 distinct 32-bit hash > codes under Python's hash. It's not enough to provoke worst-case behavior Yes. Jeremey has made me aware of this. I appear to have made a mistake when inserting python's hash code into my program that finds generators. The fact that I find so many collisions, but I don't have everything colliding indicates what the problem is. > in Python just to collide on the low-order bits: all 32 bits contribute to > the probe sequence (just colliding on the initial bucket slot doesn't have > much effect). Correct. My program, as per the paper, generates full 32 bit hash collisions. It doesn't generate bucket collisions. > As is, it's effectively creating 400 collision chains, ranging in > length from 7 to 252, with a mean length of 25 and a median of 16. Yes. I don't know python so I had no test program to verify that my attack file was correct. Its not. Now that I have one, I'll quash the bug and release a new file later this evening. Scott From tim.one@comcast.net Fri May 30 02:54:20 2003 From: tim.one@comcast.net (Tim Peters) Date: Thu, 29 May 2003 21:54:20 -0400 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: Message-ID: I've got one meta-comment here: [Scott A Crosby] > Hello. We have analyzed this software to determine its vulnerability > to a new class of DoS attacks that related to a recent paper. ''Denial > of Service via Algorithmic Complexity Attacks.'' I don't think this is new. For example, a much simpler kind of attack is to exploit the way backtracking regexp engines work -- it's easy to find regexp + target_string combos that take time exponential in the sum of the lengths of the input strings. It's not so easy to recognize such a pair when it's handed to you. In Python, exploiting unbounded-int arithmetic is another way to soak up eons of CPU with few characters, e.g. 10**10**10 will suck up all your CPU *and* all your RAM. Another easy way is to study a system's C qsort() implementation, and provoke it into quadratic-time behavior (BTW, McIlroy wrote a cool paper on this in '98: http://www.cs.dartmouth.edu/~doug/mdmspe.pdf ). I'm uninterested in trying to "do something" about these. If resource-hogging is a serious potential problem in some context, then resource limitation is an operating system's job, and any use of Python (or Perl, etc) in such a context should be under the watchful eyes of OS subsystems that track actual resource usage. From scrosby@cs.rice.edu Fri May 30 03:19:57 2003 From: scrosby@cs.rice.edu (Scott A Crosby) Date: 29 May 2003 21:19:57 -0500 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: References: Message-ID: On Thu, 29 May 2003 21:54:20 -0400, Tim Peters writes: > I've got one meta-comment here: > > [Scott A Crosby] > > Hello. We have analyzed this software to determine its vulnerability > > to a new class of DoS attacks that related to a recent paper. ''Denial > > of Service via Algorithmic Complexity Attacks.'' > > I don't think this is new. For example, a much simpler kind of attack is to > exploit the way backtracking regexp engines work -- it's easy to find regexp > + target_string combos that take time exponential in the sum of the lengths > of the input strings. It's not so easy to recognize such a pair when it's > handed to you. In Python, exploiting unbounded-int arithmetic is another > way to soak up eons of CPU with few characters, e.g. > > 10**10**10 > These ways require me having the ability to feed a program, an expression, or a regular expression into the victim's python interpreter. The attack I discuss only require that it hash some arbitrary input by the attacker, so these attacks apply in many more cases. > will suck up all your CPU *and* all your RAM. Another easy way is to study > a system's C qsort() implementation, and provoke it into quadratic-time > behavior (BTW, McIlroy wrote a cool paper on this in '98: > > http://www.cs.dartmouth.edu/~doug/mdmspe.pdf This is a very cool paper in exactly the same vein as ours. Thanks. > I'm uninterested in trying to "do something" about these. If > resource-hogging is a serious potential problem in some context, then > resource limitation is an operating system's job, and any use of Python (or > Perl, etc) in such a context should be under the watchful eyes of OS > subsystems that track actual resource usage. I disagree. Changing the hash function eliminates these attacks on hash tables. Scott From tim_one@email.msn.com Fri May 30 04:00:35 2003 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 29 May 2003 23:00:35 -0400 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: Message-ID: [Scott A Crosby] > These ways require me having the ability to feed a program, an > expression, or a regular expression into the victim's python > interpreter. I think you underestimate the perversity of regular expressions in particular. Many programs use (fixed) regexps to parse input, and it's often possible to construct inputs that take horribly long times to match, or, especially, to fail to match. Blowing the hardware stack (provoking a memory fault) is also usually easy. The art of writing robust regexps for use with a backtracking engine is obscure and difficult (see Friedl's "Mastering Regular Expressions" (O'Reilly) for a practical intro to the topic), and regexps are ubiquitous now. > The attack I discuss only require that it hash some arbitrary input by > the attacker, so these attacks apply in many more cases. While a regexp attack only requires that the program parse one user-supplied string >> http://www.cs.dartmouth.edu/~doug/mdmspe.pdf > This is a very cool paper in exactly the same vein as ours. Thanks. It is a cool paper, and you're welcome. I don't think it's in the same vein, though -- McIlroy presented it as an interesting discovery, not as "a reason" for people to get agitated about programs using quicksort. The most likely reason you didn't find references to it before is because nobody in real life cares much about this attack possibility. >> I'm uninterested in trying to "do something" about these. If >> resource-hogging is a serious potential problem in some context, then >> resource limitation is an operating system's job, and any use of >> Python (or Perl, etc) in such a context should be under the watchful >> eyes of OS subsystems that track actual resource usage. > I disagree. Changing the hash function eliminates these attacks on > hash tables. It depends on how much access an attacker has, and, as you said before, you're not aware of any specific application in Python that *can* be attacked this way. I'm not either. In any case, the universe of resource attacks is much larger than just picking on hash functions, so plugging a hole in those alone wouldn't do anything to ease my fears -- provided I had such fears, which is admittedly a stretch . If I did have such fears, I'd want the OS to alleviate them all at once. From martin@v.loewis.de Fri May 30 07:37:00 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 30 May 2003 08:37:00 +0200 Subject: [Python-Dev] Release question In-Reply-To: <20030529151421.G1276@localhost.localdomain> References: <20030529151421.G1276@localhost.localdomain> Message-ID: Jack Diederich writes: > The PEPs are pretty thorough about how to announce and build > releases. My question is about cvs, branching, and feature freezes. > I joined python-dev after the 2.2 release so I haven't been around > for a 'round number' release yet. Notice that PEP 101 *does* talk about CVS branches and tags, for releases. Branches are not used much for anything else in Python. > I'm guessing 2.3 is in bugfix only mode. Correct; there will be one more beta, release candidates, and a release. > When is 2.4 tagged, and what is the timeframe on that? (the linux > kernel generally waits a while before starting the next dev branch). This is not how Python works. Immediately after 2.3 is released, 2.4 development starts. Bugs discovered in 2.3 are then fixed both on the 2.3-maint branch and HEAD (if there are any volunteers from the PBF, those patches may also get applied to the 2.2-maint branch if applicable). In Linux, a new "unstable" kernel is usually started with a major restructuring of everything, so the "unstable" code base diverges quickly from the previous release. This is not the case for Python - we still have a lot of code that was in 1.5. Regards, Martin From guido@python.org Fri May 30 12:39:18 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 30 May 2003 07:39:18 -0400 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: "Your message of 29 May 2003 21:19:57 CDT." References: Message-ID: <200305301139.h4UBdIb15112@pcp02138704pcs.reston01.va.comcast.net> [Tim Peters] > > I'm uninterested in trying to "do something" about these. If > > resource-hogging is a serious potential problem in some context, then > > resource limitation is an operating system's job, and any use of Python (or > > Perl, etc) in such a context should be under the watchful eyes of OS > > subsystems that track actual resource usage. [Scott Crosby] > I disagree. Changing the hash function eliminates these attacks on > hash tables. At what cost for Python? 99.99% of all Python programs are not vulnerable to this kind of attack, because they don't take huge amounts of arbitrary input from an untrusted source. If the hash function you propose is even a *teensy* bit slower than the one we've got now (and from your description I'm sure it has to be), everybody would be paying for the solution to a problem they don't have. You keep insisting that you don't know Python. Hashing is used an awful lot in Python -- as an interpreted language, most variable lookups and all method and instance variable lookups use hashing. So this would affect every Python program. Scott, we thank you for pointing out the issue, but I think you'll be wearing out your welcome here quickly if you keep insisting that we do things your way based on the evidence you've produced so far. --Guido van Rossum (home page: http://www.python.org/~guido/) From scrosby@cs.rice.edu Fri May 30 17:18:14 2003 From: scrosby@cs.rice.edu (Scott A Crosby) Date: 30 May 2003 11:18:14 -0500 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: <200305301139.h4UBdIb15112@pcp02138704pcs.reston01.va.comcast.net> References: <200305301139.h4UBdIb15112@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Fri, 30 May 2003 07:39:18 -0400, Guido van Rossum writes: > [Tim Peters] > > > I'm uninterested in trying to "do something" about these. If > > > resource-hogging is a serious potential problem in some context, then > > > resource limitation is an operating system's job, and any use of Python (or > > > Perl, etc) in such a context should be under the watchful eyes of OS > > > subsystems that track actual resource usage. > > [Scott Crosby] > > I disagree. Changing the hash function eliminates these attacks on > > hash tables. > > At what cost for Python? 99.99% of all Python programs are not > vulnerable to this kind of attack, because they don't take huge > amounts of arbitrary input from an untrusted source. If the hash > function you propose is even a *teensy* bit slower than the one we've > got now (and from your description I'm sure it has to be), everybody We included several benchmarks in our paper: On here, http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/CacheEffects2.png when we compare hash functions as the working set changes, we notice that a single L2 cache miss far exceeds hashing time for all algorithms. On http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/LengthEffects.png UHASH exceeds the performance of perl's hash function, which is simpler than your own. Even for small strings, UHASH is only about half the speed of perl's hash function, and your function already performs a multiplication per byte: #define HASH(hi,ho,c) ho = (1000003*hi) ^ c #define HASH0(ho,c) ho = ((c << 7)*1000003) ^ c The difference between this and CW12 is one 32 bit modulo operation. (Please note that CW12 is currently broken. Fortunately it didn't affect the benchmarking on x86.) > would be paying for the solution to a problem they don't have. You > keep insisting that you don't know Python. Hashing is used an awful > lot in Python -- as an interpreted language, most variable lookups and > all method and instance variable lookups use hashing. So this would > affect every Python program. Have you done benchmarking to prove that string_hash is in fact an important hotspot in the python interpreter? If so, and doing one modulo operation per string is unacceptable, then you may wish to consider Jenkin's hash. The linux kernel people are switching to using a keyed veriant of Jenkin's hash. However, Jenkin's, AFAIK, has no proofs that it is in fact universal. It, however, probably is safe. It is not unlikely that if you went that route you'd be somewhat safer, and faster, but if you want full safety, you'd need to go with a universal hash. Scott From jepler@unpythonic.net Fri May 30 21:00:21 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Fri, 30 May 2003 15:00:21 -0500 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: References: <200305301139.h4UBdIb15112@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030530200021.GB30507@unpythonic.net> On Fri, May 30, 2003 at 11:18:14AM -0500, Scott A Crosby wrote: > On > > http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/LengthEffects.png > > UHASH exceeds the performance of perl's hash function, which is > simpler than your own. I notice that you say "with strings longer than around 44-bytes, UHASH dominates all the other hash functions, due in no small part to its extensive performance tuning and *hand-coded assembly routines.*" [emphasis mine] It's all well and good for people who can run your hand-coded VAX assembly, but when Intel's 80960* comes out and people start running Unix on it, won't they be forced to code your hash function all over again? Since everybody has hopes for Python beyond the VAX (heck, in 20 years VAX might have as little as 5% of the market -- anything could happen) there has been a consious decision not to hand-code anything in assembly in Python. Jeff * The Intel 80960, in case you haven't heard of it, is a superscalar processor that will require highly-tuned compilers and will run like a bat out of hell when the code is tuned right. I think it's capable of one floating-point and two integer instructions per cycle! From guido@python.org Fri May 30 21:35:53 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 30 May 2003 16:35:53 -0400 Subject: [Python-Dev] KeyboardInterrupt on Windows Message-ID: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net> I received this problem report (Kurt is the IDLEFORK developer). Does anybody know what could be the matter here? What changed recently??? --Guido van Rossum (home page: http://www.python.org/~guido/) ------- Forwarded Message Date: Fri, 30 May 2003 15:50:15 -0400 From: kbk@shore.net (Kurt B. Kaiser) To: Guido van Rossum Subject: KeyboardInterrupt I find that while 1: pass doesn't respond to a KeyboardInterrupt on Python2.3b1 on either WinXP or W2K. Is this generally known? I couldn't find any mention of it. while 1: a = 0 is fine on 2.3b1, and both work on Python2.2. - -- KBK ------- End of Forwarded Message From tim@zope.com Fri May 30 21:41:03 2003 From: tim@zope.com (Tim Peters) Date: Fri, 30 May 2003 16:41:03 -0400 Subject: [Python-Dev] KeyboardInterrupt on Windows In-Reply-To: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > I received this problem report (Kurt is the IDLEFORK developer). Does > anybody know what could be the matter here? What changed recently??? Looks like eval-loop optimizations. The first version essentially compiles to a JUMP_ABSOLUTE to itself >> 10 JUMP_ABSOLUTE 10 and case JUMP_ABSOLUTE: JUMPTO(oparg); goto fast_next_opcode; This skips the ticker checks, so never checks for interrupts. As usual, I expect we can blame Raymond Hettinger's good intentions . > ------- Forwarded Message > > Date: Fri, 30 May 2003 15:50:15 -0400 > From: kbk@shore.net (Kurt B. Kaiser) > To: Guido van Rossum > Subject: KeyboardInterrupt > > I find that > > while 1: pass > > doesn't respond to a KeyboardInterrupt on Python2.3b1 on either > WinXP or W2K. Is this generally known? I couldn't find any mention > of it. > > while 1: a = 0 > > is fine on 2.3b1, and both work on Python2.2. > > - -- > KBK > > ------- End of Forwarded Message From neal@metaslash.com Fri May 30 21:40:01 2003 From: neal@metaslash.com (Neal Norwitz) Date: Fri, 30 May 2003 16:40:01 -0400 Subject: [Python-Dev] KeyboardInterrupt on Windows In-Reply-To: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net> References: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030530204000.GH27502@epoch.metaslash.com> On Fri, May 30, 2003 at 04:35:53PM -0400, Guido van Rossum wrote: > I received this problem report (Kurt is the IDLEFORK developer). Does > anybody know what could be the matter here? What changed recently??? > while 1: pass > > doesn't respond to a KeyboardInterrupt on Python2.3b1 on either > WinXP or W2K. Could this be from the optimization Raymond did: >>> def f(): ... while 1: pass >>> dis.dis(f) 2 0 SETUP_LOOP 12 (to 15) 3 JUMP_FORWARD 4 (to 10) 6 JUMP_IF_FALSE 4 (to 13) 9 POP_TOP >> 10 JUMP_ABSOLUTE 10 >> 13 POP_TOP 14 POP_BLOCK >> 15 LOAD_CONST 0 (None) 18 RETURN_VALUE 3 jumps to 10, 10 jumps to itself unless I'm reading this wrong. See Python/compile.c::optimize_code (starting around line 339) Neal From jeremy@zope.com Fri May 30 21:43:28 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 30 May 2003 16:43:28 -0400 Subject: [Python-Dev] KeyboardInterrupt on Windows In-Reply-To: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net> References: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <1054327408.2917.9.camel@slothrop.zope.com> It's probably an unintended consequence of the "while 1" optimization and the fast-next-opcode optimization. "while 1" doesn't do a test at runtime anymore. And opcodes like JUMP_ABSOLUTE bypass the test for pending exceptions. The next result is that while 1: pass puts the interpreter in a tight loop doing a JUMP_ABSOLUTE that goes nowhere. That is offset X has JUMP_ABSOLUTE X. I'd be inclined to call this a bug, but I'm not sure how to fix it. Jeremy From neal@metaslash.com Fri May 30 22:04:30 2003 From: neal@metaslash.com (Neal Norwitz) Date: Fri, 30 May 2003 17:04:30 -0400 Subject: [Python-Dev] KeyboardInterrupt on Windows In-Reply-To: <20030530204000.GH27502@epoch.metaslash.com> References: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net> <20030530204000.GH27502@epoch.metaslash.com> Message-ID: <20030530210430.GI27502@epoch.metaslash.com> On Fri, May 30, 2003 at 04:40:00PM -0400, Neal Norwitz wrote: > On Fri, May 30, 2003 at 04:35:53PM -0400, Guido van Rossum wrote: > > I received this problem report (Kurt is the IDLEFORK developer). Does > > anybody know what could be the matter here? What changed recently??? > > > while 1: pass > > > > doesn't respond to a KeyboardInterrupt on Python2.3b1 on either > > WinXP or W2K. The patch below fixes the problem by not optimizing while 1:pass. Seems kinda hacky though. Neal -- Index: Python/compile.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Python/compile.c,v retrieving revision 2.289 diff -w -u -r2.289 compile.c --- Python/compile.c 22 May 2003 22:00:04 -0000 2.289 +++ Python/compile.c 30 May 2003 21:02:26 -0000 @@ -411,6 +411,8 @@ tgttgt -= i + 3; /* Calc relative jump addr */ if (tgttgt < 0) /* No backward relative jumps */ continue; + if (i == tgttgt && opcode == JUMP_ABSOLUTE) + goto exitUnchanged; codestr[i] = opcode; SETARG(codestr, i, tgttgt); break; From barry@python.org Fri May 30 22:09:27 2003 From: barry@python.org (Barry Warsaw) Date: 30 May 2003 17:09:27 -0400 Subject: [Python-Dev] I'm tagging the Python 2.2.3 tree Message-ID: <1054328967.13804.33.camel@geddy> No more changes please. -Barry From mwh@python.net Fri May 30 22:22:32 2003 From: mwh@python.net (Michael Hudson) Date: Fri, 30 May 2003 22:22:32 +0100 Subject: [Python-Dev] KeyboardInterrupt on Windows In-Reply-To: <1054327408.2917.9.camel@slothrop.zope.com> (Jeremy Hylton's message of "30 May 2003 16:43:28 -0400") References: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net> <1054327408.2917.9.camel@slothrop.zope.com> Message-ID: <2mvfvsf7tz.fsf@starship.python.net> Jeremy Hylton writes: > It's probably an unintended consequence of the "while 1" optimization > and the fast-next-opcode optimization. "while 1" doesn't do a test at > runtime anymore. And opcodes like JUMP_ABSOLUTE bypass the test for > pending exceptions. The next result is that while 1: pass puts the > interpreter in a tight loop doing a JUMP_ABSOLUTE that goes nowhere. > That is offset X has JUMP_ABSOLUTE X. > > I'd be inclined to call this a bug, but I'm not sure how to fix it. Take out the while 1: optimizations? I don't want to belittle Raymond's efforts, but I am conscious of[1] Tim's repeated observations of the correlation between the number of optimizations in the compiler and the number of weird bugs therein. Cheers, M. [1] I'm also warming up for a end-of-PyPy-sprint drunken hacking session so you probably shouldn't take me too seriously :-) -- ARTHUR: Why should a rock hum? FORD: Maybe it feels good about being a rock. -- The Hitch-Hikers Guide to the Galaxy, Episode 8 From python@rcn.com Fri May 30 22:23:47 2003 From: python@rcn.com (Raymond Hettinger) Date: Fri, 30 May 2003 17:23:47 -0400 Subject: [Python-Dev] KeyboardInterrupt on Windows References: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net> <20030530204000.GH27502@epoch.metaslash.com> <20030530210430.GI27502@epoch.metaslash.com> Message-ID: <004401c326f1$c2391300$125ffea9@oemcomputer> > On Fri, May 30, 2003 at 04:40:00PM -0400, Neal Norwitz wrote: > > On Fri, May 30, 2003 at 04:35:53PM -0400, Guido van Rossum wrote: > > > I received this problem report (Kurt is the IDLEFORK developer). Does > > > anybody know what could be the matter here? What changed recently??? > > > > > while 1: pass > > > > > > doesn't respond to a KeyboardInterrupt on Python2.3b1 on either > > > WinXP or W2K. > > The patch below fixes the problem by not optimizing while 1:pass. That looks like a good fix to me. There are two other ways: * disable the goto fast_next_opcode for JUMP_ABSOLUTE * disable the byte optimization for a jump-to-a-jump Raymond From scrosby@cs.rice.edu Fri May 30 23:02:51 2003 From: scrosby@cs.rice.edu (Scott A Crosby) Date: 30 May 2003 17:02:51 -0500 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: <20030530200021.GB30507@unpythonic.net> References: <200305301139.h4UBdIb15112@pcp02138704pcs.reston01.va.comcast.net> <20030530200021.GB30507@unpythonic.net> Message-ID: On Fri, 30 May 2003 15:00:21 -0500, Jeff Epler writes: > On Fri, May 30, 2003 at 11:18:14AM -0500, Scott A Crosby wrote: > > On > > > > http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/LengthEffects.png > > > > UHASH exceeds the performance of perl's hash function, which is > > simpler than your own. > > I notice that you say "with strings longer than around 44-bytes, > UHASH dominates all the other hash functions, due in no small part to its > extensive performance tuning and *hand-coded assembly routines.*" > [emphasis mine] It's all well and good for people who can run your We benchmarked it, and without assembly optimizations, uhash still exceeds perl. Also please note that we did not create uhash. We merely used it as a high performance universal hash which we could cite and benchmark. Freshly computed raw benchmarks on a P2-450 are at the end of this email. Looking at them now. I think we may have slightly err'ed and used the non-assembly version of the hash in constructing the graphs, because the crossover point compared to perl looks to be about 20 bytes with assembly, and about 48 without. Roughly, they show that uhash is about half the speed on a P2-450 without assembly. I do not have benchmarks on other platforms to compare it to. However, CW is known be about 10 times worse, relatively, than jenkin's on a SPARC. The python community will have to judge whether the performance difference of the current hash is worth the risk of the attack. Also, I'd like to thank Tim Peters for telling me about the potential of degradation that regular expressions may offer. Scott Time benchmarking actual hash (including benchmarking overhead) with a working set size of 12kb. Time(perl-5.8.0): 12.787 Mbytes/sec with string length 4, buf 12000 Time(uh_cw-1024): 6.010 Mbytes/sec with string length 4, buf 12000 Time(python): 14.952 Mbytes/sec with string length 4, buf 12000 Time(test32out_uhash): 4.584 Mbytes/sec with string length 4, buf 12000 Time(test32out_assembly_uhash): 6.014 Mbytes/sec with string length 4, buf 12000 Time(perl-5.8.0): 29.125 Mbytes/sec with string length 16, buf 12000 Time(uh_cw-1024): 11.898 Mbytes/sec with string length 16, buf 12000 Time(python): 36.445 Mbytes/sec with string length 16, buf 12000 Time(test32out_uhash): 19.169 Mbytes/sec with string length 16, buf 12000 Time(test32out_assembly_uhash): 25.660 Mbytes/sec with string length 16, buf 12000 Time(perl-5.8.0): 45.440 Mbytes/sec with string length 64, buf 12000 Time(uh_cw-1024): 16.168 Mbytes/sec with string length 64, buf 12000 Time(python): 62.213 Mbytes/sec with string length 64, buf 12000 Time(test32out_uhash): 71.396 Mbytes/sec with string length 64, buf 12000 Time(test32out_assembly_uhash): 106.873 Mbytes/sec with string length 64, buf 12000 Time benchmarking actual hash (Including benchmarking overhead) with a working set size of 6mb. Time(perl-5.8.0): 8.099 Mbytes/sec with string length 4, buf 6000000 Time(uh_cw-1024): 4.660 Mbytes/sec with string length 4, buf 6000000 Time(python): 8.840 Mbytes/sec with string length 4, buf 6000000 Time(test32out_uhash): 3.932 Mbytes/sec with string length 4, buf 6000000 Time(test32out_assembly_uhash): 4.859 Mbytes/sec with string length 4, buf 6000000 Time(perl-5.8.0): 20.878 Mbytes/sec with string length 16, buf 6000000 Time(uh_cw-1024): 9.964 Mbytes/sec with string length 16, buf 6000000 Time(python): 24.450 Mbytes/sec with string length 16, buf 6000000 Time(test32out_uhash): 16.168 Mbytes/sec with string length 16, buf 6000000 Time(test32out_assembly_uhash): 19.929 Mbytes/sec with string length 16, buf 6000000 Time(perl-5.8.0): 35.265 Mbytes/sec with string length 64, buf 6000000 Time(uh_cw-1024): 14.400 Mbytes/sec with string length 64, buf 6000000 Time(python): 46.650 Mbytes/sec with string length 64, buf 6000000 Time(test32out_uhash): 48.719 Mbytes/sec with string length 64, buf 6000000 Time(test32out_assembly_uhash): 63.523 Mbytes/sec with string length 64, buf 6000000 From nas@python.ca Fri May 30 23:22:04 2003 From: nas@python.ca (Neil Schemenauer) Date: Fri, 30 May 2003 15:22:04 -0700 Subject: [Python-Dev] KeyboardInterrupt on Windows In-Reply-To: <1054327408.2917.9.camel@slothrop.zope.com> References: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net> <1054327408.2917.9.camel@slothrop.zope.com> Message-ID: <20030530222204.GB404@glacier.arctrix.com> Jeremy Hylton wrote: > I'd be inclined to call this a bug, but I'm not sure how to fix it. I think right the fix it to make JUMP_ABSOLUTE not bypass the test for pending exceptions. We have to be really careful with using fast_next_opcode. Originally it was only used by SET_LINENO, LOAD_FAST, LOAD_CONST, STORE_FAST, POP_TOP. Using it from jump opcodes is asking for trouble, IMHO. Shall I prepare a patch? Neil From python@rcn.com Fri May 30 23:29:50 2003 From: python@rcn.com (Raymond Hettinger) Date: Fri, 30 May 2003 18:29:50 -0400 Subject: [Python-Dev] KeyboardInterrupt on Windows References: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net> <20030530204000.GH27502@epoch.metaslash.com> <20030530210430.GI27502@epoch.metaslash.com> Message-ID: <009501c326fa$fc8bb680$125ffea9@oemcomputer> > The patch below fixes the problem by not optimizing while 1:pass. > Seems kinda hacky though. > > Neal My version of the patch and a testcase is on SF at: www.python.org/sf/746376 if anyone wants to take a look. While we're focused on the compiler, there is a nasty one still outstanding that relates to the fast_function() optimation: www.python.org/sf/733667 Raymond Hettinger From python@rcn.com Fri May 30 23:47:47 2003 From: python@rcn.com (Raymond Hettinger) Date: Fri, 30 May 2003 18:47:47 -0400 Subject: [Python-Dev] KeyboardInterrupt on Windows References: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net> <1054327408.2917.9.camel@slothrop.zope.com> <20030530222204.GB404@glacier.arctrix.com> Message-ID: <00ab01c326fd$7e584000$125ffea9@oemcomputer> [Neil S] > I think right the fix it to make JUMP_ABSOLUTE not bypass the test for > pending exceptions. Yes. That's the correct fix because it handles all cases including: while 1: x=1 Please go ahead and patch it up. Raymond From tim.one@comcast.net Sat May 31 00:07:38 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 30 May 2003 19:07:38 -0400 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: Message-ID: [Scott Crosby] > ... > Have you done benchmarking to prove that string_hash is in fact an > important hotspot in the python interpreter? It depends on the specific application, of course. The overall speed of dicts is crucial, but in many "namespace" dict apps the strings are interned, implying (among other things) that a hash code is computed only once per string. In those apps the speed of the string hash function isn't so important. Overall, though, I'd welcome a faster string hash, and I agree that Python's isn't particularly zippy. OTOH, it's just a couple lines of C that run fine on everything from Palm Pilots to Crays, and have created no portability or maintenance headaches. Browsing your site, you've got over 100KB of snaky C code to implement hashing functions, some with known bugs, others with cautions about open questions wrt platforms with different endianness and word sizes than the code was initially tested on. Compared to what Python is using now, that's a maintenance nightmare. Note that Python's hash API doesn't return 32 bits, it returns a hash code of the same size as the native C long. The multiplication gimmick doesn't require any pain to do that. Other points that arise in practical deployment: + Python dicts can be indexed by many kinds of immutable objects. Strings are just one kind of key, and Python has many hash functions. + If I understand what you're selling, the hash code of a given string will almost certainly change across program runs. That's a very visible change in semantics, since hash() is a builtin Python function available to user code. Some programs use hash codes to index into persistent (file- or database- based) data structures, and such code would plain break if the hash code of a string changed from one run to the next. I expect the user-visible hash() would have to continue using a predictable function. + Some Python apps run for months, and universal hashing doesn't remove the possibility of quadratic-time behavior. If I can poke at a long-running app and observe its behavior, over time I can deduce a set of keys that collide badly for any hashing scheme fixed when the program starts. In that sense I don't believe this gimmick wholly plugs the hole it's trying to plug. > If so, and doing one modulo operation per string is unacceptable, If it's mod by a prime, probably. Some architectures Python runs on require hundreds of cycles to do an integer mod, and we don't have the resources to construct custom mod-by-an-int shortcut code for dozens of obscure architectures. > then you may wish to consider Jenkin's hash. The linux kernel people > are switching to using a keyed veriant of Jenkin's hash. However, > Jenkin's, AFAIK, has no proofs that it is in fact universal. It, > however, probably is safe. Nobody writing a Python program *has* to use a dict. That dicts have quadratic-time worst-case behavior isn't hidden, and there's no cure for that short of switching to a data structure with better worst-case bounds. I certainly agree it's something for programmers to be aware of. I still don't see any reason for the core language to care about this, though. From scrosby@cs.rice.edu Sat May 31 00:56:51 2003 From: scrosby@cs.rice.edu (Scott A Crosby) Date: 30 May 2003 18:56:51 -0500 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: References: Message-ID: On Fri, 30 May 2003 19:07:38 -0400, Tim Peters writes: > [Scott Crosby] > > ... > > Have you done benchmarking to prove that string_hash is in fact an > > important hotspot in the python interpreter? > > It depends on the specific application, of course. The overall speed of > dicts is crucial, but in many "namespace" dict apps the strings are > interned, implying (among other things) that a hash code is computed only > once per string. In those apps the speed of the string hash function isn't > so important. Overall, though, I'd welcome a faster string hash, and I > agree that Python's isn't particularly zippy. Actually, at least on x86, it is faster than perl. On other platforms, it may be somewhat slower. > OTOH, it's just a couple lines of C that run fine on everything from Palm > Pilots to Crays, and have created no portability or maintenance headaches. > Browsing your site, you've got over 100KB of snaky C code to implement > hashing functions, some with known bugs, others with cautions about open > questions wrt platforms with different endianness and word sizes than the > code was initially tested on. Compared to what Python is using now, that's > a maintenance nightmare. Yes, I am aware of the problems with the UHASH code. Unfortunately, I am not a hash function designer, that code is not mine, and I only use it as a black box. I also consider all code, until verified otherwise, to potentially suffer from endianness, alignment, and 32/64 bit issues. Excluding alignment issues (which I'm not sure whether to say that its OK to fail on strange alignments or not) it has passed *my* self-tests on big endian and 64 bit. > + Python dicts can be indexed by many kinds of immutable objects. > Strings are just one kind of key, and Python has many hash functions. > > + If I understand what you're selling, the hash code of a given string > will almost certainly change across program runs. That's a very > visible change in semantics, since hash() is a builtin Python > function available to user code. Some programs use hash codes to > index into persistent (file- or database- based) data structures, and > such code would plain break if the hash code of a string changed > from one run to the next. I expect the user-visible hash() would have > to continue using a predictable function. The hash has to be keyed upon something. It is possible to store the key in a file and reuse the same one across all runs. However, depending on the universal hash function used, leaking pairs of (input,hash(input)) may allow an attacker to determine the secret key, and allow attack again. But yeah, preserving these semantics becomes very messy. The hash-key becomes part of the system state that must be migrated along with other data that depends on it. > + Some Python apps run for months, and universal hashing doesn't remove > the possibility of quadratic-time behavior. If I can poke at a > long-running app and observe its behavior, over time I can deduce a I argued on linux-kernel with someone else that this was extremely unlikely. It requires the latency of a collision/non-collision being noticable over a noisy system, network stack, and system. In almost all cases, for short inputs, the cost of a single L2 cache miss far exceeds that of hashing. A more serious danger is an application that leaks actual hash values. > If it's mod by a prime, probably. I'd benchmark it in practice, microbenchmarking on a sparc says that it is rather expensive. However, on an X86, the cost of an L2 cache miss exceeds the cost of hashing a small string. You'd have a better idea what impact this might have on the total runtime of the system in the worst case. > > then you may wish to consider Jenkin's hash. The linux kernel people > > are switching to using a keyed veriant of Jenkin's hash. However, > > Jenkin's, AFAIK, has no proofs that it is in fact universal. It, > > however, probably is safe. > > Nobody writing a Python program *has* to use a dict. That dicts have > quadratic-time worst-case behavior isn't hidden, and there's no cure for Agreed, many have realized over the years that hash tables can have quadratic behavior in an adversarial environment. It isn't hidden. Cormen, Lieserson, and Rivest even warn about this in their seminal algorithms textbook in 1991. It *is* obvious when thought of, but the reason I was able to ship out so many vulnerability reports yesterday was because few actually *have* thought of that deterministic worst-case when writing their programs. I predict this trend to continue. I like hash tables a lot, with UH, their time bounds are randomized, but are pretty tight and the constant factors far exceed those of balanced binary trees. Scott From guido@python.org Sat May 31 01:41:54 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 30 May 2003 20:41:54 -0400 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: "Your message of Fri, 30 May 2003 19:07:38 EDT." References: Message-ID: <200305310041.h4V0fsF20796@pcp02138704pcs.reston01.va.comcast.net> > + If I understand what you're selling, the hash code of a given string > will almost certainly change across program runs. That's a very > visible change in semantics, since hash() is a builtin Python > function available to user code. Some programs use hash codes to > index into persistent (file- or database- based) data structures, and > such code would plain break if the hash code of a string changed > from one run to the next. I expect the user-visible hash() would have > to continue using a predictable function. Of course, such programs are already vulnerable to changes in the hash implementation between Python versions (which has happened before). --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@python.org Sat May 31 04:11:45 2003 From: barry@python.org (Barry Warsaw) Date: 30 May 2003 23:11:45 -0400 Subject: [Python-Dev] RELEASED Python 2.2.3 (final) Message-ID: <1054350704.14645.16.camel@geddy> I'm happy to announce the release of Python 2.2.3 (final). This is a bug fix release for the stable Python 2.2 code line. It contains more than 40 bug fixes and memory leak patches since Python 2.2.2, and all Python 2.2 users are encouraged to upgrade. The new release is available here: http://www.python.org/2.2.3/ For full details, see the release notes at http://www.python.org/2.2.3/NEWS.txt There are a small number of minor incompatibilities with Python 2.2.2; for details see: http://www.python.org/2.2.3/bugs.html Perhaps the most important is that the Bastion.py and rexec.py modules have been disabled, since we do not deem them to be safe. As usual, a Windows installer and a Unix/Linux source tarball are made available. The documentation has been updated as well, and is available both on-line and in many different formats. At the moment, no Mac version or Linux RPMs are available, although I expect them to appear soon. On behalf of Guido, I'd like to thank everyone who contributed to this release, and who continue to ensure Python's success. Enjoy, -Barry From jepler@unpythonic.net Sat May 31 14:05:06 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Sat, 31 May 2003 08:05:06 -0500 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: <200305310041.h4V0fsF20796@pcp02138704pcs.reston01.va.comcast.net> References: <200305310041.h4V0fsF20796@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030531130503.GA16185@unpythonic.net> On Fri, May 30, 2003 at 08:41:54PM -0400, Guido van Rossum wrote: > Of course, such programs are already vulnerable to changes in the hash > implementation between Python versions (which has happened before). Is there at least a guarantee that the hashing algorithm won't change in a bugfix release? For instance, can I depend that python222 -c 'print hash(1), hash("a")' python223 -c 'print hash(1), hash("a")' will both output the same thing, even if python23 -c 'print hash(1), hash("a")' and python3000 -c 'print hash(1), hash("a")' may print something different? Jeff From pje@telecommunity.com Sat May 31 14:17:16 2003 From: pje@telecommunity.com (Phillip J. Eby) Date: Sat, 31 May 2003 09:17:16 -0400 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: References: <20030530200021.GB30507@unpythonic.net> <200305301139.h4UBdIb15112@pcp02138704pcs.reston01.va.comcast.net> <20030530200021.GB30507@unpythonic.net> Message-ID: <5.1.0.14.0.20030531090628.02567ad0@mail.telecommunity.com> At 05:02 PM 5/30/03 -0500, Scott A Crosby wrote: >The python community will have to judge whether the performance >difference of the current hash is worth the risk of the attack. Note that the "community" doesn't really have to judge. An individual developer can, if they have an application they deem vulnerable, do something like this: class SafeString(str): def __hash__(self): # code to return a hash code safe = SafeString(string_from_untrusted_source) and then use only these "safe" strings as keys for a given dictionary. Or, with a little more work, they can subclass 'dict' and make a dictionary that converts its keys to "safe" strings. As far as current vulnerability goes, I'd say that the most commonly available attack point for this would be CGI programs that accept POST operations. A POST can supply an arbitrarily large number of form field keys. If you can show that the Python 'cgi' module is vulnerable to such an attack, in a dramatic disproportion to the size of the data transmitted (since obviously it's as much of a DoS to flood a script with a large quantity of data), then it might be worth making changes to the 'cgi' module, or at least warning the developers of alternatives to CGI (e.g. Zope, Quixote, SkunkWeb, CherryPy, etc.) that alternate hashes might be a good idea. But based on the discussion so far, I'm not sure I see how this attack would produce an effect that was dramatically disproportionate to the amount of data transmitted. From dave@boost-consulting.com Sat May 31 15:35:27 2003 From: dave@boost-consulting.com (David Abrahams) Date: Sat, 31 May 2003 10:35:27 -0400 Subject: [Python-Dev] more-precise instructions for "Python.h first"? Message-ID: Boost.Python is now trying hard to accomodate the "Python.h before system headers rule". Unfortunately, we still need a wrapper around Python.h, at least for some versions of Python, so that we can work around some issues like: // // Python's LongObject.h helpfully #defines ULONGLONG_MAX for us // even when it's not defined by the system which confuses Boost's // config // To cope with that correctly, we need to see (a system header) before longobject.h. Currently, we're including , then , well, and then the wrapper gets a little complicated adjusting for various compilers. Anyway, the point is that I'd like to have the rule changed to "You have to include Python.h or xxxx.h before any system header" where xxxx.h is one of the other existing headers #included in Python.h that is responsible for setting up whatever macros cause this inclusion-order requirement in the first place (preferably not LongObject.h!) That way I might be able to get those configuration issues sorted out without violating the #inclusion order rule. What I have now seems to work, but I'd rather do the right thing (TM). -- Dave Abrahams Boost Consulting www.boost-consulting.com From scrosby@cs.rice.edu Sat May 31 16:48:28 2003 From: scrosby@cs.rice.edu (Scott A Crosby) Date: 31 May 2003 10:48:28 -0500 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: <5.1.0.14.0.20030531090628.02567ad0@mail.telecommunity.com> References: <20030530200021.GB30507@unpythonic.net> <200305301139.h4UBdIb15112@pcp02138704pcs.reston01.va.comcast.net> <20030530200021.GB30507@unpythonic.net> <5.1.0.14.0.20030531090628.02567ad0@mail.telecommunity.com> Message-ID: On Sat, 31 May 2003 09:17:16 -0400, "Phillip J. Eby" writes: > At 05:02 PM 5/30/03 -0500, Scott A Crosby wrote: > But based on the discussion so far, I'm not sure I see how this attack > would produce an effect that was dramatically disproportionate to the > amount of data transmitted. I apologize for not having this available earlier, but a corrected file of 10,000 inputs is now available and shows the behavior I claimed. (Someone else independently reimplemented the attack and has sent me a corrected set for python.) With 10,000 inputs, python requires 19 seconds to process instead of .2 seconds. A file of half the size requires 4 seconds, showing the quadratic behavior, as with the case of perl. (Benchmarked on a P2-450) I thus predict that twice the inputs would take about 80 seconds. I can only guess what python applications might experience an interesting impact from this, so I'll be silent. However, here are the concrete benchmarks. Scott From martin@v.loewis.de Sat May 31 17:28:25 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 31 May 2003 18:28:25 +0200 Subject: [Python-Dev] more-precise instructions for "Python.h first"? In-Reply-To: References: Message-ID: David Abrahams writes: > Anyway, the point is that I'd like to have the rule changed to "You > have to include Python.h or xxxx.h before any system header" where > xxxx.h is one of the other existing headers #included in Python.h that > is responsible for setting up whatever macros cause this > inclusion-order requirement in the first place (preferably not > LongObject.h!) If I understand correctly, you want to follow the rule "I want to change things as long it continues to work for me". For that, you don't need any permission. If it works for you, you can ignore any rules you feel uncomfortable with. The rule is there for people who don't want to understand the specific details of system configuration. If you manage to get a consistent configuration in a different way, just go for it. You should make sure then that your users can't run into problems, though. Regards, Martin From guido@python.org Sat May 31 17:55:21 2003 From: guido@python.org (Guido van Rossum) Date: Sat, 31 May 2003 12:55:21 -0400 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: "Your message of Sat, 31 May 2003 08:05:06 CDT." <20030531130503.GA16185@unpythonic.net> References: <200305310041.h4V0fsF20796@pcp02138704pcs.reston01.va.comcast.net> <20030531130503.GA16185@unpythonic.net> Message-ID: <200305311655.h4VGtLk21998@pcp02138704pcs.reston01.va.comcast.net> > On Fri, May 30, 2003 at 08:41:54PM -0400, Guido van Rossum wrote: > > Of course, such programs are already vulnerable to changes in the hash > > implementation between Python versions (which has happened before). > > Is there at least a guarantee that the hashing algorithm won't change in a > bugfix release? For instance, can I depend that > python222 -c 'print hash(1), hash("a")' > python223 -c 'print hash(1), hash("a")' > will both output the same thing, even if > python23 -c 'print hash(1), hash("a")' > and > python3000 -c 'print hash(1), hash("a")' > may print something different? That's a reasonable assumption, yes. We realize that changing the hash algorithm is a feature change, even if it is a very subtle one. --Guido van Rossum (home page: http://www.python.org/~guido/) From jepler@unpythonic.net Sat May 31 18:42:17 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Sat, 31 May 2003 12:42:17 -0500 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: References: <20030530200021.GB30507@unpythonic.net> <200305301139.h4UBdIb15112@pcp02138704pcs.reston01.va.comcast.net> <20030530200021.GB30507@unpythonic.net> <5.1.0.14.0.20030531090628.02567ad0@mail.telecommunity.com> Message-ID: <20030531174214.GA18222@unpythonic.net> On Sat, May 31, 2003 at 10:48:28AM -0500, Scott A Crosby wrote: > On Sat, 31 May 2003 09:17:16 -0400, "Phillip J. Eby" writes: > > > At 05:02 PM 5/30/03 -0500, Scott A Crosby wrote: > > > But based on the discussion so far, I'm not sure I see how this attack > > would produce an effect that was dramatically disproportionate to the > > amount of data transmitted. > > I apologize for not having this available earlier, but a corrected > file of 10,000 inputs is now available and shows the behavior I > claimed. (Someone else independently reimplemented the attack and has > sent me a corrected set for python.) With 10,000 inputs, python > requires 19 seconds to process instead of .2 seconds. A file of half > the size requires 4 seconds, showing the quadratic behavior, as with > the case of perl. (Benchmarked on a P2-450) I thus predict that twice > the inputs would take about 80 seconds. > > I can only guess what python applications might experience an > interesting impact from this, so I'll be silent. However, here are the > concrete benchmarks. the CGI module was mentioned earlier as a possible "problem area" for this attack, I wrote a script that demonstrates this, using Scott's list of hash-colliding strings. I do see quadratic growth in runtime. When runnng the attack on mailman, however, I don't see such a large runtime, and the growth in runtime appears to be linear. This may be because the mailman installation is running on 2.1 (?) and requires a different set of attack strings. I used the cgi.py "self-test" script (the one you get when you run cgi.py *as* a cgi script) on the CGIHTTPServer.py server, and sent a long URL of the form test.cgi?x=1&=1&=1&... I looked at the size of the URL, the size of the response, and the time to transfer the response. My system is a mobile Pentium III running at 800MHz, RedHat 9, Python 2.2.2. The mailman testing system is a K6-2 running at 350MHz, RedHat 7.1, Python 2.1. In the results below, the very fast times and low reply sizes are due to the fact that the execve() call fails for argv+envp>128kb. This limitation might not exist if the CGI was POSTed, or running as fcgi, mod_python, or another system which does not pass the GET form contents in the environnment. Here are the results, for various query sizes: ######################################################################## # Output 1: Running attack in listing 1 on cgi.py # Parameters in query: 0 Length of URL: 40 Length of contents: 2905 Time for request: 0.537268042564 # Parameters in query: 1 Length of URL: 64 Length of contents: 3001 Time for request: 0.14549601078 # Parameters in query: 10 Length of URL: 307 Length of contents: 5537 Time for request: 0.151428103447 # Parameters in query: 100 Length of URL: 2737 Length of contents: 31817 Time for request: 0.222425937653 # Parameters in query: 1000 Length of URL: 27037 Length of contents: 294617 Time for request: 4.47611808777 # Parameters in query: 2000 Length of URL: 54037 Length of contents: 586617 Time for request: 18.8749380112 # Parameters in query: 4800 Length of URL: 129637 Length of contents: 1404217 Time for request: 106.951847911 # Parameters in query: 5000 Length of URL: 135037 Length of contents: 115 Time for request: 0.516644954681 # Parameters in query: 10000 Length of URL: 270037 Length of contents: 115 Time for request: 1.01809692383 When I attempted to run the attack against Apache 1.3/Mailman, any moderately-long GET requests provoked an Apache error message. ######################################################################## # Listing 1: test_cgi.py import urllib, time def time_url(url): t = time.time() u = urllib.urlopen(url) contents = u.read() t1 = time.time() print "Length of URL:", len(url) print "Length of contents:", len(contents) print contents[:200] print "Time for request:", t1-t print #URL="http://www.example.com/mailman/subscribe/test" URL="http://localhost:8000/cgi-bin/test.cgi" items = [line.strip() for line in open("python-attack").readlines()] for i in (0, 1, 10, 100, 1000, 2000, 4800, 5000, 10000): print "# Parameters in query:", i url = URL+"?x" url = url + "=1&".join(items[:i]) time_url(url) ######################################################################## I re-wrote the script to use POST instead of GET, and again ran it on cgi.py and mailman. For some reason, using 0 or 1 items aginst CGIHTTPServer.py seemed to hang. ######################################################################## # Output 2: Running attack in listing2 on cgi.py # Parameters in query: 10 Length of URL: 38 Length of data: 272 Length of contents: 3543 Time for request: 0.314235925674 # Parameters in query: 100 Length of URL: 38 Length of data: 2702 Length of contents: 13894 Time for request: 0.218624949455 # Parameters in query: 1000 Length of URL: 38 Length of data: 27002 Length of contents: 117395 Time for request: 2.20617306232 # Parameters in query: 2000 Length of URL: 38 Length of data: 54002 Length of contents: 232395 Time for request: 9.92248606682 # Parameters in query: 5000 Length of URL: 38 Length of data: 135002 Length of contents: 577396 Time for request: 57.3930220604 # Parameters in query: 10000 Length of URL: 38 Length of data: 270002 Length of contents: 1152396 Time for request: 238.318212986 ######################################################################## # Output 3: Running attack in listing2 on mailman # Parameters in query: 10 Length of URL: 44 Length of data: 272 Length of contents: 852 Time for request: 0.938691973686 # Parameters in query: 100 Length of URL: 44 Length of data: 2702 Length of contents: 852 Time for request: 0.819067001343 # Parameters in query: 1000 Length of URL: 44 Length of data: 27002 Length of contents: 852 Time for request: 1.13541901112 # Parameters in query: 2000 Length of URL: 44 Length of data: 54002 Length of contents: 852 Time for request: 1.59714698792 # Parameters in query: 5000 Length of URL: 44 Length of data: 135002 Length of contents: 852 Time for request: 3.12452697754 # Parameters in query: 10000 Length of URL: 44 Length of data: 270002 Length of contents: 852 Time for request: 5.72900700569 ######################################################################## # Listing 2: attack program using POST for longer URLs import urllib2, time def time_url(url, data): t = time.time() u = urllib2.urlopen(url, data) contents = u.read() t1 = time.time() print "Length of URL:", len(url) print "Length of data:", len(data) print "Length of contents:", len(contents) print "Time for request:", t1-t print #URL="http://www.example.com/mailman/subscribe/test" URL="http://localhost:8000/cgi-bin/test.cgi" items = [line.strip() for line in open("python-attack").readlines()] for i in (10, 100, 1000, 2000, 5000, 10000): print "# Parameters in query:", i data = "x" + "=1&".join(items[:i]) + "\r\n\r\n" time_url(URL, data) From tim.one@comcast.net Sat May 31 19:28:25 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 31 May 2003 14:28:25 -0400 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: <5.1.0.14.0.20030531090628.02567ad0@mail.telecommunity.com> Message-ID: [Phillip J. Eby] > ... > or at least warning the developers of alternatives to CGI (e.g. Zope, > Quixote, SkunkWeb, CherryPy, etc.) that alternate hashes might be a good > idea. Don't know about SkunkWeb or CherryPy etc, but Zope and Quixote apps can use ZODB's BTrees for mappings. Insertion and lookup in a BTree have worst-case log-time behavior, and no "bad" sets of keys exist for them. The Buckets forming the leaves of BTrees are vulnerable, though: provoking quadratic-time behavior in a Bucket only requires inserting keys in reverse-sorted order, and sometimes apps use Buckets directly when they should be using BTrees. Using a data structure appropriate for the job at hand is usually a good idea . From python@rcn.com Sat May 31 19:34:13 2003 From: python@rcn.com (Raymond Hettinger) Date: Sat, 31 May 2003 14:34:13 -0400 Subject: [Python-Dev] Algoritmic Complexity Attack on Python References: <200305310041.h4V0fsF20796@pcp02138704pcs.reston01.va.comcast.net> <20030531130503.GA16185@unpythonic.net> <200305311655.h4VGtLk21998@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <003b01c327a3$3c30f5e0$125ffea9@oemcomputer> > > On Fri, May 30, 2003 at 08:41:54PM -0400, Guido van Rossum wrote: > > > Of course, such programs are already vulnerable to changes in the hash > > > implementation between Python versions (which has happened before). > > > > Is there at least a guarantee that the hashing algorithm won't change in a > > bugfix release? For instance, can I depend that > > python222 -c 'print hash(1), hash("a")' > > python223 -c 'print hash(1), hash("a")' > > will both output the same thing, even if > > python23 -c 'print hash(1), hash("a")' > > and > > python3000 -c 'print hash(1), hash("a")' > > may print something different? > > That's a reasonable assumption, yes. We realize that changing the > hash algorithm is a feature change, even if it is a very subtle one. For Scott's proposal to work, it would have to change the hash value on every invocation of Python. If not, colliding keys can be found with a Monte Carlo method. Raymond Hettinger From tim.one@comcast.net Sat May 31 19:50:01 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 31 May 2003 14:50:01 -0400 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: <20030531130503.GA16185@unpythonic.net> Message-ID: [Jeff Epler] > Is there at least a guarantee that the hashing algorithm won't change > in a bugfix release? Guido said "yes" weakly, but the issue hasn't come up in recent times. In the past we've changed the hash functions for at least strings, tuples, and floats, based on systematic weaknesses uncovered by real-life ordinary data. OTOH, there's a requirement that, for objects of types that can be used as dict keys, two objects that compare equal must deliver equal hash codes, so people creating (mostly) number-like or (not sure if anyone does this) string-like types have to duplicate the hash codes Python delivers for builtin numbers and strings that compare equal to objects of their types. For example, the author of a Rational class should arrange for hash(Rational(42, 1)) to deliver the same result as hash(42) == hash(42L) == hash(42.0) == hash(complex(42.0, 0.0)) Such code would break if we changed the int/long/float/complex hashes for inputs that compare equal to integers. Tedious exercise for the reader: find a set of bad datetime objects in 2.3 ("bad" in the sense of their hash codes colliding; on a box where hash() returns a 32-bit int, there must be collisions, since datetime objects have way more than 32 independent bits of state). From tim.one@comcast.net Sat May 31 20:27:29 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 31 May 2003 15:27:29 -0400 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: Message-ID: [Scott Crosby] > ... > Also, I'd like to thank Tim Peters for telling me about the potential > of degradation that regular expressions may offer. I'm acutely aware of that one because it burns people regularly. These aren't cases of hostile input, they're cases of innocently "erroneous" input. After maybe a year of experience, people using a backtracking regexp engine usually figure out how to write a regexp that doesn't go resource-crazy when parsing strings that *do* match. Those are the inputs the program expects. But all inputs can suffers errors, and a regexp that works well when the input matches can still go nuts trying to match a non-matching string, consuming an exponential amount of time trying an exponential number of futile backtracking possibilities. Here's an unrealistic but tiny example, to get the flavor across: """ import re pat = re.compile('(x+x+)+y') from time import clock as now for n in range(10, 100): print n, start = now() pat.search('x' * n + 'y') print now() - start, start = now() pat.search('x' * n) print now() - start """ The fixed regexp here is (x+x+)+y and we search strings of the form xxx...xxxy which do match xxx...xxx which don't match The matching cases take time linear in the length of the string, but it's so fast it's hard to see the time going up at all until the string gets very large. The failing cases take time exponential in the length of the string. Here's sample output: 10 0.000155885951826 0.00068891533549 11 1.59238337887e-005 0.0013736401884 12 1.76000268191e-005 0.00268777552423 13 2.43047989406e-005 0.00609379976198 14 2.51428954558e-005 0.0109438642954 15 3.4361957123e-005 0.0219815954005 16 3.10095710622e-005 0.0673058549423 17 3.26857640926e-005 0.108308050755 18 3.35238606078e-005 0.251965336328 19 3.68762466686e-005 0.334131480581 20 3.68762466685e-005 0.671073936875 21 3.60381501534e-005 1.33723327578 22 3.60381501534e-005 2.68076149449 23 3.6038150153e-005 5.37420757974 24 3.6038150153e-005 10.7601803584 25 3.52000536381e-005 I killed the program then, as I didn't want to wait 20+ seconds for the 25-character string to fail to match. The horrid problem here is that it takes a highly educated eye to look at that regexp and see in advance that it's going to have "fast when it matches, possibly horrid when it doesn't match" behavior -- and this is a dead easy case to analyze. In a regexp that slobbers on across multiple lines, with 5 levels of group nesting, my guess is that no more than 1 programmer in 1000 has even a vague idea how to start looking for such problems. From tim.one@comcast.net Sat May 31 21:21:44 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 31 May 2003 16:21:44 -0400 Subject: [Python-Dev] Algoritmic Complexity Attack on Python In-Reply-To: Message-ID: [Tim] >> ... >> Overall, though, I'd welcome a faster string hash, and I agree that >> Python's isn't particularly zippy. [Scott Crosby] > Actually, at least on x86, it is faster than perl. On other platforms, > it may be somewhat slower. Perl's isn't particularly zippy either. I believe that, given heroic coding effort, a good universal hash designed for speed can get under 1 cycle per byte hashed on a modern Pentium. Python's and Perl's string hashes aren't really in the same ballpark. > ... > Yes, I am aware of the problems with the UHASH code. Unfortunately, I > am not a hash function designer, that code is not mine, and I only use > it as a black box. > > I also consider all code, until verified otherwise, to potentially > suffer from endianness, alignment, and 32/64 bit issues. Excluding > alignment issues (which I'm not sure whether to say that its OK to > fail on strange alignments or not) it has passed *my* self-tests on > big endian and 64 bit. Then who's going to vet the code on a Cray T3 (etc, etc, etc, etc)? This isn't a nag, it cuts to the heart of what a language like Python can do: the x-platform behavior of Python's current string hash is easy to understand, relying only on what the C standard demands. It's doing (only) one evil thing, relying on the unspecified (by C) semantics of what happens when a signed long multiply overflows. Python runs on just about every platform on Earth, and that hasn't been a problem so far. If it becomes a problem, we can change the accumulator to unsigned long, and then C would specify what happens. There ends the exhaustive discusson of all portability questions about our current code . > ... >> + Some Python apps run for months, and universal hashing doesn't >> remove the possibility of quadratic-time behavior. If I can poke >> at a long-running app and observe its behavior, over time I can >> deduce a > I argued on linux-kernel with someone else that this was extremely > unlikely. It requires the latency of a collision/non-collision being > noticable over a noisy system, network stack, and system. So you have in mind only apps accessed across a network? And then, for example, discounting the possiblity that a bitter sysadmin opens an interactive Python shell on the box, prints out a gazillion (i, hash(i)) pairs and mails them to himself for future mischief? > In almost all cases, for short inputs, the cost of a single L2 cache > miss far exceeds that of hashing. If the user is restricted to short inputs, provoking quadratic-time behavior doesn't matter. > A more serious danger is an application that leaks actual hash values. So ever-more operations become "dangerous". Programmers will never keep this straight, and Python isn't a straightjacket language. I still vote that apps for which this matters use an *appropriate* data structure instead -- Python isn't an application, it's a language for programming applications. > ... > Agreed, many have realized over the years that hash tables can have > quadratic behavior in an adversarial environment. People from real-time backgrounds are much more paranoid than that, and perhaps the security-conscious could learn a lot from them. For example, you're not going to find a dynamic hash table in a real-time app, because they can't take a chance on bad luck either. To them, "an adversary" is just an unlucky roll of the dice, and they can't tolerate it. > It isn't hidden. Cormen, Lieserson, and Rivest even warn about this in > their seminal algorithms textbook in 1991. Knuth warned about it a lot earlier than that It *is* obvious when thought of, but the reason I was able to ship out > so many vulnerability reports yesterday was because few actually *have* > thought of that deterministic worst-case when writing their programs. I > predict this trend to continue. I appreciate that, and I expect it to continue too. I expect a better solution would be for more languages to offer a choice of containers with different O() behaviors. In C it's hard to make progress because the standard language comes with so little, and so many "portable" C libraries aren't. The C++ world is in better shape, constrained more by the portability of standard C++ itself. There's less excuse for Python or Perl programmers to screw up in these areas, because libraries written in Python and Perl are very portable, and there are lot of 'em to chose from. > I like hash tables a lot, with UH, their time bounds are randomized, > but are pretty tight and the constant factors far exceed those of > balanced binary trees. Probably, but have you used a tree implementation into which the same heroic level of analysis and coding effort has been poured? The typical portable-C balanced tree implementation should be viewed as a worst-case bound on how fast balanced trees can actually work. Recently, heroic efforts have been poured into Judy tries, which may be both faster and more memory-efficent than hash tables in many kinds of apps: http://judy.sourceforge.net/ The code for Judy tries makes UHASH look trivial, though. OTOH, provided the Judy code actually works on your box, and there aren't bugs hiding in its thousands of lines of difficult code, relying on a Judy "array" for good worst-case behavior isn't a matter of luck. From dave@boost-consulting.com Sat May 31 23:11:05 2003 From: dave@boost-consulting.com (David Abrahams) Date: Sat, 31 May 2003 18:11:05 -0400 Subject: [Python-Dev] more-precise instructions for "Python.h first"? In-Reply-To: ( =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "31 May 2003 18:28:25 +0200") References: Message-ID: martin@v.loewis.de (Martin v. L=F6wis) writes: > David Abrahams writes: > >> Anyway, the point is that I'd like to have the rule changed to "You >> have to include Python.h or xxxx.h before any system header" where >> xxxx.h is one of the other existing headers #included in Python.h that >> is responsible for setting up whatever macros cause this >> inclusion-order requirement in the first place (preferably not >> LongObject.h!) > > If I understand correctly, you want to follow the rule "I want to > change things as long it continues to work for me".=20 Then you don't understand correctly. > For that, you don't need any permission. If it works for you, you > can ignore any rules you feel uncomfortable with. > > The rule is there for people who don't want to understand the > specific details of system configuration. If you manage to get a > consistent configuration in a different way, just go for it. You > should make sure then that your users can't run into problems, > though. I can't make sure that my users can't run into problems without understanding everything about Python and Posix which causes the rule to exist in the first place (and I don't), and continuously monitoring Python into the future to make sure that the distribution of Posix configuration information across its headers doesn't change in a way that invalidates previous assumptions. The current rule doesn't work for me, but I'd like to be following _some_ sanctioned rule to reduce the chance of problems today and in the future. I'm making an educated guess that the rule is much more-sweeping than Python development needs it to be. Isn't there some Python internal configuration header which can be #included first and which will accomplish all the same things as far as system-header inclusion order is concerned? --=20 Dave Abrahams Boost Consulting www.boost-consulting.com