From tim.one@comcast.net Thu May 1 03:13:46 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 30 Apr 2003 22:13:46 -0400 Subject: [Python-Dev] Dictionary tuning In-Reply-To: <001101c30f2a$216954a0$b1b3958d@oemcomputer> Message-ID: [Raymond Hettinger] > ... > I worked on similar approaches last month and found them wanting. > The concept was that a 64byte cache line held 5.3 dict entries and > that probing those was much less expensive than making a random > probe into memory outside of the cache. > > The first thing I learned was that the random probes were necessary > to reduce collisions. Checking the adjacent space is like a single > step of linear chaining, it increases the number of collisions. Yes, I believe that any regularity will. > That would be fine if the cost were offset by decreased memory > access time; however, for small dicts, the whole dict is already > in cache and having more collisions degrades performance > with no compensating gain. > > The next bright idea was to have a separate lookup function for > small dicts and for larger dictionaries. I set the large dict lookup > to search adjacent entries. The good news is that an artificial > test of big dicts showed a substantial improvement (around 25%). > The bad news is that real programs were worse-off than before. You should qualify that to "some real programs", or perhaps "all real programs I've tried". On the other side, I have real programs that access large dicts in random order, so if you tossed those into your mix, a 25% gain on those would more than wipe out the 1-2% losses you saw elsewhere. > A day of investigation showed the cause. The artificial test > accessed keys randomly and showed the anticipated benefit. However, > real programs access some keys more frequently than others > (I believe Zipf's law applies.) Some real programs do, and, for all I know, most real programs. It's not the case that all real programs do. The counterexamples that sprang instantly to my mind are those using dicts to accumulate stats for testing random number generators. Those have predictable access patterns only when the RNG they're testing sucks . > Those keys *and* their collision chains are likely already in the cache. > So, big dicts had the same limitation as small dicts: You always lose > when you accept more collisions in return for exploiting cache locality. Apart from that "always" ain't always so, I really like that as a summary! > The conclusion was clear, the best way to gain performance > was to have fewer collisions in the first place. Hence, I > resumed experiments on sparsification. How many collisions are you seeing? For int-keyed dicts, all experiments I ran said Python's dicts collided less than a perfectly random hash table would collide (the reason for that is explained in dictobject.c, and that int-keyed dicts tend to use a contiguous range of ints as keys). For string-keyed dicts, extensive experiments said collision behavior was indistinguishable from a perfectly random hash table. I never cared enough about other kinds of keys to time 'em, at least not since systematic flaws were fixed in the tuple and float hash functions (e.g., the tuple hash function used to xor the tuple's elements' hash codes, so that all permututions of a given tuple had the same hash code; that's necessary for unordered sets, but tuples are ordered). >> If someone wants to experiment with that in lookdict_string(), >> stick a new >> >> ++i; >> >> before the for loop, and move the existing >> >> i = (i << 2) + i + perturb + 1; >> >> to the bottom of that loop. Likewise for lookdict(). > PyStone gains 1%. > PyBench loses a 1%. > timecell gains 2% (spreadsheet benchmark) > timemat loses 2% (pure python matrix package benchmark) > timepuzzle loses 1% (class based graph traverser) You'll forgive me if I'm skeptical: they're such small differences that, if I saw them, I'd consider them to be a wash -- in the noise. What kind of platform are you running on that has timing behavior repeatable enough to believe 1-2% blips? > P.S. There is one other way to improve cache behavior > but it involves touching code throughout dictobject.c. Heh -- that wouldn't even be considered a minor nuisance to the truly obsessed . > Move the entry values into a separate array from the > key/hash pairs. That way, you get 8 entries per cache line. What use case would that help? You said before that small dicts are all in cache anyway, so it wouldn't help those. The jumps in large dicts are so extreme that it doesn't seem to matter if the cache line size du jour holds 1 slot or 100. To the contrary, at the end of the large dict lookup, it sounds like it would incur an additional cache miss to fetch the value after the key was found (since that value would no longer ever ride along with the key and hashcode). I can think of a different reason for considering this: sets have no use for the value slot, and wouldn't need to allocate space for 'em. > P.P.S. One other idea is to use a different search pattern > for small dictionaries. Store entries in a self-organizing list > with no holes. Dummy fields aren't needed which saves > a test in the linear search loop. When an entry is found, > move it one closer to the head of the list so that the most > common entries get found instantly. I don't see how more than just one can be found instantly; if "instantly" here means "in no more than a few tries", that's usually true of dicts too -- but is still an odd meaning for "instantly" . > Since there are no holes, all eight cells can be used instead of the > current maximum of five. Like the current arrangement, the > whole small dict fits into just two cache lines. Neil Schemenauer suggested that a few years ago, but I don't recall it going anywhere. I don't know how to predict for which apps it would be faster. If people are keen to optimize very small search tables, think about schemes that avoid the expense of hashing too. From tim.one@comcast.net Thu May 1 03:36:22 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 30 Apr 2003 22:36:22 -0400 Subject: [Python-Dev] Dictionary tuning In-Reply-To: <002301c30f55$245394c0$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger] > ... > I'm going to write-up an informational PEP to summarize the > results of research to-date. I'd suggest instead a text file checked into the Object directory, akin to the existing listsort.txt -- it's only of interest to the small fraction of hardcore developers with an optimizing bent. > After the first draft, I'm sure the other experimenters will each have > lessons to share. In addition, I'll attach a benchmarking suite and > dictionary simulator (fully instrumented). That way, future generations > can reproduce the results and pickup where we left-off. They probably won't, though. The kind of people attracted to this kind of micro-level fiddling revel in recreating this kind of stuff themselves. For example, you didn't look hard enough to find the sequence of dict simulators Christian posted last time he got obsessed with this . On the chance that they might, a plain text file-- or a Wiki page! --is easier to update than a PEP over time. The benchmarking suite should also be checked in, and should be very welcome. Perhaps it's time for a "benchmark" subdirectory under Lib/test? It doesn't make much sense even now that pystone and sortperf live directly in the test directory. > I've decided that this new process should have a name, > something pithy, yet magical sounding, so it shall be > dubbed SCIENCE. LOL! But I'm afraid it's not real science unless you first write grant proposals, and pay a Big Name to agree to be named as Principal Investigator. I'll write Uncle Don a letter on your behalf . From tim_one@email.msn.com Thu May 1 04:50:32 2003 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 30 Apr 2003 23:50:32 -0400 Subject: [Python-Dev] New thread death in test_bsddb3 In-Reply-To: Message-ID: [Thomas Heller] > ... > So is the policy now that it is no longer *allowed* to create another > thread state, while in previous versions there wasn't any choice, > because there existed no way to get the existing one? You can still create all the thread states you like; the new check is in PyThreadState_Swap(), not in PyThreadState_New(). There was always a choice, but previously Python provided no *help* in keeping track of whether a thread already had a thread state associated with it. That didn't stop careful apps from providing their own mechanisms to do so. About policy, yes, it appears to be so now, else Mark wouldn't be raising a fatal error . I view it as having always been the policy (from a good-faith reading of the previous code), just a policy that was too expensive for Python to enforce. There are many policies like that, such as not passing goofy arguments to macros, and not letting references leak. Python doesn't currently enforce them because it's currently too expensive to enforce them. Over time that can change. > IMO a fatal error is very harsh, especially there's no problem to > continue execution - excactly what happens in a release build. There may or may not be a problem with continued execution -- if you've associated more than one living thread state with a thread, your app may very well be fatally confused in a way that's very difficult to diagnose without this error. Clearly, I like having fatal errors for dubious things in debug builds. Debug builds are supposed to help you debug. If the fatal error here drives you insane, and you don't want to repair the app code, you're welcome to change #if defined(Py_DEBUG) to #if 0 in your debug build. > Not that I am misunderstood: I very much appreciate the work Mark has > done, and look forward to use it to it's fullest extent. In what way is this error a genuine burden to you? The only time I've seen it trigger is in the Berkeley database wrapper, where it pointed out a fine opportunity to simplify some obscure hand-rolled callback tomfoolery -- and pointed out that the thread in question did in fact already have a thread state. Whether that was correct in all cases is something I don't know -- and don't have to worry about anymore, since the new code reuses the thread state the thread already had. The lack of errors in a debug run now assures me that's in good shape now. From drifty@alum.berkeley.edu Thu May 1 05:38:39 2003 From: drifty@alum.berkeley.edu (Brett Cannon) Date: Wed, 30 Apr 2003 21:38:39 -0700 (PDT) Subject: [Python-Dev] Dictionary tuning In-Reply-To: References: Message-ID: [Tim Peters] > The benchmarking suite should also be checked in, and should be very > welcome. Perhaps it's time for a "benchmark" subdirectory under Lib/test? > It doesn't make much sense even now that pystone and sortperf live directly > in the test directory. > Works for me. Can we perhaps decide whether we want to do this in the near future? I am going to be writing up module docs for the test package and if we are going to end up moving them I would like to be get this written into the docs the first time through. -Brett From martin@v.loewis.de Thu May 1 05:10:24 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 01 May 2003 06:10:24 +0200 Subject: [Python-Dev] Initialization hook for extenders In-Reply-To: <3EB04B03.887CDF7B@llnl.gov> References: <3EB04B03.887CDF7B@llnl.gov> Message-ID: "Patrick J. Miller" writes: > I actually want this to do some MPI initialization to setup a > single user prompt with broadcast which has to run after > Py_Initialize() but before the import of readline. -1. It is easy enough to copy the code of Py_Main, and customize it for special requirements. The next user may want to have a hook to put additional command line options into Py_Main, YAGNI. Regards, Martin From tim_one@email.msn.com Thu May 1 07:13:51 2003 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 1 May 2003 02:13:51 -0400 Subject: [Python-Dev] Re: heaps In-Reply-To: Message-ID: [Raymond Hettinger] >> I'm quite pleased with the version already in CVS. It is a small >> masterpiece of exposition, sophistication, simplicity, and speed. >> A class based interface is not necessary for every algorithm. [David Eppstein] > It has some elegance, but omits basic operations that are necessary for > many heap-based algorithms and are not provided by this interface. I think Raymond was telling you it isn't intended to be "an interface", rather it's quite up-front about being a collection of functions that operate directly on a Python list, implementing a heap in a very straightforward way, and deliberately not trying to hide any of that. IOW, it's a concrete data type, not an abstract one. I asked, and it doesn't feel like apologizing for being what it is . That's not to say Python couldn't benefit from providing an abstract heap API too, and umpteen different implementations specialized to different kinds of heap applications. It is saying that heapq isn't trying to be that, so pointing out that it isn't falls kinda flat. > Specifically, the three algorithms that use heaps in my upper-division > undergraduate algorithms classes are heapsort (for which heapq works > fine, but you would generally want to use L.sort() instead), Dijkstra's > algorithm (and its relatives such as A* and Prim), which needs the > ability to decrease keys, and event-queue-based plane sweep algorithms > (e.g. for finding all crossing pairs in a set of line segments) which > need the ability to delete items from other than the top. Then some of those will want a different implementation of a heap. The algorithms in heapq are still suitable for many heap applications, such as maintaining an N-best list (like retaining only the 10 best-scoring items in a long sequence), and A* on a search tree (when there's only one path to a node, decrease-key isn't needed; A* on a graph is harder). > To see how important the lack of these operations is, I decided to > compare two implementations of Dijkstra's algorithm. I don't think anyone claimed-- or would claim --that a heapq is suitable for all heap purposes. > The priority-dict implementation from > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/119466 takes as > input a graph, coded as nested dicts {vertex: {neighbor: edge length}}. > This is a variation of a graph coding suggested in one of Guido's essays > that, as Raymond suggests, avoids using a separate class based interface. > > Here's a simplification of my dictionary-based Dijkstra implementation: > > def Dijkstra(G,start,end=None): > D = {} # dictionary of final distances > P = {} # dictionary of predecessors > Q = priorityDictionary() # est.dist. of non-final vert. > Q[start] = 0 > for v in Q: > D[v] = Q[v] > for w in G[v]: > vwLength = D[v] + G[v][w] > if w not in D and (w not in Q or vwLength < Q[w]): > Q[w] = vwLength > P[w] = v > return (D,P) > > Here's a translation of the same implementation to heapq (untested > since I'm not running 2.3). Since there is no decrease in heapq, nor > any way to find and remove old keys, A heapq *is* a list, so you could loop over the list to find an old object. I wouldn't recommend that in general , but it's easy, and if the need is rare then the advertised fact that a heapq is a plain list can be very convenient. Deleting an object from "the interior" still isn't supported directly, of course. It's possible to do so efficiently with this implementation of a heap, but since it doesn't support an efficient way to find an old object to begin with, there seemed little point to providing an efficient delete-by-index function. Here's one such: import heapq def delete_obj_at_index(heap, pos): lastelt = heap.pop() if pos >= len(heap): return # The rest is a lightly fiddled variant of heapq._siftup. endpos = len(heap) # Bubble up the smaller child until hitting a leaf. childpos = 2*pos + 1 # leftmost child position while childpos < endpos: # Set childpos to index of smaller child. rightpos = childpos + 1 if rightpos < endpos and heap[rightpos] <= heap[childpos]: childpos = rightpos # Move the smaller child up. heap[pos] = heap[childpos] pos = childpos childpos = 2*pos + 1 # The leaf at pos is empty now. Put lastelt there, and and bubble # it up to its final resting place (by sifting its parents down). heap[pos] = lastelt heapq._siftdown(heap, 0, pos) > I changed the algorithm to add new tuples for each new key, leaving the > old tuples in place until they bubble up to the top of the heap. > > def Dijkstra(G,start,end=None): > D = {} # dictionary of final distances > P = {} # dictionary of predecessors > Q = [(0,None,start)] # heap of (est.dist., pred., vert.) > while Q: > dist,pred,v = heappop(Q) > if v in D: > continue # tuple outdated by decrease-key, ignore > D[v] = dist > P[v] = pred > for w in G[v]: > heappush(Q, (D[v] + G[v][w], v, w)) > return (D,P) > > My analysis of the differences between the two implementations: > > - The heapq version is slightly complicated (the two lines > if...continue) by the need to explicitly ignore tuples with outdated > priorities. This need for inserting low-level data structure > maintenance code into higher-level algorithms is intrinsic to using > heapq, since its data is not structured in a way that can support > efficient decrease key operations. It surprised me that you tried using heapq at all for this algorithm. I was also surprised that you succeeded <0.9 wink>. > - Since the heap version had no way to determine when a new key was > smaller than an old one, the heapq implementation needed two separate > data structures to maintain predecessors (middle elements of tuples for > items in queue, dictionary P for items already removed from queue). In > the dictionary implementation, both types of items stored their > predecessors in P, so there was no need to transfer this information > from one structure to another. > > - The dictionary version is slightly complicated by the need to look up > old heap keys and compare them with the new ones instead of just > blasting new tuples onto the heap. So despite the more-flexible heap > structure of the dictionary implementation, the overall code complexity > of both implementations ends up being about the same. > > - Heapq forced me to build tuples of keys and items, while the > dictionary based heap did not have the same object-creation overhead > (unless it's hidden inside the creation of dictionary entries). Rest easy, it's not. > On the other hand, since I was already building tuples, it was > convenient to also store predecessors in them instead of in some > other structure. > > - The heapq version uses significantly more storage than the dictionary: > proportional to the number of edges instead of the number of vertices. > > - The changes I made to Dijkstra's algorithm in order to use heapq might > not have been obvious to a non-expert; more generally I think this lack > of flexibility would make it more difficult to use heapq for > cookbook-type implementation of textbook algorithms. Depends on the specific algorithms in question, of course. No single heap implementation is the best choice for all algorithms, and heapq would be misleading people if, e.g., it did offer a decrease_key function -- it doesn't support an efficient way to do that, and it doesn't pretend to. > - In Dijkstra's algorithm, it was easy to identify and ignore outdated > heap entries, sidestepping the inability to decrease keys. I'm not > convinced that this would be as easy in other applications of heaps. All that is explaining why this specific implementation of a heap isn't suited to the task at hand. I don't believe that was at issue, though. An implementation of a heap that is suited for this task may well be less suited for other tasks. > - One of the reasons to separate data structures from the algorithms > that use them is that the data structures can be replaced by ones with > equivalent behavior, without changing any of the algorithm code. The > heapq Dijkstra implementation is forced to include code based on the > internal details of heapq (specifically, the line initializing the heap > to be a one element list), making it less flexible for some uses. > The usual reason one might want to replace a data structure is for > efficiency, but there are others: for instance, I teach various > algorithms classes and might want to use an implementation of Dijkstra's > algorithm as a testbed for learning about different priority queue data > structures. I could do that with the dictionary-based implementation > (since it shows nothing of the heap details) but not the heapq one. You can wrap any interface you like around heapq (that's very easy to do in Python), but it won't change that heapq's implementation is poorly suited to this application. priorityDictionary looks like an especially nice API for this specific algorithm, but, e.g., impossible to use directly for maintaining an N-best queue (priorityDictionary doesn't support multiple values with the same priority, right? if we're trying to find the 10,000 poorest people in America, counting only one as dead broke would be too Republican for some peoples' tastes ). OTOH, heapq is easy and efficient for *that* class of heap application. > Overall, while heapq was usable for implementing Dijkstra, I think it > has significant shortcomings that could be avoided by a more > well-thought-out interface that provided a little more functionality and > a little clearer separation between interface and implementation. heapq isn't trying to separate them at all -- quite the contrary! It's much like the bisect module that way. They find very good uses in practice. I should note that I objected to heapq at the start, because there are several important heap implementation techniques, and just one doesn't fit anyone all the time. My objection went away when Guido pointed out how much like bisect it is: since it doesn't pretend one whit to generality or opaqueness, it can't be taken as promising more than it actually does, nor can it interfere with someone (so inclined) defining a general heap API: it's not even a class, just a handful of functions. Useful, too, just as it is. A general heap API would be nice, but it wouldn't have much (possibly nothing) to do with heapq. From eppstein@ics.uci.edu Thu May 1 07:36:17 2003 From: eppstein@ics.uci.edu (David Eppstein) Date: Wed, 30 Apr 2003 23:36:17 -0700 Subject: [Python-Dev] Re: heaps In-Reply-To: References: Message-ID: <5841710.1051745776@[10.0.1.2]> On 5/1/03 2:13 AM -0400 Tim Peters wrote: > It surprised me that you tried using heapq at all for this algorithm. I > was also surprised that you succeeded <0.9 wink>. Wink noted, but it surprised me too, a little. I had thought decrease key was a necessary part of the algorithm, not something that could be finessed like that. > You can wrap any interface you like around heapq (that's very easy to do > in Python), but it won't change that heapq's implementation is poorly > suited to this application. priorityDictionary looks like an especially > nice API for this specific algorithm, but, e.g., impossible to use > directly for maintaining an N-best queue (priorityDictionary doesn't > support multiple values with the same priority, right? if we're trying > to find the 10,000 poorest people in America, counting only one as dead > broke would be too Republican for some peoples' tastes ). OTOH, > heapq is easy and efficient for *that* class of heap application. I agree with your main points (heapq's inability to handle certain priority queue applications doesn't mean it's useless, and its implementation-specific API helps avoid fooling programmers into thinking it's any more than what it is). But I am confused at this example. Surely it's just as easy to store (income,identity) tuples in either data structure. If you mean, you want to find the 10k smallest income values (rather than the people having those incomes), then it may be that a better data structure would be a simple list L in which the value of L[i] is the count of people with income i. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From guido@python.org Thu May 1 14:28:24 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 01 May 2003 09:28:24 -0400 Subject: [Python-Dev] Dictionary tuning In-Reply-To: Your message of "Wed, 30 Apr 2003 21:38:39 PDT." References: Message-ID: <200305011328.h41DSPq05585@odiug.zope.com> [Tim] > > The benchmarking suite should also be checked in, and should be > > very welcome. Perhaps it's time for a "benchmark" subdirectory > > under Lib/test? It doesn't make much sense even now that pystone > > and sortperf live directly in the test directory. > Works for me. Can we perhaps decide whether we want to do this in the > near future? I am going to be writing up module docs for the test package > and if we are going to end up moving them I would like to be get this > written into the docs the first time through. > > -Brett Should the benchmarks directory be part of the distribution, or should it be in the nondist part of the CVS tree? --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh@python.net Thu May 1 15:07:37 2003 From: mwh@python.net (Michael Hudson) Date: Thu, 01 May 2003 15:07:37 +0100 Subject: [Python-Dev] Dictionary tuning In-Reply-To: <200305011328.h41DSPq05585@odiug.zope.com> (Guido van Rossum's message of "Thu, 01 May 2003 09:28:24 -0400") References: <200305011328.h41DSPq05585@odiug.zope.com> Message-ID: <2m65ouahg6.fsf@starship.python.net> Guido van Rossum writes: > [Tim] >> > The benchmarking suite should also be checked in, and should be >> > very welcome. Perhaps it's time for a "benchmark" subdirectory >> > under Lib/test? It doesn't make much sense even now that pystone >> > and sortperf live directly in the test directory. > >> Works for me. Can we perhaps decide whether we want to do this in the >> near future? I am going to be writing up module docs for the test package >> and if we are going to end up moving them I would like to be get this >> written into the docs the first time through. >> >> -Brett > > Should the benchmarks directory be part of the distribution, or should > it be in the nondist part of the CVS tree? I can't think why you'd want it in nondist, unless they depend on huge input files or something. Cheers, M. -- Those who have deviant punctuation desires should take care of their own perverted needs. -- Erik Naggum, comp.lang.lisp From guido@python.org Thu May 1 15:18:50 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 01 May 2003 10:18:50 -0400 Subject: [Python-Dev] Dictionary tuning In-Reply-To: Your message of "Thu, 01 May 2003 15:07:37 BST." <2m65ouahg6.fsf@starship.python.net> References: <200305011328.h41DSPq05585@odiug.zope.com> <2m65ouahg6.fsf@starship.python.net> Message-ID: <200305011418.h41EIoF07682@odiug.zope.com> > > Should the benchmarks directory be part of the distribution, or should > > it be in the nondist part of the CVS tree? > > I can't think why you'd want it in nondist, unless they depend on huge > input files or something. OK. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@pythoncraft.com Thu May 1 15:20:35 2003 From: aahz@pythoncraft.com (Aahz) Date: Thu, 1 May 2003 10:20:35 -0400 Subject: [Python-Dev] Dictionary tuning In-Reply-To: <200305011328.h41DSPq05585@odiug.zope.com> References: <200305011328.h41DSPq05585@odiug.zope.com> Message-ID: <20030501142034.GA28364@panix.com> On Thu, May 01, 2003, Guido van Rossum wrote: > > Should the benchmarks directory be part of the distribution, or should > it be in the nondist part of the CVS tree? Given the constant number of arguments in c.l.py about speed, I'd keep it in the distribution unless/until it gets large. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In many ways, it's a dull language, borrowing solid old concepts from many other languages & styles: boring syntax, unsurprising semantics, few automatic coercions, etc etc. But that's one of the things I like about it." --Tim Peters on Python, 16 Sep 93 From tjreedy@udel.edu Thu May 1 16:27:34 2003 From: tjreedy@udel.edu (Terry Reedy) Date: Thu, 1 May 2003 11:27:34 -0400 Subject: [Python-Dev] Re: Dictionary tuning References: <200305011328.h41DSPq05585@odiug.zope.com> <2m65ouahg6.fsf@starship.python.net> Message-ID: >From my curious user viewpoint ... "Michael Hudson" wrote in message news:2m65ouahg6.fsf@starship.python.net... > Guido van Rossum writes: > > > [Tim] > >> > The benchmarking suite should also be checked in, and should be > >> > very welcome. Perhaps it's time for a "benchmark" subdirectory > >> > under Lib/test? It doesn't make much sense even now that pystone > >> > and sortperf live directly in the test directory. + 1 on a separate subdirectory (there are two other already) to make these easier to find (or ignore). > >> Works for me. Can we perhaps decide whether we want to do this in the > >> near future? I am going to be writing up module docs for the test package > >> and if we are going to end up moving them I would like to be get this > >> written into the docs the first time through. + 1 on doing so by 2.3 final if not before > > Should the benchmarks directory be part of the distribution, or should > > it be in the nondist part of the CVS tree? > > I can't think why you'd want it in nondist, unless they depend on huge > input files or something. + 1 on keeping these with the standard distribution. Sortperf.py is a great example of random + systematic corner case testing extended to something more complicated than binary ops. Besides that, I expect to actually use it, with minor mods, sometime later this year. I am more than happy to give it the 4K bytes it uses. Terry J. Reedy From patmiller@llnl.gov Thu May 1 16:23:07 2003 From: patmiller@llnl.gov (Patrick J. Miller) Date: Thu, 01 May 2003 08:23:07 -0700 Subject: [Python-Dev] Initialization hook for extenders References: <3EB04B03.887CDF7B@llnl.gov> Message-ID: <3EB13BDB.808E749@llnl.gov> "Martin v. L=F6wis" wrote: > -1. It is easy enough to copy the code of Py_Main, and customize it > for special requirements. The next user may want to have a hook to put > additional command line options into Py_Main, YAGNI. It's not easy. Not if you simply want to link against an installed Python. Nor so if you want to build against 2.1 2.2 and 2.3 ... libraries. There are subtle changes that bite you in the ass if you don't physically copy the right source forward. We did copy forward main.c, but found that every time we updated Python, we had to "rehack" main to make sure we had all the options and flags and initialization straight. I think the hook is extremely cheap, very short, looks almost exactly like Py_AtExit() and solves the problem directly. Pat --=20 Patrick Miller | (925) 423-0309 | http://www.llnl.gov/CASC/people/pmiller You can discover more about a person in an hour of play than in a year of discussion. -- Plato, philosopher (427-347 BCE) From glyph@twistedmatrix.com Thu May 1 16:59:41 2003 From: glyph@twistedmatrix.com (Glyph Lefkowitz) Date: Thu, 1 May 2003 10:59:41 -0500 Subject: [Python-Dev] Re: Python-Dev digest, Vol 1 #3221 - 4 msgs In-Reply-To: <200304282102.h3SL2rW18842@odiug.zope.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Monday, April 28, 2003, at 04:02 PM, Guido van Rossum wrote: >> Why is the Python development team introducing bugs into Python and >> then expecting the user community to fix things that used to work? > > I resent your rhetoric, Glyph. Had you read the rest of this thread, > you would have seen that the performance regression only happens for > sending data at maximum speed over the loopback device, and is > negligeable when receiving e.g. data over a LAN. You would also have > seen that I have already suggested two different simple fixes. I apologize. I did not seriously mean this as an indictment of the entire Python development team or process. I would have responded to this effect sooner, but I've been swamped with work. >> I could understand not wanting to put a lot of effort into >> correcting obscure or difficult-to-find performance problems that >> only a few people care about, but the obvious thing to do in this >> case is simply to change the default behavior. > > It can and will be fixed. I just don't have the time to fix it > myself. I noticed your comment about the checkin. Thanks to the dev team for fixing it so promptly. >> I think this should be in the release notes for 2.3. "Python is 10% >> faster, unless you use sockets, in which case it is much, much slower. >> Do the following in order to regain lost performance and retain the >> same semantics:" > > That is total bullshit, Glyph, and you know it. Please pardon the exaggeration. I forget that sarcasm does not come across as well on e-mail as it does on IRC. I appreciate that the performance drop wasn't really that serious. On a more positive note, looking at performance numbers got us thinking about increasing performance in Twisted. Anthony Baxter has been very helpful with profiling information, Itamar's already written some benchmarking tests, and I finished up a logging infrastructure that is more amenable to metrics gathering last night. (It's also less completely awful than the one we had before and should hook up to the new logging.py gracefully.) We already have an always-on multi-platform regression test suite for Twisted (not the snake farm): http://www.twistedmatrix.com/users/warner.twistd/ If we get this reporting some performance numbers as well, it would be pretty easy to turn it into a regression/performance test for Python by tweaking a few variables -- probably, just 'cvs update; make' in the Python directory instead of the Twisted one. Is there interest in seeing these kinds of numbers generated regularly? What kind of numbers would be interesting on the Python side? -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (Darwin) iD8DBQE+sUSIvVGR4uSOE2wRAmJDAJ9dRfcX8zPYUvExUtvpxTpQlg2GhwCfde5B C7bsGc8YSwp5aN1vJ6BSiGU= =/c5y -----END PGP SIGNATURE----- From martin@v.loewis.de Thu May 1 16:46:05 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 01 May 2003 17:46:05 +0200 Subject: [Python-Dev] Initialization hook for extenders In-Reply-To: <3EB13BDB.808E749@llnl.gov> References: <3EB04B03.887CDF7B@llnl.gov> <3EB13BDB.808E749@llnl.gov> Message-ID: <3EB1413D.7080604@v.loewis.de> Patrick J. Miller wrote: > It's not easy. > > Not if you simply want to link against an installed Python. Why not? Just don't call the function Py_Main. > Nor so if you want to build against 2.1 2.2 and 2.3 ... libraries. Again, I can't see a reason why that is. > There are subtle changes that bite you in the ass if you don't > physically copy the right source forward. For example? > We did copy forward main.c, but found that every time we updated > Python, we had to "rehack" main to make sure we had all the options > and flags and initialization straight. That is not necessary. What would be the problem if you just left your function as it was in Python 2.1? > I think the hook is extremely cheap, very short, looks almost exactly > like Py_AtExit() and solves the problem directly. Unfortunately, the problem is one that almost nobody ever has, and supporting that API adds a maintenance burden. It is better if the maintenance burden is on your side than on the Python core. If you think you really need this, write a PEP, ask the community, and wait for BDFL pronouncement. I'm still -1. Regards, Martin From patmiller@llnl.gov Thu May 1 17:15:09 2003 From: patmiller@llnl.gov (Patrick J. Miller) Date: Thu, 01 May 2003 09:15:09 -0700 Subject: [Python-Dev] Initialization hook for extenders References: <3EB04B03.887CDF7B@llnl.gov> <3EB13BDB.808E749@llnl.gov> <3EB1413D.7080604@v.loewis.de> Message-ID: <3EB1480D.EE379EC6@llnl.gov> Martin, Sorry you disagree. I think that the issue is still important and other pieces of the API are already in this direction. For instance, there is no need to have PyImport_AppendInittab because you can hack config.c (which you can get from $prefix/lib/pythonx.x/config/config.c) and in fact many people did exactly that but it made for a messy extension until the API call made it clean and direct. You don't need Py_AtExit() because you can call through to atexit.register() to put the function in. The list goes on... I still think that Py_AtInit() is clean, symmetric with Py_AtExit(), and solves a big problem for extenders who wish to address localization from within C (as opposed to sitecustomize.py). This is a 10 line patch with 0 runtime impact that requires no maintanence to move forward with new versions. If it were more than that, I could better understand your objections. Hope that I can get you to at least vote 0 instead of -1. Cheers, Pat -- Patrick Miller | (925) 423-0309 | http://www.llnl.gov/CASC/people/pmiller You can never solve a problem on the level on which it was created. -- Albert Einstein, physicist, Nobel laureate (1879-1955) From guido@python.org Thu May 1 17:32:31 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 01 May 2003 12:32:31 -0400 Subject: [Python-Dev] Re: Python-Dev digest, Vol 1 #3221 - 4 msgs In-Reply-To: Your message of "Thu, 01 May 2003 10:59:41 CDT." References: Message-ID: <200305011632.h41GWVu08466@odiug.zope.com> Apologies accepted, Glyph. --Guido van Rossum (home page: http://www.python.org/~guido/) From drifty@alum.berkeley.edu Fri May 2 00:22:34 2003 From: drifty@alum.berkeley.edu (Brett Cannon) Date: Thu, 1 May 2003 16:22:34 -0700 (PDT) Subject: [Python-Dev] python-dev Summary for 2003-04-16 through 2003-04-30 Message-ID: Yes, I am actually getting the rough draft out the day after its coverage ends. Perk of being done with grad school apps. =3D) You guys have until Sunday (busy Friday and Saturday) to show me why I should try proof-reading one of these days. =3D) And I did leave the one thread out that Guido asked not to be spread around so I guess this summary is not as "complete" as previous ones. -------------------------------- +++++++++++++++++++++++++++++++++++++++++++++++++++++ python-dev Summary for 2003-04-16 through 2003-04-30 +++++++++++++++++++++++++++++++++++++++++++++++++++++ This is a summary of traffic on the `python-dev mailing list`_ from April 16, 2003 through April 30, 2003. It is intended to inform the wider Python community of on-going developments on the list and to have an archived summary of each thread started on the list. To comment on anything mentioned here, just post to python-list@python.org or `comp.lang.python`_ with a subject line mentioning what you are discussing. All python-dev members are interested in seeing ideas discussed by the community, so don't hesitate to take a stance on something. And if all of this really interests you then get involved and join `python-dev`_! This is the sixteenth summary written by Brett Cannon (writing history the way I see fit =3D). All summaries are archived at http://www.python.org/dev/summary/ . Please note that this summary is written using reStructuredText_ which can be found at http://docutils.sf.net/rst.html . Any unfamiliar punctuation is probably markup for reST_ (otherwise it is probably regular expression syntax or a typo =3D); you can safely ignore it, although I suggest learnin= g reST; its simple and is accepted for `PEP markup`__. Also, because of the wonders of programs that like to reformat text, I cannot guarantee you will be able to run the text version of this summary through Docutils_ as-is unless it is from the original text file. __ http://www.python.org/peps/pep-0012.html =2E. _python-dev: http://www.python.org/dev/ =2E. _python-dev mailing list: http://mail.python.org/mailman/listinfo/python-dev =2E. _comp.lang.python: http://groups.google.com/groups?q=3Dcomp.lang.pytho= n =2E. _Docutils: http://docutils.sf.net/ =2E. _reST: =2E. _reStructuredText: http://docutils.sf.net/rst.html =2E. contents:: =2E. _last summary: http://www.python.org/dev/summary/2003-04-01_2003-04-15.html =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Summary Announcements =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D So no one responded to my question last time about whether anyone cared if I stopped linking to files in the Python CVS online through ViewCVS. So silence equals what ever answer makes my life easier, so I won't link to files anymore. =2E. _ViewCVS: http://viewcvs.sf.net/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D `2.3b1 release`__ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D __ http://mail.python.org/pipermail/python-dev/2003-April/034682.html Splinter threads: - `Masks in getargs.c `__ - `CALL_ATTR patch `__ - `Built-in functions as methods `__ - `Tagging the tree `__ - `RELEASED: Python 2.3b1 `__ Guido announced he wanted to get `Python 2.3b1`_ out the door by Friday, April 25 (which he did). He also said if something urgently needed to get in before then to set the priority on the item to 7. The rules for betas is you can apply bug fixes (it is the point of the releases). New unit tests can also be added as long as the entire regression testing suite passes with them in there; since this is a beta any bugs found should be patched along with adding the tests. This led to some patches to come up that some people would like to see get into b1. One is Thomas Heller and his patch at http://www.python.org/sf/595026 which adds new argument masks for PyArg_ParseTuple(). Thomas' patch adds two new masks ('k' and 'K') and modifies some others so that their range checking (if they kept any) were more reasonable. This is when Jack Jansen chimed in saying that he didn't notice any mask that worked between 2.2 and 2.3 that converts 32 bit values without throwing a fit. Basically the changes to the 'h' mask left all of the Mac modules broken. The change was backed out, though, and the issue was solved. Martin v. L=C3=B6wis wanted to get IDNA (International Domain Name Addressi= ng) in (which he did). UnixWare was (and as of this writing still is) broken. It's being worked on, though, by Tim Rice. The CALL_ATTR patch that Thomas Wouters and I worked on at PyCon came up. We were trying to come up with an opcode to replace the common ``LOAD_ATTR; CALL_FUNCTION`` opcode pair that happens whenever you call a method. The hope was to short-circuit the pushing on to the stack the method object since it gets popped off immediately by CALL_FUNCTION. Initially the patch only worked for classic classes but Thomas has since cleaned it up and added support for new-style classes. To help out Thomas, Guido gave an overview of new-style classes and how descriptors work. Basically a descriptor is what exists in a class' __dict__ and "primarily affects instance attribute lookup". When the attribute lookup finds the descriptor it wants, it calls its __get__ method (tp_descrget slot for C types). The lookup then "binds" this to the instance; this is what separates a bound method from a function since functions are also descriptors. Properties are just descriptors whose __get__ calls whatever you passed for the fget argument. Class attribute lookup also calls __get__ but the instance attribute is made None (or NULL for C code). __set__ is called on a descriptor for instance attribute assignment but not for class attribute assignment. Guido clarified this later somewhat by the example having a descriptor f that when ``f.__get__(obj)`` is called it returns a function g which acts like a curried function (read the Python Cookbook if you don't know what currying_ is). Now when you call ``g(arg1, ...)`` you are basically doing ``f(obj, arg1, ...)``; so this all turns into ``f.__get__(obj)(arg1, =2E..)``. The problem with the CALL_ATTR patch is that there is turning out to be zero benefit from it beyond from having a nicer opcode for a common operation when the code for working with new-style classes in the code. This could be from cache misses because of the increased size of the interpreter loop or just too many branches to possibly take. As of now the patch is still on SF and has not been applied. =2E. _Python 2.3b1: http://www.python.org/2.3/ =2E. _currying: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52549 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D `Super and properties`__ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D __ http://mail.python.org/pipermail/python-dev/2003-April/034338.html This thread was initially covered in the `last summary`_. Guido ended up explaining why super() does not work for properties. super() does not stop on the first hit of finding something when that "something" is a data descriptor; it ignores it and just keeps on looking. Now super() does this so that it doesn't end up looking like something it isn't. Think of the case of __class__; if it returned what object's __class__ returned it would cause super to look like something it isn't. Guido figured people wouldn't want to override data descriptors anyway, so this made sense. But now there is a use case for this, so Guido is changing this for Python 2.3 so that data descriptors are properly hit in the inheritance chain by super(). =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D `Final PEP 311 run`__ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D __ http://mail.python.org/pipermail/python-dev/2003-April/034705.html Mark Hammond's `PEP 311`_ has now been implemented! What Mark has done is implement two functions in C; PyGILState_Ensure() and PyGILState_Restore(). Call the first one to get control of the GIL, without having to know its current state, to allow you to use the Python API safely. The second releases the GIL when you are done making calls out to Python. This is a much simpler interface than what was previously needed when you did not need a very fancy threading interface with Python and just needed to hold the GIL. As always, read the PEP to get the full details. =2E. _PEP 311: http://www.python.org/peps/pep-0311.html =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D `summing a bunch of numbers (or "whatevers")`__ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D __ http://mail.python.org/pipermail/python-dev/2003-April/034767.html Splinter threads: - `stats.py `__ - `''.join() again `__ How would use sum a list of numbers? Traditionally there have been two answers. One is to use the operator module and 'reduce' in the idiomatic ``reduce(operator.add, list_of_numbers)``. The other is to do a simple loop:: running_sum =3D 0 for num in list_of_numbers: running_sum +=3D num Common complaints against the 'reduce' solution are that is just is ugly. People don't like the loop solution because it is long for such a simple operation. And a knock against both is that new users of Python do not necessarily think of either solution initially. So, what to do? Well, Alex Martelli to the rescue. Alex proposed adding a new built-in, 'sum', that would take a list of numbers and return the sum of those numbers. Originally Alex also wanted to special-case the handling of a list of strings so as to prevent having to tell new Python programmers that ``"".join(list_of_strings)`` is the best way to concatenate a bunch of strings and that looping over them is *really* bad (the amount of I/O done in the loop kills performance). But this special-casing was shot down because it seemed rather magical and can still be taught to beginners easily enough ('reduce' tends to require an understanding of functional programming). But the function still got added for numbers. So, as of Python 2.3b1, there is a built-in named 'sum' that has the parameter list "sum(list_of_numbers [, start=3D0]) -> sum of the numbers in list_of_numbers". The 'start' parameter allows you to specify where to start in the list for your running sum. And since this is a function with a very specific use it is the fastest way you can sum a list of numbers. The question of adding a statistics module came up during this discussion. The thought was presented to come up with a good, basic stats module to have in the stdlib. The arguments against this is that there are already several good stats modules out there so why bother with including one with Python? It would cause some overshadowing of any 3rd-party stats modules. Eventually the "nays" had it and the idea was dropped. And for all his work Alex got CVS commit privileges. Python, the gift that keeps on giving you more responsibility. =3D) =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D `When is it okay to cvs remove?`__ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D __ http://mail.python.org/pipermail/python-dev/2003-April/035011.html Related threads: - `Rules of a beta release? `__ Being probably the most inexperienced person with CVS commit privileges on Python, I am continuing with my newbie questions in terms of applying patches to the CVS tree (and since I control the Summary I am going to document the answers I get here so I don't have to write them down somewhere else =3D). This time I asked about when it was appropriate to use ``cvs remove``, specifically if it was reasonable if a file was completely rewritten. The answer was to not bother with it unless you are actually removing the file forever; don't bother if you are just rewriting the file. Also, don't bother with changing the version number when doing a complete rewrite; just make sure to mention in the CVS commit message that it is a rewrite. I also learned that the basic guideline to follow in terms of whether a patch should be put up on SF_ or just committed directly is that if you are unsure about the usefulness or correctness then you should post it on SF. But if you don't think there is anyone who can answer it on SF it will just languish there for eternity. Also learned the rules of a beta release. Basically no changes that would cause someone's code to not work the same way as when the beta was released can be checked in. New tests are okay, though. =2E. _SF: http://www.sf.net/ =3D=3D=3D=3D=3D=3D=3D=3D=3D Quickies =3D=3D=3D=3D=3D=3D=3D=3D=3D `3-way result of PyObject_IsTrue() considered PITA`__ Raymond Hettinger discovered that PyObject_IsTrue() promises that there is never an error from running the function, which is not how the function performs. Raymond fixed the docs to match the code. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034658.html `Python dies upon printing UNICODE using UTF-8`__ Windows NT 4's support of UTF-8 is broken. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034666.html `shellwords`__ Gustavo Niemeyer asked if there was any chance of getting shellwords_ into the stdlib so as to be able to have POSIX command line word parsing. The basic response was that shlex_ should be enhanced to do what Gustavo wanted. He has since written `patch #722686`_ that implements the features he wanted. It was also discovered that distutils.util.split_quoted comes close. If someone wants to document Distutils utilities it would be greatly appreciated. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034670.html =2E. _shellwords: http://www.crazy-compilers.com/py-lib/shellwords.html =2E. _shlex: http://www.python.org/dev/doc/devel/lib/module-shlex.html =2E. _patch #722686: http://www.python.org/sf/722686 `Changes to gettext.py for Python 2.3`__ This thread was originally covered in the `last summary`_. Barry Warsaw and Martin v. L=C3=B6wis discussed the gettext_ and whether there should be a way to coerce strings to other encodings. They ended up agreeing on defaulting on Unicode for storing the strings and having =2Egettext() coerce to an 8-bit string while .ugettext() returns the original Unicode string. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034511.html =2E. _gettext: http://www.python.org/dev/doc/devel/lib/module-gettext.html `Stackless 3.0 alpha 1 at blinding speed`__ Christian Tismer has done it again; he improved Stackless_ and now has managed to have merged the abilities of Stackless 1 with 2 which has led to 3a. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034708.html =2E. _Stackless: http://www.stackless.com `Build errors under RH9`__ Python was not building under Red Hat 9, but Martin v. L=C3=B6wis check= ed in a fix. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034724.html `Wrappers and keywords`__ Matt LeBlanc asked why there wasn't a nice syntax for doing properties staticmethods and classmethods. The answer is that it was felt it was more important to get the ability to use those new descriptors out there instead of letting a syntax debate hold them up. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034715.html `Startup overhead due to codec usage`__ MA Lemburg and Martin v. L=C3=B6wis discussed startup time taken up by seeing what encoding is used by the local filesystem. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034742.html `test_pwd failing`__ Initially covered in the `last summary`_. test_grp was failing for the same reasons test_pwd was failing. It has been fixed. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034626.html `Evil setattr hack`__ Don't mess with an instance's __dict__ directly; we will let you but if you get burned its your own fault. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034633.html `heapq`__ Splinter threads: - `FIFO data structure? `__ - `heaps `__ The idea of turning the heapq_ module into a class came up, and later led to the idea of having a more proper FIFO (First In, First Out) data structure. Both ideas were shot down. The reason for this was that the stdlib does not need to try to grow every single possible data structure in programming. Guido's design philosophy is to have a few very powerful data structures that other ones can be built off of. This is why the bisect_ and heapq modules just work on standard lists instead of defining a new class. Queue_ is an exception, but it is designed to mediate messages between threads instead of being a general implementation of a queue. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034768.html =2E. _heapq: http://www.python.org/dev/doc/devel/lib/module-heapq.html =2E. _bisect: http://www.python.org/dev/doc/devel/lib/module-bisect.html =2E. _Queue: http://www.python.org/dev/doc/devel/lib/module-Queue.html `New re failures on Windows`__ Splinter threads: - `sre vs gcc `__ The re_ module was failing after some changes were made to it. The pain of it all was that it was failing only on certain platforms running gcc_. Initial attempts were to make it "just work", but then it was stressed that it is more important to find the offending C code and figure out why gcc on certain platforms was compiling bad assembly. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034776.html =2E. _re: http://www.python.org/dev/doc/devel/lib/module-re.html =2E. _gcc: http://gcc.gnu.org/ `os.path.walk() lacks 'depth first' option`__ Someone requested that os.path.walk support depth-first walking. The request was deemed not important enough to bother implementing, but Tim Peters did implement a new function named os.walk that is a much improved replacement for os.path.walk. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034792.html `Weekly Python Bug/Patch Summary`__ Skip Montanaro's weekly reminder that there is work to be done! Summary for week 2 can be found `here `__. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034797.html `Hook Extension Module Import?`__ Want to do something that requires a special import hook in C? Then override the __import__ built-in with what you need. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034804.html `Bug/feature/patch policy for optparse.py`__ Greg Ward asked if it would be okay to keep the official version of optparse_ at http://optik.sf.net/ . Guido said sure. The justification for this is that Greg wants Optik to be available to people for use in earlier versions of Python. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034833.html =2E. _optparse: http://www.python.org/dev/doc/devel/lib/module-optparse.htm= l `LynxOS4 dynamic loading with dlopen() and -ldl`__ LynxOS4 does not like dynamic linking. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034878.html `Embedded python on Win2K, import failures`__ I don't like Windows. And no, this has nothing to do with this single email that is a short continuation of one covered in the `last summary`_. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034506.html `New thread death in test_bsddb3`__ After Mark Hammond's new thread code got checked in the bsddb module broke. Mark went in, though, and using the wonders that is the C preprocessor and NEW_PYGILSTATE_API_EXISTS, Mark fixed the code to use the new PyGILState API as covered in `PEP 311`_ when possible and to use the old solution when needed. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034901.html `Magic number needs upgrade`__ Guido noticed that the PYC magic number needed to be incremented to handle Raymond Hettinger's new bytecode optimizations. But then Guido questioned the need of Raymond's changes. Basically Raymond's changes didn't speed anything up but cleaned up the emitted bytecode. Guido didn't like the idea of adding more code without an actual speed improvement. Since neither this code nor any of the other proposed speedup changes (CALL_ATTR and caching attribute lookup results) are panning out, Guido questioned why Raymond's should get in. Guido suggested rewriting the interpreter from scratch since all new changes seem to be breaking some delicate balance that has developed in it. He also thought putting effort into other things like pysco_. Eventually Raymond's changes were backed out. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034905.html =2E. _pysco: http://psyco.sf.net/ `draft PEP: Trace and Profile Support for Threads`__ Jeremy Hylton has a draft PEP on how to add hooks for profile and trace code in threads. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034909.html `Data Descriptors on module objects`__ Never going to happen. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035025.html `Metatype conflict among bases?`__ "The metaclass [of a class] must be a subclass of the metaclass of all the bases" of that class. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034910.html `okay to beef up tests on the maintenance branch?`__ Answer: yes! =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034939.html `Cryptographic stuff for 2.3`__ AM Kuchling wanted to add an implementation of the AES_ encryption algorithm to the stdlib. After a long discussion the idea was shot down because having crypto that strong in the stdlib would cause export issues for Python. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034957.html =2E. _AES: http://csrc.nist.gov/encryption/aes/ `vacation`__ Neal Norwitz is on vacation from April 26 till May 6. He pointed out some nagging errors coming up from the `Snake Farm`_ that could use some working on. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034942.html =2E. _Snake Farm: http://www.lysator.liu.se/xenofarm/python/latest.html `test_getargs2 failures`__ Not anymore. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034944.html `Democracy`__ Guido pointed out a paper on democracy (in the ancient Athenian sense) and the organization of groups at http://www.acm.org/ubiquity/interviews/b_manville_1.html that was interesting. Sparked some discussion on proper comparisons to open source projects and such. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034946.html `Updating PEP 246 for type/class unification, 2.2+, etc.`__ Phillip Eby proposed some changes to `PEP 246`_. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034955.html =2E. _PEP 246: http://www.python.org/peps/pep-0246.html `why is test_socketserver in expected skips?`__ Skip Montanaro noticed that socketserver was listed as an expected test to be skipped on all platforms sans os2emx even though it works on all platforms with networking (basically all of them). So it was removed from the expected skip list. Skip also tweaked test_support.requires to always pass when the caller is __main__. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034973.html `netrc.py`__ Bram Moolenaar, author of the `greatest editor in the world`_ and AAP_, requested a changed to netrc_ that got implemented. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034983.html =2E. _greatest editor in the world: http://www.vim.org/ =2E. _AAP: http://www.a-a-p.org/ =2E. _netrc: http://www.python.org/dev/doc/devel/lib/module-netrc.html `PyRun_* functions`__ They take FILE* arguments and it is going to stay that way. Just make sure the files are opened with the same library as being built against. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034990.html `Python Developers`__ Related threads: - `Getting mouse position interms of canvas unit. `__ - `2.3b1, and object() `__ Posted to the wrong email list. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034969.html `New test failure on Windows`__ re_ was failing but got fixed. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035009.html `More new Windos test failures`__ Just before `Python 2.3b1`_ got pushed out the door, some last-minute test failures cropped up (some of them were my fault). But they got fixed. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035047.html `should sre.Scanner be exposed through re and documented?`__ re.Scanner shall remain undocumented. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035066.html `LynxOS4 port: need pre-ncurses curses!`__ The LynxOS is hoping curses will go away. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035052.html `test_s?re merge`__ test_re and test_sre have been merged and moved over to unittest_ thanks to Skip Montanaro. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035067.html =2E. _unittest: http://www.python.org/dev/doc/devel/lib/module-unittest.htm= l `test_ossaudiodev hanging again`__ Some people are still having issues with ossaudiodev tests hanging. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035056.html `bz2 module fails to compile on Solaris 8`__ The joys of being cross-platform. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035068.html `test_logging hangs on Solaris 8`__ Splinter threads: - `test_logging hangs on OS X `__ - `test_logging hangs on Solaris 8 (and 9) `__ The joys of threading and trying to avoid deadlock. A fix has been checked in that seems to fix this on OS X; don't know about Solaris yet. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035065.html `Python 2.3b1 documentation`__ Fred L. Drake, Jr. posted the documentation for Python 2.3b1. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035064.html `Accepted PEPs?`__ Splinter threads: - `Reminder to PEP authors `__ - `proposed amendments to PEP 1 `__ The status of some PEPs got updated=C2=A0along with some proposed chang= es to `PEP 1`_. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035104.html =2E. _PEP 1: http://www.python.org/peps/pep-0001.html `Problems w/ Python 2.2-maint and Redhat 9`__ Dealing with some issues of Python 2.2-maint and linking against a dbm. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035120.html `Why doesn't the uu module give you the filename?`__ Someone wanted the uu_ module to let you know what the name of the encoded file is. Was told to post a patch. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035129.html =2E. _uu: http://www.python.org/dev/doc/devel/lib/module-uu.html `Antigen found CorruptedCompressedUuencodeFile virus`__ The joys of having to watch out for viruses in emails and get false positives. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035130.html `Python 2.3b1 has 20% slower networking?`__ Splinter threads: - `Python-Dev digest, Vol 1 #3221 - 4 msgs `__ Networking throughput did not have as high of a max when in a loop as before. Has been fixed, though. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035132.html `cvs socketmodule.c and IPV6 disabled`__ Discovered some code that couldn't compile because a test for a specific C function was not specific enough. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035146.html `Introduction :)`__ Someone else with the first name of Brett introduced themselves to the list (Brett Kelly). You can tell us apart because I am taller. =3D) =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035162.html `Dictionary tuning`__ Splinter threads: - `Dictionary tuning upto 100,000 entries `__ Raymond Hettinger did a bunch of attempted tuning of dictionary accesses and came up with one solution that managed to be beneficial for large dictionaries and not detrimental for small ones. He basically just caused dictionary sizes to grow by a factor of 4 instead of 2 so as to lower the number of collisions. The objection that came up was that some dictionaries would be larger than they were previously. It looks like it would be applied, but Raymond's notes on everything will most likely end up as a text file in Python. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035151.html `Thoughts on -O`__ It was suggested to change what the -O and -OO command-line switches did since at this moment they don't do much (Guido has even suggested eliminating -O). But the discussion has been partially put on hold until development for Python 2.4 starts. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035165.html `Initialization hook for extenders`__ It has been suggested to add a Py_AtInit() hook to Python to be symmetric with Py_AtExit(). The debate over this is still going. =2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035226.html From Raymond Hettinger" NOTES ON OPTIMIZING DICTIONARIES ================================ Principal Use Cases for Dictionaries ------------------------------------ Passing keyword arguments Typically, one read and one write for 1 to 3 elements. Occurs frequently in normal python code. Class method lookup Dictionaries vary in size with 8 to 16 elements being common. Usually written once with many lookups. When base classes are used, there are many failed lookups followed by a lookup in a base class. Instance attribute lookup and Global variables Dictionaries vary in size. 4 to 10 elements are common. Both reads and writes are common. Builtins Frequent reads. Almost never written. Size 126 interned strings (as of Py2.3b1). A few keys are accessed much more frequently than others. Uniquification Dictionaries of any size. Bulk of work is in creation. Repeated writes to a smaller set of keys. Single read of each key. * Removing duplicates from a sequence. dict.fromkeys(seqn).keys() * Counting elements in a sequence. for e in seqn: d[e]=d.get(e,0) + 1 * Accumulating items in a dictionary of lists. for k, v in itemseqn: d.setdefault(k, []).append(v) Membership Testing Dictionaries of any size. Created once and then rarely changes. Single write to each key. Many calls to __contains__() or has_key(). Similar access patterns occur with replacement dictionaries such as with the % formatting operator. Data Layout (assuming a 32-bit box with 64 bytes per cache line) ----------------------------------------------------------- Smalldicts (8 entries) are attached to the dictobject structure and the whole group nearly fills two consecutive cache lines. Larger dicts use the first half of the dictobject structure (one cache line) and a separate, continuous block of entries (at 12 bytes each for a total of 5.333 entries per cache line). Tunable Dictionary Parameters ----------------------------- * PyDict_MINSIZE. Currently set to 8. Must be a power of two. New dicts have to zero-out every cell. Each additional 8 consumes 1.5 cache lines. Increasing improves the sparseness of small dictionaries but costs time to read in the additional cache lines if they are not already in cache. That case is common when keyword arguments are passed. * Maximum dictionary load in PyDict_SetItem. Currently set to 2/3. Increasing this ratio makes dictionaries more dense resulting in more collisions. Decreasing it, improves sparseness at the expense of spreading entries over more cache lines and at the cost of total memory consumed. The load test occurs in highly time sensitive code. Efforts to make the test more complex (for example, varying the load for different sizes) have degraded performance. * Growth rate upon hitting maximum load. Currently set to *2. Raising this to *4 results in half the number of resizes, less effort to resize, better sparseness for some (but not all dict sizes), and potentially double memory consumption depending on the size of the dictionary. Setting to *4 eliminated every other resize step. Tune-ups should be measured across a broad range of applications and use cases. A change to any parameter will help in some situations and hurt in others. The key is to find settings that help the most common cases and do the least damage to the less common cases. Results will vary dramatically depending on the exact number of keys, whether the keys are all strings, whether reads or writes dominate, the exact hash values of the keys (some sets of values have fewer collisions than others). Any one test or benchmark is likely to prove misleading. Results of Cache Locality Experiments ------------------------------------- When a entry is retrieved from memory, 4.333 adjacent entries are also retrieved into a cache line. Since accessing items in cache is *much* cheaper than a cache miss, an enticing idea is to probe the adjacent entries as a first step in collision resolution. Unfortunately, the introduction of any regularity into collision searches results in more collisions than the current random chaining approach. Exploiting cache locality at the expense of additional collisions fails to payoff when the entries are already loaded in cache (the expense is paid with no compensating benefit). This occurs in small dictionaries where the whole dictionary fits into a pair of cache lines. It also occurs frequently in large dictionaries which have a common access pattern where some keys are accessed much more frequently than others. The more popular entries *and* their collision chains tend to remain in cache. To exploit cache locality, change the collision resolution section in lookdict() and lookdict_string(). Set i^=1 at the top of the loop and move the i = (i << 2) + i + perturb + 1 to an unrolled version of the loop. This optimization strategy can be leveraged in several ways: * If the dictionary is kept sparse (through the tunable parameters), then the occurrence of additional collisions is lessened. * If lookdict() and lookdict_string() are specialized for small dicts and for largedicts, then the versions for large_dicts can be given the alternate search without increasing the collision in small dicts which already have the maximum benefit of cache locality. * If the use case for the dictionary is known to have a random key access pattern (as opposed to a more common pattern with a Zipf's law distribution), then there will be more benefit for large dictionaries because any given key is no more likely than another to already be in cache. Optimizing the Search of Small Dictionaries ------------------------------------------- If lookdict() and lookdict_string() are specialized for smaller dictionaries, then a custom search approach can be implemented that exploits the small search space and cache locality. * The simplest example is a linear search of contiguous entries. This is simple to implement, guaranteed to terminate rapidly, and precludes the need to check for dummy entries. * A more advanced example is a self-organizing search so that the most frequently accessed entries get probed first. The organization adapts if the access pattern changes over time. * Also, small dictionaries may be made more dense, perhaps filling all eight cells to take the maximum advantage of two cache lines. Strategy Pattern ---------------- Consider allowing the user to set the tunable parameters or to select a particular search method. Since some dictionary use cases have known sizes and access patterns, the user may be able to provide useful hints. 1) For example, if membership testing or lookup dominates runtime and memory is not at a premium, the user may benefit from setting the maximum load ratio at 5% or 10% instead of the usual 66.7%. This will sharply curtail the number of collisions. 2) Dictionary creation time can be shortened in cases where the ultimate size of the dictionary is known in advance. The dictionary can be pre-sized so that *no* resize operations are required during creation. Not only does this save resizes, but the key insertion will go more quickly because the first half of the keys will be inserted into a more sparse environment than before. The preconditions for this strategy arise whenever a dictionary is created from a key or item sequence of known length. 3) If the key space is large and the access pattern is known to be random, then search strategies exploiting cache locality can be fruitful. The preconditions for this strategy arise in simulations and numerical analysis. 4) If the keys are fixed and the access pattern strongly favors some of the keys, then the entries can be stored consecutively and accessed with a linear search. This exploits knowledge of the data, cache locality, and a simplified search routine. It also eliminates the need to test for dummy entries on each probe. The preconditions for this strategy arise in symbol tables and in the builtin dictionary. From martin@v.loewis.de Fri May 2 07:40:21 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 02 May 2003 08:40:21 +0200 Subject: [Python-Dev] python-dev Summary for 2003-04-16 through 2003-04-30 In-Reply-To: References: Message-ID: Brett Cannon writes: > IDNA (International Domain Name Addressing) Funnily, the "A" is for "in applications" (as opposed to "in the nameserver"/"on the wire"). Explaining the acronym as "internationalized domain names" should be sufficient. Regards, Martin From Anthony Baxter Fri May 2 09:20:17 2003 From: Anthony Baxter (Anthony Baxter) Date: Fri, 02 May 2003 18:20:17 +1000 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200304091441.h39EfnU25347@odiug.zope.com> Message-ID: <200305020820.h428KIU24126@localhost.localdomain> >>> Guido van Rossum wrote > Hey, I just figured it out. The old socket module (Python 2.1 and > before) *did* special-case \d+\.\d+\.\d+\.\d+! This code was somehow > lost when the IPv6 support was added. I propose to put it back in, at > least for IPv4 (AF_INET). Patch anyone? https://sourceforge.net/tracker/index.php?func=detail&aid=731209&group_id=5470&atid=305470 Unfortunately the code still goes through the idna encoding module - this is some overhead that it would be nice to avoid for all-numeric addresses. Anthony From martin@v.loewis.de Fri May 2 09:58:23 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 02 May 2003 10:58:23 +0200 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200305020820.h428KIU24126@localhost.localdomain> References: <200305020820.h428KIU24126@localhost.localdomain> Message-ID: <3EB2332F.70900@v.loewis.de> Anthony Baxter wrote: > https://sourceforge.net/tracker/index.php?func=detail&aid=731209&group_id=5470&atid=305470 > > Unfortunately the code still goes through the idna encoding module - this > is some overhead that it would be nice to avoid for all-numeric addresses. That happens only if the argument is a Unicode string, no? Regards, Martin From Anthony Baxter Fri May 2 10:19:48 2003 From: Anthony Baxter (Anthony Baxter) Date: Fri, 02 May 2003 19:19:48 +1000 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <3EB2332F.70900@v.loewis.de> Message-ID: <200305020919.h429Jmp24632@localhost.localdomain> >>> =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote > > Unfortunately the code still goes through the idna encoding module - this > > is some overhead that it would be nice to avoid for all-numeric addresses. > > That happens only if the argument is a Unicode string, no? Ah. That could be the case - I think I'm loading the address from an XML file in the test case I used... will fix that. Anthony From martin@v.loewis.de Fri May 2 10:55:42 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 02 May 2003 11:55:42 +0200 Subject: [Python-Dev] _socket efficiencies ideas In-Reply-To: <200305020919.h429Jmp24632@localhost.localdomain> References: <200305020919.h429Jmp24632@localhost.localdomain> Message-ID: <3EB2409E.8000403@v.loewis.de> Anthony Baxter wrote: > Ah. That could be the case - I think I'm loading the address from an > XML file in the test case I used... will fix that. If you mean "I'll fix the test case to not use XML anymore" - that might be reasonable. If you mean "I'll fix the test case to convert the Unicode arguments to byte strings before passing them to the socket module", I suggest that this should not be needed: the IDNA codec should complete quickly if the Unicode string is ASCII only (perhaps not as fast as converting the string to ASCII beforehand, but not significantly slower). Regards, Martin From Jack.Jansen@cwi.nl Fri May 2 13:45:34 2003 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Fri, 2 May 2003 14:45:34 +0200 Subject: [Python-Dev] Demos and Tools in binary distributions Message-ID: There's a suggestion over on pythonmac-sig that I add the Demos and Tools directories to a binary installer for MacPython for OSX. For MacPython-OS9 I've always included these, as the OS9 installed tree was really the same layout as the source tree. But I don't really know where I should put them for OSX. How is this handled in binary installers for other platforms? I.e. if you install Python on Windows, do you get Demos and Tools? Where? And if you install an RPM or something similar on Linux? -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From thomas@xs4all.net Fri May 2 14:02:14 2003 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 2 May 2003 15:02:14 +0200 Subject: [Python-Dev] Demos and Tools in binary distributions In-Reply-To: References: Message-ID: <20030502130214.GG26254@xs4all.nl> On Fri, May 02, 2003 at 02:45:34PM +0200, Jack Jansen wrote: > How is this handled in binary installers for other platforms? I.e. if > you install Python on Windows, do you get Demos and Tools? Where? And > if you install an RPM or something similar on Linux? The Debian packages include Demo and Tools in /usr/share/doc/python/examples/; this is practically mandated by the Debian policy ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido@python.org Fri May 2 16:01:16 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 02 May 2003 11:01:16 -0400 Subject: [Python-Dev] Demos and Tools in binary distributions In-Reply-To: "Your message of Fri, 02 May 2003 14:45:34 +0200." References: Message-ID: <200305021501.h42F1Ga02666@pcp02138704pcs.reston01.va.comcast.net> > There's a suggestion over on pythonmac-sig that I add the Demos and > Tools directories to a binary installer for MacPython for OSX. For > MacPython-OS9 I've always included these, as the OS9 installed tree > was really the same layout as the source tree. But I don't really > know where I should put them for OSX. > > How is this handled in binary installers for other platforms? I.e. if > you install Python on Windows, do you get Demos and Tools? Where? And > if you install an RPM or something similar on Linux? On Windows, you get a small selection of tools (i18n, idle, pynche, scripts, versioncheck and webchecker) but no demos, alas. --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh@python.net Fri May 2 16:03:52 2003 From: mwh@python.net (Michael Hudson) Date: Fri, 02 May 2003 16:03:52 +0100 Subject: [Python-Dev] Demos and Tools in binary distributions In-Reply-To: (Jack Jansen's message of "Fri, 2 May 2003 14:45:34 +0200") References: Message-ID: <2mof2l8k6f.fsf@starship.python.net> Jack Jansen writes: > There's a suggestion over on pythonmac-sig that I add the Demos and > Tools directories to a binary installer for MacPython for OSX. For > MacPython-OS9 I've always included these, as the OS9 installed tree > was really the same layout as the source tree. But I don't really > know where I should put them for OSX. Surely this is more a question about OSX than Python? I.e. the examples should go where the user expects them. /Developer/Examples/Python? Of course, not everyone who installs Python will have the dev tools... Cheers, M. -- Need to Know is usually an interesting UK digest of things that happened last week or might happen next week. [...] This week, nothing happened, and we don't care. -- NTK Now, 2000-12-29, http://www.ntk.net/ From logistix@cathoderaymission.net Fri May 2 16:12:32 2003 From: logistix@cathoderaymission.net (logistix) Date: Fri, 2 May 2003 10:12:32 -0500 (CDT) Subject: [Python-Dev] Demos and Tools in binary distributions In-Reply-To: <2mof2l8k6f.fsf@starship.python.net> Message-ID: On Fri, 2 May 2003, Michael Hudson wrote: > Jack Jansen writes: > > > There's a suggestion over on pythonmac-sig that I add the Demos and > > Tools directories to a binary installer for MacPython for OSX. For > > MacPython-OS9 I've always included these, as the OS9 installed tree > > was really the same layout as the source tree. But I don't really > > know where I should put them for OSX. > > Surely this is more a question about OSX than Python? I.e. the > examples should go where the user expects them. > /Developer/Examples/Python? Of course, not everyone who installs > Python will have the dev tools... > > Cheers, > M. > Are there currently any make targets for 'tools' and 'demos'? Adding them might be a way to gently influence where they get installed when all the different distros build thier packages. From skip@pobox.com Fri May 2 16:30:53 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 2 May 2003 10:30:53 -0500 Subject: [Python-Dev] updated notes about building bsddb185 module Message-ID: <16050.36653.443229.45811@montanaro.dyndns.org> Folks, An recent thread on c.l.py about the old bsddb module and new bsddb package convinced me to add more verbiage about building the old version. If you have a moment, please take a look at http://www.python.org/2.3/highlights.html and/or README at the top of the source tree. (Search for "bsddb".) I modified them to include a brief note about building the bsddb185 module and making it appear as the default when people "import bsddb". Feedback appreciated. Thanks, Skip From barry@python.org Fri May 2 16:44:39 2003 From: barry@python.org (Barry Warsaw) Date: 02 May 2003 11:44:39 -0400 Subject: [Python-Dev] updated notes about building bsddb185 module In-Reply-To: <16050.36653.443229.45811@montanaro.dyndns.org> References: <16050.36653.443229.45811@montanaro.dyndns.org> Message-ID: <1051890279.29805.0.camel@barry> On Fri, 2003-05-02 at 11:30, Skip Montanaro wrote: > Folks, > > An recent thread on c.l.py about the old bsddb module and new bsddb package > convinced me to add more verbiage about building the old version. If you > have a moment, please take a look at > > http://www.python.org/2.3/highlights.html > > and/or README at the top of the source tree. (Search for "bsddb".) I > modified them to include a brief note about building the bsddb185 module and > making it appear as the default when people "import bsddb". Without actually trying the recipe, the instructions seem reasonable. -Barry From dberlin@dberlin.org Fri May 2 16:58:03 2003 From: dberlin@dberlin.org (Daniel Berlin) Date: Fri, 2 May 2003 11:58:03 -0400 Subject: [Python-Dev] 2.3 broke email date parsing Message-ID: Parsing dates in emails is broken in 2.3 compared to 2.2.2. Changing parsedate_tz back to what it was in 2.2.2 fixes it. I'm not sure who or why this change was made, but it clearly doesn't handle cases it used to: (oldparseaddr is the 2.3 version with the patch at the bottom applied, which reverts it to what it was in 2.2.2) >>> import _parseaddr >>> _parseaddr.parsedate_tz("3 Mar 2001 02:04:50 -0000") >>> import oldparseaddr >>> oldparseaddr.parsedate_tz("3 Mar 2001 02:04:50 -0000") (2001, 3, 3, 2, 4, 50, 0, 0, 0, 0) >>> The problem is obvious from looking at the new code: The old version would only care if it actually found something it needed to delete. The new version assumes there *must* be a comma in the date if there is no dayname, and if there isn't, returns nothing. I wanted to know if this was a mistake, or done on purpose. If it's a mistake, i'll submit a patch to sourceforge to fix it. Index: _parseaddr.py =================================================================== RCS file: /cvsroot/python/python/dist/src/Lib/email/_parseaddr.py,v retrieving revision 1.5 diff -u -3 -p -r1.5 _parseaddr.py --- _parseaddr.py 17 Mar 2003 18:35:42 -0000 1.5 +++ _parseaddr.py 2 May 2003 15:42:30 -0000 @@ -49,14 +49,9 @@ def parsedate_tz(data): data = data.split() # The FWS after the comma after the day-of-week is optional, so search and # adjust for this. - if data[0].endswith(',') or data[0].lower() in _daynames: + if data[0][-1] in (',', '.') or data[0].lower() in _daynames: # There's a dayname here. Skip it del data[0] - else: - i = data[0].rfind(',') - if i < 0: - return None - data[0] = data[0][i+1:] if len(data) == 3: # RFC 850 date, deprecated stuff = data[0].split('-') if len(stuff) == 3: From just@letterror.com Fri May 2 17:20:40 2003 From: just@letterror.com (Just van Rossum) Date: Fri, 2 May 2003 18:20:40 +0200 Subject: [Python-Dev] Demos and Tools in binary distributions In-Reply-To: <2mof2l8k6f.fsf@starship.python.net> Message-ID: Michael Hudson wrote: > Surely this is more a question about OSX than Python? I.e. the > examples should go where the user expects them. > /Developer/Examples/Python? Of course, not everyone who installs > Python will have the dev tools... Actually, I didn't know until recently that 3rd party stuff sometimes gets installed there (eg. the PyObjC doco). I would actually expect it in /Application/MacPython-2.3/..., as that's where the apps get installed. I guess /Developer/... would make sense if the Python apps got installed in /Developer/Applications/, which they don't. Just From theller@python.net Fri May 2 17:35:36 2003 From: theller@python.net (Thomas Heller) Date: 02 May 2003 18:35:36 +0200 Subject: [Python-Dev] New thread death in test_bsddb3 In-Reply-To: References: Message-ID: "Tim Peters" writes: > [Thomas Heller] > > ... > > So is the policy now that it is no longer *allowed* to create another > > thread state, while in previous versions there wasn't any choice, > > because there existed no way to get the existing one? > > You can still create all the thread states you like; the new check is in > PyThreadState_Swap(), not in PyThreadState_New(). So you can create them, but are not allowed to use them? (Should there be a smiley here, or not, I'm not sure) > > There was always a choice, but previously Python provided no *help* in > keeping track of whether a thread already had a thread state associated with > it. That didn't stop careful apps from providing their own mechanisms to do > so. > > About policy, yes, it appears to be so now, else Mark wouldn't be raising a > fatal error . I view it as having always been the policy (from a > good-faith reading of the previous code), just a policy that was too > expensive for Python to enforce. There are many policies like that, such as > not passing goofy arguments to macros, and not letting references leak. > Python doesn't currently enforce them because it's currently too expensive > to enforce them. Over time that can change. I'm confused: what *is* the policy now? And: Has the policy *changed*, or was it simply not checked before? Since I don't know the policy, I can only guess if the fatal error is appropriate or not. If it is, there should be a 'recipe' what to do (even if it is 'use the approach outlined in PEP311'). If it is not, the error should be removed (IMO). > Clearly, I like having fatal errors for dubious things in debug builds. > Debug builds are supposed to help you debug. If the fatal error here drives > you insane, and you don't want to repair the app code, No, not at all. Thanks, Thomas From martin@v.loewis.de Fri May 2 18:01:51 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 02 May 2003 19:01:51 +0200 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <16050.36653.443229.45811@montanaro.dyndns.org> References: <16050.36653.443229.45811@montanaro.dyndns.org> Message-ID: <3EB2A47F.8000706@v.loewis.de> Skip Montanaro wrote: > Feedback appreciated. I think we need to build bsddb185 automatically under certain conditions. I have encouraged a user to submit a patch in that direction. Regards, Martin From skip@pobox.com Fri May 2 18:34:40 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 2 May 2003 12:34:40 -0500 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <3EB2A47F.8000706@v.loewis.de> References: <16050.36653.443229.45811@montanaro.dyndns.org> <3EB2A47F.8000706@v.loewis.de> Message-ID: <16050.44080.588636.503705@montanaro.dyndns.org> Martin> Skip Montanaro wrote: >> Feedback appreciated. Martin> I think we need to build bsddb185 automatically under certain Martin> conditions. I have encouraged a user to submit a patch in that Martin> direction. I suppose that's an alternative, however, it is complicated by a couple issues: * The bsddb185 module would have to be built as bsddb (not a big deal in and of itself). * The current bsddb package directory would have to be renamed or not installed to avoid name clashes. I don't think there's a precedent for the second issue. The make install target installs everything in Lib. I think The decision about whether the package or the module gets installed would be made in setup.py. The coupling between the two increases the complexity of the process. I smell an ugly hack in the offing. Skip From tim.one@comcast.net Fri May 2 18:55:05 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 02 May 2003 13:55:05 -0400 Subject: [Python-Dev] New thread death in test_bsddb3 In-Reply-To: Message-ID: [Thomas Heller] >>> ... >>> So is the policy now that it is no longer *allowed* to create another >>> thread state, while in previous versions there wasn't any choice, >>> because there existed no way to get the existing one? [Tim] >> You can still create all the thread states you like; the new check is >> in PyThreadState_Swap(), not in PyThreadState_New(). [Thomas] > So you can create them, Yes. > but are not allowed to use them? Currently, no more than one at a time per thread. The API doesn't appear to preclude using multiple thread states with a single thread if the right dances are performed. Offhand I don't know why someone would want to, but people want to do a lot of silly things . > (Should there be a smiley here, or not, I'm not sure) No. > ... > I'm confused: what *is* the policy now? > And: Has the policy *changed*, or was it simply not checked before? I already gave you my best guesses about those (no, yes). > Since I don't know the policy, I can only guess if the fatal error is > appropriate or not. Ditto (yes). > If it is, there should be a 'recipe' what to do (even if it is 'use the > approach outlined in PEP311'). Additions to NEWS and the PEP would be fine by me. > If it is not, the error should be removed (IMO). Sure. From tim.one@comcast.net Fri May 2 20:28:41 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 02 May 2003 15:28:41 -0400 Subject: [Python-Dev] Draft of dictnotes.txt [Please Comment] In-Reply-To: <000901c3106b$0d549d20$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger] > NOTES ON OPTIMIZING DICTIONARIES > ================================ > ... Very nice! Please check it in. From tim.one@comcast.net Fri May 2 20:59:40 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 02 May 2003 15:59:40 -0400 Subject: [Python-Dev] python-dev Summary for 2003-04-16 through 2003-04-30 In-Reply-To: Message-ID: [Brett Cannon] > ... > But the function still got added for numbers. So, as of Python 2.3b1, > there is a built-in named 'sum' that has the parameter list > "sum(list_of_numbers [, start=0]) -> sum of the numbers in > list_of_numbers". The 'start' parameter allows you to specify where to > start in the list for your running sum. list_of_numbers is really any iterable producing numbers. All the numbers are added ("start" doesn't affect that), as if via def sum(seq, start=0): result = start for x in seq: result += x return start The best use for start is if you're summing a sequence of number-like arguments that can't be added to the integer 0 (datetime.timedelta is an example). > ... > Python, the gift that keeps on giving you more responsibility. =) Speaking of which, your PSF dues for April are overdue . > ... > `os.path.walk() lacks 'depth first' option`__ > Someone requested that os.path.walk support depth-first walking. This was a terminology confusion: os.path.walk() always did depth-first walking, and so does the new os.walk(). The missing bit was an option to control whether directories are delivered in preorder ("top down") or postorder ("bottom up") during the depth-first walk. > The request was deemed not important enough to bother implementing, A topdown flag is implemented in os.walk(). From Jack.Jansen@oratrix.com Fri May 2 22:52:08 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Fri, 2 May 2003 23:52:08 +0200 Subject: [Python-Dev] Demos and Tools in binary distributions In-Reply-To: Message-ID: <51F6BD90-7CE8-11D7-A7DC-000A27B19B96@oratrix.com> On vrijdag, mei 2, 2003, at 18:20 Europe/Amsterdam, Just van Rossum wrote: > Michael Hudson wrote: > >> Surely this is more a question about OSX than Python? I.e. the >> examples should go where the user expects them. >> /Developer/Examples/Python? Of course, not everyone who installs >> Python will have the dev tools... > > Actually, I didn't know until recently that 3rd party stuff sometimes > gets installed there (eg. the PyObjC doco). I would actually expect it > in /Application/MacPython-2.3/..., as that's where the apps get > installed. I guess /Developer/... would make sense if the Python apps > got installed in /Developer/Applications/, which they don't. I'm also tempted to go with /Applications/MacPython-2.3/Demo and .../Tools. That is what a lot of Mac applications do. It has a slight problems, though: it would look unintuitive to a pure-unix user. But as there isn't a standard location for this on unix anyway: who cares . A slightly more serious problem is that the README's in Tools and Demo aren't really meant for the 100%-novice, and a prominent location at the top of the /Applications/MacPython-2.3 folder will make it almost-100%-certain that these files are going to be among the first they read. I could put Demo and Tools one level deeper (in an Extras folder?) and provide a readme there explaining that these demos and tools are for all Pythons on all platforms, so may not work and/or may not be intellegible int he first place. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From just@letterror.com Fri May 2 23:08:48 2003 From: just@letterror.com (Just van Rossum) Date: Sat, 3 May 2003 00:08:48 +0200 Subject: [Python-Dev] Demos and Tools in binary distributions In-Reply-To: <51F6BD90-7CE8-11D7-A7DC-000A27B19B96@oratrix.com> Message-ID: Jack Jansen wrote: > I could put Demo and Tools one level deeper (in an Extras folder?) > and provide a readme there explaining that these demos and tools are > for all Pythons on all platforms, so may not work and/or may not be > intellegible int he first place. +1 Just From martin@v.loewis.de Sat May 3 00:39:51 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 03 May 2003 01:39:51 +0200 Subject: [Python-Dev] New thread death in test_bsddb3 In-Reply-To: References: Message-ID: <3EB301C7.5000508@v.loewis.de> Tim Peters wrote: > Currently, no more than one at a time per thread. The API doesn't appear to > preclude using multiple thread states with a single thread if the right > dances are performed. Offhand I don't know why someone would want to, but > people want to do a lot of silly things . There are many good reasons; here is one scenario: Application A calls embedded Python. It creates thread state T1 to do so. Python calls library L1, which releases GIL. L1 calls L2. L2 calls back into Python. To do so, it allocates a new thread state, and acquires the GIL. All in one thread. L2 has no idea that A has already allocated a thread state for this thread. With the new API, L2 does not need any longer to create a thread state. However, in older Python releases, this was necessary, so libraries do such things. It is unfortunate that these libraries now break, and I wish the new API would not be enforced so strictly yet. > I already gave you my best guesses about those (no, yes). I think your guess is wrong: In the past, it was often *necessary* to have multiple thread states allocated for a single thread. There was simply no other option. So it can't be that this was not allowed. Regards, Martin From skip@pobox.com Sat May 3 00:49:31 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 2 May 2003 18:49:31 -0500 Subject: [Python-Dev] removing csv directory from nondist/sandbox - how? Message-ID: <16051.1035.821998.148196@montanaro.dyndns.org> Taking a cue from Raymond's sandbox/itertools cleanup, I cvs removed the contents of sandbox/csv just now. How do I get rid of the sandbox/csv directory itself? I see that the itertools directory remains as well, even though I executed "cvs -dP ." from the sandbox directory. Skip From martin@v.loewis.de Sat May 3 00:32:24 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 03 May 2003 01:32:24 +0200 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <16050.44080.588636.503705@montanaro.dyndns.org> References: <16050.36653.443229.45811@montanaro.dyndns.org> <3EB2A47F.8000706@v.loewis.de> <16050.44080.588636.503705@montanaro.dyndns.org> Message-ID: <3EB30008.4010603@v.loewis.de> Skip Montanaro wrote: [building bsddb185] > I suppose that's an alternative, however, it is complicated by a couple > issues: > > * The bsddb185 module would have to be built as bsddb (not a big deal in > and of itself). Why is that? I propose to build the bsddb185 module as bsddb185. It does not support being built as bsddb[module]. > * The current bsddb package directory would have to be renamed or not > installed to avoid name clashes. I suggest no such thing, and I agree that this would not be desirable. Regards, Martin From skip@pobox.com Sat May 3 01:11:53 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 2 May 2003 19:11:53 -0500 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <3EB30008.4010603@v.loewis.de> References: <16050.36653.443229.45811@montanaro.dyndns.org> <3EB2A47F.8000706@v.loewis.de> <16050.44080.588636.503705@montanaro.dyndns.org> <3EB30008.4010603@v.loewis.de> Message-ID: <16051.2377.270099.748537@montanaro.dyndns.org> Skip> I suppose that's an alternative, however, it is complicated by a Skip> couple issues: Skip> Skip> * The bsddb185 module would have to be built as bsddb (not a big Skip> deal in and of itself). Martin> Why is that? I propose to build the bsddb185 module as Martin> bsddb185. It does not support being built as bsddb[module]. Skip> * The current bsddb package directory would have to be renamed or Skip> not installed to avoid name clashes. Martin> I suggest no such thing, and I agree that this would not be Martin> desirable. My apologies, Martin. I guess I misunderstood what you suggested. (I suspect Nick Vargish may have as well.) My interpretation of his complaint is that he doesn't have a functioning bsddb module and wants the old module back. He wants to be able to install Python and have "bsddb" be the module. As currently constituted, I think Modules/bsddbmodule.c can only be built as "bsddb185" because of the symbols in the file. How can Nick build that as "bsddb"? Furthermore, how can you guarantee that the bsddb package directory won't be found before the bsddb module during a module search (short, perhaps of statically linking the module into the interpreter)? Skip From pje@telecommunity.com Sat May 3 01:29:12 2003 From: pje@telecommunity.com (Phillip J. Eby) Date: Fri, 02 May 2003 20:29:12 -0400 Subject: [Python-Dev] removing csv directory from nondist/sandbox - how? In-Reply-To: <16051.1035.821998.148196@montanaro.dyndns.org> Message-ID: <5.1.0.14.0.20030502202821.02563020@mail.telecommunity.com> At 06:49 PM 5/2/03 -0500, Skip Montanaro wrote: >Taking a cue from Raymond's sandbox/itertools cleanup, I cvs removed the >contents of sandbox/csv just now. How do I get rid of the sandbox/csv >directory itself? I see that the itertools directory remains as well, even >though I executed "cvs -dP ." from the sandbox directory. You can't remove directories from a CVS server unless you have direct access to it. And if you remove the directory, its history goes with it. From martin@v.loewis.de Sat May 3 01:25:03 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 03 May 2003 02:25:03 +0200 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <16051.2377.270099.748537@montanaro.dyndns.org> References: <16050.36653.443229.45811@montanaro.dyndns.org> <3EB2A47F.8000706@v.loewis.de> <16050.44080.588636.503705@montanaro.dyndns.org> <3EB30008.4010603@v.loewis.de> <16051.2377.270099.748537@montanaro.dyndns.org> Message-ID: <3EB30C5F.90801@v.loewis.de> Skip Montanaro wrote: > My apologies, Martin. I guess I misunderstood what you suggested. (I > suspect Nick Vargish may have as well.) My interpretation of his complaint > is that he doesn't have a functioning bsddb module and wants the old module > back. That's the larger of his complaints. There is also a subcomplaint: Building the new bsddb185 module is not automatic, so he has to give explicit instructions to his admins. > He wants to be able to install Python and have "bsddb" be the module. He would want it that way. However, he could also accept importing bsddb185 as bsddb. He cannot accept having to edit Modules/Setup, and he cannot accept building Sleepycat [34].x > As currently constituted, I think Modules/bsddbmodule.c can only be built as > "bsddb185" because of the symbols in the file. How can Nick build that as > "bsddb"? He can't. He can build it as bsddb185. However, his complaint is that setup.py doesn't do that for him. > Furthermore, how can you guarantee that the bsddb package > directory won't be found before the bsddb module during a module search > (short, perhaps of statically linking the module into the interpreter)? I don't think the module should be bsddb; I renamed the init function on purpose. All I'm suggesting that it is autmatically built with setup.py. People can accept changing their Python code. They cannot accept having to ask more favours from their sysadmins. Regards, Martin From andymac@bullseye.apana.org.au Fri May 2 23:45:27 2003 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Sat, 3 May 2003 09:45:27 +1100 (edt) Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <16050.44080.588636.503705@montanaro.dyndns.org> Message-ID: On Fri, 2 May 2003, Skip Montanaro wrote: > Martin> Skip Montanaro wrote: > >> Feedback appreciated. > > Martin> I think we need to build bsddb185 automatically under certain > Martin> conditions. I have encouraged a user to submit a patch in that > Martin> direction. > > I suppose that's an alternative, however, it is complicated by a couple > issues: > > * The bsddb185 module would have to be built as bsddb (not a big deal in > and of itself). > > * The current bsddb package directory would have to be renamed or not > installed to avoid name clashes. > > I don't think there's a precedent for the second issue. The make install > target installs everything in Lib. I think The decision about whether the > package or the module gets installed would be made in setup.py. The > coupling between the two increases the complexity of the process. I smell > an ugly hack in the offing. Could you not have the following? - build bsddb if the Sleepycat libraries are found; - build bsddb185 if the DB 1.85 libraries can be found; - where bsddb is imported, try importing bsddb, and if that fails try importing bsddb185 as bsddb (or as * inside the bsddb pkg). -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From barry@python.org Sat May 3 03:02:45 2003 From: barry@python.org (Barry Warsaw) Date: 02 May 2003 22:02:45 -0400 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <3EB30008.4010603@v.loewis.de> References: <16050.36653.443229.45811@montanaro.dyndns.org> <3EB2A47F.8000706@v.loewis.de> <16050.44080.588636.503705@montanaro.dyndns.org> <3EB30008.4010603@v.loewis.de> Message-ID: <1051927365.4302.3.camel@anthem> On Fri, 2003-05-02 at 19:32, "Martin v. L=F6wis" wrote: > [building bsddb185] > > I suppose that's an alternative, however, it is complicated by a coup= le > > issues: > >=20 > > * The bsddb185 module would have to be built as bsddb (not a big = deal in > > and of itself). >=20 > Why is that? I propose to build the bsddb185 module as bsddb185. It doe= s=20 > not support being built as bsddb[module]. >=20 > > * The current bsddb package directory would have to be renamed or= not > > installed to avoid name clashes. >=20 > I suggest no such thing, and I agree that this would not be desirable. I totally agree with Martin. Make bsddb185 explicit and do not masquerade it as bsddb by default. -Barry From barry@python.org Sat May 3 03:04:17 2003 From: barry@python.org (Barry Warsaw) Date: 02 May 2003 22:04:17 -0400 Subject: [Python-Dev] removing csv directory from nondist/sandbox - how? In-Reply-To: <16051.1035.821998.148196@montanaro.dyndns.org> References: <16051.1035.821998.148196@montanaro.dyndns.org> Message-ID: <1051927457.4302.5.camel@anthem> On Fri, 2003-05-02 at 19:49, Skip Montanaro wrote: > Taking a cue from Raymond's sandbox/itertools cleanup, I cvs removed the > contents of sandbox/csv just now. How do I get rid of the sandbox/csv > directory itself? I see that the itertools directory remains as well, even > though I executed "cvs -dP ." from the sandbox directory. Check to make sure you don't have any dot-files left in the directory. -P should definitely zap it if there's nothing in there. You really don't want to remove the directory from the repository (for a number of reasons). -Barry From tim.one@comcast.net Sat May 3 03:49:04 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 02 May 2003 22:49:04 -0400 Subject: [Python-Dev] New thread death in test_bsddb3 In-Reply-To: <3EB301C7.5000508@v.loewis.de> Message-ID: [Martin v. L=F6wis] > There are many good reasons; here is one scenario: > > Application A calls embedded Python. It creates thread state T1 to = do > so. Python calls library L1, which releases GIL. L1 calls L2. L2 ca= lls > back into Python. To do so, it allocates a new thread state, and > acquires the GIL. All in one thread. > > L2 has no idea that A has already allocated a thread state for this > thread. With the new API, L2 does not need any longer to create a t= hread > state. However, in older Python releases, this was necessary, so > libraries do such things. I understand that some people did this (we've bumped into two so far, right?), but don't agree it was necessary: the thrust of Mark's new = code is to make this easy to do in a uniform way, but people could (and did) = build their own layers of TLS-based Python wrappers before this (Mark is on= e of them; a former employer of mine is another). AFAIK, though, these we= re cases where multiple libraries agreed to cooperate. I don't really c= are anymore, since there's a standard way to do this now. > It is unfortunate that these libraries now break, and I wish the ne= w > API would not be enforced so strictly yet. If it were enforced in a release build I'd agree, but it isn't -- a r= elease build enforces nothing new here, and I want to be punched in the groi= n when a debug build spots dubious practice. >> I already gave you my best guesses about those (no, yes). > I think your guess is wrong: In the past, it was often *necessary* = to > have multiple thread states allocated for a single thread. There wa= s > simply no other option. So it can't be that this was not allowed. It's a new world now -- let's get on with it. Fighting for the right= to retain lame code (judged by current stds, whether or not it was lame = before) isn't a cause I'll sign up for, and especially not when it's in an ex= tremely error-prone area of the C API, and certainly not when it's so easy to= repair too. But if you're determined to let slop slide in the debug build, = check in a change to stop the warning -- it's not important enough to me to= keep arguing about it. I don't think you'd be doing anyone a real favor, = and I'll leave it at that. From martin@v.loewis.de Sat May 3 02:52:41 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 03 May 2003 03:52:41 +0200 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: References: Message-ID: <3EB320E9.4020409@v.loewis.de> Andrew MacIntyre wrote: > Could you not have the following? > - build bsddb if the Sleepycat libraries are found; That is happening now. > - build bsddb185 if the DB 1.85 libraries can be found; That is what I'm proposing. Volunteers should step forward. > - where bsddb is imported, try importing bsddb, and if that > fails try importing bsddb185 as bsddb (or as * inside the bsddb pkg). I'm strongly opposed to that. Users of bsddb185 need to make an explicit choice that they want to use that library. Otherwise, we would have to deal with the bug reports resulting from the brokenness of the library forever. Regards, Martin From tim.one@comcast.net Sat May 3 04:16:35 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 02 May 2003 23:16:35 -0400 Subject: [Python-Dev] removing csv directory from nondist/sandbox - how? In-Reply-To: <16051.1035.821998.148196@montanaro.dyndns.org> Message-ID: [Skip Montanaro] > Taking a cue from Raymond's sandbox/itertools cleanup, I cvs removed > the contents of sandbox/csv just now. How do I get rid of the > sandbox/csv directory itself? I see that the itertools directory > remains as well, even though I executed "cvs -dP ." from the sandbox > directory. -P won't remove a directory if there's any file remaining in the directory that wasn't checked in. This includes dot files (as Barry said), .rej files left behind by old rejected patches, temp scripts or output files you may have created, or a build directory created by setup.py. I had to get rid of all of those before CVS deleted my csv directory (normally I just do deltree (rm -rf) on a dead directory, and CVS won't recreate it then, but I did it by hand this time just to verify how -P works). From noah@noah.org Sat May 3 11:36:53 2003 From: noah@noah.org (Noah Spurrier) Date: Sat, 03 May 2003 03:36:53 -0700 Subject: [Python-Dev] posixmodule.c patch to support forkpty (patch against posixmodule.c Revision 2.241.2.1) Message-ID: <3EB39BC5.50702@noah.org> This is a multi-part message in MIME format. --------------020003030503000503080704 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Hi, I have been taking a hard look at Python 2.3b and support for pseudo-ttys seems to be much better. It looks like os.openpty() was updated to provide support for a wider range of pseudo ttys. Unfortunately os.forkpty() was not also updated. I am attaching a patch that allows os.forkpty() to run on the same platforms that os.openpty supports. In other words, os.forkpty() will use os.fork() and os.openpty() for platforms that don't already have forkpty(). Note that since pty module calls os.forkpty this patch will also allow pty.fork() to work properly on more platforms. Most importantly to me, this patch will allow os.forkpty() to work with Solaris. This patch was diffed against posixmodule.c Revision 2.241.2.1 Python 2.3b. This patch moves most of the logic out of the posix_openpty() C function into a function that can be shared by both posix_openpty() and posix_forkpty(). Although the posix_openpty() logic was moved it was unchanged. I think I kept the code neat despite all the messy #if's that always accompany pty code. I am also attaching a test script, test_forkpty.py (based on test_openpty.py), that tests the basic ability to fork and read and write a pty. I am testing it with my Pexpect module which makes heavy use of the pty module. With the patch Pexpect passes all my unit tests on Solaris. Pexpect has been tested on Linux, OpenBSD, Solaris, and Cygwin. I'm looking for an OS X server to test with. Yours, Noah --------------020003030503000503080704 Content-Type: text/plain; name="test_forkpty.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="test_forkpty.py" #!/usr/bin/env python2.3 import os, sys, time verbose = 1 try: if verbose: print "Calling os.forkpty()" pid, fd = os.forkpty() if verbose: print "(pid, fd) = (%d, %d)"%(pid, fd) except AttributeError: raise TestSkipped, "No forkpty() available." if pid == 0: # child print "I am not a robot!" sys.stdout.flush(0) else: time.sleep(1) print "The robot says: ", os.read(fd,100) os.close(fd) --------------020003030503000503080704 Content-Type: text/plain; name="posixmodule.c.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="posixmodule.c.patch" *** posixmodule.c Tue Apr 22 22:39:17 2003 --- new.posixmodule.c Sat May 3 06:11:04 2003 *************** *** 2597,2685 **** #endif /* defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) || defined(HAVE_DEV_PTMX */ #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX) - PyDoc_STRVAR(posix_openpty__doc__, - "openpty() -> (master_fd, slave_fd)\n\n\ - Open a pseudo-terminal, returning open fd's for both master and slave end.\n"); - static PyObject * ! posix_openpty(PyObject *self, PyObject *noargs) { ! int master_fd, slave_fd; #ifndef HAVE_OPENPTY ! char * slave_name; #endif #if defined(HAVE_DEV_PTMX) && !defined(HAVE_OPENPTY) && !defined(HAVE__GETPTY) ! PyOS_sighandler_t sig_saved; #ifdef sun ! extern char *ptsname(); #endif #endif #ifdef HAVE_OPENPTY ! if (openpty(&master_fd, &slave_fd, NULL, NULL, NULL) != 0) ! return posix_error(); #elif defined(HAVE__GETPTY) ! slave_name = _getpty(&master_fd, O_RDWR, 0666, 0); ! if (slave_name == NULL) ! return posix_error(); ! slave_fd = open(slave_name, O_RDWR); ! if (slave_fd < 0) ! return posix_error(); #else ! master_fd = open(DEV_PTY_FILE, O_RDWR | O_NOCTTY); /* open master */ ! if (master_fd < 0) ! return posix_error(); ! sig_saved = signal(SIGCHLD, SIG_DFL); ! /* change permission of slave */ ! if (grantpt(master_fd) < 0) { ! signal(SIGCHLD, sig_saved); ! return posix_error(); ! } ! /* unlock slave */ ! if (unlockpt(master_fd) < 0) { ! signal(SIGCHLD, sig_saved); ! return posix_error(); ! } ! signal(SIGCHLD, sig_saved); ! slave_name = ptsname(master_fd); /* get name of slave */ ! if (slave_name == NULL) ! return posix_error(); ! slave_fd = open(slave_name, O_RDWR | O_NOCTTY); /* open slave */ ! if (slave_fd < 0) ! return posix_error(); #if !defined(__CYGWIN__) && !defined(HAVE_DEV_PTC) ! ioctl(slave_fd, I_PUSH, "ptem"); /* push ptem */ ! ioctl(slave_fd, I_PUSH, "ldterm"); /* push ldterm */ #ifndef __hpux ! ioctl(slave_fd, I_PUSH, "ttcompat"); /* push ttcompat */ #endif /* __hpux */ #endif /* HAVE_CYGWIN */ #endif /* HAVE_OPENPTY */ ! return Py_BuildValue("(ii)", master_fd, slave_fd); } #endif /* defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX) */ ! #ifdef HAVE_FORKPTY PyDoc_STRVAR(posix_forkpty__doc__, "forkpty() -> (pid, master_fd)\n\n\ Fork a new process with a new pseudo-terminal as controlling tty.\n\n\ Like fork(), return 0 as pid to child process, and PID of child to parent.\n\ To both, return fd of newly opened pseudo-terminal.\n"); - static PyObject * posix_forkpty(PyObject *self, PyObject *noargs) { ! int master_fd, pid; ! pid = forkpty(&master_fd, NULL, NULL, NULL); ! if (pid == -1) ! return posix_error(); ! if (pid == 0) ! PyOS_AfterFork(); ! return Py_BuildValue("(ii)", pid, master_fd); } #endif --- 2597,2784 ---- #endif /* defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) || defined(HAVE_DEV_PTMX */ #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX) static PyObject * ! __shared_openpty (int * out_master_fd, int * out_slave_fd) { ! int master_fd, slave_fd; #ifndef HAVE_OPENPTY ! char * slave_name; #endif #if defined(HAVE_DEV_PTMX) && !defined(HAVE_OPENPTY) && !defined(HAVE__GETPTY) ! PyOS_sighandler_t sig_saved; #ifdef sun ! extern char *ptsname(); #endif #endif #ifdef HAVE_OPENPTY ! if (openpty(&master_fd, &slave_fd, NULL, NULL, NULL) != 0) ! return posix_error(); #elif defined(HAVE__GETPTY) ! slave_name = _getpty(&master_fd, O_RDWR, 0666, 0); ! if (slave_name == NULL) ! return posix_error(); ! slave_fd = open(slave_name, O_RDWR); ! if (slave_fd < 0) ! return posix_error(); #else ! master_fd = open(DEV_PTY_FILE, O_RDWR | O_NOCTTY); ! if (master_fd < 0){ ! return posix_error(); ! } ! sig_saved = signal(SIGCHLD, SIG_DFL); ! /* change permission of slave */ ! if (grantpt(master_fd) < 0) { ! signal(SIGCHLD, sig_saved); ! return posix_error(); ! } ! /* unlock slave */ ! if (unlockpt(master_fd) < 0) { ! signal(SIGCHLD, sig_saved); ! return posix_error(); ! } ! signal(SIGCHLD, sig_saved); ! slave_name = ptsname(master_fd); ! if (slave_name == NULL){ ! return posix_error(); ! } ! slave_fd = open(slave_name, O_RDWR | O_NOCTTY); ! if (slave_fd < 0){ ! return posix_error(); ! } #if !defined(__CYGWIN__) && !defined(HAVE_DEV_PTC) ! ioctl(slave_fd, I_PUSH, "ptem"); /* push ptem */ ! ioctl(slave_fd, I_PUSH, "ldterm"); /* push ldterm */ #ifndef __hpux ! ioctl(slave_fd, I_PUSH, "ttcompat"); /* push ttcompat */ #endif /* __hpux */ #endif /* HAVE_CYGWIN */ #endif /* HAVE_OPENPTY */ ! *out_master_fd = master_fd; ! *out_slave_fd = slave_fd; ! return Py_BuildValue("(ii)", master_fd, slave_fd); ! } + PyDoc_STRVAR(posix_openpty__doc__, + "openpty() -> (master_fd, slave_fd)\n\n\ + Open a pseudo-terminal, returning open fd's for both master and slave end.\n"); + static PyObject * + posix_openpty(PyObject *self, PyObject *noargs) + { + int master_fd; + int slave_fd; + + return __shared_openpty (& master_fd, & slave_fd); } #endif /* defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX) */ ! /* Use forkpty if available. For platform that don't have it I try to define it. */ ! #if defined(HAVE_FORKPTY) || (defined(HAVE_FORK) && (defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX))) PyDoc_STRVAR(posix_forkpty__doc__, "forkpty() -> (pid, master_fd)\n\n\ Fork a new process with a new pseudo-terminal as controlling tty.\n\n\ Like fork(), return 0 as pid to child process, and PID of child to parent.\n\ To both, return fd of newly opened pseudo-terminal.\n"); static PyObject * posix_forkpty(PyObject *self, PyObject *noargs) { ! #ifdef HAVE_FORKPTY /* The easy one */ ! int master_fd, pid; ! pid = forkpty(&master_fd, NULL, NULL, NULL); ! #else /* The hard one */ ! int master_fd, pid; ! int slave_fd; ! char * slave_name; ! int fd; ! ! __shared_openpty (& master_fd, & slave_fd); ! if (master_fd < 0 || slave_fd < 0) ! { ! return posix_error(); ! } ! slave_name = ptsname(master_fd); ! pid = fork(); ! switch (pid) { ! case -1: ! return posix_error(); ! case 0: /* Child */ ! ! #ifdef TIOCNOTTY ! /* Explicitly close the old controlling terminal. ! Some platforms require an explicit detach of the current controlling tty ! before we close stdin, stdout, stderr. ! OpenBSD says that this is obsolete, but doesn't hurt. */ ! fd = open("/dev/tty", O_RDWR | O_NOCTTY); ! if (fd >= 0) { ! (void) ioctl(fd, TIOCNOTTY, (char *)0); ! close(fd); ! } ! #endif /* TIOCNOTTY */ ! ! /* The setsid() system call will place the process into its own session ! which has the effect of disassociating it from the controlling terminal. ! This is known to be true for OpenBSD. ! */ ! if (setsid() < 0){ ! return posix_error(); ! } ! ! ! /* Verify that we are disconnected from the controlling tty. */ ! fd = open("/dev/tty", O_RDWR | O_NOCTTY); ! if (fd >= 0) { ! close(fd); ! return posix_error(); ! } ! ! #ifdef TIOCSCTTY ! /* Make the pseudo terminal the controlling terminal for this process ! (the process must not currently have a controlling terminal). ! */ ! if (ioctl(slave_fd, TIOCSCTTY, (char *)0) < 0){ ! return posix_error(); ! } ! #endif /* TIOCSCTTY */ ! ! /* Verify that we can open to the slave pty file. */ ! fd = open(slave_name, O_RDWR); ! if (fd < 0){ ! return posix_error(); ! } ! else ! close(fd); ! ! /* Verify that we now have a controlling tty. */ ! fd = open("/dev/tty", O_WRONLY); ! if (fd < 0){ ! return posix_error(); ! } ! else { ! close(fd); ! } ! ! (void) close(master_fd); ! (void) dup2(slave_fd, 0); ! (void) dup2(slave_fd, 1); ! (void) dup2(slave_fd, 2); ! if (slave_fd > 2) ! (void) close(slave_fd); ! pid = 0; ! break; ! default: ! /* PARENT */ ! (void) close(slave_fd); ! } ! #endif ! ! if (pid == -1) ! return posix_error(); ! if (pid == 0) ! PyOS_AfterFork(); ! return Py_BuildValue("(ii)", pid, master_fd); } #endif *************** *** 6994,7000 **** #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX) {"openpty", posix_openpty, METH_NOARGS, posix_openpty__doc__}, #endif /* HAVE_OPENPTY || HAVE__GETPTY || HAVE_DEV_PTMX */ ! #ifdef HAVE_FORKPTY {"forkpty", posix_forkpty, METH_NOARGS, posix_forkpty__doc__}, #endif /* HAVE_FORKPTY */ #ifdef HAVE_GETEGID --- 7093,7099 ---- #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX) {"openpty", posix_openpty, METH_NOARGS, posix_openpty__doc__}, #endif /* HAVE_OPENPTY || HAVE__GETPTY || HAVE_DEV_PTMX */ ! #if defined(HAVE_FORKPTY) || (defined(HAVE_FORK) && (defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX))) {"forkpty", posix_forkpty, METH_NOARGS, posix_forkpty__doc__}, #endif /* HAVE_FORKPTY */ #ifdef HAVE_GETEGID --------------020003030503000503080704-- From martin@v.loewis.de Sat May 3 13:23:09 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 03 May 2003 14:23:09 +0200 Subject: [Python-Dev] posixmodule.c patch to support forkpty (patch against posixmodule.c Revision 2.241.2.1) In-Reply-To: <3EB39BC5.50702@noah.org> References: <3EB39BC5.50702@noah.org> Message-ID: Noah Spurrier writes: > I am attaching a patch Please see http://www.python.org/dev/devfaq.html#a2 Please don't post patches to python-dev. > This patch was diffed against posixmodule.c Revision 2.241.2.1 Python 2.3b. Please generate patches against the mainline, not against branches. Kind regards, Martin From noah@noah.org Sat May 3 15:10:12 2003 From: noah@noah.org (Noah Spurrier) Date: Sat, 03 May 2003 07:10:12 -0700 Subject: [Python-Dev] posixmodule.c patch to support forkpty (patch against posixmodule.c Revision 2.241.2.1) In-Reply-To: References: <3EB39BC5.50702@noah.org> Message-ID: <3EB3CDC4.4020306@noah.org> Sorry... my first patch :-) Yours, Noah Martin v. L=F6wis wrote: > Noah Spurrier writes: >=20 >>I am attaching a patch=20 >=20 > Please see >=20 > http://www.python.org/dev/devfaq.html#a2 From skip@pobox.com Sat May 3 15:22:48 2003 From: skip@pobox.com (Skip Montanaro) Date: Sat, 3 May 2003 09:22:48 -0500 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <1051927365.4302.3.camel@anthem> References: <16050.36653.443229.45811@montanaro.dyndns.org> <3EB2A47F.8000706@v.loewis.de> <16050.44080.588636.503705@montanaro.dyndns.org> <3EB30008.4010603@v.loewis.de> <1051927365.4302.3.camel@anthem> Message-ID: <16051.53432.301308.205335@montanaro.dyndns.org> Barry> I totally agree with Martin. Make bsddb185 explicit and do not Barry> masquerade it as bsddb by default. Okay, that's fine with me. Skip From skip@pobox.com Sat May 3 15:25:28 2003 From: skip@pobox.com (Skip Montanaro) Date: Sat, 3 May 2003 09:25:28 -0500 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <1051927365.4302.3.camel@anthem> References: <16050.36653.443229.45811@montanaro.dyndns.org> <3EB2A47F.8000706@v.loewis.de> <16050.44080.588636.503705@montanaro.dyndns.org> <3EB30008.4010603@v.loewis.de> <1051927365.4302.3.camel@anthem> Message-ID: <16051.53592.704262.929675@montanaro.dyndns.org> Barry> I totally agree with Martin. Make bsddb185 explicit and do not Barry> masquerade it as bsddb by default. Skip> Okay, that's fine with me. How about http://python.org/sf/727137 then? I think dbhash should consider bsddb185 as a possibility. That would make Nick Vargish's anydbm programs keep running I think. Skip From skip@pobox.com Sat May 3 15:28:52 2003 From: skip@pobox.com (Skip Montanaro) Date: Sat, 3 May 2003 09:28:52 -0500 Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module In-Reply-To: <3EB320E9.4020409@v.loewis.de> References: <3EB320E9.4020409@v.loewis.de> Message-ID: <16051.53796.553202.289905@montanaro.dyndns.org> >> - where bsddb is imported, try importing bsddb, and if that >> fails try importing bsddb185 as bsddb (or as * inside the bsddb pkg). Martin> I'm strongly opposed to that. Users of bsddb185 need to make an Martin> explicit choice that they want to use that library. Otherwise, Martin> we would have to deal with the bug reports resulting from the Martin> brokenness of the library forever. Yeah, but there are places in the core library (like anydbm via dbhash) which import bsddb and are generally going to be out of control of end users. I think those places need to consider bsddb185 as a possibility. I already posted a link to a SF patch. Skip From dave@boost-consulting.com Sat May 3 17:45:10 2003 From: dave@boost-consulting.com (David Abrahams) Date: Sat, 03 May 2003 12:45:10 -0400 Subject: [Python-Dev] Timbot? Message-ID: This has probably already been spotted, but in case it hasn't... I just googled for Timbot and found: http://www.cse.ogi.edu/~mpj/timbot/#Programming -- Dave Abrahams Boost Consulting www.boost-consulting.com From gward@python.net Sat May 3 20:21:31 2003 From: gward@python.net (Greg Ward) Date: Sat, 3 May 2003 15:21:31 -0400 Subject: [Python-Dev] optparse docs need proofreading Message-ID: <20030503192131.GA4689@cthulhu.gerg.ca> So you're sitting around, wondering what to do with your weekend, and worrying that the Python 2.3 documentation is not perfect yet. Well, you could proofread the documentation for optparse (currently section 6.20 of the "lib" manual), which was converted wholesale from reStructuredText to LaTeX, and still bears some scars. Both the DVI/PS/PDF output and HTML bear close examination. I'm working on it now, but will undoubtedly miss stuff, so feel free to email any glitches you notice in the latest CVS version to me. Greg -- Greg Ward http://www.gerg.ca/ From tim.one@comcast.net Sun May 4 06:26:09 2003 From: tim.one@comcast.net (Tim Peters) Date: Sun, 04 May 2003 01:26:09 -0400 Subject: [Python-Dev] Re: heaps In-Reply-To: <5841710.1051745776@[10.0.1.2]> Message-ID: [Tim] >> ... >> priorityDictionary looks like an especially nice API for this specific >> algorithm, but, e.g., impossible to use directly for maintaining an N- >> best queue (priorityDictionary doesn't support multiple values with >> the same priority, right? That was wrong: the dict maps items to priorities, and I read it backwards. Sorry! >> if we're trying to find the 10,000 poorest people in America, counting >> only one as dead broke would be too Republican for some peoples' tastes >> ). OTOH, heapq is easy and efficient for *that* class of heap >> application. [David Eppstein] > I agree with your main points (heapq's inability to handle > certain priority queue applications doesn't mean it's useless, and > its implementation-specific API helps avoid fooling programmers into > thinking it's any more than what it is). But I am confused at this > example. Surely it's just as easy to store (income,identity) tuples in > either data structure. As above, I was inside out. "Just as easy" can't be answered without trying to write actual code, though. Given that heapq and priorityDictionary are both min-heaps, to avoid artificial pain let's look for the people with the N highest incomes instead. For an N-best queue using heapq, "the natural" thing is to define people like so: class Person: def __init__(self, income): self.income = income def __cmp__(self, other): return cmp(self.income, other.income) and then the N-best calculation is as follows; it's normal in N-best applications that N is much smaller than the number of items being ranked, and you don't want to consume more than O(N) memory (for example, google wants to show you the best-scoring 25 documents of the 6 million matches it found): """ # N-best queue for people with the N largest incomes. import heapq dummy = Person(-1) # effectively an income of -Inf q = [dummy] * N # it's fine to use the same object N times for person in people: if person > q[0]: heapq.heapreplace(q, person) # The result list isn't sorted. result = [person for person in q if q is not dummy] """ I'm not as experienced with priorityDictionary. For heapq, the natural __cmp__ is the one that compares objects' priorities. For priorityDictionary, we can't use that, because Person instances will be used as dict keys, and then two Persons with the same income couldn't be in the queue at the same time. So Person.__cmp__ will have to change in such a way that distinct Persons never compare equal. I also have to make sure that a Person is hashable. I see there's another subtlety, apparent only from reading the implementation code: in the heap maintained alongside the dict, it's actually (priority, object) tuples that get compared. Since I expect to see Persons with equal income, when two such tuples get compared, they'll tie on the priority, and go on to compare the Persons. So I have to be sure too that comparing two Persons is cheap. Pondering all that for a while, it seems best to make sure Person doesn't define __cmp__ or __hash__ at all. Then instances will get compared by memory address, distinct Persons will never compare equal, comparing Persons is cheap, and hashing by memory address is cheap too: class Person: def __init__(self, income): self.income = income The N-best code is then: """ q = priorityDictionary() for dummy in xrange(N): q[Person(-1)] = -1 # have to ensure these are distinct Persons for person in people: if person.income > q.smallest().income: del q[q.smallest()] q[person] = person.income # The result list is sorted. result = [person for person in q if person.income != -1] """ Perhaps paradoxically, I had to know essentially everything about how priorityDictionary is implemented to write a correct and efficient algorithm here. That was true of heapq too, of course, but there were fewer subtleties to trip over there, and heapq isn't trying to hide its implementation. BTW, there's a good use of heapq for you: you could use it to maintain the under-the-covers heap inside priorityDictionary! It would save much of the code, and possibly speed it too (heapq's implementation of popping usually requires substantially fewer comparisons than priorityDictionary.smallest uses; this is explained briefly in the comments before _siftup, deferring to Knuth for the gory details). > If you mean, you want to find the 10k smallest income values (rather than > the people having those incomes), then it may be that a better data > structure would be a simple list L in which the value of L[i] is > the count of people with income i. Well, leaving pennies out of it, incomes in the USA span 9 digits, so something taking O(N) memory would still be most attractive. From eppstein@ics.uci.edu Sun May 4 06:46:58 2003 From: eppstein@ics.uci.edu (David Eppstein) Date: Sat, 03 May 2003 22:46:58 -0700 Subject: [Python-Dev] Re: heaps In-Reply-To: References: Message-ID: <17342817.1052002018@[10.0.1.2]> On 5/4/03 1:26 AM -0400 Tim Peters wrote: > it's normal in N-best applications that N is much smaller than the number > of items being ranked, and you don't want to consume more than O(N) > memory (for example, google wants to show you the best-scoring 25 > documents of the 6 million matches it found): Ok, I think you're right, for this sort of thing heapq is better. One could extend my priorityDictionary code to limit memory like this but it would be unnecessary work when the extra features it has over heapq are not used for this sort of algorithm. On the other hand, if you really want to find the n best items in a data stream large enough that you care about using only space O(n), it might also be preferable to take constant amortized time per item rather than the O(log n) that heapq would use, and it's not very difficult nor does it require any fancy data structures. Some time back I needed some Java code for this, haven't had an excuse to port it to Python. In case anyone's interested, it's online at . Looking at it now, it seems more complicated than it needs to be, but maybe that's just the effect of writing in Java instead of Python (I've seen an example of a three-page Java implementation of an algorithm in a textbook that could easily be done in a dozen Python lines). -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From eppstein@ics.uci.edu Sun May 4 08:54:21 2003 From: eppstein@ics.uci.edu (David Eppstein) Date: Sun, 04 May 2003 00:54:21 -0700 Subject: [Python-Dev] Re: heaps References: <17342817.1052002018@[10.0.1.2]> Message-ID: In article <17342817.1052002018@[10.0.1.2]>, David Eppstein wrote: > On the other hand, if you really want to find the n best items in a data > stream large enough that you care about using only space O(n), it might > also be preferable to take constant amortized time per item rather than the > O(log n) that heapq would use, and it's not very difficult nor does it > require any fancy data structures. Some time back I needed some Java code > for this, haven't had an excuse to port it to Python. In case anyone's > interested, it's online at > . BTW, the central idea here is to use a random quicksort pivot to shrink the list, when it grows too large. In python, this could be done without randomization as simply as def addToNBest(L,x,N): L.append(x) if len(L) > 2*N: L.sort() del L[N:] It's not constant amortized time due to the sort, but that's probably more than made up for due to the speed of compiled sort versus interpreted randomized pivot. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From skip@mojam.com Sun May 4 13:00:24 2003 From: skip@mojam.com (Skip Montanaro) Date: Sun, 4 May 2003 07:00:24 -0500 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200305041200.h44C0OY12616@manatee.mojam.com> Bug/Patch Summary ----------------- 423 open / 3606 total bugs (+17) 137 open / 2130 total patches (+10) New Bugs -------- mmap's resize method resizes the file in win32 but not unix (2003-04-27) http://python.org/sf/728515 Long file names in osa suites (2003-04-27) http://python.org/sf/728574 ConfigurePython gives depreaction warning (2003-04-27) http://python.org/sf/728608 super bug (2003-04-28) http://python.org/sf/729103 building readline module fails on Irix 6.5 (2003-04-28) http://python.org/sf/729236 What's new in Python2.3b1 HTML generation. (2003-04-28) http://python.org/sf/729297 comparing versions - one a float (2003-04-28) http://python.org/sf/729317 rexec not listed as dead (2003-04-29) http://python.org/sf/729817 MacPython-OS9 eats CPU while waiting for I/O (2003-04-29) http://python.org/sf/729871 metaclasses, __getattr__, and special methods (2003-04-29) http://python.org/sf/729913 socketmodule.c: inet_pton() expects 4-byte packed_addr (2003-04-30) http://python.org/sf/730222 Unexpected Changes in list Iterator (2003-04-30) http://python.org/sf/730296 Not detecting AIX_GENUINE_CPLUSPLUS (2003-04-30) http://python.org/sf/730467 Python 2.3 bsddb docs need update (2003-05-01) http://python.org/sf/730938 HTTPRedirectHandler variable out of scope (2003-05-01) http://python.org/sf/730963 urllib2 raises AttributeError on redirect (2003-05-01) http://python.org/sf/731116 test_tarfile writes in Lib/test directory (2003-05-02) http://python.org/sf/731403 Importing anydbm generates exception if _bsddb unavailable (2003-05-02) http://python.org/sf/731501 Pimp needs to be able to update itself (2003-05-02) http://python.org/sf/731626 OSX installer .pkg file permissions (2003-05-02) http://python.org/sf/731631 Package Manager needs Help menu (2003-05-02) http://python.org/sf/731635 IDE "lookup in documentation" doesn't work in interactive wi (2003-05-02) http://python.org/sf/731643 GIL not released around getaddrinfo() (2003-05-02) http://python.org/sf/731644 An extended definition of "non-overlapping" would save time. (2003-05-04) http://python.org/sf/732120 Clarification of "pos" and "endpos" for match objects. (2003-05-04) http://python.org/sf/732124 New Patches ----------- Fixes for setup.py in Mac/OSX/Docs (2003-04-27) http://python.org/sf/728744 test_timeout updates (2003-04-28) http://python.org/sf/728815 Compiler warning on Solaris 8 (2003-04-28) http://python.org/sf/729305 Dictionary tuning (2003-04-29) http://python.org/sf/729395 Add Py_AtInit() startup hook for extenders (2003-04-30) http://python.org/sf/730473 assert from longobject.c, line 1215 (2003-04-30) http://python.org/sf/730594 RTEMS does not have a popen (2003-04-30) http://python.org/sf/730597 socketmodule inet_ntop built when IPV6 is disabled (2003-04-30) http://python.org/sf/730603 pimp.py has old URL for default database (2003-05-01) http://python.org/sf/731151 redirect fails in urllib2 (2003-05-01) http://python.org/sf/731153 AssertionError when building rpm under RedHat 9.1 (2003-05-02) http://python.org/sf/731328 make threading join() method return a value (2003-05-02) http://python.org/sf/731607 SpawnedGenerator class for threading module (2003-05-02) http://python.org/sf/731701 find correct socklen_t type (2003-05-03) http://python.org/sf/731991 exit status of latex2html "ignored" (2003-05-04) http://python.org/sf/732143 Closed Bugs ----------- "es#" parser marker leaks memory (2002-01-10) http://python.org/sf/501716 math.fabs documentation is misleading (2003-03-22) http://python.org/sf/708205 Lineno calculation sometimes broken (2003-03-24) http://python.org/sf/708901 Put a reference to print in the Library Reference, please. (2003-04-17) http://python.org/sf/723136 imaplib should convert line endings to be rfc2822 complient (2003-04-18) http://python.org/sf/723962 socketmodule doesn't compile on strict POSIX systems (2003-04-20) http://python.org/sf/724588 SRE bug with capturing groups in alternatives in repeats (2003-04-21) http://python.org/sf/725106 valgrind python fails (2003-04-24) http://python.org/sf/727051 tmpnam problems on windows 2.3b, breaks test.test_os (2003-04-26) http://python.org/sf/728097 Closed Patches -------------- fix for bug 501716 (2003-02-11) http://python.org/sf/684981 OpenVMS complementary patches (2003-03-23) http://python.org/sf/708495 unchecked return values - compile.c (2003-03-23) http://python.org/sf/708604 Cause pydoc to show data descriptor __doc__ strings (2003-03-29) http://python.org/sf/711902 timeouts for FTP connect (and other supported ops) (2003-04-03) http://python.org/sf/714592 Modules/addrinfo.h patch (2003-04-22) http://python.org/sf/725942 Remove extra line ending in CGI XML-RPC responses (2003-04-25) http://python.org/sf/727805 From m@moshez.org Sun May 4 19:55:44 2003 From: m@moshez.org (Moshe Zadka) Date: 4 May 2003 18:55:44 -0000 Subject: [Python-Dev] Distutils using apply Message-ID: <20030504185544.6010.qmail@green.zadka.com> Hi! I haven't seen this come up yet -- why is distutils still using apply? It causes warnings to be emitted when building packages with Python 2.3 and -Wall, and is altogether unclean. Is this just a matter of checking in a patch? Or submitting one to SF? Or is there a real desire to be compatible to Python 1.5.2? Thanks, Moshe -- Moshe Zadka -- http://moshez.org/ Buffy: I don't like you hanging out with someone that... short. Riley: Yeah, a lot of young people nowadays are experimenting with shortness. Agile Programming Language -- http://www.python.org/ From goodger@python.org Sun May 4 20:18:04 2003 From: goodger@python.org (David Goodger) Date: Sun, 04 May 2003 15:18:04 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: <20030504185544.6010.qmail@green.zadka.com> References: <20030504185544.6010.qmail@green.zadka.com> Message-ID: <3EB5676C.1000900@python.org> Moshe Zadka wrote: > Or is there a real desire to be compatible to Python 1.5.2? PEP 291 lists distutils as requiring 1.5.2 compatibility. -- David Goodger From Raymond Hettinger" After more dictionary sparseness experiments, I've become convinced that the ideal settings are better left up to the user who is in a better position to know: * anticipated dictionary size * overall application memory issues * characteristic access patterns (stores vs. reads vs. deletions vs. iteration) * when the dictionary is growing, shrinking, or stablized. * whether many deletions have taken place I have two competing proposals to expose dictresize(): 1) d.resize(minsize=0) The first approach allows a user to trigger a resize(). This is handy after deletions have taken place and dictionary contents have become stable. It allows the dictionary to be rebuilt without dummy entries. If the minsize factor is specified, then the dictionary will be built to the specified size or larger if needed to achieve a power of two or to accommodate existing entries. That is handy when building a dictionary whose approximate size is known in advance because it eliminates all of the intermediate resizes during construction. For instance, the builtin dictionary can be pre-sized for the 126 entries and it will build more quickly. It is also useful after dictionary contents have stabilized and the user wants improved lookup time at the expense of additional memory and slower iteration time. For instance, the builtin dictionary can be resized to 500 entries making it so sparse that the lookups will typically hit on the first try. This API requires a little user sophistication because the effects get wiped out during the next automatic resize (when the dict is two-thirds full). 2) d.setsparsity(factor=1) The second approach does not allow dictionaries to be pre-sized, but the effects do not get wiped out by normal dictionary activity. It is handy when a particular dictionary's lookup/insertion time is more important than iteration time or space considerations. For instance, the builtin dictionary can be set to a sparsity factor of four so that lookups are more rapid. Raymond Hettinger From drifty@alum.berkeley.edu Mon May 5 00:27:16 2003 From: drifty@alum.berkeley.edu (Brett Cannon) Date: Sun, 4 May 2003 16:27:16 -0700 (PDT) Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <001301c31290$fcea25e0$125ffea9@oemcomputer> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger] > I have two competing proposals to expose dictresize(): > > 1) d.resize(minsize=0) > > The first approach allows a user to trigger a resize(). This is handy > after deletions have taken place and dictionary contents have become > stable. It allows the dictionary to be rebuilt without dummy entries. The issue I see with this is people going overboard with calls to this. I can easily imagine a new Python programmer calling this after every insertion or deletion into the dictionary. I can even see experienced programmer getting trapped into this by coming up with a size and then coding themselves into a corner by trying to maintain the size. I also see people coding a size that is optimal and then changing their code but forgetting to change the value passed to the method, thus negating the perk of having this option set > 2) d.setsparsity(factor=1) > > The second approach does not allow dictionaries to be pre-sized, > but the effects do not get wiped out by normal dictionary activity. > This is more reasonable. Since it is a factor it will makes sense to beginners who view it as a sliding scale and also allows more experienced programmers to set it to where they know they want the performance. And setting the value will more than likely be good no matter how the code is changed since the use of the dictionary will most likely stay consistent. Do either hinder dictionary performance just by introducing the possible functionality? I am -1 on 'resize' and +0, teetering on +1, for setsparsity. I will kick over to +1 if someone else out there with more experience with newbies can say strongly that they don't see them messing up with this option. -Brett P.S.: Thanks, Raymond, for doing all of this work and documenting it so well. From guido@python.org Mon May 5 01:34:33 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 04 May 2003 20:34:33 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: "Your message of Sun, 04 May 2003 18:59:46 EDT." <001301c31290$fcea25e0$125ffea9@oemcomputer> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> Message-ID: <200305050034.h450YXx23808@pcp02138704pcs.reston01.va.comcast.net> > After more dictionary sparseness experiments, I've become > convinced that the ideal settings are better left up to the user > who is in a better position to know: > > * anticipated dictionary size > * overall application memory issues > * characteristic access patterns (stores vs. reads vs. deletions > vs. iteration) > * when the dictionary is growing, shrinking, or stablized. > * whether many deletions have taken place Hm. Maybe so, but it *is* a feature that there are no user controls over dictionary behavior, based on the observation that for every user who knows enough about the dict implementation to know how to tweak it, there are at least 1000 who don't, and the latter, in their ill-advised quest for more speed, will use the tweakage API to their detriment. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Mon May 5 03:06:26 2003 From: skip@pobox.com (Skip Montanaro) Date: Sun, 4 May 2003 21:06:26 -0500 Subject: [Python-Dev] Distutils using apply In-Reply-To: <3EB5676C.1000900@python.org> References: <20030504185544.6010.qmail@green.zadka.com> <3EB5676C.1000900@python.org> Message-ID: <16053.50978.292901.471132@montanaro.dyndns.org> >> Or is there a real desire to be compatible to Python 1.5.2? David> PEP 291 lists distutils as requiring 1.5.2 compatibility. Then should distutils be suppressing those warnings? Skip From tim.one@comcast.net Mon May 5 03:20:09 2003 From: tim.one@comcast.net (Tim Peters) Date: Sun, 04 May 2003 22:20:09 -0400 Subject: [Python-Dev] Re: heaps In-Reply-To: <17342817.1052002018@[10.0.1.2]> Message-ID: This is a multi-part message in MIME format. --Boundary_(ID_19ec9GZ0WH09Yh6NFIvyew) Content-type: text/plain; charset=iso-8859-1 Content-transfer-encoding: 7BIT [David Eppstein] > Ok, I think you're right, for this sort of thing heapq is better. > One could extend my priorityDictionary code to limit memory like > this but it would be unnecessary work when the extra features it > has over heapq are not used for this sort of algorithm. I don't believe memory usage was an issue here. Take a look at the code again (comments removed): """ q = priorityDictionary() for dummy in xrange(N): q[Person(-1)] = -1 for person in people: if person.income > q.smallest().income: del q[q.smallest()] q[person] = person.income """ q starts with N entries. Each trip around the loop either leaves the q contents alone, or both removes and adds an entry. So the size of the dict is a loop invariant, len(q) == N. In the cases where it does remove an entry, it always removes the smallest entry, and the entry being added is strictly larger than that, so calling q.smallest() at the start of the next loop trip finds the just-deleted smallest entry still in self.__heap[0], and removes it. So the internal list may grow to N+1 entries immediately following del q[q.smallest()] but by the time we get to that line again it should be back to N entries again. The reasons I found heapq easier to live with in this specific app had more to do with the subtleties involved in sidestepping potential problems with __hash__, __cmp__, and the speed of tuple comparison when the first tuple elements tie. heapq also supplies a "remove current smallest and replace with a new value" primitive, which happens to be just right for this app (that isn't an accident ): """ dummy = Person(-1) q = [dummy] * N for person in people: if person > q[0]: heapq.heapreplace(q, person) """ > On the other hand, if you really want to find the n best items in a data > stream large enough that you care about using only space O(n), it might > also be preferable to take constant amortized time per item rather than > the O(log n) that heapq would use, In practice, it's usually much faster than that. Over time, it gets rarer and rarer for person > q[0] to be true (the new person has to be larger than the N-th largest seen so far, and that bar gets raised whenever a new person manages to hurdle it), and the vast majority of sequence elements are disposed with via that single Python statement (the ">" test fails, and we move on to the next element with no heap operations). In the simplest case, if N==1, the incoming data is randomly ordered, and the incoming sequence has M elements, the if-test is true (on average) only ln(M) times (the expected number of left-to-right maxima). The order statistics get more complicated as N increases, of course, but in practice it remains very fast, and doing a heapreplace() on every incoming item is the worst case (achieved if the items come in sorted order; the best case is when they come in reverse-sorted order, in which case min(M, N) heapreplace() operations are done). > and it's not very difficult nor does it require any fancy data > structures. Some time back I needed some Java code for this, > haven't had an excuse to port it to Python. In case anyone's > interested, it's online at > . > Looking at it now, it seems more complicated than it needs to be, but > maybe that's just the effect of writing in Java instead of Python > (I've seen an example of a three-page Java implementation of an > algorithm in a textbook that could easily be done in a dozen Python > lines). Cool! I understood the thrust but not the details -- and I agree Java must be making it harder than it should be . > In python, this could be done without randomization as simply as > > def addToNBest(L,x,N): > L.append(x) > if len(L) > 2*N: > L.sort() > del L[N:] > > It's not constant amortized time due to the sort, but that's probably > more than made up for due to the speed of compiled sort versus > interpreted randomized pivot. I'll attach a little timing script. addToNBest is done inline there, some low-level tricks were played to speed it, and it was changed to be a max N-best instead of a min N-best. Note that the list sort in 2.3 has a real advantage over Pythons before 2.3 here, because it recognizes (in linear time) that the first half of the list is already in sorted order (on the second & subsequent sorts), and leaves it alone until a final merge step with the other half of the array. The relative speed (compared to the heapq code) varies under 2.3, seeming to depend mostly on M/N. The test case is set up to find the 1000 largest of a million random floats. In that case the sorting method takes about 3.4x longer than the heapq approach. As N gets closer to M, the sorting method eventually wins; when M and N are both a million, the sorting method is 10x faster. For most N-best apps, M is much smaller than N, and the heapq code should be quicker unless the data is already in order. --Boundary_(ID_19ec9GZ0WH09Yh6NFIvyew) Content-type: text/plain; name=timeq.py Content-transfer-encoding: 7BIT Content-disposition: attachment; filename=timeq.py def one(seq, N): from heapq import heapreplace L = [-1] * N for x in seq: if x > L[0]: heapreplace(L, x) L.sort() return L def two(seq, N): L = [] push = L.append twoN = 2*N for x in seq: push(x) if len(L) > twoN: L.sort() del L[:-N] L.sort() del L[:-N] return L def timeit(seq, N): from time import clock as now s = now() r1 = one(seq, N) t = now() e1 = t - s s = now() r2 = two(seq, N) t = now() e2 = t - s print len(seq), N, e1, e2 assert r1 == r2 def tryone(M, N): from random import random seq = [random() for dummy in xrange(M)] timeit(seq, N) for i in range(10): tryone(1000000, 1000) --Boundary_(ID_19ec9GZ0WH09Yh6NFIvyew)-- From python@rcn.com Mon May 5 03:22:08 2003 From: python@rcn.com (Raymond Hettinger) Date: Sun, 4 May 2003 22:22:08 -0400 Subject: [Python-Dev] Dictionary sparseness References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305050034.h450YXx23808@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <003301c312ad$2113e520$125ffea9@oemcomputer> > > After more dictionary sparseness experiments, I've become > > convinced that the ideal settings are better left up to the user > > who is in a better position to know: > > > > * anticipated dictionary size > > * overall application memory issues > > * characteristic access patterns (stores vs. reads vs. deletions > > vs. iteration) > > * when the dictionary is growing, shrinking, or stablized. > > * whether many deletions have taken place > > Hm. Maybe so, but it *is* a feature that there are no user controls > over dictionary behavior, based on the observation that for every user > who knows enough about the dict implementation to know how to tweak > it, there are at least 1000 who don't, and the latter, in their > ill-advised quest for more speed, will use the tweakage API to their > detriment. Perhaps there should be safety-belts and kindergarten controls: d.pack(fat=False) --> None. Reclaims deleted entries. If optional fat argument is true, the internal size is doubled resulting in potentially faster lookups at the expense of slower iteration and more memory. This ought to be both safe and simple. Raymond Hettinger P.S. Also, I think it worthwhile to at least transform dictresize() into PyDict_Resize() so that C extensions will have some control. This would make it possible for us to add a single line making the builtin dictionary more sparse and providing a 75% first probe hit rate. From skip@pobox.com Mon May 5 03:24:31 2003 From: skip@pobox.com (Skip Montanaro) Date: Sun, 4 May 2003 21:24:31 -0500 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <001301c31290$fcea25e0$125ffea9@oemcomputer> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> Message-ID: <16053.52063.690466.272706@montanaro.dyndns.org> Raymond> After more dictionary sparseness experiments, I've become Raymond> convinced that the ideal settings are better left up to the Raymond> user who is in a better position to know: Speaking as a moderately sophisticated Python programmer, I can tell you I wouldn't have the slightest idea what the properties of my applications' dictionary usage is. Unless I'm going to get a major league speedup (like factor of two or greater) tweaking these settings, I don't see that they'd benefit me. Skip From python@rcn.com Mon May 5 03:26:47 2003 From: python@rcn.com (Raymond Hettinger) Date: Sun, 4 May 2003 22:26:47 -0400 Subject: [Python-Dev] Re: heaps References: Message-ID: <003f01c312ad$c7277580$125ffea9@oemcomputer> > The relative speed (compared to the heapq code) varies under 2.3, seeming to > depend mostly on M/N. The test case is set up to find the 1000 largest of a > million random floats. In that case the sorting method takes about 3.4x > longer than the heapq approach. As N gets closer to M, the sorting method > eventually wins; when M and N are both a million, the sorting method is 10x > faster. For most N-best apps, M is much smaller than N, and the heapq code > should be quicker unless the data is already in order. FWIW, there is C implementation of heapq at: http://zhar.net/projects/python/ Raymond Hettinger From tim.one@comcast.net Mon May 5 04:00:09 2003 From: tim.one@comcast.net (Tim Peters) Date: Sun, 04 May 2003 23:00:09 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <003301c312ad$2113e520$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger] > ... > P.S. Also, I think it worthwhile to at least transform dictresize() > into PyDict_Resize() so that C extensions will have some control. > This would make it possible for us to add a single line making > the builtin dictionary more sparse and providing a 75% first probe > hit rate. The dynamic hit rate is the one that counts, and, e.g., it's not going to speed anything to remove the current lowest-8-but-not-lowest-9-bits collision between 'ArithmeticError' and 'reload' (I've never seen the former used, and the latter is expensive). IOW, measuring the dynamic first-probe hit rate is a prerequisite to selling this idea; a stronger prerequisite is demonstrating actual before-and-after speedups. I agree with Guido that giving people controls they're ill-equipped to understand will do more harm than good. Even when they manage to stumble into a small speedup, that will often become counterproductive over time, as the characteristics of their ever-growing app change, and the Speed Weenie who got the 2% speedup left, or moved on to some other project. Or somebody corrects the option name from 'smalest' to 'smallest', and suddenly the only dict entry that mattered doesn't collide anymore -- but the mystery knob boosting the dict size "because it sped things up" forever more wastes half the space for a reason nobody ever understood. Or we change Python's string hash to use addition instead of xor to merge in the next character (a change that may actually help a bit -- addition is a littler better at scrambling the bits). Etc. it's-python-it's-supposed-to-be-slow-ly y'rs - tim From eppstein@ics.uci.edu Mon May 5 04:26:29 2003 From: eppstein@ics.uci.edu (David Eppstein) Date: Sun, 04 May 2003 20:26:29 -0700 Subject: [Python-Dev] Re: heaps In-Reply-To: References: Message-ID: <20414172.1052079989@[10.0.1.2]> On 5/4/03 10:20 PM -0400 Tim Peters wrote: > In practice, it's usually much faster than that. Over time, it gets rarer > and rarer for > > person > q[0] > > to be true (the new person has to be larger than the N-th largest seen so > far, and that bar gets raised whenever a new person manages to hurdle it), Good point. If any permutation of the input sequence is equally likely, and you're selecting the best k out of n items, the expected number of times you have to hit the data structure in your heapq solution is roughly k ln n, so the total expected time is O(n + k log k log n), with a really small constant factor on the O(n) term. The sorting solution I suggested has total time O(n log k), and even though sorting is built-in and fast it can't compete when k is small. Random pivoting is O(n + k), but with a larger constant factor, so your heapq solution looks like a winner. For fairness, it might be interesting to try another run of your test in which the input sequence is sorted in increasing order rather than random. I.e., replace the random generation of seq by seq = range(M) I'd try it myself, but I'm still running python 2.2 and haven't installed heapq. I'd have to know more about your application to have an idea whether the sorted or randomly-permuted case is more representative. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From oren-py-d@hishome.net Mon May 5 06:23:35 2003 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 5 May 2003 01:23:35 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: <20030504185544.6010.qmail@green.zadka.com> References: <20030504185544.6010.qmail@green.zadka.com> Message-ID: <20030505052335.GA37311@hishome.net> On Sun, May 04, 2003 at 06:55:44PM -0000, Moshe Zadka wrote: > Hi! > I haven't seen this come up yet -- why is distutils still using apply? > It causes warnings to be emitted when building packages with Python 2.3 > and -Wall, and is altogether unclean. > > Is this just a matter of checking in a patch? Or submitting one to SF? > Or is there a real desire to be compatible to Python 1.5.2? I was wondering if a milder form of deprecation may be appropriate for some features such as the apply builtin: 1. Add a notice in docstring 'not recommended for new code' 2. Move to 'obsolete' or 'backward compatibility' section in manual 3. Do NOT produce a warning (pychecker may still do that) 4. Do NOT plan removal of feature in a specific future release Oren From martin@v.loewis.de Mon May 5 06:55:56 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 05 May 2003 07:55:56 +0200 Subject: [Python-Dev] Distutils using apply In-Reply-To: <16053.50978.292901.471132@montanaro.dyndns.org> References: <20030504185544.6010.qmail@green.zadka.com> <3EB5676C.1000900@python.org> <16053.50978.292901.471132@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > David> PEP 291 lists distutils as requiring 1.5.2 compatibility. > > Then should distutils be suppressing those warnings? This isn't trivial: the warnings module is not available in Python 1.5.2. Regards, Martin From m@moshez.org Mon May 5 07:40:51 2003 From: m@moshez.org (Moshe Zadka) Date: 5 May 2003 06:40:51 -0000 Subject: [Python-Dev] Distutils using apply In-Reply-To: References: , <20030504185544.6010.qmail@green.zadka.com> <3EB5676C.1000900@python.org> <16053.50978.292901.471132@montanaro.dyndns.org> Message-ID: <20030505064051.29353.qmail@green.zadka.com> [Trimming CC list] On 05 May 2003, martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) wrote: > This isn't trivial: the warnings module is not available in Python > 1.5.2. Yes it is (trivial, not in 1.5.2) try: import warnings except ImportError: pass else: ...disable warnings... Thanks, Moshe -- Moshe Zadka -- http://moshez.org/ Buffy: I don't like you hanging out with someone that... short. Riley: Yeah, a lot of young people nowadays are experimenting with shortness. Agile Programming Language -- http://www.python.org/ From mal@lemburg.com Mon May 5 08:41:05 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 05 May 2003 09:41:05 +0200 Subject: [Python-Dev] Distutils using apply In-Reply-To: <20030504185544.6010.qmail@green.zadka.com> References: <20030504185544.6010.qmail@green.zadka.com> Message-ID: <3EB61591.5070204@lemburg.com> Moshe Zadka wrote: > Hi! > I haven't seen this come up yet -- why is distutils still using apply? > It causes warnings to be emitted when building packages with Python 2.3 > and -Wall, and is altogether unclean. Could someone please explain why apply() was marked deprecated ? The only reference I can find is in PEP 290 and that merely reports this "fact". I'm -1 on deprecating apply(). Not only because it introduces yet another incompatiblity between Python versions, but also because it is still useful in the context of having a function which mimics a function call, e.g. for map() and other instance where you pass around functions as operators. > Is this just a matter of checking in a patch? Or submitting one to SF? > Or is there a real desire to be compatible to Python 1.5.2? Yes. It was decided that Python 2.3 will ship with the last version of distutils that is Python 1.5.2 compatible. After that it may drop that compatibility and become Python 2.0 compatible. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 05 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 50 days left From python@rcn.com Mon May 5 10:28:51 2003 From: python@rcn.com (Raymond Hettinger) Date: Mon, 5 May 2003 05:28:51 -0400 Subject: [Python-Dev] Dictionary sparseness References: Message-ID: <001501c312e8$bd892420$125ffea9@oemcomputer> > it's-python-it's-supposed-to-be-slow-ly y'rs - tim Oh, now you tell me. I've got about a hundred failed experiments that provide slowdowns ranging from modest to excruciating. Take your pick. My favorite: Eliminating the test for dummy entry re-use ended up hurting every benchmark and completely destroying a couple of them. Raymond From guido@python.org Mon May 5 12:47:39 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 07:47:39 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: "Your message of Sun, 04 May 2003 22:22:08 EDT." <003301c312ad$2113e520$125ffea9@oemcomputer> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305050034.h450YXx23808@pcp02138704pcs.reston01.va.comcast.net> <003301c312ad$2113e520$125ffea9@oemcomputer> Message-ID: <200305051147.h45Bldw24692@pcp02138704pcs.reston01.va.comcast.net> > > Hm. Maybe so, but it *is* a feature that there are no user controls > > over dictionary behavior, based on the observation that for every user > > who knows enough about the dict implementation to know how to tweak > > it, there are at least 1000 who don't, and the latter, in their > > ill-advised quest for more speed, will use the tweakage API to their > > detriment. > > Perhaps there should be safety-belts and kindergarten controls: > > d.pack(fat=False) --> None. Reclaims deleted entries. > If optional fat argument is true, the internal size is doubled > resulting in potentially faster lookups at the expense of > slower iteration and more memory. > > This ought to be both safe and simple. And a waste of time except in the most rare circumstances. > Raymond Hettinger > > > P.S. Also, I think it worthwhile to at least transform dictresize() > into PyDict_Resize() so that C extensions will have some control. > This would make it possible for us to add a single line making > the builtin dictionary more sparse and providing a 75% first probe > hit rate. And that would give *how much* of a performance improvement of typical applications? Sorry, I really think that you're complexificating APIs here without sufficient gain. I really value the work you've done on figuring out how to improve dicts, but I think you've come to know the code too well to see the other side of the coin. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon May 5 13:02:08 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 08:02:08 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: "Your message of Mon, 05 May 2003 09:41:05 +0200." <3EB61591.5070204@lemburg.com> References: <20030504185544.6010.qmail@green.zadka.com> <3EB61591.5070204@lemburg.com> Message-ID: <200305051202.h45C28E24795@pcp02138704pcs.reston01.va.comcast.net> > Could someone please explain why apply() was marked deprecated ? Becase it's more readable, more efficient, and more flexible to write f(x, y, *t) than apply(f, (x, y) + t). > The only reference I can find is in PEP 290 and that merely > reports this "fact". > > I'm -1 on deprecating apply(). Not only because it introduces yet > another incompatiblity between Python versions, but also because it > is still useful in the context of having a function which mimics > a function call, e.g. for map() and other instance where you > pass around functions as operators. Then maybe we should add something like operator.__call__. OTOH, you're lucky that map isn't deprecated yet in favor of list comprehensions; I expect that Python 3.0 won't have map or filter either. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon May 5 13:03:58 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 08:03:58 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: "Your message of Mon, 05 May 2003 01:23:35 EDT." <20030505052335.GA37311@hishome.net> References: <20030504185544.6010.qmail@green.zadka.com> <20030505052335.GA37311@hishome.net> Message-ID: <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net> > I was wondering if a milder form of deprecation may be appropriate for > some features such as the apply builtin: > > 1. Add a notice in docstring 'not recommended for new code' > 2. Move to 'obsolete' or 'backward compatibility' section in manual > 3. Do NOT produce a warning (pychecker may still do that) > 4. Do NOT plan removal of feature in a specific future release The form of deprecation used for apply() is already very mild (you don't get a warning unless you do -Wall). I don't think Moshe's use case is important enough to care; if Moshe cares, he can easily construct a command line argument or warnings.filterwarning() call to suppress the warnings he doesn't care about. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon May 5 13:30:30 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 05 May 2003 14:30:30 +0200 Subject: [Python-Dev] Distutils using apply In-Reply-To: <200305051202.h45C28E24795@pcp02138704pcs.reston01.va.comcast.net> References: <20030504185544.6010.qmail@green.zadka.com> <3EB61591.5070204@lemburg.com> <200305051202.h45C28E24795@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3EB65966.6090005@lemburg.com> Guido van Rossum wrote: >>Could someone please explain why apply() was marked deprecated ? > > Becase it's more readable, more efficient, and more flexible to write > f(x, y, *t) than apply(f, (x, y) + t). True, but it's in wide use out there, so it shouldn't go until Python 3 is out the door. BTW, shouldn't these deprecations be listed in e.g PEP 4 ? There doesn't seem to be a single place to look for deprecated features and APIs (PEP 4 only lists modules). I find it rather troublesome that deprecation seems to be using stealth mode of operation in Python development -- discussions about it rarely surface until someone complains about a warning relating to it. There should be open discussions about whether or not to deprecate functionality. >>The only reference I can find is in PEP 290 and that merely >>reports this "fact". >> >>I'm -1 on deprecating apply(). Not only because it introduces yet >>another incompatiblity between Python versions, but also because it >>is still useful in the context of having a function which mimics >>a function call, e.g. for map() and other instance where you >>pass around functions as operators. > > Then maybe we should add something like operator.__call__. Why remove a common API and reinvent it somewhere else ? > OTOH, you're lucky that map isn't deprecated yet in favor of list > comprehensions; I expect that Python 3.0 won't have map or filter > either. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 05 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 50 days left From oren-py-d@hishome.net Mon May 5 13:50:07 2003 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 5 May 2003 08:50:07 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net> References: <20030504185544.6010.qmail@green.zadka.com> <20030505052335.GA37311@hishome.net> <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030505125007.GA20312@hishome.net> On Mon, May 05, 2003 at 08:03:58AM -0400, Guido van Rossum wrote: > > I was wondering if a milder form of deprecation may be appropriate for > > some features such as the apply builtin: > > > > 1. Add a notice in docstring 'not recommended for new code' > > 2. Move to 'obsolete' or 'backward compatibility' section in manual > > 3. Do NOT produce a warning (pychecker may still do that) > > 4. Do NOT plan removal of feature in a specific future release > > The form of deprecation used for apply() is already very mild (you > don't get a warning unless you do -Wall). I don't think Moshe's use > case is important enough to care; if Moshe cares, he can easily > construct a command line argument or warnings.filterwarning() call to > suppress the warnings he doesn't care about. My comment was not specifically about Moshe's use case - it's about the meaning of deprecation in Python. Does it always have to mean "start replacing because it *will* go away" as seems to be implied by PEP 5 or perhaps in some cases it could just mean "please don't use this in new code, okay" ? Oren From guido@python.org Mon May 5 14:47:12 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 09:47:12 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: Your message of "Mon, 05 May 2003 14:30:30 +0200." <3EB65966.6090005@lemburg.com> References: <20030504185544.6010.qmail@green.zadka.com> <3EB61591.5070204@lemburg.com> <200305051202.h45C28E24795@pcp02138704pcs.reston01.va.comcast.net> <3EB65966.6090005@lemburg.com> Message-ID: <200305051347.h45DlCp30562@odiug.zope.com> > Guido van Rossum wrote: > >>Could someone please explain why apply() was marked deprecated ? > > > > Becase it's more readable, more efficient, and more flexible to write > > f(x, y, *t) than apply(f, (x, y) + t). > > True, but it's in wide use out there, so it shouldn't go until > Python 3 is out the door. And it won't. But that doesn't mean we can't add a PendingDeprecation warning for it. > BTW, shouldn't these deprecations be listed in e.g PEP 4 ? > > There doesn't seem to be a single place to look for deprecated > features and APIs (PEP 4 only lists modules). That's a problem indeed. > I find it rather troublesome that deprecation seems to be using > stealth mode of operation in Python development -- discussions > about it rarely surface until someone complains about a warning > relating to it. There should be open discussions about whether > or not to deprecate functionality. I believe the discussions are open enough (things like this are never decided at PythonLabs, but always brought out on python-dev). But it's easy to miss these discussions, and the records aren't always clear. > > Then maybe we should add something like operator.__call__. > > Why remove a common API and reinvent it somewhere else ? To reflect its demoted status. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon May 5 14:50:05 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 09:50:05 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: Your message of "Mon, 05 May 2003 08:50:07 EDT." <20030505125007.GA20312@hishome.net> References: <20030504185544.6010.qmail@green.zadka.com> <20030505052335.GA37311@hishome.net> <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net> <20030505125007.GA20312@hishome.net> Message-ID: <200305051350.h45Do5c30595@odiug.zope.com> > My comment was not specifically about Moshe's use case - it's about > the meaning of deprecation in Python. > > Does it always have to mean "start replacing because it *will* go > away" as seems to be implied by PEP 5 or perhaps in some cases it > could just mean "please don't use this in new code, okay" ? I think that can be safely left up to the individual programmer, who has a better idea (hopefully) on the life expectancy of his code. We try to give guidance about the urgency of the deprecation e.g. in PEPs or by using the normally-silent PendingDeprecation (which suggests it's not urgent :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@pythoncraft.com Mon May 5 14:52:01 2003 From: aahz@pythoncraft.com (Aahz) Date: Mon, 5 May 2003 09:52:01 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <001501c312e8$bd892420$125ffea9@oemcomputer> References: <001501c312e8$bd892420$125ffea9@oemcomputer> Message-ID: <20030505135201.GA14870@panix.com> How about this: when we create read-only dicts, you add an optional argument that re-packs the dict and optimizes for space or speed. That way, the dict can be analyzed to provide appropriate results. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "In many ways, it's a dull language, borrowing solid old concepts from many other languages & styles: boring syntax, unsurprising semantics, few automatic coercions, etc etc. But that's one of the things I like about it." --Tim Peters on Python, 16 Sep 93 From skip@pobox.com Mon May 5 15:34:14 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 5 May 2003 09:34:14 -0500 Subject: [Python-Dev] How to test this? Message-ID: <16054.30310.489999.134263@montanaro.dyndns.org> I just added a patch file to . It doesn't include any test cases, since that requires an old db hash v2 file present. Is it okay to check in a dummy file to Lib/test for this purpose? Thanks, Skip From BPettersen@NAREX.com Mon May 5 15:55:02 2003 From: BPettersen@NAREX.com (Bjorn Pettersen) Date: Mon, 5 May 2003 08:55:02 -0600 Subject: [Python-Dev] Windows installer request... Message-ID: <60FB8BB7F0EFC7409B75EEEC13E20192022DE1FD@admin56.narex.com> Would it be possible for the windows installer to use $SYSTEMDRIVE$ as the default installation drive instead of C:? (On my XP box, C: is my zip-drive, and E: is my SYSTEMDRIVE(*) -- I'm now re-installing :-) If it's considered a good idea, and someone can point me to where the change has to be made, I'd be more than willing to produce a patch... -- bjorn (*) Don't ask, MS wisdom I guess. Oh, and if you don't have a C: drive, all you WinExplorer icons disappear (subst C: a: and it works :-) In any case, I'm not brave enough to try to change it . From oren-py-d@hishome.net Mon May 5 15:58:06 2003 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 5 May 2003 10:58:06 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: <200305051350.h45Do5c30595@odiug.zope.com> References: <20030504185544.6010.qmail@green.zadka.com> <20030505052335.GA37311@hishome.net> <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net> <20030505125007.GA20312@hishome.net> <200305051350.h45Do5c30595@odiug.zope.com> Message-ID: <20030505145806.GA46311@hishome.net> On Mon, May 05, 2003 at 09:50:05AM -0400, Guido van Rossum wrote: > > My comment was not specifically about Moshe's use case - it's about > > the meaning of deprecation in Python. > > > > Does it always have to mean "start replacing because it *will* go > > away" as seems to be implied by PEP 5 or perhaps in some cases it > > could just mean "please don't use this in new code, okay" ? > > I think that can be safely left up to the individual programmer, who > has a better idea (hopefully) on the life expectancy of his code. We > try to give guidance about the urgency of the deprecation e.g. in PEPs > or by using the normally-silent PendingDeprecation (which suggests > it's not urgent :-). I'm afraid this is too subtle for me. I'll ask my question a third time, hoping for an answer that a mere mortal can understand: Are all deprecated features on death row or are some of them merely serving a life sentence? Oren "Do not meddle in the affairs of BDFLs, for they are subtle and quick to anger" From guido@python.org Mon May 5 16:10:43 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 11:10:43 -0400 Subject: [Python-Dev] Distutils using apply In-Reply-To: Your message of "Mon, 05 May 2003 10:58:06 EDT." <20030505145806.GA46311@hishome.net> References: <20030504185544.6010.qmail@green.zadka.com> <20030505052335.GA37311@hishome.net> <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net> <20030505125007.GA20312@hishome.net> <200305051350.h45Do5c30595@odiug.zope.com> <20030505145806.GA46311@hishome.net> Message-ID: <200305051510.h45FAhY31026@odiug.zope.com> > Are all deprecated features on death row or are some of them merely > serving a life sentence? They are all slated to go away. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Mon May 5 16:14:07 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 05 May 2003 11:14:07 -0400 Subject: [Python-Dev] Windows installer request... In-Reply-To: <60FB8BB7F0EFC7409B75EEEC13E20192022DE1FD@admin56.narex.com> Message-ID: [Bjorn Pettersen] > Would it be possible for the windows installer to use $SYSTEMDRIVE$ as > the default installation drive instead of C:? (On my XP box, C: is my > zip-drive, and E: is my SYSTEMDRIVE(*) -- I'm now re-installing :-) Are you saying that the "Select Destination Directory" dialog box doesn't allow you to select your E: drive? Or just that you'd rather not need to select the drive you want? > If it's considered a good idea, and someone can point me to where the > change has to be made, I'd be more than willing to produce a patch... I apparently left this comment in the Wise script: Note from Tim: doesn't seem to be a way to get the true boot drive, the Wizard hardcodes "C". So, AFAIK, there isn't a straightforward way to get Wise 8.14 to suggest a drive other than C:. Perhaps it would work better for you if I removed the Wizard-generated hardcoded "C:" (I don't know which drive Wise would pick then), but since yours is the only complaint about this I've seen, and I have no way to test such a change, I'm very reluctant to fiddle with it. From Jack.Jansen@oratrix.com Mon May 5 16:35:55 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Mon, 5 May 2003 17:35:55 +0200 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <001301c31290$fcea25e0$125ffea9@oemcomputer> Message-ID: <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com> I sort-of agree with Guido that any calls to optimize dictionaries may do more good than bad, but I think that if we make the interface sufficiently abstract we may have something that may work. I was thinking of something analogous to madvise(): the user can specify high level access patterns. For Python dictionaries the access patterns would probably be - I'm going to write a lot of stuff - I'm done writing, and from now on I'm mainly going to read - I haven't a clue what I'm going to do Especially the "I'm going to read from now on" could be put to good use, for instance after completing the dictionary of a class. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From skip@pobox.com Mon May 5 16:43:55 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 5 May 2003 10:43:55 -0500 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com> Message-ID: <16054.34491.64051.134832@montanaro.dyndns.org> Jack> I was thinking of something analogous to madvise(): ... Quick, everyone who's used madvise() please raise your hand... I'll bet a beer most people (even on this list) have never put it to good use. We all know Tim probably has just because he's Tim, and apparently Jack has. Anyone else? Guido, have you ever been tempted? Skip From guido@python.org Mon May 5 16:58:44 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 11:58:44 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: Your message of "Mon, 05 May 2003 10:43:55 CDT." <16054.34491.64051.134832@montanaro.dyndns.org> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com> <16054.34491.64051.134832@montanaro.dyndns.org> Message-ID: <200305051558.h45Fwi531325@odiug.zope.com> > Jack> I was thinking of something analogous to madvise(): ... > > Quick, everyone who's used madvise() please raise your hand... I'll bet a > beer most people (even on this list) have never put it to good use. We all > know Tim probably has just because he's Tim, and apparently Jack has. > Anyone else? Guido, have you ever been tempted? What's madvise()? :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@pfdubois.com Mon May 5 17:13:17 2003 From: paul@pfdubois.com (Paul Dubois) Date: Mon, 5 May 2003 09:13:17 -0700 Subject: [Python-Dev] Election of Todd Miller as head of numpy team Message-ID: <000001c31321$3dc958c0$6801a8c0@NICKLEBY> Todd Miller has been elected as the new Head of the Numeric Python development team. I am still an active developer, but it was time to rotate responsibilities. We especially need help with Numeric maintenance while Todd is working on Numarray. Thanks to all of you who helped me during my tenure. Remember, when you see Todd, the expected greeting to the NummieHead is a salute with more than one finger, accompanied by the cry, "Ni Ni Numpy!". See the file DEVELOPERS in the distribution for our "constitution". Paul From aleax@aleax.it Mon May 5 17:42:02 2003 From: aleax@aleax.it (Alex Martelli) Date: Mon, 5 May 2003 18:42:02 +0200 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <16054.34491.64051.134832@montanaro.dyndns.org> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com> <16054.34491.64051.134832@montanaro.dyndns.org> Message-ID: <200305051842.02937.aleax@aleax.it> On Monday 05 May 2003 05:43 pm, Skip Montanaro wrote: > Jack> I was thinking of something analogous to madvise(): ... > > Quick, everyone who's used madvise() please raise your hand... I'll bet a > beer most people (even on this list) have never put it to good use. We all > know Tim probably has just because he's Tim, and apparently Jack has. I used madvise extensively (and quite successfully) back when I was the senior software consultant responsible for the lower-levels of a variety of Unix-system ports of a line of mechanical CAD products. And I loved and still love the general concept -- let me advise an optimizer (so it can do whatever -- be it a little or a lot -- rather than spend energy trying to guess what in blazes I may be doing:-). Alex From guido@python.org Mon May 5 17:47:06 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 12:47:06 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: Your message of "Mon, 05 May 2003 18:42:02 +0200." <200305051842.02937.aleax@aleax.it> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com> <16054.34491.64051.134832@montanaro.dyndns.org> <200305051842.02937.aleax@aleax.it> Message-ID: <200305051647.h45Gl6N04048@odiug.zope.com> > I used madvise extensively (and quite successfully) back when I was > the senior software consultant responsible for the lower-levels of a > variety of Unix-system ports of a line of mechanical CAD products. > And I loved and still love the general concept -- let me advise an > optimizer (so it can do whatever -- be it a little or a lot -- > rather than spend energy trying to guess what in blazes I may be > doing:-). Hm. How do you know that you were succesful? I could think of an implementation that's similar to those "press to cross" buttons you see at some intersections, and which seem to have no effect whatsoever on the traffic lights. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@zope.com Mon May 5 18:11:50 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 05 May 2003 13:11:50 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <200305051842.02937.aleax@aleax.it> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com> <16054.34491.64051.134832@montanaro.dyndns.org> <200305051842.02937.aleax@aleax.it> Message-ID: <1052154710.12534.14.camel@slothrop.zope.com> On Mon, 2003-05-05 at 12:42, Alex Martelli wrote: > On Monday 05 May 2003 05:43 pm, Skip Montanaro wrote: > > Jack> I was thinking of something analogous to madvise(): ... > > > > Quick, everyone who's used madvise() please raise your hand... I'll bet a > > beer most people (even on this list) have never put it to good use. We all > > know Tim probably has just because he's Tim, and apparently Jack has. > > I used madvise extensively (and quite successfully) back when I was the > senior software consultant responsible for the lower-levels of a variety of > Unix-system ports of a line of mechanical CAD products. And I loved and > still love the general concept -- let me advise an optimizer (so it can do > whatever -- be it a little or a lot -- rather than spend energy trying to > guess what in blazes I may be doing:-). Have you seen the work on gray-box systems? http://www.cs.wisc.edu/graybox/ The philosophy of this project seems to be "You can observe an awful lot just by watching." (Apologies to Yogi.) The approach is to learn how a particular service is implemented, e.g. what buffer-replacement algorithm is used, by observing its behavior. Then write an application that exploits that knowledge to drive the system into optimized behavior for the application. No madvise() necessary. I wonder if the same can be done for dicts? My first guess would be no, because the sparseness is a fixed policy. Jeremy From aleax@aleax.it Mon May 5 18:22:53 2003 From: aleax@aleax.it (Alex Martelli) Date: Mon, 5 May 2003 19:22:53 +0200 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <200305051647.h45Gl6N04048@odiug.zope.com> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051842.02937.aleax@aleax.it> <200305051647.h45Gl6N04048@odiug.zope.com> Message-ID: <200305051922.53855.aleax@aleax.it> On Monday 05 May 2003 06:47 pm, Guido van Rossum wrote: > > I used madvise extensively (and quite successfully) back when I was > > the senior software consultant responsible for the lower-levels of a > > variety of Unix-system ports of a line of mechanical CAD products. > > And I loved and still love the general concept -- let me advise an > > optimizer (so it can do whatever -- be it a little or a lot -- > > rather than spend energy trying to guess what in blazes I may be > > doing:-). > > Hm. How do you know that you were succesful? I could think of an By measuring applications' performance on important benchmarks (mostly not artificial ones, but rather actual benchmarks used in the past by some customers to help them choose which CAD package to buy -- we treasured those, at that firm, and had built up quite a portfolio of them over the years). As CPUs and floating-point units became fast enough, more and more of the speed issues with so-called "CPU intensive" bottlenecks in mechanical-engineering CAD actually became related to memory-access patterns (a phenomenon I had already observed when I worked on IBM multi-CPU mainframes with vector-units, being sold as "supercomputers" but in fact still having complex and deep memory hierarchies -- Cray guys of the time such as Tim no doubt had it easier!-). > implementation that's similar to those "press to cross" buttons you > see at some intersections, and which seem to have no effect whatsoever > on the traffic lights. :-) Yes, there were a few of those, too. That's part of what's cool about an "advise" operation: it IS quite OK to implement it as a no-op, both in the early times when you're moving an existing API to some new platform, AND in (hypothetical:-) late maturity when your optimizer's pattern-detector has become able to outsmart the programmer on a regular basis. C's "register" keyword is a familiar example: it was quite precious in very early compilers with nearly nonexistent optimizers, it was regularly ignored in new compilers for very limited (and particularly register-limited) platforms, and it's invariably ignored now that optimizers have become able to allocate registers better than most programmers. (It should probably have been a #pragma rather than eat up a reserved word, but that's just syntactic-level hindsight:-). Alex From aleax@aleax.it Mon May 5 18:36:20 2003 From: aleax@aleax.it (Alex Martelli) Date: Mon, 5 May 2003 19:36:20 +0200 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <1052154710.12534.14.camel@slothrop.zope.com> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051842.02937.aleax@aleax.it> <1052154710.12534.14.camel@slothrop.zope.com> Message-ID: <200305051936.20078.aleax@aleax.it> On Monday 05 May 2003 07:11 pm, Jeremy Hylton wrote: ... > Have you seen the work on gray-box systems? > > http://www.cs.wisc.edu/graybox/ > > The philosophy of this project seems to be "You can observe an awful lot > just by watching." (Apologies to Yogi.) The approach is to learn how a > particular service is implemented, e.g. what buffer-replacement > algorithm is used, by observing its behavior. Then write an application > that exploits that knowledge to drive the system into optimized behavior > for the application. No madvise() necessary. Haven't read that URL, but this seems to summarize the way we had to work with Fortran compilers on 3090-VF's back in the late '80s -- no way to explicitly advise the compiler about what and how to vectorize, so, lots of experimentation and tweaking to find how what the (expletive deleted) heuristics the GD beast was using, and how to outsmart it and get it to vectorize what *WE* wanted rather than what *IT* thought was good for us. What fun! And of course we got to redo it all over again when a new compiler release came out. No thanks. I've paid my dues and I hope I will *NEVER* again have to work with a system that thinks it's so smart it doesn't need my advisory input -- or at least not on anything that's as performance-crucial as those Fortran programs were (most of my work in IBM Research in did with Rexx -- that's when I learned to love scripting! -- but then and again we did have to crunch really huge batches of numbers). Alex From jeremy@zope.com Mon May 5 18:41:35 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 05 May 2003 13:41:35 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <200305051936.20078.aleax@aleax.it> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051842.02937.aleax@aleax.it> <1052154710.12534.14.camel@slothrop.zope.com> <200305051936.20078.aleax@aleax.it> Message-ID: <1052156494.12531.27.camel@slothrop.zope.com> On Mon, 2003-05-05 at 13:36, Alex Martelli wrote: > No thanks. I've paid my dues and I hope I will *NEVER* again have to > work with a system that thinks it's so smart it doesn't need my advisory > input -- or at least not on anything that's as performance-crucial as > those Fortran programs were (most of my work in IBM Research in > did with Rexx -- that's when I learned to love scripting! -- but then and > again we did have to crunch really huge batches of numbers). I think the graybox project is assuming that few people will have the luxury of working with a system that accepts useful advisory input. Given that hypothesis, they built a tool for identifying what algorithm is being used so that it can be tweaked appropriately. Jeremy From guido@python.org Mon May 5 18:46:28 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 05 May 2003 13:46:28 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: Your message of "Mon, 05 May 2003 19:36:20 +0200." <200305051936.20078.aleax@aleax.it> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051842.02937.aleax@aleax.it> <1052154710.12534.14.camel@slothrop.zope.com> <200305051936.20078.aleax@aleax.it> Message-ID: <200305051746.h45HkS009569@odiug.zope.com> > No thanks. I've paid my dues and I hope I will *NEVER* again have to > work with a system that thinks it's so smart it doesn't need my advisory > input -- or at least not on anything that's as performance-crucial as > those Fortran programs were [...] I severely doubt that any Python apps are as performance-critical as those Fortran programs were. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Mon May 5 18:40:18 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 05 May 2003 13:40:18 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <1052154710.12534.14.camel@slothrop.zope.com> Message-ID: [Jeremy Hylton] > Have you seen the work on gray-box systems? > > http://www.cs.wisc.edu/graybox/ > > The philosophy of this project seems to be "You can observe an awful lot > just by watching." (Apologies to Yogi.) The approach is to learn how a > particular service is implemented, e.g. what buffer-replacement > algorithm is used, by observing its behavior. Then write an application > that exploits that knowledge to drive the system into optimized behavior > for the application. No madvise() necessary. > > I wonder if the same can be done for dicts? My first guess would be no, > because the sparseness is a fixed policy. Well, a dict suffers damaging collisions or it doesn't. If it does, the best thing a user can do is rebuild the dict from scratch, inserting keys by decreasing order of access frequency. Then the most frequently accessed keys come earliest in their collision chains. Collisions simply don't matter for rarely referenced keys. (And, for example, if there *are* any truly damaging collisions in __builtin__.__dict__, I expect this gimmick would remove the damage.) The size of the dict can be forced larger by inserting artificial keys, if a user is insane . It's always been possible to eliminate dummy entries by doing "dict = dict.copy()". Note that because Python exposes the hash function used by dicts, you can write a faithful low-level dict emulator in Python, and deduce what effects a sequence of dict inserts and deletes will have. So, overall, I expect there's more you *could* do to speed dict access (in the handful of bad cases it's not already good enough) yourself than Python could do for you. You'd have to be nuts, though -- or writing papers on gray-box systems. From tim.one@comcast.net Mon May 5 18:54:41 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 05 May 2003 13:54:41 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <200305051922.53855.aleax@aleax.it> Message-ID: [Alex Martelli] > ... > As CPUs and floating-point units became fast enough, more and more of > the speed issues with so-called "CPU intensive" bottlenecks in > mechanical-engineering CAD actually became related to memory-access > patterns (a phenomenon I had already observed when I worked on IBM > multi-CPU mainframes with vector-units, being sold as "supercomputers" > but in fact still having complex and deep memory hierarchies -- Cray guys > of the time such as Tim no doubt had it easier!-). Indeed, Seymour Cray used to say a supercomputer is a machine that transforms a CPU-bound program into an I/0-bound program, and didn't want anything "in between" complicating that view. As a result, optimizing programs to run on Crays was, while still arbitrarily difficult, generally a monotonic process, rarely beset by "mysterious regressions" along the way. Now that gigabyte+ RAM boxes are becoming common, I wonder when someone will figure out that the VM machinery is just slowing them down <0.9 wink>. From aleax@aleax.it Mon May 5 18:57:25 2003 From: aleax@aleax.it (Alex Martelli) Date: Mon, 5 May 2003 19:57:25 +0200 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <200305051746.h45HkS009569@odiug.zope.com> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051936.20078.aleax@aleax.it> <200305051746.h45HkS009569@odiug.zope.com> Message-ID: <200305051957.25403.aleax@aleax.it> On Monday 05 May 2003 07:46 pm, Guido van Rossum wrote: > > No thanks. I've paid my dues and I hope I will *NEVER* again have to > > work with a system that thinks it's so smart it doesn't need my advisory > > input -- or at least not on anything that's as performance-crucial as > > those Fortran programs were [...] > > I severely doubt that any Python apps are as performance-critical as > those Fortran programs were. Yes, this may well be correct. My only TRUE wish for tuning performance of Python applications is to have SOME ways to measure memory footprints with sensible guesses about where they come from -- THAT is where I might gain hugely (by fighting excessive working sets through selective flushing of caches, freelists, etc). Alex From jeremy@zope.com Mon May 5 19:12:12 2003 From: jeremy@zope.com (Jeremy Hylton) Date: 05 May 2003 14:12:12 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <200305051957.25403.aleax@aleax.it> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051936.20078.aleax@aleax.it> <200305051746.h45HkS009569@odiug.zope.com> <200305051957.25403.aleax@aleax.it> Message-ID: <1052158331.12534.31.camel@slothrop.zope.com> On Mon, 2003-05-05 at 13:57, Alex Martelli wrote: > On Monday 05 May 2003 07:46 pm, Guido van Rossum wrote: > > > No thanks. I've paid my dues and I hope I will *NEVER* again have to > > > work with a system that thinks it's so smart it doesn't need my advisory > > > input -- or at least not on anything that's as performance-crucial as > > > those Fortran programs were [...] > > > > I severely doubt that any Python apps are as performance-critical as > > those Fortran programs were. > > Yes, this may well be correct. My only TRUE wish for tuning performance > of Python applications is to have SOME ways to measure memory > footprints with sensible guesses about where they come from -- THAT > is where I might gain hugely (by fighting excessive working sets through > selective flushing of caches, freelists, etc). Any idea how to actually do this? Jeremy From python@rcn.com Mon May 5 19:12:53 2003 From: python@rcn.com (Raymond Hettinger) Date: Mon, 5 May 2003 14:12:53 -0400 Subject: [Python-Dev] Dictionary sparseness References: Message-ID: <000101c31339$791838c0$125ffea9@oemcomputer> > the best thing a user can do is rebuild the dict from scratch, inserting keys by > decreasing order of access frequency. Then a periodic resize comes alongm re-inserting everything in a different order. >The size of the dict can be forced larger by > inserting artificial keys, if a user is insane . Uh oh: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/198157 > You'd have to be nuts, though That explains alot ;) Does the *4 patch (amended to have an upper bound) have a chance? It's automatic, simple, benefits some cases while not harming others, Raymond From aleax@aleax.it Mon May 5 20:21:28 2003 From: aleax@aleax.it (Alex Martelli) Date: Mon, 5 May 2003 21:21:28 +0200 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <1052158331.12534.31.camel@slothrop.zope.com> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051957.25403.aleax@aleax.it> <1052158331.12534.31.camel@slothrop.zope.com> Message-ID: <200305052121.28017.aleax@aleax.it> On Monday 05 May 2003 08:12 pm, Jeremy Hylton wrote: > On Mon, 2003-05-05 at 13:57, Alex Martelli wrote: > > On Monday 05 May 2003 07:46 pm, Guido van Rossum wrote: > > > > No thanks. I've paid my dues and I hope I will *NEVER* again have to > > > > work with a system that thinks it's so smart it doesn't need my > > > > advisory input -- or at least not on anything that's as > > > > performance-crucial as those Fortran programs were [...] > > > > > > I severely doubt that any Python apps are as performance-critical as > > > those Fortran programs were. > > > > Yes, this may well be correct. My only TRUE wish for tuning performance > > of Python applications is to have SOME ways to measure memory > > footprints with sensible guesses about where they come from -- THAT > > is where I might gain hugely (by fighting excessive working sets through > > selective flushing of caches, freelists, etc). > > Any idea how to actually do this? Not really, even though I've been thinking about it for a while -- pymalloc's the only "hook" that comes to mind so far. Alex From skip@pobox.com Mon May 5 20:35:23 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 5 May 2003 14:35:23 -0500 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: <200305051957.25403.aleax@aleax.it> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051936.20078.aleax@aleax.it> <200305051746.h45HkS009569@odiug.zope.com> <200305051957.25403.aleax@aleax.it> Message-ID: <16054.48379.533379.672799@montanaro.dyndns.org> Alex> My only TRUE wish for tuning performance of Python applications is Alex> to have SOME ways to measure memory footprints with sensible Alex> guesses about where they come from Here's a thought. Debug builds appear to now add a getobjects method to sys. Would it be possible to also add another method to sys (also only available on debug builds) which knows just enough about basic builtin object types to say a little about how much space it's consuming? For example, I could do something like this: allocdict = {} for o in sys.getobjects(0): allocsize = sys.get_object_allocation_size(o) # I'm not a fan of {}.setdefault() alloc = allocdict.get(type(o), []) alloc.append(allocsize) # or alloc.append((allocsize, o)) allocdict[type(o)] = alloc Once the list is traversed you can poke around in allocdict figuring out where your memory went (other than to allocdict itself!). (I was tempted to suggest another method, but I fear that would just spread the mess around. That may also be a viable option though.) Skip From tim@zope.com Mon May 5 20:34:58 2003 From: tim@zope.com (Tim Peters) Date: Mon, 5 May 2003 15:34:58 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <000101c31339$791838c0$125ffea9@oemcomputer> Message-ID: [Tim] >> the best thing a user can do is rebuild the dict from scratch, >> inserting keys by decreasing order of access frequency. [Raymond Hettinger] > Then a periodic resize comes alongm re-inserting everything > in a different order. Sure -- micro-optimizations are always fragile. This kind of thing will be done by someone who's certain the dict is henceforth read-only, and who thinks it's worth the risk and obscurity to get some small speedup. They're probably right at the time they do it, too, and probably wrong over time. Same thing goes for, e.g., an madvise() call claiming a current truth that changes over time. > ... > Does the *4 patch (amended to have an upper bound) have a chance? > It's automatic, simple, benefits some cases while not harming others, It would be nice if more people tried it and added their results to the patch report: http://www.python.org/sf/729395 Right now, we just have Guido's comment saying that he no longer sees the Zope3 startup speedup he thought he saw earlier. Small percentage speedups are like that, alas. The patch is OK by me. From tim.one@comcast.net Mon May 5 20:56:41 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 05 May 2003 15:56:41 -0400 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: <16054.48379.533379.672799@montanaro.dyndns.org> Message-ID: [Skip Montanaro] > Here's a thought. Debug builds appear to now add a getobjects method to > sys. Yes, although that isn't new -- it's been there forever (read Misc/SpecialBuilds.txt). > Would it be possible to also add another method to sys (also only > available on debug builds) which knows just enough about basic builtin > object types to say a little about how much space it's consuming? Marc-Andre has something like that in mxTools already (his sizeof() function). Note also the COUNT_ALLOCS special build, which saves info about total # of allocations, deallocations, and highwater mark per type, made available via sys.getcounts(). The nifty thing about COUNT_ALLOCS is that you can enable it in a release build (it doesn't rely on the debug-build changes to the layout of PyObject). Stuff all these things miss (even pymalloc, because it isn't asked for the memory) include the immortal and unbounded int freelist, the I&U float FL, and the immortal but bounded frameobject FL. Do, e.g., range(2000000) (as someone did on c.l.py last week), and about 24MB "goes missing" until the program shuts down (it's sitting in the int FL). Note that pymalloc never returns its "arenas" to the system either. From zooko@zooko.com Mon May 5 21:02:08 2003 From: zooko@zooko.com (Zooko) Date: Mon, 05 May 2003 16:02:08 -0400 Subject: [Python-Dev] Re: heaps In-Reply-To: Message from "Raymond Hettinger" of "Sun, 04 May 2003 22:26:47 EDT." <003f01c312ad$c7277580$125ffea9@oemcomputer> References: <003f01c312ad$c7277580$125ffea9@oemcomputer> Message-ID: >From heapq.py: """ Usage: heap = [] # creates an empty heap heappush(heap, item) # pushes a new item on the heap item = heappop(heap) # pops the smallest item from the heap item = heap[0] # smallest item on the heap without popping it ... [It is] possible to view the heap as a regular Python list without surprises: heap[0] is the smallest item, and heap.sort() maintains the heap invariant! """ Shouldn't heapq be a subclass of list? Then it would read: """ heap = heapq() # creates an empty heap heap.push(item) # pushes a new item on the heap item = heap.pop() # pops the smallest item from the heap item = heap[0] # smallest item on the heap without popping it """ In addition to nicer syntax, this would give you the option to forbid invariant-breaking alterations. Although you could also choose to allow invariant-breaking alterations, just as the current heapq does. One thing I don't know how to implement is: # This changes mylist itself into a heapq -- it doesn't make a copy of mylist! makeheapq(mylist) Perhaps this is a limitation of the current object model? Or is there a way to change an object's type at runtime. Regards, Zooko http://zooko.com/ From agthorr@barsoom.org Mon May 5 21:14:16 2003 From: agthorr@barsoom.org (Agthorr) Date: Mon, 5 May 2003 13:14:16 -0700 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: References: <000101c31339$791838c0$125ffea9@oemcomputer> Message-ID: <20030505201416.GB17384@barsoom.org> On Mon, May 05, 2003 at 03:34:58PM -0400, Tim Peters wrote: > Sure -- micro-optimizations are always fragile. This kind of thing will be > done by someone who's certain the dict is henceforth read-only, and who > thinks it's worth the risk and obscurity to get some small speedup. An alternate optimization would be the additional of an immutable dictionary type to the language, initialized from a mutable dictionary type. Upon creation, this dictionary would optimize itself, in a manner similar to "gperf" program which creates (nearly) minimal zero-collision hash tables. On this plus side, this would form a nice symmetry with the existing mutable vs immutable types. Also, it would be proof against bit-rot, since either: a) the user changes the mutable dictionary before it is optimized. In this case, the optimizer will simply optimize the new dictionary, or b) the user attempts to modify the immutable dictionary, which will fail with an error. -- Agthorr From skip@pobox.com Mon May 5 21:24:39 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 5 May 2003 15:24:39 -0500 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: References: <16054.48379.533379.672799@montanaro.dyndns.org> Message-ID: <16054.51335.255858.381526@montanaro.dyndns.org> Tim> Stuff all these things miss (even pymalloc, because it isn't asked Tim> for the memory) include the immortal and unbounded int freelist, Tim> the I&U float FL, and the immortal but bounded frameobject FL. Do, Tim> e.g., range(2000000) (as someone did on c.l.py last week), and Tim> about 24MB "goes missing" until the program shuts down (it's Tim> sitting in the int FL). Note that pymalloc never returns its Tim> "arenas" to the system either. These shortcomings could be remedied by suitable inspection functions added to sys for debug builds. This leads me to wonder, has anyone measured the cost of deleting the int and float free lists when pymalloc is enabled? I wonder how unbearable it would be. Skip From martin@v.loewis.de Mon May 5 21:39:40 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 05 May 2003 22:39:40 +0200 Subject: [Python-Dev] How to test this? In-Reply-To: <16054.30310.489999.134263@montanaro.dyndns.org> References: <16054.30310.489999.134263@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > I just added a patch file to . It > doesn't include any test cases, since that requires an old db hash > v2 file present. Is it okay to check in a dummy file to Lib/test > for this purpose? Make sure you use -kb in the cvs add. Apart from that, it would be fine by me - except that I recall that the file format is endianness-sensitive, so you should make sure that the test passes on machines of both endiannesses before adding the file. Regards, Martin From aleax@aleax.it Mon May 5 21:42:40 2003 From: aleax@aleax.it (Alex Martelli) Date: Mon, 5 May 2003 22:42:40 +0200 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: <20030505201416.GB17384@barsoom.org> References: <000101c31339$791838c0$125ffea9@oemcomputer> <20030505201416.GB17384@barsoom.org> Message-ID: <200305052242.40380.aleax@aleax.it> On Monday 05 May 2003 10:14 pm, Agthorr wrote: > On Mon, May 05, 2003 at 03:34:58PM -0400, Tim Peters wrote: > > Sure -- micro-optimizations are always fragile. This kind of thing will > > be done by someone who's certain the dict is henceforth read-only, and > > who thinks it's worth the risk and obscurity to get some small speedup. > > An alternate optimization would be the additional of an immutable > dictionary type to the language, initialized from a mutable dictionary I'd love a read-only dictionary (AND a read-only list) for reasons having little to do with optimization, actually -- ease of use as dict keys and/or set members, plus, occasional help in catching errors (for the latter use it would be wonderful if read-only dictionaries could be actually substituted in place of such things as instance and class dictionaries). Tuples are no substitutes for read-only lists because they lack many useful "read-only" methods of lists (and won't grow them, as the BDFL has abundantly made clear, as he sees tuples as drastically different from lists). Neither, even more clearly, are e.g. tuples of pairs a good substitute for read-only dictionaries. I've played with adding more selective "locking" to dicts but I was unable to do it without a performance hit. If wholesale "RO-ness" can in fact *increase* performance in some cases, so much the better. "RO lists" could probably save a little memory compared to normal ones since they would need no "spare space" for growing. Alex From martin@v.loewis.de Mon May 5 21:43:17 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 05 May 2003 22:43:17 +0200 Subject: [Python-Dev] Windows installer request... In-Reply-To: References: Message-ID: Tim Peters writes: > Are you saying that the "Select Destination Directory" dialog box doesn't > allow you to select your E: drive? Or just that you'd rather not need to > select the drive you want? I second the second; I noticed that Python installed on the "wrong drive" (i.e. the W9x installation) only after installation was complete. I don't know (and can't check at the moment) whether it offered me to pick e:. It probably did, but I don't know for sure. > So, AFAIK, there isn't a straightforward way to get Wise 8.14 to suggest a > drive other than C:. Perhaps it would work better for you if I removed the > Wizard-generated hardcoded "C:" (I don't know which drive Wise would pick > then), but since yours is the only complaint about this I've seen, and I > have no way to test such a change, I'm very reluctant to fiddle with it. I have the same complaint, and I'd happily test any updated installer. Regards, Martin From zooko@zooko.com Mon May 5 21:58:21 2003 From: zooko@zooko.com (Zooko) Date: Mon, 05 May 2003 16:58:21 -0400 Subject: [Python-Dev] Dictionary sparseness In-Reply-To: Message from Alex Martelli of "Mon, 05 May 2003 22:42:40 +0200." <200305052242.40380.aleax@aleax.it> References: <000101c31339$791838c0$125ffea9@oemcomputer> <20030505201416.GB17384@barsoom.org> <200305052242.40380.aleax@aleax.it> Message-ID: Alex Martelli wrote: > > I'd love a read-only dictionary (AND a read-only list) for reasons having > little to do with optimization, actually [...] Me too! It would be very useful for secure Python -- I could pass my list to some without risking that it mutates my list. Without RO-lists I have to make a copy of my list every time I want to show it to someone. Regards, Zooko http://zooko.com/ From tim.one@comcast.net Mon May 5 22:00:16 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 05 May 2003 17:00:16 -0400 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: <16054.51335.255858.381526@montanaro.dyndns.org> Message-ID: [Skip Montanaro] [on assorted freelists] > These shortcomings could be remedied by suitable inspection > functions added to sys for debug builds. If someone cares enough , sure. > This leads me to wonder, has anyone measured the cost of deleting the > int and float free lists when pymalloc is enabled? I wonder how > unbearable it would be. Vladimir did when he was first developing pymalloc, and left the free lists in deliberately. I haven't tried it. pymalloc is a bit faster since then, but will always have the additional overhead of needing to figure out *which* freelist to look in (pymalloc's free lists are segregated by block size), and, because it recycles empty pools among different block sizes too, the overhead on free of checking for pool emptiness. The int free list is faster in part because it's so damn Narcissistic <0.7 wink>. From skip@pobox.com Mon May 5 22:24:59 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 5 May 2003 16:24:59 -0500 Subject: [Python-Dev] How to test this? In-Reply-To: References: <16054.30310.489999.134263@montanaro.dyndns.org> Message-ID: <16054.54955.226043.202262@montanaro.dyndns.org> Martin> Make sure you use -kb in the cvs add. Thanks, I'd forgotten about that. Martin> Apart from that, it would be fine by me - except that I recall Martin> that the file format is endianness-sensitive, so you should make Martin> sure that the test passes on machines of both endiannesses Martin> before adding the file. It appears the database itself accounts for the endianness of the file. I copied my test db file from my Mac to a Linux PC. struct.unpack("=l", f.read(4)) showed different values on the two systems (0x61561 vs 0x61150600) but bsddb185 on both systems could read the file. This is a very nice property of Berkeley DB in general. I copy db files from the spambayes project all the time. rsync(1) sure beats the heck out of dumping and reloading a 20+MB file all the time. Skip From martin@v.loewis.de Mon May 5 23:03:00 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 06 May 2003 00:03:00 +0200 Subject: [Python-Dev] How to test this? In-Reply-To: <16054.54955.226043.202262@montanaro.dyndns.org> References: <16054.30310.489999.134263@montanaro.dyndns.org> <16054.54955.226043.202262@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > It appears the database itself accounts for the endianness of the file. I > copied my test db file from my Mac to a Linux PC. struct.unpack("=l", > f.read(4)) showed different values on the two systems (0x61561 vs > 0x61150600) but bsddb185 on both systems could read the file. This is a > very nice property of Berkeley DB in general. That's good to hear. I thought I understood a report on the Subversion mailing list that you can't move databases across endianesses, but that might have been an unrelated issue. Regards, Martin From tim.one@comcast.net Tue May 6 01:33:39 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 05 May 2003 20:33:39 -0400 Subject: [Python-Dev] Re: heaps In-Reply-To: <003f01c312ad$c7277580$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger] > FWIW, there is C implementation of heapq at: > http://zhar.net/projects/python/ Cool! I thought the code was remarkably clear, until I realized it never checked for errors (e.g., PyList_Append() can run out of memory, and PyObject_RichCompareBool() can raise any exception). Those would have to be repaired, and doing so would slow it some. If the heapq module survives with the same API for a release or two, it would be a fine candidate to move into C, or maybe Pyrex (fiddly little integer arithmetic interspersed via if/then/else with trivial array indexing aren't Python's strong suits). From tim.one@comcast.net Tue May 6 03:35:28 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 05 May 2003 22:35:28 -0400 Subject: [Python-Dev] Re: heaps In-Reply-To: <20414172.1052079989@[10.0.1.2]> Message-ID: [David Eppstein, on the bar-raising behavior of person > q[0] ] > Good point. If any permutation of the input sequence is equally likely, > and you're selecting the best k out of n items, the expected number of > times you have to hit the data structure in your heapq solution > is roughly k ln n, so the total expected time is O(n + k log k log n), > with a really small constant factor on the O(n) term. The sorting > solution I suggested has total time O(n log k), and even though sorting > is built-in and fast it can't compete when k is small. Random pivoting > is O(n + k), but with a larger constant factor, so your heapq solution > looks like a winner. In real Python Life, it's the fastest way I know (depending ...). > For fairness, it might be interesting to try another run of your test > in which the input sequence is sorted in increasing order rather > than random. Comparing the worst case of one against the best case of the other isn't my idea of fairness , but sure. On the best-1000 of a million floats test, and sorting the floats first, the heap method ran about 30x slower than on random data, and the sort method ran significantly faster than on random data (a factor of 1.3x faster). OTOH, if I undo my speed tricks and call a function in the sort method (instead of doing it all micro-optimized inline), that slows the sort method by a bit over a factor of 2. > I.e., replace the random generation of seq by > seq = range(M) > I'd try it myself, but I'm still running python 2.2 and haven't > installed heapq. I'd have to know more about your application to > have an idea whether the sorted or randomly-permuted case is more > representative. Of course -- so would I . Here's a surprise: I coded a variant of the quicksort-like partitioning method, at the bottom of this mail. On the largest-1000 of a million random-float case, times were remarkably steady across trials (i.e., using a different set of a million random floats each time): heapq 0.96 seconds sort (micro-optimized) 3.4 seconds KBest (below) 2.6 seconds The KBest code creates new lists with wild abandon. I expect it does better than the sort method anyway because it gets to exploit its own form of "raise the bar" behavior as more elements come in. For example, on the first run, len(buf) exceeded 3000 only 14 times, and the final pivot value each time is used by put() as an "ignore the input unless it's bigger than that" cutoff: pivoted w/ 0.247497558554 pivoted w/ 0.611006884768 pivoted w/ 0.633565558936 pivoted w/ 0.80516673256 pivoted w/ 0.814304890889 pivoted w/ 0.884660572175 pivoted w/ 0.89986744075 pivoted w/ 0.946575251872 pivoted w/ 0.980386533221 pivoted w/ 0.983743795382 pivoted w/ 0.992381911217 pivoted w/ 0.994243625292 pivoted w/ 0.99481443021 pivoted w/ 0.997044443344 The already-sorted case is also a bad case for this method, because then the pivot is never big enough to trigger the early exit in put(). def split(seq, pivot): lt, eq, gt = [], [], [] lta, eqa, gta = lt.append, eq.append, gt.append for x in seq: c = cmp(x, pivot) if c < 0: lta(x) elif c: gta(x) else: eqa(x) return lt, eq, gt # KBest(k, minusinf) remembers the largest k objects # from a sequence of objects passed one at a time to # put(). minusinf must be smaller than any object # passed to put(). After feeding in all the objects, # call get() to retrieve a list of the k largest (or # as many as were passed to put(), if put() was called # fewer than k times). class KBest(object): __slots__ = 'k', 'buflim', 'buf', 'cutoff' def __init__(self, k, minusinf): self.k = k self.buflim = 3*k self.buf = [] self.cutoff = minusinf def put(self, obj): if obj <= self.cutoff: return buf = self.buf buf.append(obj) if len(buf) <= self.buflim: return # Reduce len(buf) by at least one, by retaining # at least k, and at most len(buf)-1, of the # largest objects in buf. from random import choice sofar = [] k = self.k while len(sofar) < k: pivot = choice(buf) buf, eq, gt = split(buf, pivot) sofar.extend(gt) if len(sofar) < k: sofar.extend(eq[:k - len(sofar)]) self.buf = sofar self.cutoff = pivot def get(self): from random import choice buf = self.buf k = self.k if len(buf) <= k: return buf # Retain only the k largest. sofar = [] needed = k while needed: pivot = choice(buf) lt, eq, gt = split(buf, pivot) if len(gt) <= needed: sofar.extend(gt) needed -= len(gt) if needed: takefromeq = min(len(eq), needed) sofar.extend(eq[:takefromeq]) needed -= takefromeq # If we still need more, they have to # come out of things < pivot. buf = lt else: # gt alone is too large. buf = gt assert len(sofar) == k self.buf = sofar return sofar From BPettersen@NAREX.com Tue May 6 05:40:04 2003 From: BPettersen@NAREX.com (Bjorn Pettersen) Date: Mon, 5 May 2003 22:40:04 -0600 Subject: [Python-Dev] Windows installer request... Message-ID: <60FB8BB7F0EFC7409B75EEEC13E20192022DE23A@admin56.narex.com> > From: Tim Peters [mailto:tim.one@comcast.net]=20 >=20 > [Bjorn Pettersen] > > Would it be possible for the windows installer to use=20 > > $SYSTEMDRIVE$ as the default installation drive instead=20 > > of C:? [...] > Are you saying that the "Select Destination Directory" dialog=20 > box doesn't allow you to select your E: drive? Or just=20 > that you'd rather not need to select the drive you want? Most installers default to the system drive, so I didn't even look the first time. I am able to change it manually. > > If it's considered a good idea, and someone can point me to=20 > > where the change has to be made, I'd be more than willing to=20 > > produce a patch... >=20 > I apparently left this comment in the Wise script: >=20 > Note from Tim: doesn't seem to be a way to get the true=20 > boot drive, the Wizard hardcodes "C". >=20 > So, AFAIK, there isn't a straightforward way to get Wise 8.14=20 > to suggest a drive other than C:. It should be as easy as (platforms that doesn't have %systemdrive% could only install to C:): item: Get Environment Variable Variable=3DOSDRIVE Environment=3DSystemDrive Default=3DC: end However, you might have to do item: Get Registry Key Value Variable=3DOSDRIVE Key=3DSystem\CurrentControlSet\Control\Session Manager\Environment Value Name=3DSystemDrive Flags=3D00000100 Defualt=3DC: end (not sure about the Flags parameter) I couldn't find much documentation, and the example I'm looking at is a litte "divided" about which it should use... I think it tries the first one, and falls back on the second(?) (http://ibinstall.defined.net/dl_scripts.htm, script_6016.zip/IBWin32Setup.wse). Also, it looks like you want to use %SYS32% to get to the windows system directory (on WinXP, it's c:\windows\system32, which doesn't seem to be listed anywhere...) I can't figure out how you're building the installer however. If you can point me in the right direction I can test it on my special WinXP, regular WinXP, Win98, Win2k, and maybe WinNT4 (I think we still have one around :-). -- bjorn From eppstein@ics.uci.edu Tue May 6 07:00:24 2003 From: eppstein@ics.uci.edu (David Eppstein) Date: Mon, 05 May 2003 23:00:24 -0700 Subject: [Python-Dev] Re: heaps References: <20414172.1052079989@[10.0.1.2]> Message-ID: In article , Tim Peters wrote: > > For fairness, it might be interesting to try another run of your test > > in which the input sequence is sorted in increasing order rather > > than random. > > Comparing the worst case of one against the best case of the other isn't my > idea of fairness , but sure. Well, it doesn't seem any fairer to use random data to compare an algorithm with an average time bound that depends on an assumption of randomness in the data...anyway, the point was more to understand the limiting cases. If one algorithm is usually 3x faster than the other, and is never more than 10x slower, that's better than being usually 3x faster but sometimes 1000x slower, for instance. > > I'd have to know more about your application to > > have an idea whether the sorted or randomly-permuted case is more > > representative. > > Of course -- so would I . My Java KBest code was written to make data subsets for a half-dozen web pages (same data selected according to different criteria). Of these six instances, one is presented the data in roughly ascending order, one in descending order, and the other four are less clear but probably not random. Robustness in the face of this sort of variation is why I prefer any average-case assumptions in my code's performance to depend only on randomness from a random number generator, and not arbitrariness in the actual input. But I'm not sure I'd usually be willing to pay a 3x penalty for that robustness. > Here's a surprise: I coded a variant of the quicksort-like partitioning > method, at the bottom of this mail. On the largest-1000 of a million > random-float case, times were remarkably steady across trials (i.e., using a > different set of a million random floats each time): > > heapq 0.96 seconds > sort (micro-optimized) 3.4 seconds > KBest (below) 2.6 seconds Huh. You're almost convincing me that asymptotic analysis works even in the presence of Python's compiled-vs-interpreted anomalies. The other surprise is that (unlike, say, the sort or heapq versions) your KBest doesn't look significantly more concise than my earlier Java implementation. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From harri.pasanen@trema.com Tue May 6 09:55:27 2003 From: harri.pasanen@trema.com (Harri Pasanen) Date: Tue, 6 May 2003 10:55:27 +0200 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: <16054.48379.533379.672799@montanaro.dyndns.org> References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051957.25403.aleax@aleax.it> <16054.48379.533379.672799@montanaro.dyndns.org> Message-ID: <200305061055.27898.harri.pasanen@trema.com> Speaking of memory consumption, has the memory footprint of Python changed significantly from 2.2 to 2.3? I've been toying with the idea of making a small python ever since I compiled Python 1.0 for MS-DOS box with 512Kb of memory. I've scanned at the Palm Python stuff, but I did not have a clear picture if they really did everything possible to make it small, including changing the representation of internal structs, or did they just chop away the complex type, parser, compiler, etc? Regards, Harri From mal@lemburg.com Tue May 6 11:03:15 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 06 May 2003 12:03:15 +0200 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: References: Message-ID: <3EB78863.5070105@lemburg.com> Tim Peters wrote: > [Skip Montanaro] > > [on assorted freelists] > >>These shortcomings could be remedied by suitable inspection >>functions added to sys for debug builds. > > If someone cares enough , sure. > >>This leads me to wonder, has anyone measured the cost of deleting the >>int and float free lists when pymalloc is enabled? I wonder how >>unbearable it would be. > > Vladimir did when he was first developing pymalloc, and left the free lists > in deliberately. I haven't tried it. pymalloc is a bit faster since then, > but will always have the additional overhead of needing to figure out > *which* freelist to look in (pymalloc's free lists are segregated by block > size), and, because it recycles empty pools among different block sizes too, > the overhead on free of checking for pool emptiness. The int free list is > faster in part because it's so damn Narcissistic <0.7 wink>. If someone really care, I suppose that the garbage collector could do an occasional scan of the int free list and chop of the tail after a certain number of entries. FWIW, Unicode free lists have a cap to limit the number of entries in the list to 1024. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 06 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 49 days left From guido@python.org Tue May 6 13:07:54 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 06 May 2003 08:07:54 -0400 Subject: [Python-Dev] Startup time Message-ID: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> While Python's general speed has gone up, its startup speed has slowed down! I timed this two different ways. The first way is to run python -c "import time; print time.clock()" On Unix, this prints the CPU time used since the process was created. The second way is to run time python -c pass which shows CPU and real time to complete running the process. I did this on a 633 MHz PC running Red Hat Linux 7.3. The Python builds were standard non-debug builds. I tried with and without the -S option, which is supposed to suppress loading of site.py and hence most startup overhead; it didn't exist in Python 1.3 and 1.4. Results for the first way are pretty inaccurate because it's such a small number and is only measured in 1/100 of a second, yet revealing. Some times are printed as two values; I didn't do enough runs to compute a careful average, so I'm just showing the range: Version CPU Time CPU Time with -S 1.3 0.00 N/A 1.4 0.00 N/A 1.5.2 0.01 0.00 2.0 0.01-0.02 0.00 2.1 0.01-0.02 0.00 2.2 0.02 0.00 2.3 0.04 0.03-0.04 Now using time: Version CPU Time CPU Time with -S 1.3 0.004 N/A 1.4 0.004 N/A 1.5 0.018 0.006 2.0 0.021 0.006 2.1 0.018 0.004 2.2 0.025 0.004 2.3 0.045 0.045 Note two things: (a) the start time goes up over time, and (b) for Python 2.3, -S doesn't make any difference. Given that we often run very short Python programs, and e.g. Python's popularity for CGI scripts, I find this increase in startup time very worrysome, and worthy of our attention (more than gaining nanoseconds on dict operations or even socket I/O speed). My goal: I'd like Python 2.3(final) to start up at least as fast as Python 2.2, and I'd like the much faster startup time back with -S. I have no time to investigate the cause right now, although I have a suspicion that the problem might be in loading too much of the encoding framework at start time (I recall Marc-Andre and Martin debating this earlier). --Guido van Rossum (home page: http://www.python.org/~guido/) From mcherm@mcherm.com Tue May 6 13:32:03 2003 From: mcherm@mcherm.com (Michael Chermside) Date: Tue, 6 May 2003 05:32:03 -0700 Subject: [Python-Dev] Re: heaps Message-ID: <1052224323.3eb7ab43530a5@mcherm.com> Zooko writes: > Shouldn't heapq be a subclass of list? [...] > One thing I don't know how to implement is: > > # This changes mylist itself into a heapq -- it doesn't make a copy of mylist! > makeheapq(mylist) > > Perhaps this is a limitation of the current object model? Or is there a way > to change an object's type at runtime. To change an object's CLASS, sure, but it's TYPE -- seems impossible to me on the face of it since a different type may have a different C layout. Now in THIS case there's no need for a different C layout, so perhaps there's some wierd trick I don't know, but I wouldn't think so. As to your FIRST point though... the choice seems to be between making heapq a subclass of list or a module for operating on a list. You argue that the syntax will be cleaner, but comparing your examples: > heap = [] > heappush(heap, item) > item = heappop(heap) > item = heap[0] > heap = heapq() > heap.push(item) > item = heap.pop() > item = heap[0] I honestly see little meaningful difference. Since (as per earlier discussion) heapq is NOT intended to be a abstract heap data type, I tend to prefer the simpler solution (using a list instead of subclassing). -- Michael Chermside From tim.one@comcast.net Tue May 6 16:47:46 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 11:47:46 -0400 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: <3EB78863.5070105@lemburg.com> Message-ID: [M.-A. Lemburg] > If someone really care, I suppose that the garbage collector could > do an occasional scan of the int free list and chop of the tail > after a certain number of entries. Int objects aren't allocated individually; malloc() is used to get single "int blocks", which contain room for about 1000 ints at a time, and these blocks are carved up internally by intobject.c. So it isn't possible to reclaim the space for a single int, so "tail" doesn't mean anything useful in this context. > FWIW, Unicode free lists have a cap to limit the number of entries > in the list to 1024. The Unicode freelist is more like the frameobject freelist that way (it is possible to reclaim the space for an individual Unicode string or frame object). From skip@pobox.com Tue May 6 16:55:25 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 10:55:25 -0500 Subject: [Python-Dev] testing with and without pyc files present Message-ID: <16055.56045.277686.400944@montanaro.dyndns.org> The test targets in the Makefile first delete any .py[co] files, then run the test suite twice. I know there must be a reason for this, but isn't there a less sledgehammer-like and more explicit way to test whatever this is trying to test? Skip From guido@python.org Tue May 6 17:04:20 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 06 May 2003 12:04:20 -0400 Subject: [Python-Dev] testing with and without pyc files present In-Reply-To: Your message of "Tue, 06 May 2003 10:55:25 CDT." <16055.56045.277686.400944@montanaro.dyndns.org> References: <16055.56045.277686.400944@montanaro.dyndns.org> Message-ID: <200305061604.h46G4KR25972@odiug.zope.com> > The test targets in the Makefile first delete any .py[co] files, then run > the test suite twice. I know there must be a reason for this, but isn't > there a less sledgehammer-like and more explicit way to test whatever this > is trying to test? In the past, we've had problems where bugs in the marshalling or elsewhere caused bytecode read from .pyc files to behave differently than bytecode generated directly from a .py source file. Sometimes the bytecode read from a .pyc file had the bug, somtimes the directly generated bytecode. This is sometimes a very shy bug needing a lot of sample data. How else would you propose to test this? --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Tue May 6 17:12:32 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 06 May 2003 18:12:32 +0200 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: References: Message-ID: <3EB7DEF0.4020105@lemburg.com> Tim Peters wrote: > [M.-A. Lemburg] > >>If someone really care, I suppose that the garbage collector could >>do an occasional scan of the int free list and chop of the tail >>after a certain number of entries. > > Int objects aren't allocated individually; malloc() is used to get single > "int blocks", which contain room for about 1000 ints at a time, and these > blocks are carved up internally by intobject.c. So it isn't possible to > reclaim the space for a single int, so "tail" doesn't mean anything useful > in this context. Hmm, looking at the code it seems that the different blocks are not referencing each other. Wouldn't it be possible to link them together as list of blocks ? This list could then be used for the review operation. >>FWIW, Unicode free lists have a cap to limit the number of entries >>in the list to 1024. > > The Unicode freelist is more like the frameobject freelist that way (it is > possible to reclaim the space for an individual Unicode string or frame > object). Probably :-) Would using the block technique from the int implementation make a difference for the frame objects ? I would guess that a typical Python program rarely has more than 100 frames alive at any one time. These could be placed into such a block to make setting them up faster, possible making Python function calls a tad snippier. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 06 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 49 days left From skip@pobox.com Tue May 6 17:16:45 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 11:16:45 -0500 Subject: [Python-Dev] testing with and without pyc files present In-Reply-To: <200305061604.h46G4KR25972@odiug.zope.com> References: <16055.56045.277686.400944@montanaro.dyndns.org> <200305061604.h46G4KR25972@odiug.zope.com> Message-ID: <16055.57325.88910.417060@montanaro.dyndns.org> Guido> Sometimes the bytecode read from a .pyc file had the bug, Guido> somtimes the directly generated bytecode. This is sometimes a Guido> very shy bug needing a lot of sample data. How else would you Guido> propose to test this? I have no idea, but the reason for the two test runs should probably be documented somewhere. I just embellished the comment in Makefile.pre.in which preceed the test targets. Skip From skip@pobox.com Tue May 6 17:20:34 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 11:20:34 -0500 Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck" Message-ID: <16055.57554.364845.689049@montanaro.dyndns.org> I decided to investigate why the resource module wasn't getting built on my Mac today. A quick check showed that build.opt/pyconfig.h didn't include this stanza: /* Define if you have the 'getpagesize' function. */ #define HAVE_GETPAGESIZE 1 although pyconfig.h.in contained this stanza: /* Define if you have the 'getpagesize' function. */ #undef HAVE_GETPAGESIZE The date on pyconfig.h.in was May 5. The date on build.opt/pyconfig.h was Feb 27. Executing ./config.status --recheck in my build.opt tree doesn't regenerate pyconfig.h. I then tried executing ../configure --prefix=/Users/skip/local This generated pyconfig.h. It would thus appear that config.status shouldn't be used by developers. Apparently one of the other flags it appends to the generated configure command suppresses generation of pyconfig.h (and maybe other files). Skip From jepler@unpythonic.net Tue May 6 17:21:27 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Tue, 6 May 2003 11:21:27 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030506162127.GC12791@unpythonic.net> Comparing 2.2 and 2.3, there are a lot of files opened in 2.3 that aren't in 2.2. The comparison is not fully valid because I'm running 2.3 from the compilation directory, while 2.2 is being run from /usr/bin. Results: # Number of attempts to open a file # Python-2.3b1 compiled with no special flags $ strace -e open ./python -S -c pass 2>&1 | wc -l 249 # RedHat 9's /usr/bin/python (based on 2.2.2) $ strace -e open python -S -c pass 2>&1 | wc -l 9 # Number of attempts to open an existing file $ strace -e open python -S -c pass 2>&1 | grep -v ENOENT | wc -l 8 $ strace -e open ./python -S -c pass 2>&1 | grep -v ENOENT | wc -l 46 The modules imported in 2.3 are: warnings re sre sre_compile sre_constants sre_parse string copy_reg types linecache os posixpath stat UserDict codecs encodings.__init__ encodings.utf_8 I'm crossing my fingers that the time to reload(m) is similar to the time to import it in the first place, which gives these maybe-helpful stats: $ for i in warnings re sre sre_compile sre_constants sre_parse string copy_reg types linecache os posixpath stat UserDict codecs encodings.__init__ encodings.utf_8; do echo -n "reload of module $i: "; ./python Lib/timeit.py -s "import $i" "reload($i)"; done reload of module warnings: 1000 loops, best of 3: 495 usec per loop reload of module re: 10000 loops, best of 3: 80.3 usec per loop reload of module sre: 1000 loops, best of 3: 575 usec per loop reload of module sre_compile: 1000 loops, best of 3: 503 usec per loop reload of module sre_constants: 1000 loops, best of 3: 380 usec per loop reload of module sre_parse: 1000 loops, best of 3: 701 usec per loop reload of module string: 1000 loops, best of 3: 465 usec per loop reload of module copy_reg: 10000 loops, best of 3: 200 usec per loop reload of module types: 10000 loops, best of 3: 180 usec per loop reload of module linecache: 10000 loops, best of 3: 156 usec per loop reload of module os: 1000 loops, best of 3: 1.53e+03 usec per loop reload of module posixpath: 1000 loops, best of 3: 403 usec per loop reload of module stat: 10000 loops, best of 3: 157 usec per loop reload of module UserDict: 1000 loops, best of 3: 454 usec per loop reload of module codecs: 1000 loops, best of 3: 852 usec per loop reload of module encodings.__init__: 1000 loops, best of 3: 244 usec per loop reload of module encodings.utf_8: 10000 loops, best of 3: 132 usec per loop These times seem pretty low, but maybe they're accurate. "os" is the worst of the lot (1530us) and the total comes to 7507us (7.5ms). On my system [2.4GHz Pentium4], this is a typical output of 'time' on python: $ time ./python -S -c pass real 0m0.249s user 0m0.020s sys 0m0.000s $ time python -S -c pass real 0m0.043s user 0m0.010s sys 0m0.000s so the time to import these 17 modules does account for 3/4 of the additional user time between 2.2.2 and 2.3. (Do you care about the 200ms increase in "real" time, or just the user time?) I tried compiling 2.3 with profiling, but gprof sees no samples ("Each sample counts as 0.01 seconds. no time accumulated"). I don't have the capability to try oprofile right now either. Jeff From tim.one@comcast.net Tue May 6 17:30:14 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 12:30:14 -0400 Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness) In-Reply-To: <3EB7DEF0.4020105@lemburg.com> Message-ID: [M.-A. Lemburg] > Hmm, looking at the code it seems that the different blocks > are not referencing each other. Wouldn't it be possible to link > them together as list of blocks ? This list could then be used > for the review operation. The blocks are linked together; that's what the _intblock.next pointer does. See PyInt_Fini(). > Would using the block technique from the int implementation > make a difference for the frame objects ? I would guess that a > typical Python program rarely has more than 100 frames alive > at any one time. These could be placed into such a block to > make setting them up faster, possible making Python function > calls a tad snippier. frame objects have variable size; int objects have fixed size; variable size objects don't play nice with fixed block sizes. Note that the frame allocation code already tries to reuse whatever initialization it can left over from the frame object it (normally) pulls off the frame free list. From info@nyc-search.com Tue May 6 17:57:56 2003 From: info@nyc-search.com (NYC-SEARCH) Date: Tue, 6 May 2003 12:57:56 -0400 Subject: [Python-Dev] Python Technical Lead, New York, NY - 80-85k Message-ID: <01fd01c313f0$a41abb80$e0bfef18@earthlink.net> Python Technical Lead, New York, NY - 80-85k - IMMEDIATE HIRE http://www.nyc-search.com/jobs/python.html From martin@v.loewis.de Tue May 6 18:35:40 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 06 May 2003 19:35:40 +0200 Subject: [Python-Dev] Startup time In-Reply-To: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > While Python's general speed has gone up, its startup speed has slowed > down! Hear hear! I always thought you didn't care about startup time at all :-) > I have no time to investigate the cause right now, although I have a > suspicion that the problem might be in loading too much of the > encoding framework at start time (I recall Marc-Andre and Martin > debating this earlier). That would be easy to determine: Just disable the block #if defined(Py_USING_UNICODE) && defined(HAVE_LANGINFO_H) && defined(CODESET) in pythonrun.c, and see whether it changes anything. To my knowledge, this is the only cause of loading encodings during startup on Unix. Regards, Martin From martin@v.loewis.de Tue May 6 18:37:46 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 06 May 2003 19:37:46 +0200 Subject: [Python-Dev] Startup time In-Reply-To: <20030506162127.GC12791@unpythonic.net> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030506162127.GC12791@unpythonic.net> Message-ID: Jeff Epler writes: > Comparing 2.2 and 2.3, there are a lot of files opened in 2.3 that > aren't in 2.2. Very interesting. Could you also try to find out the difference in terms of stat calls? > I'm crossing my fingers that the time to reload(m) is similar to the > time to import it in the first place, which gives these maybe-helpful > stats: That is, unfortunately, not the case: reloading a dynamic module is a no-op. Regards, Martin From martin@v.loewis.de Tue May 6 18:39:21 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 06 May 2003 19:39:21 +0200 Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck" In-Reply-To: <16055.57554.364845.689049@montanaro.dyndns.org> References: <16055.57554.364845.689049@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > This generated pyconfig.h. It would thus appear that config.status > shouldn't be used by developers. Apparently one of the other flags it > appends to the generated configure command suppresses generation of > pyconfig.h (and maybe other files). Can you find out whether this is related to the fact that you are building in a separate build directory? Regards, Martin From aleax@aleax.it Tue May 6 18:49:52 2003 From: aleax@aleax.it (Alex Martelli) Date: Tue, 6 May 2003 19:49:52 +0200 Subject: [Python-Dev] Startup time In-Reply-To: <20030506162127.GC12791@unpythonic.net> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030506162127.GC12791@unpythonic.net> Message-ID: <200305061949.52953.aleax@aleax.it> On Tuesday 06 May 2003 06:21 pm, Jeff Epler wrote: ... > # Number of attempts to open an existing file > $ strace -e open python -S -c pass 2>&1 | grep -v ENOENT | wc -l > 8 > $ strace -e open ./python -S -c pass 2>&1 | grep -v ENOENT | wc -l > 46 Yes, same here (2.2.2 and 2.3 from CVS both built locally with Mdk 9.0). Besides the .py and .pyc for all the modules, there's a few more files that 2.3 is opening and 2.2 isn't: early on: open("/usr/lib/libstdc++.so.5", O_RDONLY) = 3 open("/lib/libgcc_s.so.1", O_RDONLY) = 3 in the midst of the imports (just before encodings/__init__.py): open("/usr/share/locale/locale.alias", O_RDONLY) = 3 open("/usr/share/locale/en_US/LC_CTYPE", O_RDONLY) = 3 Alex From aleax@aleax.it Tue May 6 19:20:42 2003 From: aleax@aleax.it (Alex Martelli) Date: Tue, 6 May 2003 20:20:42 +0200 Subject: [Python-Dev] Startup time In-Reply-To: References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030506162127.GC12791@unpythonic.net> Message-ID: <200305062020.42734.aleax@aleax.it> On Tuesday 06 May 2003 07:37 pm, Martin v. L�wis wrote: > Jeff Epler writes: > > Comparing 2.2 and 2.3, there are a lot of files opened in 2.3 that > > aren't in 2.2. > > Very interesting. Could you also try to find out the difference in > terms of stat calls? In general: [alex@lancelot blm]$ strace -e stat64 python2.2 -S -c pass 2>&1 | wc -l 18 [alex@lancelot blm]$ strace -e stat64 python2.3 -S -c pass 2>&1 | wc -l 71 [alex@lancelot blm]$ strace -e fstat64 python2.2 -S -c pass 2>&1 | wc -l 8 [alex@lancelot blm]$ strace -e fstat64 python2.3 -S -c pass 2>&1 | wc -l 71 [alex@lancelot blm]$ Of the stat64 calls, the found-files only: [alex@lancelot blm]$ strace -e stat64 python2.2 -S -c pass 2>&1 | grep -v ENOENT | wc -l 4 [alex@lancelot blm]$ strace -e stat64 python2.3 -S -c pass 2>&1 | grep -v ENOENT | wc -l 12 Alex From guido@python.org Tue May 6 19:26:07 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 06 May 2003 14:26:07 -0400 Subject: [Python-Dev] MS VC 7 offer Message-ID: <200305061826.h46IQ7605750@odiug.zope.com> A month ago at Python UK in Oxford (which was colocated with C and C++ standardization meetings as well as a general C and C++ users conference) I met with some folks from Microsoft's VC development team, including the project lead, Nick Hodapp. I told Nick that Python for Windows was still built using VC 6. He pointed out that the actual compilers (not the GUI) from VC 7 are freely downloadable. More recently, Nick sent me an email offering to donate copies of VC 7 to the "key developers". I count Tim, myself and Mark Hammond among the key developers. Is there anyone else who would count themselves among those? I presume he's offering the pro version, which has a real optimizer, unlike the "standard" version that was kindly donated by Bjorn Pettersen. I can see advantages and disadvantages of moving to VC 7; I'm sure the VC 7 compiler is more standard-compliant and generates faster code, but a disadvantage is that you can't apparently link binaries built with VC 6 to a program built with VC 7, meaning that 3rd party extensions will have to be recompiled with VC 7 as well. I have no idea how many projects this will affect (don't worry about Zope Corp :-). Maybe we should try to include those 3rd party developers in the deal. (I think Robin Dunn would be affected, wxPython has a Windows distribution.) If you think this is a bad idea or if you would like to qualify for a compiler donation, please follow up! --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Tue May 6 19:27:53 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 06 May 2003 20:27:53 +0200 Subject: [Python-Dev] Startup time In-Reply-To: <200305061949.52953.aleax@aleax.it> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030506162127.GC12791@unpythonic.net> <200305061949.52953.aleax@aleax.it> Message-ID: Alex Martelli writes: > in the midst of the imports (just before encodings/__init__.py): > open("/usr/share/locale/locale.alias", O_RDONLY) = 3 > open("/usr/share/locale/en_US/LC_CTYPE", O_RDONLY) = 3 That is the effect of nl_langinfo(CODESET). Regards, Martin From jepler@unpythonic.net Tue May 6 19:36:00 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Tue, 6 May 2003 13:36:00 -0500 Subject: [Python-Dev] Startup time In-Reply-To: References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030506162127.GC12791@unpythonic.net> Message-ID: <20030506183600.GA27125@unpythonic.net> On Tue, May 06, 2003 at 07:37:46PM +0200, Martin v. L=F6wis wrote: > Jeff Epler writes: >=20 > > Comparing 2.2 and 2.3, there are a lot of files opened in 2.3 that > > aren't in 2.2. >=20 > Very interesting. Could you also try to find out the difference in > terms of stat calls? # redhat's 9 2.2.2 $ strace -e stat64 python -S -c pass 2>&1 | wc -l 11 # python.org's 2.3b1 $ strace -e stat64 ./python -S -c pass 2>&1 | wc -l 72 By the way, I was able to account for the wall-time difference I saw due to the fact that my PYTHONPATH contains some directories on NFS, and so the attempted open()s and stat()s of standard modules did take measurable wall time. With no PYTHONPATH variable set, these are the startup timings I see: # 2.2.2 real 0m0.005s user 0m0.000s sys 0m0.000s # 2.3b2 real 0m0.044s user 0m0.020s sys 0m0.020s By the way, I wouldn't be too excited about trusting this Python -- ./python -c "import random" Illegal instruction I wonder what's gone wrong... (gdb) run -c "import random" Starting program: /usr/src/Python-2.3b1/python -c "import random" [New Thread 1074963072 (LWP 28408)] Program received signal SIGILL, Illegal instruction. [Switching to Thread 1074963072 (LWP 28408)] 0x08109aa0 in subtype_getsets_full () (gdb) where #0 0x08109aa0 in subtype_getsets_full () #1 0x4001c743 in random_new (type=3D0x4001c738, args=3D0x4012c02c, kwds=3D= 0x0) at /usr/src/Python-2.3b1/Modules/_randommodule.c:439 (gdb) ptype subtype_getsets_full type =3D struct PyGetSetDef { [...] I'm recompiling now to see if it was just a bogon strike.. surely somebod= y else has tested on redhat9! nope, recompiled and I still have the proble= m. and I can't get the debugger to stop at the top of random_new either. jeff From neal@metaslash.com Tue May 6 19:37:21 2003 From: neal@metaslash.com (Neal Norwitz) Date: Tue, 06 May 2003 14:37:21 -0400 Subject: [Python-Dev] Startup time In-Reply-To: <200305062020.42734.aleax@aleax.it> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030506162127.GC12791@unpythonic.net> <200305062020.42734.aleax@aleax.it> Message-ID: <20030506183721.GC1340@epoch.metaslash.com> On Tue, May 06, 2003 at 08:20:42PM +0200, Alex Martelli wrote: > On Tuesday 06 May 2003 07:37 pm, Martin v. L=F6wis wrote: > > Jeff Epler writes: > > > Comparing 2.2 and 2.3, there are a lot of files opened in 2.3 t= hat > > > aren't in 2.2. >=20 > [alex@lancelot blm]$ strace -e stat64 python2.2 -S -c pass 2>&1 | w= c -l > 18 > [alex@lancelot blm]$ strace -e stat64 python2.3 -S -c pass 2>&1 | w= c -l > 71 I think amny of the extra stat/open calls are due to zipimports. I don't have python23.zip, but it's still looking for a bunch of extra files that can't exist (in python23.zip). Perhaps if the zip file doesn't exist, we can short circuit the remaining calls to open()? stat64("/home/neal/local/lib/python23.zip/warnings", 0xbfffebc0) = =3D -1 ENOENT (No such file or directory) open("/home/neal/local/lib/python23.zip/warnings.so", O_RDONLY|O_LARG= EFILE) =3D -1 ENOENT (No such file or directory) open("/home/neal/local/lib/python23.zip/warningsmodule.so", O_RDONLY|= O_LARGEFILE) =3D -1 ENOENT (No such file or directory) open("/home/neal/local/lib/python23.zip/warnings.py", O_RDONLY|O_LARG= EFILE) =3D -1 ENOENT (No such file or directory) open("/home/neal/local/lib/python23.zip/warnings.pyc", O_RDONLY|O_LAR= GEFILE) =3D -1 ENOENT (No such file or directory) Neal From nas@python.ca Tue May 6 19:41:07 2003 From: nas@python.ca (Neil Schemenauer) Date: Tue, 6 May 2003 11:41:07 -0700 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061826.h46IQ7605750@odiug.zope.com> References: <200305061826.h46IQ7605750@odiug.zope.com> Message-ID: <20030506184107.GA21470@glacier.arctrix.com> Guido van Rossum wrote: > I can see advantages and disadvantages of moving to VC 7; I'm sure the > VC 7 compiler is more standard-compliant and generates faster code, > but a disadvantage is that you can't apparently link binaries built > with VC 6 to a program built with VC 7, meaning that 3rd party > extensions will have to be recompiled with VC 7 as well. Can distutils use (or be made to use) the free command line VC 7 tools? Also, does this affect whether extensions can be compiled by Mingw? It would be nice if people could continue building extensions on Windows using free tools. Neil From guido@python.org Tue May 6 19:45:50 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 06 May 2003 14:45:50 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: Your message of "Tue, 06 May 2003 11:41:07 PDT." <20030506184107.GA21470@glacier.arctrix.com> References: <200305061826.h46IQ7605750@odiug.zope.com> <20030506184107.GA21470@glacier.arctrix.com> Message-ID: <200305061845.h46Ijo106044@odiug.zope.com> > > I can see advantages and disadvantages of moving to VC 7; I'm sure the > > VC 7 compiler is more standard-compliant and generates faster code, > > but a disadvantage is that you can't apparently link binaries built > > with VC 6 to a program built with VC 7, meaning that 3rd party > > extensions will have to be recompiled with VC 7 as well. > > Can distutils use (or be made to use) the free command line VC 7 tools? That would be a project, but his implication was that the compilers are usable as command line tools, so I'm confident it can be done. > Also, does this affect whether extensions can be compiled by Mingw? > It would be nice if people could continue building extensions on > Windows using free tools. I know noting about Mingw. Anyone who does please speak up if this would affect them or not. --Guido van Rossum (home page: http://www.python.org/~guido/) From phil@riverbankcomputing.co.uk Tue May 6 19:48:03 2003 From: phil@riverbankcomputing.co.uk (Phil Thompson) Date: Tue, 6 May 2003 19:48:03 +0100 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061826.h46IQ7605750@odiug.zope.com> References: <200305061826.h46IQ7605750@odiug.zope.com> Message-ID: <200305061948.03757.phil@riverbankcomputing.co.uk> On Tuesday 06 May 2003 7:26 pm, Guido van Rossum wrote: > A month ago at Python UK in Oxford (which was colocated with C and C++ > standardization meetings as well as a general C and C++ users > conference) I met with some folks from Microsoft's VC development > team, including the project lead, Nick Hodapp. I told Nick that > Python for Windows was still built using VC 6. He pointed out that > the actual compilers (not the GUI) from VC 7 are freely downloadable. > > More recently, Nick sent me an email offering to donate copies of VC 7 > to the "key developers". I count Tim, myself and Mark Hammond among > the key developers. Is there anyone else who would count themselves > among those? > > I presume he's offering the pro version, which has a real optimizer, > unlike the "standard" version that was kindly donated by Bjorn > Pettersen. > > I can see advantages and disadvantages of moving to VC 7; I'm sure the > VC 7 compiler is more standard-compliant and generates faster code, > but a disadvantage is that you can't apparently link binaries built > with VC 6 to a program built with VC 7, meaning that 3rd party > extensions will have to be recompiled with VC 7 as well. I have no > idea how many projects this will affect (don't worry about Zope Corp > > :-). Maybe we should try to include those 3rd party developers in the > > deal. (I think Robin Dunn would be affected, wxPython has a Windows > distribution.) > > If you think this is a bad idea or if you would like to qualify for a > compiler donation, please follow up! How do we get hold of the free VC 7 compilers? Phil From theller@python.net Tue May 6 19:48:12 2003 From: theller@python.net (Thomas Heller) Date: 06 May 2003 20:48:12 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <20030506184107.GA21470@glacier.arctrix.com> References: <200305061826.h46IQ7605750@odiug.zope.com> <20030506184107.GA21470@glacier.arctrix.com> Message-ID: Neil Schemenauer writes: > Guido van Rossum wrote: > > I can see advantages and disadvantages of moving to VC 7; I'm sure the > > VC 7 compiler is more standard-compliant and generates faster code, > > but a disadvantage is that you can't apparently link binaries built > > with VC 6 to a program built with VC 7, meaning that 3rd party > > extensions will have to be recompiled with VC 7 as well. > > Can distutils use (or be made to use) the free command line VC 7 tools? The only problem distutils has is to find the compiler and the environment it needs. Currently it relies on (undocumented) registry entries (for VC6), and there's a patch somewhere on SF for the registry entries for VC7. I like the idea of using VC7 (as much as I dislike the VC7 gui itself). 'Professional' windows developers have VC7 anyway, it's included in MSDN professional. Thomas From logistix@cathoderaymission.net Tue May 6 19:52:32 2003 From: logistix@cathoderaymission.net (logistix) Date: Tue, 6 May 2003 13:52:32 -0500 (CDT) Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061826.h46IQ7605750@odiug.zope.com> Message-ID: On Tue, 6 May 2003, Guido van Rossum wrote: > A month ago at Python UK in Oxford (which was colocated with C and C++ > standardization meetings as well as a general C and C++ users > conference) I met with some folks from Microsoft's VC development > team, including the project lead, Nick Hodapp. I told Nick that > Python for Windows was still built using VC 6. He pointed out that > the actual compilers (not the GUI) from VC 7 are freely downloadable. > > More recently, Nick sent me an email offering to donate copies of VC 7 > to the "key developers". I count Tim, myself and Mark Hammond among > the key developers. Is there anyone else who would count themselves > among those? > > I presume he's offering the pro version, which has a real optimizer, > unlike the "standard" version that was kindly donated by Bjorn > Pettersen. > > I can see advantages and disadvantages of moving to VC 7; I'm sure the > VC 7 compiler is more standard-compliant and generates faster code, > but a disadvantage is that you can't apparently link binaries built > with VC 6 to a program built with VC 7, meaning that 3rd party > extensions will have to be recompiled with VC 7 as well. I have no > idea how many projects this will affect (don't worry about Zope Corp > :-). Maybe we should try to include those 3rd party developers in the > deal. (I think Robin Dunn would be affected, wxPython has a Windows > distribution.) > > If you think this is a bad idea or if you would like to qualify for a > compiler donation, please follow up! > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Visual Studio 2003 came out a few weeks ago. I honestly don't know if its considered VC8 or just VC7.1 with the same backend compilers. But if you're going to upgrad, you might as well go all the way. Also, I'm assuming 2.3 will still be compiled on 6.0, right? From guido@python.org Tue May 6 19:55:15 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 06 May 2003 14:55:15 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: Your message of "Tue, 06 May 2003 19:48:03 BST." <200305061948.03757.phil@riverbankcomputing.co.uk> References: <200305061826.h46IQ7605750@odiug.zope.com> <200305061948.03757.phil@riverbankcomputing.co.uk> Message-ID: <200305061855.h46ItFZ06217@odiug.zope.com> > How do we get hold of the free VC 7 compilers? Here's the info Nick sent me: | We offer as part of the .NET Framework SDK each of the compilers that | comprise our Visual Studio tool - including C++. The caveat here is | that we don't yet ship the full CRT or STL with this distribution - | this will be changing. Also, the 64bit C++ compilers ship for free as | part of the Windows Platform SDK. All of this is available on | msdn.microsoft.com. [...] | Here are the links to the SDKs. But so you aren't surprised, these are | NOT low-overhead downloads or installs... | | .NET Framework 1.1 | | http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx | | Platform SDK | | http://msdn.microsoft.com/library/default.asp?url=/library/en-us/sdkintro/sdkintro/obtaining_the_complete_sdk.asp --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Tue May 6 19:56:52 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 14:56:52 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061948.03757.phil@riverbankcomputing.co.uk> Message-ID: [Phil Thompson] > How do we get hold of the free VC 7 compilers? Part of the 100+ MB .NET Framework 1.1 SDK: http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx Note that this requires Win2K minimum. From jepler@unpythonic.net Tue May 6 19:57:50 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Tue, 6 May 2003 13:57:50 -0500 Subject: RedHat 9 _random failure under -pg (was Re: [Python-Dev] Startup time) Message-ID: <20030506185750.GB27125@unpythonic.net> On Tue, May 06, 2003 at 01:36:00PM -0500, Jeff Epler wrote: > (gdb) run -c "import random" > Starting program: /usr/src/Python-2.3b1/python -c "import random" > [New Thread 1074963072 (LWP 28408)] > > Program received signal SIGILL, Illegal instruction. > [Switching to Thread 1074963072 (LWP 28408)] > 0x08109aa0 in subtype_getsets_full () > (gdb) where > #0 0x08109aa0 in subtype_getsets_full () > #1 0x4001c743 in random_new (type=0x4001c738, args=0x4012c02c, kwds=0x0) > at /usr/src/Python-2.3b1/Modules/_randommodule.c:439 > (gdb) ptype subtype_getsets_full > type = struct PyGetSetDef { > [...] gcc is generating plainly bogus code for this simple function random_new: 00001738 : 1738: 55 push %ebp 1739: 89 e5 mov %esp,%ebp 173b: 56 push %esi 173c: 53 push %ebx 173d: ff 93 7c 00 00 00 call *0x7c(%ebx) (for those of you who don't read x86 assembly, the first 4 functions are part of a standard function prologue. The fifth instruction is a call through a function pointer, but the register's value at this point is undefined. This is not the call to type->tp_alloc(), correct code for that is just below) Well, this may have been false alarm -- when I removed -pg from OPT in the Makefile, './python -c "import random"' works. So this is a problem only when profiling is enabled. Is this intended to work? In any case, the fact that the disassembly is so plainly bogus tends to imply that this is a gcc bug, not anything that Python can fix. Jeff From just@letterror.com Tue May 6 19:59:02 2003 From: just@letterror.com (Just van Rossum) Date: Tue, 6 May 2003 20:59:02 +0200 Subject: [Python-Dev] Startup time In-Reply-To: <20030506183721.GC1340@epoch.metaslash.com> Message-ID: Neal Norwitz wrote: > I think amny of the extra stat/open calls are due to zipimports. > > I don't have python23.zip, but it's still looking for a bunch > of extra files that can't exist (in python23.zip). Perhaps > if the zip file doesn't exist, we can short circuit the remaining > calls to open()? I think we should, although I wouldn't know off hand how to do that. There's still some nice-to-have PEP302 stuff that remains to be implemented, that could actually help solve this problem. Currently there are no real importer objects for the builting import mechanisms: a value of None for a path item in sys.path_importer_cache means: use the builtin importer. If there _was_ a true builtin importer object, None could mean: no importer can handle this path item, skip it. See also python.org/sf/692884. I hope to be able to work on this before 2.3b2. > stat64("/home/neal/local/lib/python23.zip/warnings", 0xbfffebc0) = -1 ENOENT (No such file or > directory) > open("/home/neal/local/lib/python23.zip/warnings.so", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No > such file or directory) > open("/home/neal/local/lib/python23.zip/warningsmodule.so", O_RDONLY|O_LARGEFILE) = -1 ENOENT > (No such file or directory) > open("/home/neal/local/lib/python23.zip/warnings.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No > such file or directory) > open("/home/neal/local/lib/python23.zip/warnings.pyc", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No > such file or directory) You could try editing site.py so it (as it used to) removes path items that don't exist on the file system. Except this probably only helps if you'd do this _before_ os.py is imported, as os.py pulls in quite a few modules. Hm, chicken and egg... Or disable to the code that adds the zipfile to sys.path in Modules/getpath.c, and compare the number of stat calls. Just From tim@zope.com Tue May 6 20:00:03 2003 From: tim@zope.com (Tim Peters) Date: Tue, 6 May 2003 15:00:03 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: Message-ID: [logistix] > ... > Also, I'm assuming 2.3 will still be compiled on 6.0, right? The PythonLabs 2.3 Windows distribution will be compiled with MSVC 6, barring an unbroken chain of miracles. From guido@python.org Tue May 6 20:01:01 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 06 May 2003 15:01:01 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: Your message of "Tue, 06 May 2003 13:52:32 CDT." References: Message-ID: <200305061901.h46J11306259@odiug.zope.com> > Visual Studio 2003 came out a few weeks ago. I honestly don't know if > its considered VC8 or just VC7.1 with the same backend compilers. But if > you're going to upgrad, you might as well go all the way. Good question. > Also, I'm assuming 2.3 will still be compiled on 6.0, right? Hm, I was thinking that 2.3 final could be built using 7.x if Nick can get us the donated copies fast enough. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Tue May 6 20:12:13 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 14:12:13 -0500 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061901.h46J11306259@odiug.zope.com> References: <200305061901.h46J11306259@odiug.zope.com> Message-ID: <16056.2317.124886.963460@montanaro.dyndns.org> >> Also, I'm assuming 2.3 will still be compiled on 6.0, right? Guido> Hm, I was thinking that 2.3 final could be built using 7.x if Guido> Nick can get us the donated copies fast enough. I can see the downside (next to no experience with 7.x, and perhaps none before the final release). What's the upside? Skip From skip@pobox.com Tue May 6 20:18:24 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 14:18:24 -0500 Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck" In-Reply-To: References: <16055.57554.364845.689049@montanaro.dyndns.org> Message-ID: <16056.2688.72423.251200@montanaro.dyndns.org> >> This generated pyconfig.h. It would thus appear that config.status >> shouldn't be used by developers. Apparently one of the other flags >> it appends to the generated configure command suppresses generation >> of pyconfig.h (and maybe other files). Martin> Can you find out whether this is related to the fact that you Martin> are building in a separate build directory? I just confirmed that it's not related to the separate build directory. When you run config.status --recheck it reruns your latest configure command with the extra flags --no-create and --no-recursion. Without rummaging around in the configure file my guess is the --no-create flag is the culprit. So, a word to the wise: avoid config.status --recheck. Skip From tim.one@comcast.net Tue May 6 20:17:28 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 15:17:28 -0400 Subject: [Python-Dev] Windows installer request... In-Reply-To: <60FB8BB7F0EFC7409B75EEEC13E20192022DE23A@admin56.narex.com> Message-ID: [Bjorn Pettersen] > Most installers default to the system drive, so I didn't even look the > first time. I am able to change it manually. > ... > It should be as easy as (platforms that doesn't have %systemdrive% could > only install to C:): > > item: Get Environment Variable > Variable=OSDRIVE > Environment=SystemDrive > Default=C: > end > > However, you might have to do > > item: Get Registry Key Value > Variable=OSDRIVE > Key=System\CurrentControlSet\Control\Session Manager\Environment > Value Name=SystemDrive > Flags=00000100 > Defualt=C: > end > > (not sure about the Flags parameter) I couldn't find much documentation, > and the example I'm looking at is a litte "divided" about which it > should use... I think it tries the first one, and falls back on the > second(?) (http://ibinstall.defined.net/dl_scripts.htm, > script_6016.zip/IBWin32Setup.wse). > > Also, it looks like you want to use %SYS32% to get to the windows system > directory (on WinXP, it's c:\windows\system32, which doesn't seem to be > listed anywhere...) Enough already : I don't have time to try umpteen different things here, or really even one. What I did do is build an installer *just* removing the hard-coded Wizard-generated "C:" prefix. Martin tried that and said it worked for him. It doesn't hurt me. If it works for you too, I'll commit the change: ftp://ftp.python.org/pub/tmp/experimental.exe Please give that a try. It's an incoherent mix if files, so please use a junk name for the installation directory and program startup group (or simply abort the install after you see whether it suggested a drive you approve of). > I can't figure out how you're building the installer however. If you can > point me in the right direction I can test it on my special WinXP, > regular WinXP, Win98, Win2k, and maybe WinNT4 (I think we still have one > around :-). .wse files aren't intended to be edited by hand (although we all do, sometimes). Instead, they're input to Wise's commercial GUI, which displays their contents in a nice block-indented, color-coded way. "flags" aren't documented, and the GUI never shows them to you -- they correspond to the on/off status of various checkboxes in various GUI dialogs. We use Wise 8.14 to build the installer. If you have Wise, you open the python20.wse file using it, and click the "Compile" button in the GUI. If you don't have Wise, I suppose you guess what Wise would do if you did have it . From brian@sweetapp.com Tue May 6 20:24:31 2003 From: brian@sweetapp.com (Brian Quinlan) Date: Tue, 06 May 2003 12:24:31 -0700 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <16056.2317.124886.963460@montanaro.dyndns.org> Message-ID: <007e01c31405$1ea52fc0$21795418@dell1700> > I can see the downside (next to no experience with 7.x, and perhaps none > before the final release). What's the upside? It's free and more standards compliant. Cheers, Brian From martin@v.loewis.de Tue May 6 20:21:41 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 06 May 2003 21:21:41 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061826.h46IQ7605750@odiug.zope.com> References: <200305061826.h46IQ7605750@odiug.zope.com> Message-ID: Guido van Rossum writes: > More recently, Nick sent me an email offering to donate copies of VC 7 > to the "key developers". I count Tim, myself and Mark Hammond among > the key developers. Is there anyone else who would count themselves > among those? Does he already have the copies, or would purchase them/donate the money? > If you think this is a bad idea or if you would like to qualify for a > compiler donation, please follow up! If the money isn't spent yet, I think it would be better spent for copies of VC 7.1 (aka .NET 2003). Reportedly, this compiler fixes a number of bugs of the 7.0 release, i.e. it crashes less frequently. I'm still uncertain what the binary compatibility issues are, but I have reason to assume that 7.0 and 7.1 are binary compatible. Before getting multiple copies of the compiler, you should double check that you can actually produce a Windows installer for that compiler. Notice that there is a particular problem hidden here: You will have to ship the C runtime (MSVCR7.DLL) with the installer. However, Microsoft does *not* give you permission to include the DLL file. Instead, they provide a Windows installer snippet which you must "use" (I believe in the sense of "execute on the target machine"). The installer snippet will check for versions, deal with DLL caches, etc. Microsoft has procedure for combining installer snippets into full installer files. They acknowledge the existance of other tools that make installable binaries, but mandate that these tools perform the same procedures. So you should check whether your copy of Wise can deal with these issues. If you find it could actually work, I'm +0 on accepting the donation (though I won't need a copy myself). You have to switch sooner or later, anyway, so you might as well switch now instead of later. The advantage I see for Python itself is that IPv6 would now work on Windows. The disadvantage I see is that distutils would need to get updated. If you think that 2.3 won't be built with 7.x anyway, you might as well reject the donation, and hope the donor will still be there to offer VC 7.2/8.0. Regards, Martin From tim.one@comcast.net Tue May 6 20:20:34 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 15:20:34 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061901.h46J11306259@odiug.zope.com> Message-ID: [Guido] > Hm, I was thinking that 2.3 final could be built using 7.x if Nick can > get us the donated copies fast enough. As I said, the PythonLabs Windows 2.3 installer will be compiled using MSVC 6, barring an unbroken chain of miracles . From martin@v.loewis.de Tue May 6 20:25:20 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 06 May 2003 21:25:20 +0200 Subject: RedHat 9 _random failure under -pg (was Re: [Python-Dev] Startup time) In-Reply-To: <20030506185750.GB27125@unpythonic.net> References: <20030506185750.GB27125@unpythonic.net> Message-ID: Jeff Epler writes: > Well, this may have been false alarm -- when I removed -pg from OPT in > the Makefile, './python -c "import random"' works. So this is a problem > only when profiling is enabled. Is this intended to work? You mean, is the gcc option -pg supposed to work? As a Python developer: How am I supposed to know? As a gcc developer: yes, certainly. > In any case, the fact that the disassembly is so plainly bogus tends to > imply that this is a gcc bug, not anything that Python can fix. That seems to be the case, yes. Python can only work-around, but in this case, the work-around seems trivial. Regards, Martin From gtalvola@nameconnector.com Tue May 6 20:27:11 2003 From: gtalvola@nameconnector.com (Geoffrey Talvola) Date: Tue, 6 May 2003 15:27:11 -0400 Subject: [Python-Dev] MS VC 7 offer Message-ID: <61957B071FF421419E567A28A45C7FE59AF419@mailbox.nameconnector.com> Guido van Rossum wrote: > I can see advantages and disadvantages of moving to VC 7; I'm sure the > VC 7 compiler is more standard-compliant and generates faster code, > but a disadvantage is that you can't apparently link binaries built > with VC 6 to a program built with VC 7, meaning that 3rd party > extensions will have to be recompiled with VC 7 as well. If that's really true then my vote would be against switching to VC 7. My company uses VC 6 extensively and we have no plans to upgrade to VC 7. Our Python programs make extensive use of .pyd's compiled with VC6, and we also embed the Python interpreter within our C++ programs. It would be _very_ painful for us to upgrade our world to VC7, and if Python switched to VC 7, we'd probably be forced to simply compile our own custom version of Python (and the 3rd-party extension DLLs we use) with VC6. So there's one data point for you... - Geoff From skip@pobox.com Tue May 6 20:31:16 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 14:31:16 -0500 Subject: [Python-Dev] SF CVS offline Message-ID: <16056.3460.431223.466945@montanaro.dyndns.org> It appears SF CVS is offline (as of 2:30PM Central Daylight Time). I noticed this when I was prompted for a CVS password for the first time in ages (and which I can't remember). I went poking around for help and came across this page: http://sourceforge.net/docman/display_doc.php?docid=2352&group_id=1 which says, in part: Project CVS Services: Offline; unplanned maintenance (follow-up from 2003-05-05) in-progress FYI. Skip From skip@pobox.com Tue May 6 20:33:22 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 14:33:22 -0500 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <007e01c31405$1ea52fc0$21795418@dell1700> References: <16056.2317.124886.963460@montanaro.dyndns.org> <007e01c31405$1ea52fc0$21795418@dell1700> Message-ID: <16056.3586.553248.689395@montanaro.dyndns.org> >> I can see the downside (next to no experience with 7.x, and perhaps >> none before the final release). What's the upside? Brian> It's free and more standards compliant. Then I suggest we have at least one beta which is built using it. Skip From brian@sweetapp.com Tue May 6 20:53:56 2003 From: brian@sweetapp.com (Brian Quinlan) Date: Tue, 06 May 2003 12:53:56 -0700 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <16056.3586.553248.689395@montanaro.dyndns.org> Message-ID: <007f01c31409$3a5f9670$21795418@dell1700> Brian> It's free and more standards compliant. > > Then I suggest we have at least one beta which is built using it. To get me wrong; I think that moving to VC7 for Python 2.3 would be mistake if VC6 compiled extension modules are not binary compatible. My understanding was that static libraries are not compatible but that dynamic ones are. I spent a few minutes with google but wasn't able to find out. Assuming that VC6 and VC7 are not binary compatible, here are my concerns: 1. 3rd party extension developers will have to switch very quickly to be ready for the 2.3 release 2. Some 3rd party extension developers may have already released binaries for Python 2.3, based on the understanding that there won't be any additional API changes after the first beta (baring a disaster). 3. I believe that the installer normally preserves site-packages when doing an upgrade? If so, the user is going to be left with extension modules that won't work. Cheers, Brian From fdrake@acm.org Tue May 6 20:57:04 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 6 May 2003 15:57:04 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <007f01c31409$3a5f9670$21795418@dell1700> References: <16056.3586.553248.689395@montanaro.dyndns.org> <007f01c31409$3a5f9670$21795418@dell1700> Message-ID: <16056.5008.75631.677019@grendel.zope.com> Brian Quinlan writes: > 1. 3rd party extension developers will have to switch very quickly to be > ready for the 2.3 release A very real issue, to be sure. > 2. Some 3rd party extension developers may have already released > binaries for Python 2.3, based on the understanding that there won't > be any additional API changes after the first beta (baring a > disaster). I'm not convinced that's a huge problem, though it could be an annoyance. > 3. I believe that the installer normally preserves site-packages when > doing an upgrade? If so, the user is going to be left with extension > modules that won't work. Yes, but site-packages is specific to the major.minor version of Python, so it would only bite people going from an alpha/beta to a final release, not from major.minor-1. Is this really an issue? -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From pje@telecommunity.com Tue May 6 20:58:34 2003 From: pje@telecommunity.com (Phillip J. Eby) Date: Tue, 06 May 2003 15:58:34 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061845.h46Ijo106044@odiug.zope.com> References: <200305061826.h46IQ7605750@odiug.zope.com> <20030506184107.GA21470@glacier.arctrix.com> Message-ID: <5.1.1.6.0.20030506155456.01f9c220@telecommunity.com> At 02:45 PM 5/6/03 -0400, Guido van Rossum wrote: > > Also, does this affect whether extensions can be compiled by Mingw? > > It would be nice if people could continue building extensions on > > Windows using free tools. > >I know noting about Mingw. Anyone who does please speak up if this >would affect them or not. I build my extensions on Windows 98 with MinGW. I don't know if VC6 vs. VC7 makes a difference or not, since I don't own either one. I think someone said something about the free VC7 requiring Win2K? That seems to me like a dealbreaker for switching from MinGW to VC7, even if the VC7 is free-as-in-beer. From skip@pobox.com Tue May 6 21:01:51 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 15:01:51 -0500 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <007f01c31409$3a5f9670$21795418@dell1700> References: <16056.3586.553248.689395@montanaro.dyndns.org> <007f01c31409$3a5f9670$21795418@dell1700> Message-ID: <16056.5295.907462.399304@montanaro.dyndns.org> Brian> Assuming that VC6 and VC7 are not binary compatible, here are my Brian> concerns: ... Sounds to me like the switch to VC7 will have to happen with a long lead time, similar to what one might expect if Guido decided to deprecate the sys module. ;-) Skip From aleax@aleax.it Tue May 6 21:12:07 2003 From: aleax@aleax.it (Alex Martelli) Date: Tue, 6 May 2003 22:12:07 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <007f01c31409$3a5f9670$21795418@dell1700> References: <007f01c31409$3a5f9670$21795418@dell1700> Message-ID: <200305062212.07539.aleax@aleax.it> On Tuesday 06 May 2003 09:53 pm, Brian Quinlan wrote: > Brian> It's free and more standards compliant. > > > Then I suggest we have at least one beta which is built using it. > > To get me wrong; I think that moving to VC7 for Python 2.3 would be > mistake if VC6 compiled extension modules are not binary compatible. My > understanding was that static libraries are not compatible but that > dynamic ones are. I spent a few minutes with google but wasn't able to > find out. When we discussed VC versions (back when we met in Ofxord during PythonUK/ACCU), David Abrahams seemed adamant that VC7 and VC6 are indeed compatible -- as he has first-hand experience while I just have horror stories from ex-coworkers I suspect he's likelier to be right. Anyway, I'm CC'ing him since I do suspect he has relevant input and might not be following python-dev right now... Alex From martin@v.loewis.de Tue May 6 21:22:26 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 06 May 2003 22:22:26 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <007f01c31409$3a5f9670$21795418@dell1700> References: <007f01c31409$3a5f9670$21795418@dell1700> Message-ID: <3EB81982.8070600@v.loewis.de> Brian Quinlan wrote: > To get me wrong; I think that moving to VC7 for Python 2.3 would be > mistake if VC6 compiled extension modules are not binary compatible. My > understanding was that static libraries are not compatible but that > dynamic ones are. I spent a few minutes with google but wasn't able to > find out. Please rest assured that they are definitely incompatible. People have been trying to combine VC7 extension modules with VC6, and got consistent crashes. The crashes occur as you pass FILE* across libraries: Neither C library can deal with FILE* (such as stdout) received from the other library. > 1. 3rd party extension developers will have to switch very quickly to be > > ready for the 2.3 release True. > 2. Some 3rd party extension developers may have already released > binaries for Python 2.3, based on the understanding that there won't > be any additional API changes after the first beta (baring a > disaster). There won't be any. That's any ABI change. > 3. I believe that the installer normally preserves site-packages when > doing an upgrade? If so, the user is going to be left with extension > modules that won't work. Users installing betas should still expect such things. Uninstallation before upgrading to the final release is strongly advised. Regards, Martin From guido@python.org Tue May 6 21:23:18 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 06 May 2003 16:23:18 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: Your message of "Tue, 06 May 2003 22:12:07 +0200." <200305062212.07539.aleax@aleax.it> References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it> Message-ID: <200305062023.h46KNI907721@odiug.zope.com> I should mention that on re-reading Nick's email, it's clear that he's offering to donate copies of Visual C++ 2003, so that's the latest. I've invited him to respond directly to the comments and questions. In any case, it looks like it may be best to wait until after 2.3 is released, although if there's time I wouldn't mind playing a bit with 2003. (Hmm... if it really doesn't work on Win98 I have a problem.) --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Tue May 6 21:26:51 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 06 May 2003 22:26:51 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305062212.07539.aleax@aleax.it> References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it> Message-ID: <3EB81A8B.9090603@v.loewis.de> Alex Martelli wrote: > When we discussed VC versions (back when we met in Ofxord during > PythonUK/ACCU), David Abrahams seemed adamant that VC7 and VC6 > are indeed compatible I doubt he said this in this generality: he surely knows that you cannot mix C++ objects files on the object file level between those compilers, as they implement completely different ABIs. For Python, the biggest problem is that you cannot pass FILE* from one C library to the other, because of some stupid locking test in the C library. This does cause crashes when you try to use Python extension modules compiled with the wrong compiler. Regards, Martin From aleax@aleax.it Tue May 6 21:34:12 2003 From: aleax@aleax.it (Alex Martelli) Date: Tue, 6 May 2003 22:34:12 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305062023.h46KNI907721@odiug.zope.com> References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it> <200305062023.h46KNI907721@odiug.zope.com> Message-ID: <200305062234.12363.aleax@aleax.it> On Tuesday 06 May 2003 10:23 pm, Guido van Rossum wrote: > I should mention that on re-reading Nick's email, it's clear that he's > offering to donate copies of Visual C++ 2003, so that's the latest. > I've invited him to respond directly to the comments and questions. > > In any case, it looks like it may be best to wait until after 2.3 is > released, although if there's time I wouldn't mind playing a bit with > 2003. (Hmm... if it really doesn't work on Win98 I have a problem.) Me too -- a BAD one, since I do just about all of my "windows" work these days with win4lin under Linux on my desktop box (cheap, fast, convenient), or on an old Acer Travelmate 345T laptop, and both only support Win98 -- the only "modern" Windows version I have around is in the dualboot of a far-too-heavy Dell laptop which came with Win/XP (so I didn't entirely remove it when installing Linux as the main OS, just shrank it as much as I could in case I ever needed something in it)... It WOULD be deucedly inconvenient to have to install Win/XP and keep it booted just to be able to build Python extension binaries for Windows...!-( Why a command-line compiler shouldn't be able to run on just about any version of its OS really escapes me. Maybe a clever move to force us laggards to upgrade whether we want to or not...?-( Alex From skip@pobox.com Tue May 6 21:50:31 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 15:50:31 -0500 Subject: [Python-Dev] bsddb185 module changes checked in Message-ID: <16056.8215.274307.904009@montanaro.dyndns.org> The various bits necessary to implement the "build bsddb185 when appropriate" have been checked in. I'm pretty sure I don't have the best possible test for the existence of a db library, but it will have to do for now. I suspect others can clean it up later during the beta cycle. The current detection code in setup.py should work for Nick on OSF/1 and for platforms which don't require a separate db library. I'd appreciate some extra pounding on this code. Thanks, Skip From tim.one@comcast.net Tue May 6 21:50:18 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 16:50:18 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB81A8B.9090603@v.loewis.de> Message-ID: [Alex Martelli] > When we discussed VC versions (back when we met in Ofxord during > PythonUK/ACCU), David Abrahams seemed adamant that VC7 and VC6 > are indeed compatible [Martin v. Lowis] > I doubt he said this in this generality: he surely knows that you > cannot mix C++ objects files on the object file level between those > compilers, as they implement completely different ABIs. > > For Python, the biggest problem is that you cannot pass FILE* from one C > library to the other, because of some stupid locking test in the C > library. This does cause crashes when you try to use Python extension > modules compiled with the wrong compiler. And not the only problem. Review the "PyObject_New vs PyObject_NEW" thread from python-dev in March. This snippet sums it up: [David Abrahams] > Python was compiled with vc6, the rest with vc7. I test this > combination regularly and have never seen a problem. [Tim] You have now . From brian@sweetapp.com Tue May 6 22:15:35 2003 From: brian@sweetapp.com (Brian Quinlan) Date: Tue, 06 May 2003 14:15:35 -0700 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB81982.8070600@v.loewis.de> Message-ID: <008601c31414$a26d0120$21795418@dell1700> > Please rest assured that they are definitely incompatible. People have > been trying to combine VC7 extension modules with VC6, and got > consistent crashes. The crashes occur as you pass FILE* across > libraries: Neither C library can deal with FILE* (such as stdout) > received from the other library. Wouldn't this only affect extension modules using PyFile_FromFile and PyFile_AsFile? And a little hackery could make those routines generate exceptions if called from an incompatible VC version. > > 2. Some 3rd party extension developers may have already released > > binaries for Python 2.3, based on the understanding that there > > won't be any additional API changes after the first beta (baring > > a disaster). > > There won't be any. That's any ABI change. Isn't the ABI dependant on the API and linker? The API is supposed to be stable at this point. I would imagine that most extension developers would assume that the build environment is also stable at this point. Cheers, Brian From jepler@unpythonic.net Tue May 6 21:57:33 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Tue, 6 May 2003 15:57:33 -0500 Subject: RedHat 9 _random failure under -pg (was Re: [Python-Dev] Startup time) In-Reply-To: References: <20030506185750.GB27125@unpythonic.net> Message-ID: <20030506205733.GE27125@unpythonic.net> On Tue, May 06, 2003 at 09:25:20PM +0200, Martin v. L=F6wis wrote: > Jeff Epler writes: >=20 > > Well, this may have been false alarm -- when I removed -pg from OPT i= n > > the Makefile, './python -c "import random"' works. So this is a prob= lem > > only when profiling is enabled. Is this intended to work? >=20 > You mean, is the gcc option -pg supposed to work? As a Python > developer: How am I supposed to know? As a gcc developer: yes, > certainly. I didn't know you were a gcc developer. In any case, I've distilled this down to a small testcase and was working on preparing a bug report for their gnats database. The testcase is about as simple as it gets: /* compile with -pg -fPIC -O */ typedef struct { void *(*f)(void *, int); } T; void *g(T *t) { return t->f(t, 0); } however, I checked 3.2.3 and this bug is fixed, so I guess I don't need to do that. Jeff From lists@morpheus.demon.co.uk Tue May 6 22:19:31 2003 From: lists@morpheus.demon.co.uk (Paul Moore) Date: Tue, 06 May 2003 22:19:31 +0100 Subject: [Python-Dev] MS VC 7 offer References: <200305061901.h46J11306259@odiug.zope.com> Message-ID: Guido van Rossum writes: >> Visual Studio 2003 came out a few weeks ago. I honestly don't know if >> its considered VC8 or just VC7.1 with the same backend compilers. But if >> you're going to upgrad, you might as well go all the way. > > Good question. > >> Also, I'm assuming 2.3 will still be compiled on 6.0, right? > > Hm, I was thinking that 2.3 final could be built using 7.x if Nick can > get us the donated copies fast enough. If this means that those of us with VC6, and with no plans/reasons to upgrade can no longer build our own extensions, this would be a disaster. Surely VC7-compiled C programs can be built in such a way as to be link-compatible with VC6-compiled extensions??? (Wait, this is Microsoft...) Please *don't* build 2.3 final with VC7. If you're going to switch, give users more warning, and test builds - I would need at least to find out if I could build extensions against a VC7-compiled Python using mingw... Paul. -- This signature intentionally left blank From lists@morpheus.demon.co.uk Tue May 6 22:20:32 2003 From: lists@morpheus.demon.co.uk (Paul Moore) Date: Tue, 06 May 2003 22:20:32 +0100 Subject: [Python-Dev] MS VC 7 offer References: <200305061948.03757.phil@riverbankcomputing.co.uk> Message-ID: Tim Peters writes: > [Phil Thompson] >> How do we get hold of the free VC 7 compilers? > > Part of the 100+ MB .NET Framework 1.1 SDK: > > http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx > > Note that this requires Win2K minimum. Note that these have no optimiser, as I understand it. Paul. -- This signature intentionally left blank From martin@v.loewis.de Tue May 6 22:45:28 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 06 May 2003 23:45:28 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <008601c31414$a26d0120$21795418@dell1700> References: <008601c31414$a26d0120$21795418@dell1700> Message-ID: <3EB82CF8.4030005@v.loewis.de> Brian Quinlan wrote: > Wouldn't this only affect extension modules using PyFile_FromFile and > PyFile_AsFile? That might be the case. However, notice that there might be other incompatibilities which we might discover by chance only - Microsoft hasn't documented any of this. >>There won't be any. That's any ABI change. > > > Isn't the ABI dependant on the API and linker? And the compiler, and the operating system, and the microprocessor. > The API is supposed to be > stable at this point. I would imagine that most extension developers > would assume that the build environment is also stable at this point. Yes, some are certainly assuming that. Some are sincerely hoping, or even expecting, that Python 2.3 is released with VC7, so that they can embed Python in their VC7-based application without having to recompile it. No matter what the choice is, somebody will be unhappy. Regards, Martin From dave@boost-consulting.com Tue May 6 23:01:28 2003 From: dave@boost-consulting.com (David Abrahams) Date: Tue, 06 May 2003 18:01:28 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB81A8B.9090603@v.loewis.de> (Martin v. =?iso-8859-1?q?L=F6wis's?= message of "Tue, 06 May 2003 22:26:51 +0200") References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it> <3EB81A8B.9090603@v.loewis.de> Message-ID: "Martin v. L=F6wis" writes: > Alex Martelli wrote: > >> When we discussed VC versions (back when we met in Ofxord during >> PythonUK/ACCU), David Abrahams seemed adamant that VC7 and VC6 >> are indeed compatible > > I doubt he said this in this generality Actually, I did. I may have overstated the case slightly, but not by much. > he surely knows that you cannot mix C++ objects files on the object > file level between those compilers, as they implement completely > different ABIs. They implement substantially similar ABIs. Here are the facts in full, glorious/gory detail from a member of Microsoft's compiler team. I quote: The bottom line: the ABI is backwards compatible. We do require using the linker that matches the newest compiler used in a set of .obj files. There were some incompatible name decoration changes (function templates) b/w VC7 and VC7.1. Most people should never notice this one, though I know of at least 1 customer that did. Another name decoration change was made b/w VC6 and VC7, but nobody should notice that change, since they were hitting a broken construct anyway. There was a SP of VC6 that is incompatible with VC7 and other builds of VC6, I forget which exactly, maybe SP4, or maybe it was the processor pack. It only involved pointer to members, but we were layout incompatible. The only other issues I can think of are related to __declspec(align(N)) and __unaligned (IA64 only, really.) > For Python, the biggest problem is that you cannot pass FILE* from > one C library to the other, because of some stupid locking test in > the C library. This does cause crashes when you try to use Python > extension modules compiled with the wrong compiler. Assuming you are passing availability of FILE*s across the extension module boundary and the extension module author is using the VC7 libraries instead of those that ship with VC6 (using the VC6 libraries with VC7 would be a trick)... then yes. In practice, making sure that resources are only used by the appropriate 'C' library is not too difficult, but requires a level of attention that I wouldn't want to demand of newbies. I certainly build all kinds of Boost.Python extension modules with VC7 and test them without problems using a VC6 build of Python. HTH, --=20 Dave Abrahams Boost Consulting www.boost-consulting.com From guido@python.org Tue May 6 23:06:11 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 06 May 2003 18:06:11 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: Your message of "Tue, 06 May 2003 22:19:31 BST." References: <200305061901.h46J11306259@odiug.zope.com> Message-ID: <200305062206.h46M6BP08306@odiug.zope.com> > If this means that those of us with VC6, and with no plans/reasons to > upgrade can no longer build our own extensions, this would be a > disaster. Part of the offer was: | Potentially we can even figure out how to enable anyone to | build Python using the freely downloadable compilers I mentioned | above... --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Tue May 6 23:17:14 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 07 May 2003 00:17:14 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: <200305061901.h46J11306259@odiug.zope.com> Message-ID: <3EB8346A.1000907@v.loewis.de> Paul Moore wrote: > If this means that those of us with VC6, and with no plans/reasons to > upgrade can no longer build our own extensions, this would be a > disaster. Using VC7 would be a desaster for those required to use VC6. Using VC6 is a desaster for those required to use VC7. Somebody will be unhappy. > Surely VC7-compiled C programs can be built in such a way as to be > link-compatible with VC6-compiled extensions??? It probably works in many cases, but it is known to fail in certain cases. Regards, Martin From tim.one@comcast.net Tue May 6 23:17:22 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 18:17:22 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB82CF8.4030005@v.loewis.de> Message-ID: [Martin v. Lowis] > ... > Some are sincerely hoping, or even expecting, that Python 2.3 is > released with VC7, so that they can embed Python in their VC7-based > application without having to recompile it. > > No matter what the choice is, somebody will be unhappy. OTOH, I don't see anything to stop releasing VC6 and VC7 versions of Python, except for the absence of a volunteer to do it. While the Wise installer is proprietary, there's nothing hidden about what goes into a release, there are several free installers people *could* use instead, and the build process for the 3rd-party components is pretty exhaustively documented. Speaking of which, presumably Tcl/Tk and SSL and etc on Windows should also be compiled under VC7 then. From cnetzer@mail.arc.nasa.gov Tue May 6 23:37:30 2003 From: cnetzer@mail.arc.nasa.gov (Chad Netzer) Date: 06 May 2003 15:37:30 -0700 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305062206.h46M6BP08306@odiug.zope.com> References: <200305061901.h46J11306259@odiug.zope.com> <200305062206.h46M6BP08306@odiug.zope.com> Message-ID: <1052260650.529.14.camel@sayge.arc.nasa.gov> On Tue, 2003-05-06 at 15:06, Guido van Rossum wrote: > Part of the offer was: > > | Potentially we can even figure out how to enable anyone to > | build Python using the freely downloadable compilers I mentioned > | above... Which would seem to exclude building on Win98 machines (or WinME *snort*, or even Win NT 4). Those platforms still have a huge installed base, and I would assume a not insignificant developer base. Is offering a MSVC6 version along with a more recent compiler version an option? -- Chad Netzer (any opinion expressed is my own and not NASA's or my employer's) From martin@v.loewis.de Tue May 6 23:47:01 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 07 May 2003 00:47:01 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it> <3EB81A8B.9090603@v.loewis.de> Message-ID: <3EB83B65.8070900@v.loewis.de> David Abrahams wrote: > Actually, I did. I may have overstated the case slightly, but not by > much. Hmm. While this is certainly off-topic for python-dev, I'm still curious. So I just did this: 1. Create a library project with VC6. Put a single class into a single translation unit #include struct X:public CObject{ X(); }; 2. Compile this library with vc6. 3. Create an MFC application with VC7. Instantiate X somewhere. Try to link. This gives the error message LINK : fatal error LNK1104: cannot open file 'mfc42d.lib' Sure enough, VC7 does not come with that library. So it seems very clear to me that the libraries shipped are incompatible in a way that does not allow to mix object files of different compilers. Did I do something wrong here? Regards, Martin From martin@v.loewis.de Tue May 6 23:50:44 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 07 May 2003 00:50:44 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: Message-ID: <3EB83C44.20706@v.loewis.de> Tim Peters wrote: > Speaking of which, presumably Tcl/Tk and SSL and etc on Windows should also > be compiled under VC7 then. That is certainly the case (not to forget bsddb, zlib, and bzip2). This will require quite some volunteer time. Regards, Martin From dave@boost-consulting.com Wed May 7 00:05:41 2003 From: dave@boost-consulting.com (David Abrahams) Date: Tue, 06 May 2003 19:05:41 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB83B65.8070900@v.loewis.de> (Martin v. =?iso-8859-1?q?L=F6wis's?= message of "Wed, 07 May 2003 00:47:01 +0200") References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it> <3EB81A8B.9090603@v.loewis.de> <3EB83B65.8070900@v.loewis.de> Message-ID: "Martin v. L=F6wis" writes: > David Abrahams wrote: > >> Actually, I did. I may have overstated the case slightly, but not by >> much. > > Hmm. While this is certainly off-topic for python-dev, I'm still > curious. So I just did this: > > 1. Create a library project with VC6. Put a single class into > a single translation unit > > #include > > struct X:public CObject{ > X(); > }; > > 2. Compile this library with vc6. > > 3. Create an MFC application with VC7. Instantiate X somewhere. > Try to link. This gives the error message > > LINK : fatal error LNK1104: cannot open file 'mfc42d.lib' > > Sure enough, VC7 does not come with that library. > So it seems very clear to me that the libraries shipped are > incompatible in a way that does not allow to mix object files > of different compilers. Did I do something wrong here? I normally don't think of the contents (or naming) of a non-standard library like MFC that just happens to ship with the compiler as being something that affects object-code compatibility. *If* you accept the way I see that term, your test doesn't say anything about it. Certainly for any accepted definition of "ABI", it's hard to connect your test with the claim that "they implement completely different ABIs". You could make a reasonable argument that differences in the standard 'C' or C++ library affects object code compatibility; frankly I have avoided that area so I don't know whether there are problems with the 'C' library but I know the C++ library underwent a major overhaul, so I wouldn't place any bets. Regardless, when I say "object code compatibility", I'm talking about what's traditionally thought of as the ABI: the layout of objects, calling convention, mechanics of the runtime, etc., all of which are basically library-independent issues. HTH2, --=20 Dave Abrahams Boost Consulting www.boost-consulting.com From mhammond@skippinet.com.au Wed May 7 00:06:38 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 7 May 2003 09:06:38 +1000 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305061826.h46IQ7605750@odiug.zope.com> Message-ID: <03dc01c31424$26758500$530f8490@eden> [Guido] > I can see advantages and disadvantages of moving to VC 7; I'm sure the > VC 7 compiler is more standard-compliant and generates faster code, > but a disadvantage is that you can't apparently link binaries built > with VC 6 to a program built with VC 7, meaning that 3rd party > extensions will have to be recompiled with VC 7 as well. Actually, I think this need not be true. I have MSVC7, not currently installed, but when it was I did manage to mix and match compilers for Python and extensions without problem. I am happy to play with this, but am short on time for a week or so. Another thing to consider is the "make" environment. If we don't use DevStudio, then presumably our existing project files will become useless. Not a huge problem, but a real one. MSVC exported makefiles are not designed to be maintained. I'm having good success with autoconf and Python on other projects, but that would raise the barrier to including cygwin in your build environment. Then-just-one-step-from-gcc ly, Mark. From mhammond@skippinet.com.au Wed May 7 00:19:50 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 7 May 2003 09:19:50 +1000 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB83B65.8070900@v.loewis.de> Message-ID: <03e201c31425$feeb3a00$530f8490@eden> > > Actually, I did. I may have overstated the case slightly, > but not by > > much. > > Hmm. While this is certainly off-topic for python-dev, I'm still > curious. So I just did this: What you did is to create a library using a specific version of an "external" library (MFC - shipped with MS as part of MSVC, but as external as any other .lib you may use from anywhere) You then upgrade to a newer version of the library, and attempted to link code built using an earlier one. So this has nothing to do with MSVC as such, only with MFC. It is somewhat similar to trying to use a Python 1.x extension with Python 2.x, or, assuming it was possible, using the same MSVCx with 2 discrete MFC versions. Mark. From phil@riverbankcomputing.co.uk Wed May 7 00:46:06 2003 From: phil@riverbankcomputing.co.uk (Phil Thompson) Date: Wed, 7 May 2003 00:46:06 +0100 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: Message-ID: <200305070046.06725.phil@riverbankcomputing.co.uk> On Tuesday 06 May 2003 7:56 pm, Tim Peters wrote: > [Phil Thompson] > > > How do we get hold of the free VC 7 compilers? > > Part of the 100+ MB .NET Framework 1.1 SDK: > > http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx > > Note that this requires Win2K minimum. Does it generate binaries that will run under Win9x? Phil From pje@telecommunity.com Wed May 7 00:52:39 2003 From: pje@telecommunity.com (Phillip J. Eby) Date: Tue, 06 May 2003 19:52:39 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <03dc01c31424$26758500$530f8490@eden> References: <200305061826.h46IQ7605750@odiug.zope.com> Message-ID: <5.1.1.6.0.20030506194806.01fc05b0@telecommunity.com> At 09:06 AM 5/7/03 +1000, Mark Hammond wrote: >Another thing to consider is the "make" environment. If we don't use >DevStudio, then presumably our existing project files will become useless. >Not a huge problem, but a real one. MSVC exported makefiles are not >designed to be maintained. I'm having good success with autoconf and Python >on other projects, but that would raise the barrier to including cygwin in >your build environment. > >Then-just-one-step-from-gcc ly, Just out of curiosity, what is it that MSVC adds to the picture over gcc anyway? Has anybody ever tried making a MinGW-only build of Python on Windows? From phil@riverbankcomputing.co.uk Wed May 7 00:57:10 2003 From: phil@riverbankcomputing.co.uk (Phil Thompson) Date: Wed, 7 May 2003 00:57:10 +0100 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <007e01c31405$1ea52fc0$21795418@dell1700> References: <007e01c31405$1ea52fc0$21795418@dell1700> Message-ID: <200305070057.10872.phil@riverbankcomputing.co.uk> On Tuesday 06 May 2003 8:24 pm, Brian Quinlan wrote: > > I can see the downside (next to no experience with 7.x, and perhaps > > none > > > before the final release). What's the upside? > > It's free and more standards compliant. This is what I'm struggling with. If it's free, why pay any attention to the offer of a donation of a GUI frontend? (With a certain amount of irony, I don't attach any value to a GUI frontend to a compiler.) If it is free using some Microsoft definition of the word (eg. users have to upgrade to Win2K, or some other "read the small print" reason) then my vote is -1. If it is really free then submit a PEP and factor it in to the normal review/development process. I don't understand the apparent urgency. Phil From mhammond@skippinet.com.au Wed May 7 01:08:25 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 7 May 2003 10:08:25 +1000 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <5.1.1.6.0.20030506194806.01fc05b0@telecommunity.com> Message-ID: <03f801c3142c$c8603290$530f8490@eden> > Just out of curiosity, what is it that MSVC adds to the > picture over gcc > anyway? Has anybody ever tried making a MinGW-only build of > Python on Windows? Now or then? . "Then" it was the simple matter of no gcc available for Windows. Now, it is a combination of no one driving it, and the simple fact that msvc will almost certainly generate better code and work with almost every library on Windows worth talking to. However, until the "no one driving it" part of solved, the latter, including the impact of mingw, wont be able to be measured. Mark. From gh@ghaering.de Wed May 7 01:31:04 2003 From: gh@ghaering.de (=?ISO-8859-1?Q?Gerhard_H=E4ring?=) Date: Wed, 07 May 2003 02:31:04 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <5.1.1.6.0.20030506194806.01fc05b0@telecommunity.com> References: <200305061826.h46IQ7605750@odiug.zope.com> <5.1.1.6.0.20030506194806.01fc05b0@telecommunity.com> Message-ID: <3EB853C8.70100@ghaering.de> Phillip J. Eby wrote: > Just out of curiosity, what is it that MSVC adds to the picture over gcc > anyway? Has anybody ever tried making a MinGW-only build of Python on > Windows? I'm working (as time and enthusiasm permits) on making this happen. For this project, I even got commit privileges by the powers that be :-) Getting as far as: C:\src\python\dist\src>python 'import site' failed; use -v for traceback Python 2.3a2+ (#27, Apr 23 2003, 21:13:49) [GCC 3.2.2 (mingw special 20030208-1)] on mingw32_nt-5.11 Type "help", "copyright", "credits" or "license" for more information. >>> isn't much of a problem. This is a statically linked python.exe built with the autoconf-based build process, msys, mingw and my patches, mostly for posixmodule.c. The difficult part is figuring out the autoconf stuff and distutils, so that the rest of the modules can be built. I didn't get very far on this side, yet :-/ OTOH I'm pretty sure that a mingw build would be much easier if I just wrote my own Makefiles, but that's probably unlikely to ever be merged. At least that was my experience when making PostgreSQL's client code compile with mingw. Their answer was "we don't want to maintain yet anothe set of proprietary Makefiles", which is a good argument. -- Gerhard From tim.one@comcast.net Wed May 7 01:43:59 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 20:43:59 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305070046.06725.phil@riverbankcomputing.co.uk> Message-ID: [Phil Thompson, on http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx ] Follow the link, please. I haven't tried it myself, and you've already proved you can read too . From skip@pobox.com Wed May 7 01:52:13 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 6 May 2003 19:52:13 -0500 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB853C8.70100@ghaering.de> References: <200305061826.h46IQ7605750@odiug.zope.com> <5.1.1.6.0.20030506194806.01fc05b0@telecommunity.com> <3EB853C8.70100@ghaering.de> Message-ID: <16056.22717.877383.95261@montanaro.dyndns.org> Gerhard> OTOH I'm pretty sure that a mingw build would be much easier if Gerhard> I just wrote my own Makefiles, but that's probably unlikely to Gerhard> ever be merged. At least that was my experience when making Gerhard> PostgreSQL's client code compile with mingw. I suggest you go ahead with whatever is easiest for you. At least you will be able to focus on actually solving the MinGW-related problems. Others can chip in on the autoconf problems. As a starter perhaps a Makefile.mingw file can be added to the PCBuild directory. At a later date the interim makefile can be removed to the attic. Skip From tim.one@comcast.net Wed May 7 02:17:49 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 21:17:49 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305070057.10872.phil@riverbankcomputing.co.uk> Message-ID: [Phil Thompson] > ... > If it's free, why pay any attention to the offer of a donation of a GUI > frontend? (With a certain amount of irony, I don't attach any > value to a GUI frontend to a compiler.) The GUI isn't just the compiler, it's also the automated dependency analysis, a make system, and a (very good) debugger. > If it is free using some Microsoft definition of the word (eg. > users have to upgrade to Win2K, or some other "read the small print" > reason) then my vote is -1. Guido asked who would want one. You don't, but you don't get to vote that nobody else does either. From BPettersen@NAREX.com Wed May 7 02:21:11 2003 From: BPettersen@NAREX.com (Bjorn Pettersen) Date: Tue, 6 May 2003 19:21:11 -0600 Subject: [Python-Dev] Windows installer request... Message-ID: <60FB8BB7F0EFC7409B75EEEC13E20192022DE2A8@admin56.narex.com> > From: Tim Peters [mailto:tim.one@comcast.net]=20 >=20 > [Bjorn Pettersen] [...] > > item: Get Environment Variable > > Variable=3DOSDRIVE > > Environment=3DSystemDrive > > Default=3DC: > > end [...] > Enough already : I don't have time to try=20 > umpteen different things here, or really even one. Thank you for doing it anyway then . > What I did do is build an installer *just* removing the hard-coded > Wizard-generated "C:" prefix. Martin tried that and said it=20 > worked for him. It doesn't hurt me. If it works for you too,=20 > I'll commit the change: Works like a charm. Tested on Win98, Win2k, WinXP Pro (regular), and my "special" XP. (NT4 seems to have died a silent death, so I couldn't test it there...) > Please give that a try. It's an incoherent mix if files, so=20 > please use a junk name for the installation directory and program=20 > startup group (or simply abort the install after you see whether=20 > it suggested a drive you approve of). I went all the way through (all files seems to have gone in correctly), and as expected it shadowed my original install of 2.3b1 in the Add/Remove Programs window. Surprisingly however, the original came back after this one was removed. Who'd have thought.. ;-) [.. xx.wse needs the Wise GUI to create an installer..] Thought it might be that way... FWIW, re: the MSVC7 debate, the "Microsoft Development Environment" (DevStudio), comes with five different "Setup and Deployment projects". I've never used any of them, nor Wise (obviously :-), but it could potentially get you out of the loop... . Thanks again! -- bjorn From tim.one@comcast.net Wed May 7 02:23:55 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 21:23:55 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB83C44.20706@v.loewis.de> Message-ID: [Tim] > Speaking of which, presumably Tcl/Tk and SSL and etc on Windows > should also be compiled under VC7 then. [Martin v. L=F6wis] > That is certainly the case (not to forget bsddb, zlib, and bzip2). > This will require quite some volunteer time. Amplifying a little, the Python code base required some changes befor= e it would compile under VC 7 (I didn't make these changes, and don't reca= ll any details apart from changes in MS's LONG_INTEGER APIs). There's no re= ason to believe that other code bases are immune from needing changes too. A= t present, we don't maintain any patches to any external code base in o= rder to build the Windows release. If we needed to make changes to them for = VC 7, that would probably change, and should really be done by the packages= ' primary (non-Python) maintainers. From tim.one@comcast.net Wed May 7 02:40:49 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 21:40:49 -0400 Subject: [Python-Dev] Windows installer request... In-Reply-To: <60FB8BB7F0EFC7409B75EEEC13E20192022DE2A8@admin56.narex.com> Message-ID: [Tim] >> Enough already : I don't have time to try >> umpteen different things here, or really even one. [Bjorn Pettersen] > Thank you for doing it anyway then . You're welcome! I took the "C:" out on Monday, when I had just enough spare time to delete one byte, and took the rest out of sleep. > ... > Works like a charm. Tested on Win98, Win2k, WinXP Pro (regular), and my > "special" XP. (NT4 seems to have died a silent death, so I couldn't test > it there...) Thanks! I'll check it in ... Thursday. >> Please give that a try. It's an incoherent mix if files, so >> please use a junk name for the installation directory and program >> startup group (or simply abort the install after you see whether >> it suggested a drive you approve of). > I went all the way through (all files seems to have gone in correctly), > and as expected it shadowed my original install of 2.3b1 in the > Add/Remove Programs window. Surprisingly however, the original came back > after this one was removed. Who'd have thought.. ;-) The rollback features in Wise 8.14-generated installers are pretty good (esp. if you check the "make backups" option when installing). Uninstall/rollback will even restore start menu groups and file associations. I don't trust it enough to recommend it, though (I haven't really beat on it). Something fun to waste time: in the very last "Installation Completed!" install dialog, click "Cancel" instead of "Finish". It will then roll back all the changes it made, leaving things as they were before you started the installer. > ... FWIW, re: the MSVC7 debate, the "Microsoft Development Environment" > (DevStudio), comes with five different "Setup and Deployment projects". > I've never used any of them, nor Wise (obviously :-), but it could > potentially get you out of the loop... . Thanks, but I'm not sure even death has that kind of power. From dave@boost-consulting.com Wed May 7 03:16:25 2003 From: dave@boost-consulting.com (David Abrahams) Date: Tue, 06 May 2003 22:16:25 -0400 Subject: [Python-Dev] Re: MS VC 7 offer References: <008601c31414$a26d0120$21795418@dell1700> <3EB82CF8.4030005@v.loewis.de> Message-ID: "Martin v. L�wis" writes: > Brian Quinlan wrote: > >> Wouldn't this only affect extension modules using PyFile_FromFile and >> PyFile_AsFile? > > That might be the case. However, notice that there might be other > incompatibilities which we might discover by chance only - Microsoft > hasn't documented any of this. They pretty much told you the exact score, through me. More details are available if neccessary. -- Dave Abrahams Boost Consulting www.boost-consulting.com From dave@boost-consulting.com Wed May 7 03:20:59 2003 From: dave@boost-consulting.com (David Abrahams) Date: Tue, 06 May 2003 22:20:59 -0400 Subject: [Python-Dev] Re: MS VC 7 offer References: <16056.2317.124886.963460@montanaro.dyndns.org> <007e01c31405$1ea52fc0$21795418@dell1700> Message-ID: Brian Quinlan writes: >> I can see the downside (next to no experience with 7.x, and perhaps > none >> before the final release). What's the upside? > > It's free and more standards compliant. That compliance means a lot to C++ programmers. It takes MSVC from being a real PITA to do any serious C++ in (Vc7.0 was worse than 6 in some ways) to being a first-class contender among quality C++ implementations. I'm not sure whether that should have any effect on decisions made about Python development, though ;-) -- Dave Abrahams Boost Consulting www.boost-consulting.com From tim.one@comcast.net Wed May 7 03:42:29 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 06 May 2003 22:42:29 -0400 Subject: [Python-Dev] Re: heaps In-Reply-To: Message-ID: [David Eppstein] >>> For fairness, it might be interesting to try another run of your test >>> in which the input sequence is sorted in increasing order rather >>> than random. [Tim] >> Comparing the worst case of one against the best case of the >> other isn't my idea of fairness , but sure. [David] > Well, it doesn't seem any fairer to use random data to compare an > algorithm with an average time bound that depends on an assumption of > randomness in the data...anyway, the point was more to understand the > limiting cases. If one algorithm is usually 3x faster than the other, > and is never more than 10x slower, that's better than being usually 3x > faster but sometimes 1000x slower, for instance. Sure. In practice you need to know time distributions when using an algorithm -- best, expected, worse, and how likely each are under a variety of expected conditions. > My Java KBest code was written to make data subsets for a half-dozen web > pages (same data selected according to different criteria). Of these > six instances, one is presented the data in roughly ascending order, one > in descending order, and the other four are less clear but probably not > random. > > Robustness in the face of this sort of variation is why I prefer any > average-case assumptions in my code's performance to depend only on > randomness from a random number generator, and not arbitrariness in the > actual input. But I'm not sure I'd usually be willing to pay a 3x > penalty for that robustness. Most people aren't, until they hit a bad case <0.5 wink>. So "pure" algorithms rarely survive in the face of a large variety of large problem instances. The monumental complications Python's list.sort() endures to work well under many conditions (both friendly and hostile) is a good example of that. In industrial real life, I expect an all-purpose N-Best queue would need to take a hybrid approach, monitoring its fast-path gimmick in some cheap way in order to fall back to a more defensive algorithm when the fast-path gimmick isn't paying. >> Here's a surprise: I coded a variant of the quicksort-like >> partitioning method, at the bottom of this mail. On the largest-1000 >> of a million random-float case, times were remarkably steady across >> trials (i.e., using a different set of a million random floats each >> time): >> >> heapq 0.96 seconds >> sort (micro-optimized) 3.4 seconds >> KBest (below) 2.6 seconds > Huh. You're almost convincing me that asymptotic analysis works even in > the presence of Python's compiled-vs-interpreted anomalies. Indeed, you can't fight the math! It often takes a large problem for better O() behavior to overcome a smaller constant in a worse O() approach, and especially in Python. For example, I once wrote and tuned and timed an O(N) worst-case rank algorithm in Python ("find the k'th smallest item in a sequence"), using the median-of-medians-of-5 business. I didn't have enough RAM at the time to create a list big enough for it to beat "seq.sort(); return seq[k]". By playing lots of tricks, and boosting it to median-of-medians-of-11, IIRC I eventually got it to run faster than sorting on lists with "just" a few hundred thousand elements. But in *this* case I'm not sure that the only thing we're really measuring isn't: 1. Whether an algorithm has an early-out gimmick. 2. How effective that early-out gimmick is. and 3. How expensive it is to *try* the early-out gimmick. The heapq method Rulz on random data because its answers then are "yes, very, dirt cheap". I wrote the KBest test like so: def three(seq, N): NBest = KBest(N, -1e200) for x in seq: NBest.put(x) L = NBest.get() L.sort() return L (the sort at the end is just so the results can be compared against the other methods, to ensure they all get the same answer). If I break into the abstraction and change the test like so: def three(seq, N): NBest = KBest(N, -1e200) cutoff = -1e200 for x in seq: if x > cutoff: NBest.put(x) cutoff = NBest.cutoff L = NBest.get() L.sort() return L then KBest is about 20% *faster* than heapq on random data. Doing the comparison inline avoids a method call when early-out pays, early-out pays more and more as the sequence nears its end, and simply avoiding the method call then makes the overall algorithm 3X faster. So O() analysis may triumph when equivalent low-level speed tricks are played (the heapq method did its early-out test inline too), but get swamped before doing so. > The other surprise is that (unlike, say, the sort or heapq versions) > your KBest doesn't look significantly more concise than my earlier Java > implementation. The only thing I was trying to minimize was my time in whipping up something correct to measure. Still, I count 107 non-blank, non-comment lines of Java, and 59 of Python. Java gets unduly penalized for curly braces, Python for tedious tricks like buf = self.buf k = self.k to make locals for speed, and that I never put dependent code on the same line as an "if" or "while" test (while you do). Note that it's not quite the same algorithm: the Python version isn't restricted to ints, and in particular doesn't assume it can do arithmetic on a key to get "the next larger" key. Instead it does 3-way partitioning to find the items equal to the pivot. The greater generality may make the Python a little windier. BTW, the heapq code isn't really more compact than C, if you count the implementation code in heapq.py too: it's all low-level small-int arithmetic and array indexing. The only real advantage Python has over assembler for code like that is that we can grow the list/heap dynamically without any coding effort. From martin@v.loewis.de Wed May 7 06:21:46 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 07 May 2003 07:21:46 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: Message-ID: Tim Peters writes: > There's no reason to believe that other code bases are immune from > needing changes too. OTOH, there is any reason to believe that for many of these packages, the required changes have been made already, atleast for those that get regular updates (Tcl/Tk, bsddb). Regards, Martin From martin@v.loewis.de Wed May 7 06:23:31 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 07 May 2003 07:23:31 +0200 Subject: [Python-Dev] Re: MS VC 7 offer In-Reply-To: References: <008601c31414$a26d0120$21795418@dell1700> <3EB82CF8.4030005@v.loewis.de> Message-ID: David Abrahams writes: > They pretty much told you the exact score, through me. More details > are available if neccessary. That is information about the core ABI. I do need to be concerned about changes in the libraries, as well, in particular about incompatibilities resulting from multiple copies of the C library. You said you don't know much about that. Regards, Martin From paoloinvernizzi@dmsware.com Wed May 7 08:12:34 2003 From: paoloinvernizzi@dmsware.com (Paolo Invernizzi) Date: Wed, 07 May 2003 09:12:34 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <03dc01c31424$26758500$530f8490@eden> References: <03dc01c31424$26758500$530f8490@eden> Message-ID: <3EB8B1E2.2050108@dmsware.com> Mark Hammond wrote: >Another thing to consider is the "make" environment. If we don't use >DevStudio, then presumably our existing project files will become useless. >Not a huge problem, but a real one. MSVC exported makefiles are not >designed to be maintained. I'm having good success with autoconf and Python >on other projects, but that would raise the barrier to including cygwin in >your build environment. > I think the scons (www.scons.org) will have in its next release full support for building targets using VC6 *project* file, and full support for VC7. Actually it has support also for cygwin and mingw... So I think is possible to have an automated way for building VC7 python based only on some scons script and VC6 project files... The possible goal is to keep working with VC6 IDE as now, and have a simple build script able to automatically build the VC7 version tracking changes.. I've inserted Greg Spencer, who I know is working on this... surely he can bring us more details. --- Paolo Invernizzi. From phil@riverbankcomputing.co.uk Wed May 7 09:02:46 2003 From: phil@riverbankcomputing.co.uk (Phil Thompson) Date: Wed, 7 May 2003 09:02:46 +0100 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: Message-ID: <200305070902.46353.phil@riverbankcomputing.co.uk> On Wednesday 07 May 2003 2:17 am, Tim Peters wrote: > [Phil Thompson] > > > ... > > If it's free, why pay any attention to the offer of a donation of a GUI > > frontend? (With a certain amount of irony, I don't attach any > > value to a GUI frontend to a compiler.) > > The GUI isn't just the compiler, it's also the automated dependency > analysis, a make system, and a (very good) debugger. > > > If it is free using some Microsoft definition of the word (eg. > > users have to upgrade to Win2K, or some other "read the small print" > > reason) then my vote is -1. > > Guido asked who would want one. You don't, but you don't get to vote that > nobody else does either. That's not the point I'm trying to make. If there is a cost to *users* of a change then that change must be managed properly. The statement on Microsoft's web page says... "Non-developers need to install the .NET Framework 1.1 to run applications developed using the .NET Framework 1.1." The impression I'm getting is that a quick switchover to VC 7 is being suggested - that's what I'm "voting" against. Phil From harri.pasanen@trema.com Wed May 7 09:31:08 2003 From: harri.pasanen@trema.com (Harri Pasanen) Date: Wed, 7 May 2003 10:31:08 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB81A8B.9090603@v.loewis.de> References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it> <3EB81A8B.9090603@v.loewis.de> Message-ID: <200305071031.08474.harri.pasanen@trema.com> On Tuesday 06 May 2003 22:26, Martin v. L=F6wis wrote: > Alex Martelli wrote: > > When we discussed VC versions (back when we met in Ofxord during > > PythonUK/ACCU), David Abrahams seemed adamant that VC7 and VC6 > > are indeed compatible > > I doubt he said this in this generality: he surely knows that you > cannot mix C++ objects files on the object file level between those > compilers, as they implement completely different ABIs. > > For Python, the biggest problem is that you cannot pass FILE* from > one C library to the other, because of some stupid locking test in > the C library. This does cause crashes when you try to use Python > extension modules compiled with the wrong compiler. > One known failure case from the real world is the OmniORB free CORBA=20 ORB. The omniidl parser, which is implemented as a mixture of python=20 and C++ requires that python is compiled with the same VC version as=20 you are compiling OmniORB with. So if you are using VC7 to compile OmniORB, you cannot use the binary=20 Python 2.2.2 from pythonlabs for it, you need to compile your own=20 python using VC7. I believe it is the FILE * that is causing the=20 problem here. If I recall correctly, the size of the underlying FILE=20 struct is different in msvcrt.dll and msvcrt7.dll. I don't know the=20 gory details, I just know the cure. This issue was also in omniORB=20 mailing list. =46or our own product we have to support both VC6 and VC7. For our=20 development version we have actually imported python 2.3 to our CVS,=20 and we are compiling it with VC7.1. Our previous release continues=20 to rely on VC6, and Python 2.2.2, so each develeloper actually has=20 both VC6 and VC7.1 installed on their machine, and correspondingly=20 both python 2.2.2 and python 2.3. Just another datapoint. =2DHarri From sjoerd@acm.org Wed May 7 09:36:46 2003 From: sjoerd@acm.org (Sjoerd Mullender) Date: Wed, 07 May 2003 10:36:46 +0200 Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck" In-Reply-To: <16056.2688.72423.251200@montanaro.dyndns.org> References: <16055.57554.364845.689049@montanaro.dyndns.org> <16056.2688.72423.251200@montanaro.dyndns.org> Message-ID: <20030507083646.6305F74230@indus.ins.cwi.nl> On Tue, May 6 2003 Skip Montanaro wrote: > > >> This generated pyconfig.h. It would thus appear that config.status > >> shouldn't be used by developers. Apparently one of the other flags > >> it appends to the generated configure command suppresses generation > >> of pyconfig.h (and maybe other files). > > Martin> Can you find out whether this is related to the fact that you > Martin> are building in a separate build directory? > > I just confirmed that it's not related to the separate build directory. > When you run config.status --recheck it reruns your latest configure command > with the extra flags --no-create and --no-recursion. Without rummaging > around in the configure file my guess is the --no-create flag is the > culprit. > > So, a word to the wise: avoid config.status --recheck. I don't agree. Just run ./config.status without arguments after running ./config.status --recheck. That *will* regenerate all files. -- Sjoerd Mullender From Paul.Moore@atosorigin.com Wed May 7 11:49:36 2003 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Wed, 7 May 2003 11:49:36 +0100 Subject: [Python-Dev] MS VC 7 offer Message-ID: <16E1010E4581B049ABC51D4975CEDB88619A64@UKDCX001.uk.int.atosorigin.com> From: Guido van Rossum [mailto:guido@python.org] > > If this means that those of us with VC6, and with no plans/reasons = to > > upgrade can no longer build our own extensions, this would be a > > disaster. > Part of the offer was: > | Potentially we can even figure out how to enable anyone to > | build Python using the freely downloadable compilers I mentioned > | above... Which is good news (don't get me wrong, I'm glad to see Microsoft supporting open source projects in this way). But wouldn't that imply unoptimised builds? I just checked: >cl /O2 Microsoft (R) 32-bit C/C++ Standard Compiler Version 13.00.9466 for = 80x86 Copyright (C) Microsoft Corporation 1984-2001. All rights reserved. cl : Command line warning D4029 : optimization is not available in the standard edition compiler So, specifically, if PythonLabs releases Python 2.3 built with MSVC7, and I want to build the latest version of PIL, (maybe because Fredrik hasn't released a binary version yet), do I have no way of getting an optimised build (I pick PIL deliberately, because I guess that image processing would benefit from optimisation, and in the past, PIL = binaries have been relatively hard to obtain at times)? That's the problem I see, personally. I have VC6 because my employer = uses Visual Studio for Visual Basic development. But VB has changed so much = in the transition to .NET, that I don't believe they will ever going to = VS7. So I will have to remain with VS6 (I'm never going to buy VS7 myself, = just for this sort of job). Paul. From mhammond@skippinet.com.au Wed May 7 11:52:21 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 07 May 2003 20:52:21 +1000 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <200305070902.46353.phil@riverbankcomputing.co.uk> Message-ID: <046701c31486$bdac6170$530f8490@eden> > "Non-developers need to install the .NET Framework 1.1 to run > applications developed using the .NET Framework 1.1." MSVC7 is not the .NET framework. Let's just relax a little and have some faith in the people making these decisions. Mark. From mhammond@skippinet.com.au Wed May 7 11:55:48 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 7 May 2003 20:55:48 +1000 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <16E1010E4581B049ABC51D4975CEDB88619A64@UKDCX001.uk.int.atosorigin.com> Message-ID: <046a01c31487$399d3390$530f8490@eden> This is a multi-part message in MIME format. ------=_NextPart_000_046B_01C314DB.0B494390 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit > That's the problem I see, personally. I have VC6 because my > employer uses Visual Studio for Visual Basic development. > But VB has changed so much in > the transition to .NET, that I don't believe they will ever > going to VS7. So I will have to remain with VS6 (I'm never > going to buy VS7 myself, just for this sort of job). I must say that anecdotally, I find this to be true. Developers are *not* flocking to VC7. I wonder if that fact has anything to do with MS offering free compilers? Maybe we could get 100 free versions out of them Mark. ------=_NextPart_000_046B_01C314DB.0B494390 Content-Type: application/ms-tnef; name="winmail.dat" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="winmail.dat" eJ8+IjMKAQaQCAAEAAAAAAABAAEAAQeQBgAIAAAA5AQAAAAAAADoAAEIgAcAGAAAAElQTS5NaWNy b3NvZnQgTWFpbC5Ob3RlADEIAQ2ABAACAAAAAgACAAEGgAMADgAAANMHBQAHABQANwAAAAMANAEB A5AGALgGAAAlAAAACwACAAEAAAALACMAAAAAAAMAJgAAAAAACwApAAAAAAADAC4AAAAAAAMANgAA AAAAHgBwAAEAAAAbAAAAW1B5dGhvbi1EZXZdIE1TIFZDIDcgb2ZmZXIAAAIBcQABAAAAFgAAAAHD FIc3oS70lqaMukw1n+Kj0kqn7w8AAAIBHQwBAAAAHwAAAFNNVFA6TUhBTU1PTkRAU0tJUFBJTkVU LkNPTS5BVQAACwABDgAAAABAAAYOANrXGocUwwECAQoOAQAAABgAAAAAAAAAccRAiYOwsk+IY+5m qPea/wKHAAADABQOAQAAAAsAHw4BAAAAAgEJEAEAAAB2AgAAcgIAAHwDAABMWkZ1tVBoEAMACgBy Y3BnMTI14jIDQ3RleAVBAQMB9/8KgAKkA+QHEwKAD/MAUARWPwhVB7IRJQ5RAwECAGNo4QrAc2V0 MgYABsMRJfYzBEYTtzASLBEzCO8J97Y7GB8OMDURIgxgYwBQMwsJAWQzNhZQC6YgPiQgVBPgdCcE IHRoGGUgcANgAmBlbSBkSSAUEGUsHbAEkHNjAiAHQGx5Lh4xE+B2QR2gVkM2IGIFkGHidRQQIG15 CuMKgBzwcx4QC1BveRKBIGEEIFYNBAB1B0AGAHR1ZGmcbyACEAXAIjVCYQ3R7iABAB+wF7BwB4AC MB9QySDWQnUFQFZCH4EEILkT0W5nCYAeUCLwbRrQnGggC4Eg5R2CdHIAchZ0IuADoHQi8C5ORV5U HpAdgB0wHjFkAiAn+wVAICBsCJAfsR2BILAD8J8fICFAH7AK1CESZ28LgOJnKKJWUzcfUAYAIvD/ HkAq4x+TKLEYIADAC4Aq0UcdgCyBIAAoSSceIG7zKz8sQ2J1ILAskSCRFBD0bGYekGogYAVAIxId gMcEACaRACAgb2YxwB3gfCkuINQg1B5AJtAx8XP+YSCwKUMAcAWQKbABkB8h+x6QHkBmC4AmgDJj MKIn4nsKUCzBRCQ0HsE1MBggIDQqbjWQKiMAF7Bja/0sJkMssi0hAiAEgScQMxC9KUNmANAFQCXi AHB5MmHfLDQpsC50BeEzAGYGcSwxbwNQCeAmEANwcAMQHsE/LTOKTTTANwF3PeJ1bOcmgCZgBUAx MBZQPbMvgX8AkAIgBCAIYDLjHYEeIDyRA/Buaz4+rHJrM3sCfUSwAAAeAEIQAQAAAEgAAAA8MTZF MTAxMEU0NTgxQjA0OUFCQzUxRDQ5NzVDRURCODg2MTlBNjRAVUtEQ1gwMDEudWsuaW50LmF0b3Nv cmlnaW4uY29tPgADAAlZAQAAAAsAC4AIIAYAAAAAAMAAAAAAAABGAAAAAAOFAAAAAAAAAwAMgAgg BgAAAAAAwAAAAAAAAEYAAAAAEIUAAAAAAAADAA2ACCAGAAAAAADAAAAAAAAARgAAAABShQAAfW4B AB4ADoAIIAYAAAAAAMAAAAAAAABGAAAAAFSFAAABAAAABAAAADkuMAALABKACCAGAAAAAADAAAAA AAAARgAAAAAOhQAAAAAAAAMAE4AIIAYAAAAAAMAAAAAAAABGAAAAABGFAAAAAAAAAwAUgAggBgAA AAAAwAAAAAAAAEYAAAAAGIUAAAAAAAALABWACCAGAAAAAADAAAAAAAAARgAAAAAGhQAAAAAAAAMA FoAIIAYAAAAAAMAAAAAAAABGAAAAAAGFAAAAAAAAAgH4DwEAAAAQAAAAccRAiYOwsk+IY+5mqPea /wIB+g8BAAAAEAAAAHHEQImDsLJPiGPuZqj3mv8CAfsPAQAAAJIAAAAAAAAAOKG7EAXlEBqhuwgA KypWwgAAbXNwc3QuZGxsAAAAAABOSVRB+b+4AQCqADfZbgAAAEU6XERvY3VtZW50cyBhbmQgU2V0 dGluZ3Ncc2tpcFxMb2NhbCBTZXR0aW5nc1xBcHBsaWNhdGlvbiBEYXRhXE1pY3Jvc29mdFxPdXRs b29rXG91dGxvb2sucHN0AAAAAwD+DwUAAAADAA00/TcAAAIBfwABAAAAMQAAADAwMDAwMDAwNzFD NDQwODk4M0IwQjI0Rjg4NjNFRTY2QThGNzlBRkY2NDE5RjkwMAAAAAADAAYQa9ilLgMABxCvAQAA AwAQEAEAAAADABEQAQAAAB4ACBABAAAAZQAAAFRIQVRTVEhFUFJPQkxFTUlTRUUsUEVSU09OQUxM WUlIQVZFVkM2QkVDQVVTRU1ZRU1QTE9ZRVJVU0VTVklTVUFMU1RVRElPRk9SVklTVUFMQkFTSUNE RVZFTE9QTUVOVEJVVFYAAAAAC5I= ------=_NextPart_000_046B_01C314DB.0B494390-- From dave@boost-consulting.com Wed May 7 12:06:22 2003 From: dave@boost-consulting.com (David Abrahams) Date: Wed, 07 May 2003 07:06:22 -0400 Subject: [Python-Dev] Re: MS VC 7 offer In-Reply-To: (Martin v. =?iso-8859-15?q?L=F6wis's?= message of "07 May 2003 07:23:31 +0200") References: <008601c31414$a26d0120$21795418@dell1700> <3EB82CF8.4030005@v.loewis.de>

Message-ID: martin@v.loewis.de (Martin v. L=F6wis) writes: > David Abrahams writes: > >> They pretty much told you the exact score, through me. More details >> are available if neccessary. > > That is information about the core ABI. I do need to be concerned > about changes in the libraries, as well, in particular about > incompatibilities resulting from multiple copies of the C library. You > said you don't know much about that. I can find out almost as easily, if you have specific questions. Just let me know, --=20 Dave Abrahams Boost Consulting www.boost-consulting.com From mwh@python.net Wed May 7 12:31:58 2003 From: mwh@python.net (Michael Hudson) Date: Wed, 07 May 2003 12:31:58 +0100 Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck" In-Reply-To: <16056.2688.72423.251200@montanaro.dyndns.org> (Skip Montanaro's message of "Tue, 6 May 2003 14:18:24 -0500") References: <16055.57554.364845.689049@montanaro.dyndns.org> <16056.2688.72423.251200@montanaro.dyndns.org> Message-ID: <2m65on6lht.fsf@starship.python.net> Skip Montanaro writes: > So, a word to the wise: avoid config.status --recheck. I don't know if I'm wise or not but I do tend to go for rm -rf build && mkdir build && cd build && ../configure -q && make -s for most rebuilds... I guess I should trust my tools a bit more. Cheers, M. -- The meaning of "brunch" is as yet undefined. -- Simon Booth, ucam.chat From skip@pobox.com Wed May 7 12:42:21 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 7 May 2003 06:42:21 -0500 Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck" In-Reply-To: <2m65on6lht.fsf@starship.python.net> References: <16055.57554.364845.689049@montanaro.dyndns.org> <16056.2688.72423.251200@montanaro.dyndns.org> <2m65on6lht.fsf@starship.python.net> Message-ID: <16056.61725.602991.181703@montanaro.dyndns.org> >> So, a word to the wise: avoid config.status --recheck. Michael> I don't know if I'm wise or not but I do tend to go for Michael> rm -rf build && mkdir build && cd build && ../configure -q && make -s Michael> for most rebuilds... I guess I should trust my tools a bit Michael> more. I got in the habit of using config.status --recheck because it allowed me to only remember a single configure-like command for most packages I build/install using configure. I only had to figure out what flags to pass to configure once, then later typing "C-r rech" in bash was sufficient to reconfigure the package. It would be nice if config.status had a flag which actually executed configure without the --no-create and --no-recursion flags. Someone mentioned invoking config.status without the --recheck flag. I don't think that's wise in a development environment since that doesn't actually run configure. Since we're talking about building Python in a development environment, I find it hard to believe you'd want to skip configure altogether. Skip From Jack.Jansen@cwi.nl Wed May 7 14:08:44 2003 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Wed, 7 May 2003 15:08:44 +0200 Subject: [Python-Dev] bsddb185 module changes checked in In-Reply-To: <16056.8215.274307.904009@montanaro.dyndns.org> Message-ID: <085D82A5-808D-11D7-A6E2-0030655234CE@cwi.nl> On Tuesday, May 6, 2003, at 22:50 Europe/Amsterdam, Skip Montanaro wrote: > > The various bits necessary to implement the "build bsddb185 when > appropriate" have been checked in. I'm pretty sure I don't have the > best > possible test for the existence of a db library, but it will have to > do for > now. I suspect others can clean it up later during the beta cycle. > The > current detection code in setup.py should work for Nick on OSF/1 and > for > platforms which don't require a separate db library. > > I'd appreciate some extra pounding on this code. On SGI Irix 6.5 (MIPSpro Compilers: Version 7.2.1) it tries to build it, and fails. It complains about "u_int" and such not being defined. There's magic at the top of /usr/include/db.h for defining various types optionally, and that's as far as my understanding went. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From andymac@bullseye.apana.org.au Wed May 7 10:18:41 2003 From: andymac@bullseye.apana.org.au (Andrew MacIntyre) Date: Wed, 7 May 2003 20:18:41 +1100 (edt) Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB853C8.70100@ghaering.de> Message-ID: On Wed, 7 May 2003, [ISO-8859-1] Gerhard H=E4ring wrote: > OTOH I'm pretty sure that a mingw build would be much easier if I just > wrote my own Makefiles, but that's probably unlikely to ever be merged. I'm maintaining the EMX port in a subdirectory of the PC directory (in CVS), and it is (basically) the way the MSVC build is being maintained - if you consider Visual Studio project files as abstract makefiles. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au | Snail: PO Box 370 andymac@pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From duncan@rcp.co.uk Wed May 7 14:30:14 2003 From: duncan@rcp.co.uk (Duncan Booth) Date: Wed, 7 May 2003 14:30:14 +0100 Subject: [Python-Dev] Microsoft speedup Message-ID: I was just playing around with the compiler options using Microsoft VC6 and I see that adding the option /Ob2 speeds up pystone by about 2.5% (/Ob2 is the option to automatically inline functions where the compiler thinks it is worthwhile.) The downside is that it increases the size of python23.dll by about 13%. It's not a phenomenal speedup, but it should be pretty low impact if the extra size is considered a worthwhile tradeoff. I haven't checked yet with VC7, but the compiler options are the same so the effect should also be similar. -- Duncan Booth duncan@rcp.co.uk int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure? From sjoerd@acm.org Wed May 7 15:14:19 2003 From: sjoerd@acm.org (Sjoerd Mullender) Date: Wed, 07 May 2003 16:14:19 +0200 Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck" In-Reply-To: <16056.61725.602991.181703@montanaro.dyndns.org> References: <16055.57554.364845.689049@montanaro.dyndns.org> <16056.2688.72423.251200@montanaro.dyndns.org> <2m65on6lht.fsf@starship.python.net> <16056.61725.602991.181703@montanaro.dyndns.org> Message-ID: <20030507141419.B87AA74230@indus.ins.cwi.nl> On Wed, May 7 2003 Skip Montanaro wrote: > > >> So, a word to the wise: avoid config.status --recheck. > > Michael> I don't know if I'm wise or not but I do tend to go for > > Michael> rm -rf build && mkdir build && cd build && ../configure -q && make -s > > Michael> for most rebuilds... I guess I should trust my tools a bit > Michael> more. > > I got in the habit of using config.status --recheck because it allowed me to > only remember a single configure-like command for most packages I > build/install using configure. I only had to figure out what flags to pass > to configure once, then later typing "C-r rech" in bash was sufficient to > reconfigure the package. It would be nice if config.status had a flag which > actually executed configure without the --no-create and --no-recursion > flags. > > Someone mentioned invoking config.status without the --recheck flag. I > don't think that's wise in a development environment since that doesn't > actually run configure. Since we're talking about building Python in a > development environment, I find it hard to believe you'd want to skip > configure altogether. I mentioned that. But I also said to do that after running with the --recheck flag. In fact, I use the bit Makefile: Makefile.in config.h.in config.status ./config.status config.status: configure ./config.status --recheck in some of my makefiles. I just type "make Makefile" and it does all it needs to do. -- Sjoerd Mullender From skip@pobox.com Wed May 7 15:24:46 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 7 May 2003 09:24:46 -0500 Subject: [Python-Dev] odd interpreter feature Message-ID: <16057.5934.556547.671279@montanaro.dyndns.org> I was editing the tutorial just now and noticed the secondary prompt (...) in a situation where I didn't think it was appropriate: >>> # The argument of repr() may be any Python object: ... repr(x, y, ('spam', 'eggs')) "(32.5, 40000, ('spam', 'eggs'))" It's caused by the trailing colon at the end of the comment. I verified it using current CVS: >>> hello = 'hello, world\n' hellos = repr(hello) print hellos 'hello, world\n' >>> # hello: ... >>> Shouldn't the trailing colon be ignored in comments? Bug, feature or wart? Skip From mwh@python.net Wed May 7 15:37:37 2003 From: mwh@python.net (Michael Hudson) Date: Wed, 07 May 2003 15:37:37 +0100 Subject: [Python-Dev] odd interpreter feature In-Reply-To: <16057.5934.556547.671279@montanaro.dyndns.org> (Skip Montanaro's message of "Wed, 7 May 2003 09:24:46 -0500") References: <16057.5934.556547.671279@montanaro.dyndns.org> Message-ID: <2mwuh26cwe.fsf@starship.python.net> Skip Montanaro writes: > I was editing the tutorial just now and noticed the secondary prompt (...) > in a situation where I didn't think it was appropriate: > > >>> # The argument of repr() may be any Python object: > ... repr(x, y, ('spam', 'eggs')) > "(32.5, 40000, ('spam', 'eggs'))" > > It's caused by the trailing colon at the end of the comment. Python 2.3b1+ (#1, May 6 2003, 18:00:11) [GCC 2.96 20000731 (Red Hat Linux 7.2 2.96-112.7.2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> # no it's not ... Cheers, M. -- The Internet is full. Go away. -- http://www.disobey.com/devilshat/ds011101.htm From amk@amk.ca Wed May 7 15:35:39 2003 From: amk@amk.ca (A.M. Kuchling) Date: Wed, 07 May 2003 10:35:39 -0400 Subject: [Python-Dev] Re: odd interpreter feature In-Reply-To: <16057.5934.556547.671279@montanaro.dyndns.org> References: <16057.5934.556547.671279@montanaro.dyndns.org> Message-ID: Skip Montanaro wrote: > It's caused by the trailing colon at the end of the comment. No, it's just the comment. >>> # hello ... print 'foo' foo >>> --amk From tim.one@comcast.net Wed May 7 15:42:22 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 07 May 2003 10:42:22 -0400 Subject: [Python-Dev] odd interpreter feature In-Reply-To: <16057.5934.556547.671279@montanaro.dyndns.org> Message-ID: [Skip Montanaro] > I was editing the tutorial just now and noticed the secondary prompt (...) > in a situation where I didn't think it was appropriate: > > >>> # The argument of repr() may be any Python object: > ... repr(x, y, ('spam', 'eggs')) > "(32.5, 40000, ('spam', 'eggs'))" > > It's caused by the trailing colon at the end of the comment. I > verified it using current CVS: > > >>> hello = 'hello, world\n' hellos = repr(hello) print hellos > 'hello, world\n' > >>> # hello: > ... > >>> > > Shouldn't the trailing colon be ignored in comments? Bug, > feature or wart? This changed at some very early point in Python's life. I don't think the trailing colon is relevant: >>> 1+2 3 >>> # hello ... >>> From skip@pobox.com Wed May 7 15:51:10 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 7 May 2003 09:51:10 -0500 Subject: [Python-Dev] odd interpreter feature In-Reply-To: <2mwuh26cwe.fsf@starship.python.net> References: <16057.5934.556547.671279@montanaro.dyndns.org> <2mwuh26cwe.fsf@starship.python.net> Message-ID: <16057.7518.148868.168522@montanaro.dyndns.org> >>> # no it's not ... Damn, thanks... I guess the question still remains though, should the secondary prompt be issued after a comment? Skip From fdrake@acm.org Wed May 7 15:55:45 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 7 May 2003 10:55:45 -0400 Subject: [Python-Dev] odd interpreter feature In-Reply-To: References: <16057.5934.556547.671279@montanaro.dyndns.org> Message-ID: <16057.7793.975960.566995@grendel.zope.com> Tim Peters writes: > This changed at some very early point in Python's life. I don't think the > trailing colon is relevant: > > >>> 1+2 > 3 > >>> # hello > ... > >>> I think this is also a point on which Python and Jython differ, but I don't have Jython installed anywhere nearby to test with. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From guido@python.org Wed May 7 16:02:21 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 07 May 2003 11:02:21 -0400 Subject: [Python-Dev] odd interpreter feature In-Reply-To: "Your message of Wed, 07 May 2003 09:24:46 CDT." <16057.5934.556547.671279@montanaro.dyndns.org> References: <16057.5934.556547.671279@montanaro.dyndns.org> Message-ID: <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net> > I was editing the tutorial just now and noticed the secondary prompt (...) > in a situation where I didn't think it was appropriate: > > >>> # The argument of repr() may be any Python object: > ... repr(x, y, ('spam', 'eggs')) > "(32.5, 40000, ('spam', 'eggs'))" > > It's caused by the trailing colon at the end of the comment. I verified it > using current CVS: > > >>> hello = 'hello, world\n' hellos = repr(hello) print hellos > 'hello, world\n' > >>> # hello: > ... > >>> > > Shouldn't the trailing colon be ignored in comments? Bug, feature or wart? It's not the trailing colon. Any line that consists of only a comment does this: >>> >>> # foo ... >>> # foo ... >>> 12 # foo 12 >>> And yes, it's a wart, but I don't know how to fix it. --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas@xs4all.net Wed May 7 16:16:20 2003 From: thomas@xs4all.net (Thomas Wouters) Date: Wed, 7 May 2003 17:16:20 +0200 Subject: [Python-Dev] odd interpreter feature In-Reply-To: <16057.7793.975960.566995@grendel.zope.com> References: <16057.5934.556547.671279@montanaro.dyndns.org> <16057.7793.975960.566995@grendel.zope.com> Message-ID: <20030507151620.GI26254@xs4all.nl> On Wed, May 07, 2003 at 10:55:45AM -0400, Fred L. Drake, Jr. wrote: > Tim Peters writes: > > This changed at some very early point in Python's life. I don't think the > > trailing colon is relevant: > > > > >>> 1+2 > > 3 > > >>> # hello > > ... > > >>> > I think this is also a point on which Python and Jython differ, but I > don't have Jython installed anywhere nearby to test with. I do: debian:~ > jython Jython 2.1 on java1.1.8 (JIT: null) Type "copyright", "credits" or "license" for more information. >>> 1+2 3 >>> # hello >>> ^D (This is why I use Debian... 'apt-get install jython' :-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From skip@pobox.com Wed May 7 16:16:29 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 7 May 2003 10:16:29 -0500 Subject: [Python-Dev] odd interpreter feature In-Reply-To: <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net> References: <16057.5934.556547.671279@montanaro.dyndns.org> <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <16057.9037.913362.225855@montanaro.dyndns.org> Guido> And yes, it's a wart, but I don't know how to fix it. I did a little digging and noticed this comment dating from v 2.5 (Jul 91): /* Lines with only whitespace and/or comments shouldn't affect the indentation and are not passed to the parser as NEWLINE tokens, except *totally* empty lines in interactive mode, which signal the end of a command group. */ Not surprisingly, given the age of the change, your fingerprints are all over it. ;-) I suspect if the code beneath that comment was executed only when the indentation level is zero we'd be okay, but I don't know if the tokenizer has that sort of information available. I'll do a little more poking around. Skip From akuchlin@mems-exchange.org Wed May 7 16:19:16 2003 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 07 May 2003 11:19:16 -0400 Subject: [Python-Dev] Relying on ReST in the core? Message-ID: For PEP 314, it's been suggested that the Description field be written in RestructuredText. This change doesn't affect the Distutils code, because the Distutils just takes this field and copies it into an output file; programs using the metadata defined in PEP 314 would have to be able to process ReST, though. I know the plan is to eventually add ReST/docutils to the standard library, and that this isn't happening for Python 2.3. Question: is it OK to make something in the core implicitly depend on ReST before ReST is in the core? Until docutils is added, there's always the risk that we decide to never add ReST to the core, or ReST 2.0 changes the format completely, or we decide XYZ is much better, or something like that. --amk (www.amk.ca) IAGO: Poor and content is rich and rich enough. -- _Othello_, III, iii From guido@python.org Wed May 7 16:32:39 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 07 May 2003 11:32:39 -0400 Subject: [Python-Dev] Relying on ReST in the core? In-Reply-To: "Your message of Wed, 07 May 2003 11:19:16 EDT." References: Message-ID: <200305071532.h47FWdX03514@pcp02138704pcs.reston01.va.comcast.net> > For PEP 314, it's been suggested that the Description field > be written in RestructuredText. This change doesn't affect the > Distutils code, because the Distutils just takes this field and > copies it into an output file; programs using the metadata defined > in PEP 314 would have to be able to process ReST, though. > > I know the plan is to eventually add ReST/docutils to the standard > library, and that this isn't happening for Python 2.3. Question: is > it OK to make something in the core implicitly depend on ReST before > ReST is in the core? Until docutils is added, there's always the risk > that we decide to never add ReST to the core, or ReST 2.0 changes the > format completely, or we decide XYZ is much better, or something like > that. I think it's okay to make this a recommendation, with the suggestion to be conservative in using reST features. Since a description is usually only a paragraph long, I think that should be okay. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed May 7 16:33:31 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 07 May 2003 11:33:31 -0400 Subject: [Python-Dev] odd interpreter feature In-Reply-To: "Your message of Wed, 07 May 2003 10:16:29 CDT." <16057.9037.913362.225855@montanaro.dyndns.org> References: <16057.5934.556547.671279@montanaro.dyndns.org> <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net> <16057.9037.913362.225855@montanaro.dyndns.org> Message-ID: <200305071533.h47FXVf03526@pcp02138704pcs.reston01.va.comcast.net> > Guido> And yes, it's a wart, but I don't know how to fix it. > > I did a little digging and noticed this comment dating from v 2.5 (Jul 91): > > /* Lines with only whitespace and/or comments > shouldn't affect the indentation and are > not passed to the parser as NEWLINE tokens, > except *totally* empty lines in interactive > mode, which signal the end of a command group. */ > > Not surprisingly, given the age of the change, your fingerprints are all > over it. ;-) > > I suspect if the code beneath that comment was executed only when the > indentation level is zero we'd be okay, but I don't know if the tokenizer > has that sort of information available. I'll do a little more poking > around. Please do. The indentation level should be easily available, since it is computed by the tokenizer. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Wed May 7 17:15:35 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 7 May 2003 11:15:35 -0500 Subject: [Python-Dev] odd interpreter feature In-Reply-To: <200305071533.h47FXVf03526@pcp02138704pcs.reston01.va.comcast.net> References: <16057.5934.556547.671279@montanaro.dyndns.org> <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net> <16057.9037.913362.225855@montanaro.dyndns.org> <200305071533.h47FXVf03526@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <16057.12583.500034.130135@montanaro.dyndns.org> Guido> Please do. The indentation level should be easily available, Guido> since it is computed by the tokenizer. Alas, it's more complicated than just the indentation level of the current line. I need to know if the previous line was indented, which I don't think the tokenizer knows (at least examining *tok in gdb under various conditions suggests it doesn't). I see the following possible cases (there are perhaps more, but I think they are similar enough to ignore here): >>> if x == y: ... # hello ... pass ... >>> if x == y: ... x = 1 ... # hello ... pass ... >>> x = 1 >>> # hello ... >>> Only the last case should display the primary prompt after the comment is entered. The other two correctly display the secondary prompt. It's distinguishing the second and third cases in the tokenizer without help from the parser that's the challenge. Oh well. Perhaps it's a wart best left alone. Skip From brian@sweetapp.com Wed May 7 18:02:19 2003 From: brian@sweetapp.com (Brian Quinlan) Date: Wed, 07 May 2003 10:02:19 -0700 Subject: [Python-Dev] Re: MS VC 7 offer In-Reply-To: Message-ID: <010501c314ba$6b8dbef0$21795418@dell1700> > > That is information about the core ABI. I do need to be concerned > > about changes in the libraries, as well, in particular about > > incompatibilities resulting from multiple copies of the C library. > > You said you don't know much about that. > > I can find out almost as easily, if you have specific questions. But the actual question that we would like to answer is quite broad: what are all of the possible compatibility problems associated with using a VC6 compiled DLL with a VC7 compiled application? Assuming that only changed runtime data structures are going to be a problem, knowing which ones cannot be passed between the two versions would be nice. Below is a list of the standard types defined by Microsoft's VC6 runtime library (taken from the VC6 docs): clock_t _complex _dev_t div_t, ldiv_t _exception FILE _finddata_t, _wfinddata_t, _wfinddatai64_t _FPIEEE_RECORD fpos_t _HEAPINFO jmp_buf lconv _off_t _onexit_t _PNH ptrdiff_t sig_atomic_t size_t _stat time_t _timeb tm _utimbuf va_list wchar_t wctrans_t wctype_t wint_t Cheers, Brian From greg_spencer@acm.org Wed May 7 17:58:03 2003 From: greg_spencer@acm.org (Greg Spencer) Date: Wed, 7 May 2003 10:58:03 -0600 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB8B1E2.2050108@dmsware.com> Message-ID: Well, I'm almost done with the SCons integration for both VC6 and VC7. Just some tests to write and integration into the current codeline to do. Paolo, I'm not sure what you mean by "full" support for VC7, but here's what I'm working on: 1) SCons writes out and maintains (as a "product" of the build) a .dsw and .dsw file for VC6, or an .sln and .vcproj file for VC7. 2) The project and solution files contain "External Makefile" targets, which in MSVC means that it will launch an external command when the "build" button is pressed. 3) The project files contain all of the sources configured in the SCons file, and you can include as many additional files as you would like. The SConscript file that generated the .dsp or .vcproj file is automatically included in the source list so you can edit it from the IDE. With this scheme, you can browse the class hierarchy, edit resource files, build the project, double-click on errors (if any :-), edit source files from the IDE, launch the executable (if any) in the debugger, lather, rinse, repeat. The build is then completely controlled by the Python SConscripts, with the full flexibility that offers, and the project files are now just products of the build that will be blown away and regenerated any time they need to be rebuilt. The only things I've discovered that you can't do with this scheme are insert ActiveX controls (because the menu items are disabled) and build individual object files. At first glance, it seems like the logical choice for VS integration is to build a plugin to Visual Studio, but for VS6, there aren't really enough trigger events to capture the appropriate information at the right times, so it's not really feasible. For VS7, I think things are much more promising in the plugin department, but truthfully, I'm not sure there's much added value. You could insert new ActiveX controls with the wizard and build individual files, sure. But do you really want to change build settings from within the IDE's dialogs? I haven't really decided how this would even work. Probably you'd need a third configuration file that both the VS7 tool and the SConscript could share so that they could get their build setting information. Yet another config file, and now you'd have to keep the .sln and .vcproj files too, making a total of four files that control the build. They'd be in sync, but one file is always better than four. Also, this only works for VS7, and it's complex. I'm still considering a VS7 plugin as a possible future direction, but I need some compelling reasons to do it. I've used the "External Makefile" scheme with classic Cons for four years now, and I haven't had any major complaints from anyone -- they're just overjoyed that their build is automated and "just works", and they can still use the IDE for 90% of what they used it for before. Not to mention all the benefits of using a build system like SCons (centralized setting of build parameters for all projects, for instance). I hope that addresses your needs. If you have suggestions or questions, feel free to e-mail me. BTW, I don't subscribe to python-dev, so be sure to CC me in this thread. -Greg. P.S. Thanks for creating a language that a Perl guy can learn in a week. And I thought shifting from classic cons to scons would be hard... :-) -----Original Message----- From: Paolo Invernizzi [mailto:paoloinvernizzi@dmsware.com] Sent: Wednesday, May 07, 2003 1:13 AM To: python-dev@python.org Cc: Mark Hammond; greg_spencer@acm.org Subject: Re: [Python-Dev] MS VC 7 offer Mark Hammond wrote: >Another thing to consider is the "make" environment. If we don't use >DevStudio, then presumably our existing project files will become useless. >Not a huge problem, but a real one. MSVC exported makefiles are not >designed to be maintained. I'm having good success with autoconf and Python >on other projects, but that would raise the barrier to including cygwin in >your build environment. > I think the scons (www.scons.org) will have in its next release full support for building targets using VC6 *project* file, and full support for VC7. Actually it has support also for cygwin and mingw... So I think is possible to have an automated way for building VC7 python based only on some scons script and VC6 project files... The possible goal is to keep working with VC6 IDE as now, and have a simple build script able to automatically build the VC7 version tracking changes.. I've inserted Greg Spencer, who I know is working on this... surely he can bring us more details. --- Paolo Invernizzi. From jepler@unpythonic.net Wed May 7 18:06:18 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Wed, 7 May 2003 12:06:18 -0500 Subject: [Python-Dev] Startup time In-Reply-To: References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030507170618.GI27125@unpythonic.net> On Tue, May 06, 2003 at 07:35:40PM +0200, Martin v. L=F6wis wrote: > That would be easy to determine: Just disable the block >=20 > #if defined(Py_USING_UNICODE) && defined(HAVE_LANGINFO_H) && defined(CO= DESET) >=20 > in pythonrun.c, and see whether it changes anything. To my knowledge, > this is the only cause of loading encodings during startup on Unix. With this change, I typically see real 0m0.020s user 0m0.020s sys 0m0.000s instead of real 0m0.022s user 0m0.020s sys 0m0.000s The number of successful open()s decreases, but not by much: # before change $ strace -e open ./python-2.3 -S -c pass 2>&1 | grep -v ENOENT | wc -l 46 # after change $ strace -e open ./python -S -c pass 2>&1 | grep -v ENOENT | wc -l 39 What about this line? It seems to be the cause of a bunch of imports, including the sre stuff: /* pythonrun.c */ PyModule_WarningsModule =3D PyImport_ImportModule("warnings"); Jeff From patmiller@llnl.gov Wed May 7 18:25:57 2003 From: patmiller@llnl.gov (Pat Miller) Date: Wed, 07 May 2003 10:25:57 -0700 Subject: [Python-Dev] odd interpreter feature Message-ID: <3EB941A5.5070003@llnl.gov> Skip writes: > >>> # hello: > ... > >>> > > Shouldn't the trailing colon be ignored in comments? Bug, feature or wart? I figured it was a feature... Taking the view that any source block asks for continuations seemed natural, so I assumed Guido intended it that way ;-) If the comments were active objects (like doc strings), then it would be the desired association. >>> # About to do something tricky ... tricky() >>> Pat -- Patrick Miller | (925) 423-0309 | http://www.llnl.gov/CASC/people/pmiller All you need in this life is ignorance and confidence, and then success is sure. -- Mark Twain From martin@v.loewis.de Wed May 7 18:48:41 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 07 May 2003 19:48:41 +0200 Subject: [Python-Dev] bsddb185 module changes checked in In-Reply-To: <085D82A5-808D-11D7-A6E2-0030655234CE@cwi.nl> References: <085D82A5-808D-11D7-A6E2-0030655234CE@cwi.nl> Message-ID: Jack Jansen writes: > On SGI Irix 6.5 (MIPSpro Compilers: Version 7.2.1) it tries to build > it, and fails. It complains about "u_int" and such not being > defined. There's magic at the top of /usr/include/db.h for defining > various types optionally, and that's as far as my understanding > went. I would not be worried about that too much. An Irix user who cares about that will propose a solution, if there are Irix users who care about that. Regards, Martin From greg_spencer@acm.org Wed May 7 19:25:11 2003 From: greg_spencer@acm.org (Greg Spencer) Date: Wed, 7 May 2003 12:25:11 -0600 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EB8B1E2.2050108@dmsware.com> Message-ID: Actually, on re-reading your mail, I realize that you might just be talking about getting VC7 to work well with SCons (since it currently only knows about how to find VC6). I've got that part done, and it'll be in with the project file stuff. -Greg. -----Original Message----- From: Paolo Invernizzi [mailto:paoloinvernizzi@dmsware.com] Sent: Wednesday, May 07, 2003 1:13 AM To: python-dev@python.org Cc: Mark Hammond; greg_spencer@acm.org Subject: Re: [Python-Dev] MS VC 7 offer Mark Hammond wrote: >Another thing to consider is the "make" environment. If we don't use >DevStudio, then presumably our existing project files will become useless. >Not a huge problem, but a real one. MSVC exported makefiles are not >designed to be maintained. I'm having good success with autoconf and Python >on other projects, but that would raise the barrier to including cygwin in >your build environment. > I think the scons (www.scons.org) will have in its next release full support for building targets using VC6 *project* file, and full support for VC7. Actually it has support also for cygwin and mingw... So I think is possible to have an automated way for building VC7 python based only on some scons script and VC6 project files... The possible goal is to keep working with VC6 IDE as now, and have a simple build script able to automatically build the VC7 version tracking changes.. I've inserted Greg Spencer, who I know is working on this... surely he can bring us more details. --- Paolo Invernizzi. From jepler@unpythonic.net Wed May 7 19:30:26 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Wed, 7 May 2003 13:30:26 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <20030507170618.GI27125@unpythonic.net> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507170618.GI27125@unpythonic.net> Message-ID: <20030507183025.GJ27125@unpythonic.net> On Wed, May 07, 2003 at 12:06:18PM -0500, Jeff Epler wrote: > What about this line? It seems to be the cause of a bunch of imports, > including the sre stuff: > /* pythonrun.c */ > PyModule_WarningsModule = PyImport_ImportModule("warnings"); With this *and* the unicode stuff removed, I see runtimes like this: $ time ./python -S -c pass real 0m0.008s user 0m0.010s sys 0m0.000s and opens are nearly down to 2.2 levels: $ strace -e open ./python -S -c pass 2>&1 | grep -v ENOENT | wc 11 44 489 $ strace -e open /usr/bin/python -S -c pass 2>&1 | grep -v ENOENT | wc 8 32 355 (the differences are libstdc++, libgcc_s, and librt) With *just* the import of warnings removed, I get this: $ time ./python -S -c pass real 0m0.017s user 0m0.010s sys 0m0.010s .. and the input of sre is back. I guess it's used in both warnings.py and encodings/__init__.py Jeff From jepler@unpythonic.net Wed May 7 19:52:46 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Wed, 7 May 2003 13:52:46 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <20030507183025.GJ27125@unpythonic.net> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507170618.GI27125@unpythonic.net> <20030507183025.GJ27125@unpythonic.net> Message-ID: <20030507185245.GL27125@unpythonic.net> On Wed, May 07, 2003 at 01:30:26PM -0500, Jeff Epler wrote: > .. and the input of sre is back. I guess it's used in both warnings.py > and encodings/__init__.py In encodings.__init__.py, the only use of re is for the normalize_encoding function. It could potentially be replaced with only string operations: # translate all offending characters to whitespace _norm_encoding_trans = string.maketrans(...) def normalize_encoding(encoding): encoding = encoding.translate(_norm_encoding_trans) # let the str.split machinery take care of splitting # only once on repeated whitespace return "_".join(encoding.split()) .. or the import of re could be moved inside normalize_encoding. In warnings.py, re is used in two functions, filterwarnings() and _setoption(). it's probably safe to move 'import re' inside these functions. I'm guessing the 'import lock' warnings.py problem doesn't apply when parsing options or adding new warning filters. Furthermore, filterwarnings() will have to be changed to not use re.compile() when message is "" (the resulting RE is always successfully matched) since several filterwarnings() calls are already performed by default, but always with message="". These changes would prevent the import of 're' at startup time, which appears to be the real killer. (see my module import timings in an earlier message) From skip@pobox.com Wed May 7 20:05:06 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 7 May 2003 14:05:06 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <20030507185245.GL27125@unpythonic.net> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507170618.GI27125@unpythonic.net> <20030507183025.GJ27125@unpythonic.net> <20030507185245.GL27125@unpythonic.net> Message-ID: <16057.22754.582272.377803@montanaro.dyndns.org> Jeff> In encodings.__init__.py, the only use of re is for the Jeff> normalize_encoding function. It could potentially be replaced with only Jeff> string operations: ... Jeff> .. or the import of re could be moved inside normalize_encoding. I don't know if this still holds true, but at one point during the 2.x series I think it was pretty expensive to perform imports inside functions, much more expensive than in 1.5.2 at least (maybe right after nested scopes were introduced?). If that is still true, moving the import might be false economy. Skip "Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems." -- Jamie Zawinski From jepler@unpythonic.net Wed May 7 20:42:17 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Wed, 7 May 2003 14:42:17 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <16057.22754.582272.377803@montanaro.dyndns.org> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507170618.GI27125@unpythonic.net> <20030507183025.GJ27125@unpythonic.net> <20030507185245.GL27125@unpythonic.net> <16057.22754.582272.377803@montanaro.dyndns.org> Message-ID: <20030507194215.GM27125@unpythonic.net> On Wed, May 07, 2003 at 02:05:06PM -0500, Skip Montanaro wrote: > I don't know if this still holds true, but at one point during the 2.x > series I think it was pretty expensive to perform imports inside functions, > much more expensive than in 1.5.2 at least (maybe right after nested scopes > were introduced?). If that is still true, moving the import might be false > economy. $ ./python Lib/timeit.py -s "def f(): import sys" "f()" 100000 loops, best of 3: 3.34 usec per loop $ ./python Lib/timeit.py -s "def f(): pass" "import sys; f()" 100000 loops, best of 3: 3.3 usec per loop $ ./python Lib/timeit.py -s "def f(): pass" "f()" 1000000 loops, best of 3: 0.451 usec per loop $ ./python Lib/timeit.py 'import sys' 100000 loops, best of 3: 2.88 usec per loop About 2.8usec would be added to each invocation of the functions in question, about the same as the cost of a global-scope import. This means that you lose overall as soon as the function is called twice. .. but this was about speeding python startup, not just speeding python. <.0375 wink> Jeff From aleax@aleax.it Wed May 7 21:57:26 2003 From: aleax@aleax.it (Alex Martelli) Date: Wed, 7 May 2003 22:57:26 +0200 Subject: [Python-Dev] Startup time In-Reply-To: <16057.22754.582272.377803@montanaro.dyndns.org> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507185245.GL27125@unpythonic.net> <16057.22754.582272.377803@montanaro.dyndns.org> Message-ID: <200305072257.26085.aleax@aleax.it> On Wednesday 07 May 2003 09:05 pm, Skip Montanaro wrote: > Jeff> In encodings.__init__.py, the only use of re is for the > Jeff> normalize_encoding function. It could potentially be replaced > with only Jeff> string operations: > ... > Jeff> .. or the import of re could be moved inside normalize_encoding. > > I don't know if this still holds true, but at one point during the 2.x > series I think it was pretty expensive to perform imports inside functions, > much more expensive than in 1.5.2 at least (maybe right after nested scopes > were introduced?). If that is still true, moving the import might be false > economy. Doesn't seem to be true in 2.3, if I understand what you're saying: [alex@lancelot src]$ python Lib/timeit.py -s'def f(): pass' 'import math; f()' 100000 loops, best of 3: 4.04 usec per loop [alex@lancelot src]$ python Lib/timeit.py -s'def f(): import math' 'pass; f()' 100000 loops, best of 3: 4.05 usec per loop or even [alex@lancelot src]$ python Lib/timeit.py -s'import math' -s'def f(): pass' 'reload(math); f()' 10000 loops, best of 3: 168 usec per loop [alex@lancelot src]$ python Lib/timeit.py -s'import math' -s'def f(): reload(math)' 'pass; f()' 10000 loops, best of 3: 169 usec per loop Alex From skip@pobox.com Wed May 7 22:16:28 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 7 May 2003 16:16:28 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <200305072257.26085.aleax@aleax.it> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507185245.GL27125@unpythonic.net> <16057.22754.582272.377803@montanaro.dyndns.org> <200305072257.26085.aleax@aleax.it> Message-ID: <16057.30636.403064.675001@montanaro.dyndns.org> >> I don't know if this still holds true, but at one point during the >> 2.x series I think it was pretty expensive to perform imports inside >> functions, much more expensive than in 1.5.2 at least (maybe right >> after nested scopes were introduced?). Alex> Doesn't seem to be true in 2.3, if I understand what you're saying: Alex> [alex@lancelot src]$ python Lib/timeit.py -s'def f(): pass' 'import math; f()' Alex> 100000 loops, best of 3: 4.04 usec per loop Alex> [alex@lancelot src]$ python Lib/timeit.py -s'def f(): import math' 'pass; f()' Alex> 100000 loops, best of 3: 4.05 usec per loop Yes, you're correct. Guess I could have run that myself had I been thinking. (My sleeping cap wasn't on much last night, so my thinking cap hasn't been on much today.) Guido, any chance you can quickly run the above two through the thirty-leven versions of Python you have laying about so we can narrow this down or refute my faulty memory? I've seen some recent posts by you which had performance data as far back as 1.3. I tried with 2.1, 2.2 and CVS but saw no discernable differences within versions: % python ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()' 100000 loops, best of 3: 7.44 usec per loop % python ~/local/bin/timeit.py -s'def f(): import math' 'f()' 100000 loops, best of 3: 7.6 usec per loop % python2.2 ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()' 100000 loops, best of 3: 9.19 usec per loop % python2.2 ~/local/bin/timeit.py -s'def f(): import math' 'f()' 100000 loops, best of 3: 9.05 usec per loop % python2.1 ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()' 100000 loops, best of 3: 9.16 usec per loop % python2.1 ~/local/bin/timeit.py -s'def f(): import math' 'f()' 100000 loops, best of 3: 9.12 usec per loop Maybe it was 2.0? Thx, Skip From drifty@alum.berkeley.edu Wed May 7 23:16:50 2003 From: drifty@alum.berkeley.edu (Brett Cannon) Date: Wed, 7 May 2003 15:16:50 -0700 (PDT) Subject: [Python-Dev] Make _strptime only time.strptime implementation? Message-ID: Someone filed a bug report wanting it to be mentioned that most libc implementations of strptime don't handle %Z. Michael asked whether _strptime was going to become the permanent version of time.strptime or not. This was partially discussed back when Guido used his amazing time machine to make time.strptime use _strptime exclusively for testing purposes. I vaguely remember Tim saying he supported moving to _strptime, but I don't remember Guido having an opinion. If this is going to happen for 2.3 I would like to know so as to fix the documentation to be better. -Brett From python@rcn.com Thu May 8 00:55:03 2003 From: python@rcn.com (Raymond Hettinger) Date: Wed, 7 May 2003 19:55:03 -0400 Subject: [Python-Dev] Startup time References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507185245.GL27125@unpythonic.net> <16057.22754.582272.377803@montanaro.dyndns.org> <200305072257.26085.aleax@aleax.it> <16057.30636.403064.675001@montanaro.dyndns.org> Message-ID: <003e01c314f4$1414f780$125ffea9@oemcomputer> > Guido, any chance you can quickly run the above two through the thirty-leven > versions of Python you have laying about so we can narrow this down or > refute my faulty memory? I've seen some recent posts by you which had > performance data as far back as 1.3. I tried with 2.1, 2.2 and CVS but saw > no discernable differences within versions: > > % python ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()' > 100000 loops, best of 3: 7.44 usec per loop > % python ~/local/bin/timeit.py -s'def f(): import math' 'f()' > 100000 loops, best of 3: 7.6 usec per loop > > % python2.2 ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()' > 100000 loops, best of 3: 9.19 usec per loop > % python2.2 ~/local/bin/timeit.py -s'def f(): import math' 'f()' > 100000 loops, best of 3: 9.05 usec per loop > > % python2.1 ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()' > 100000 loops, best of 3: 9.16 usec per loop > % python2.1 ~/local/bin/timeit.py -s'def f(): import math' 'f()' > 100000 loops, best of 3: 9.12 usec per loop I don't think timeit.py helps here. It works by substituting *both* the setup and statement inside a compiled function. So, *none* of the above timings show the effect of a top level import versus one that is inside a function. It does compare 1 deep nesting to 2 levels deep. So, you'll likely have to roll your own minature timer if you want a straight answer. Raymond Hettinger From jepler@unpythonic.net Thu May 8 01:53:44 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Wed, 7 May 2003 19:53:44 -0500 Subject: [Python-Dev] Startup time In-Reply-To: <003e01c314f4$1414f780$125ffea9@oemcomputer> References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507185245.GL27125@unpythonic.net> <16057.22754.582272.377803@montanaro.dyndns.org> <200305072257.26085.aleax@aleax.it> <16057.30636.403064.675001@montanaro.dyndns.org> <003e01c314f4$1414f780$125ffea9@oemcomputer> Message-ID: <20030508005342.GA3634@unpythonic.net> On Wed, May 07, 2003 at 07:55:03PM -0400, Raymond Hettinger wrote: > I don't think timeit.py helps here. It works by substituting *both* > the setup and statement inside a compiled function. > > So, *none* of the above timings show the effect of a top level import > versus one that is inside a function. It does compare 1 deep nesting > to 2 levels deep. This program prints clock() times for 4e6 imports, first at global and then at function scope. Function scope wins a little bit, possibly due to the speed of STORE_FAST instead of STORE_GLOBAL (or would it be STORE_NAME?) ######################################################################## # (on a different machine than my earlier timeit results, running 2.2.2) # time for global import 30.21 # time for function import 27.31 import time, sys t0 = time.clock() for i in range(1e6): import sys; import sys; import sys; import sys; t1 = time.clock() print "time for global import", t1-t0 def f(): for i in range(1e6): import sys; import sys; import sys; import sys; t0 = time.clock() f() t1 = time.clock() print "time for function import", t1-t0 ######################################################################## If Skip is thinking of a slowdown for import and function scope, could it be the {LOAD,STORE}_FAST performance killer 'import *'? (wow, LOAD_NAME isn't as much slower than LOAD_FAST as you might expect..) ######################################################################## # time for 27.9 # time for 37.94 import new, sys, time m = new.module('m') sys.modules['m'] = m m.__dict__.update({'__all__': ['x'], 'x': None}) def f(): from m import x x; x; x; x; x; x; x; x; x; x x; x; x; x; x; x; x; x; x; x x; x; x; x; x; x; x; x; x; x x; x; x; x; x; x; x; x; x; x x; x; x; x; x; x; x; x; x; x def g(): from m import * x; x; x; x; x; x; x; x; x; x x; x; x; x; x; x; x; x; x; x x; x; x; x; x; x; x; x; x; x x; x; x; x; x; x; x; x; x; x x; x; x; x; x; x; x; x; x; x for fn in f, g: t0 = time.clock() for i in range(1e6): fn() t1 = time.clock() print "time for", fn, t1-t0 ######################################################################## From dave@boost-consulting.com Thu May 8 03:02:34 2003 From: dave@boost-consulting.com (David Abrahams) Date: Wed, 07 May 2003 22:02:34 -0400 Subject: [Python-Dev] Re: MS VC 7 offer In-Reply-To: <010501c314ba$6b8dbef0$21795418@dell1700> (Brian Quinlan's message of "Wed, 07 May 2003 10:02:19 -0700") References: <010501c314ba$6b8dbef0$21795418@dell1700> Message-ID: Brian Quinlan writes: >> > That is information about the core ABI. I do need to be concerned >> > about changes in the libraries, as well, in particular about >> > incompatibilities resulting from multiple copies of the C library. >> > You said you don't know much about that. >> >> I can find out almost as easily, if you have specific questions. > > But the actual question that we would like to answer is quite broad: > what are all of the possible compatibility problems associated with > using a VC6 compiled DLL with a VC7 compiled application? > > Assuming that only changed runtime data structures are going to be a > problem, knowing which ones cannot be passed between the two versions > would be nice. Below is a list of the standard types defined by > Microsoft's VC6 runtime library (taken from the VC6 docs): > > clock_t > _complex > _dev_t > div_t, ldiv_t > _exception > FILE > _finddata_t, _wfinddata_t, _wfinddatai64_t > _FPIEEE_RECORD > fpos_t > _HEAPINFO > jmp_buf > lconv > _off_t > _onexit_t > _PNH > ptrdiff_t > sig_atomic_t > size_t > _stat > time_t > _timeb > tm > _utimbuf > va_list > wchar_t > wctrans_t > wctype_t > wint_t So do you want me to ask what all the possible compatibility problems are, or do you want me to ask which of the above structures cannot be passed between the two versions (or neither)? -- Dave Abrahams Boost Consulting www.boost-consulting.com From logistix@cathoderaymission.net Thu May 8 03:48:50 2003 From: logistix@cathoderaymission.net (logistix) Date: Wed, 7 May 2003 22:48:50 -0400 Subject: [Python-Dev] Building Python with .NET 2003 SDK Message-ID: <000201c3150c$5b294cd0$20bba8c0@XP> I decided to see if you really could build Python with the .NET compiler. I just got a preliminary build done that passed 67 tests (and failed 17) Two big gothas: 1) You also need to install the "Platform SDK". This one makes the .NET SDK download seem fast. 2) VC6 generated makefiles include references to a few .lib files that aren't included. They also don't seem to be needed either. The offending librarys are largeint.lib, odbc32.lib, and odbccp32.lib. More detailed notes on what had to be done to get it working can be found here, http://www.cathoderaymission.net/~logistix/python/buildingPythonWithDotN et.html Enjoy! -Grant From skip@pobox.com Thu May 8 04:23:58 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 7 May 2003 22:23:58 -0500 Subject: [Python-Dev] local import cost Message-ID: <16057.52686.475079.530463@montanaro.dyndns.org> Thanks to Raymond H for pointing out the probably fallacy in my original timeit runs. Here's a simple timer which I think gets at what I'm after: import time import math import sys N = 500000 def fmath(): import math def fpass(): pass v = sys.version.split()[0] t = time.clock() for i in xrange(N): fmath() fmathcps = N/(time.clock()-t) t = time.clock() for i in xrange(N): fpass() fpasscps = N/(time.clock()-t) print "%s fpass/fmath: %.1f" % (v, fpasscps/fmathcps) On my Mac I get these outputs: 2.1.3 fpass/fmath: 5.0 2.2.2 fpass/fmath: 5.6 2.3b1+ fpass/fmath: 5.3 Naturally, I expect fpass() to run a lot faster than fmath(). If my presumption is correct though, there will be a sharp increase in the ratio, maybe in 2.0 or 2.1, or whenever nested scopes were first introduced. I can't run anything earlier than 2.1.x (I'll see about building 2.1) on my Mac. I'd have to break out my Linux laptop and do a bunch of downloading and compiling to get earlier results. From brian@sweetapp.com Thu May 8 08:08:48 2003 From: brian@sweetapp.com (Brian Quinlan) Date: Thu, 08 May 2003 00:08:48 -0700 Subject: [Python-Dev] Re: MS VC 7 offer In-Reply-To: Message-ID: <002f01c31530$ac4b5ee0$21795418@dell1700> > So do you want me to ask what all the possible compatibility problems > are, or do you want me to ask which of the above structures cannot be > passed between the two versions (or neither)? The former question would be best as the later would seem to be a subset. Cheers, Brian From sjoerd@acm.org Thu May 8 09:11:15 2003 From: sjoerd@acm.org (Sjoerd Mullender) Date: Thu, 08 May 2003 10:11:15 +0200 Subject: [Python-Dev] odd interpreter feature In-Reply-To: <16057.12583.500034.130135@montanaro.dyndns.org> References: <16057.5934.556547.671279@montanaro.dyndns.org> <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net> <16057.9037.913362.225855@montanaro.dyndns.org> <200305071533.h47FXVf03526@pcp02138704pcs.reston01.va.comcast.net> <16057.12583.500034.130135@montanaro.dyndns.org> Message-ID: <20030508081115.79E3174230@indus.ins.cwi.nl> Isn't it the case that you should only get a secondary prompt after the comment line if the comment line *itself* had a secondary prompt? On Wed, May 7 2003 Skip Montanaro wrote: > > Guido> Please do. The indentation level should be easily available, > Guido> since it is computed by the tokenizer. > > Alas, it's more complicated than just the indentation level of the current > line. I need to know if the previous line was indented, which I don't think > the tokenizer knows (at least examining *tok in gdb under various conditions > suggests it doesn't). > > I see the following possible cases (there are perhaps more, but I think they > are similar enough to ignore here): > > >>> if x == y: > ... # hello > ... pass > ... > >>> if x == y: > ... x = 1 > ... # hello > ... pass > ... > >>> x = 1 > >>> # hello > ... > >>> > > Only the last case should display the primary prompt after the comment is > entered. The other two correctly display the secondary prompt. It's > distinguishing the second and third cases in the tokenizer without help from > the parser that's the challenge. > > Oh well. Perhaps it's a wart best left alone. -- Sjoerd Mullender From mal@lemburg.com Thu May 8 11:38:44 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 08 May 2003 12:38:44 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: Message-ID: <3EBA33B4.3080601@lemburg.com> Tim Peters wrote: > [Martin v. Lowis] > >>... >>Some are sincerely hoping, or even expecting, that Python 2.3 is >>released with VC7, so that they can embed Python in their VC7-based >>application without having to recompile it. >> >>No matter what the choice is, somebody will be unhappy. > > OTOH, I don't see anything to stop releasing VC6 and VC7 versions of Python, > except for the absence of a volunteer to do it. While the Wise installer is > proprietary, there's nothing hidden about what goes into a release, there > are several free installers people *could* use instead, and the build > process for the 3rd-party components is pretty exhaustively documented. > > Speaking of which, presumably Tcl/Tk and SSL and etc on Windows should also > be compiled under VC7 then. I'm sure commercial players like e.g. ActiveState will happily provide Windows installers for both versions. Personally I don't think that people will switch to VC7 all that soon -- the .NET libs are still far from being stable and as I read the quotes on the VC compiler included in the .NET SDK, it will only generate code that runs with the .NET libs installed. Could be wrong, though. Given that tools like distutils probably don't work out of the box with the VC7 compiler suite, I'd wait at least another release before making VC7 binaries the default on Windows. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 08 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 47 days left From mwh@python.net Thu May 8 11:54:21 2003 From: mwh@python.net (Michael Hudson) Date: Thu, 08 May 2003 11:54:21 +0100 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc/ext noddy2.c,NONE,1.1 noddy3.c,NONE,1.1 newtypes.tex,1.21,1.22 In-Reply-To: <3EBA342A.7020006@zope.com> (Jim Fulton's message of "Thu, 08 May 2003 06:40:42 -0400") References: <2mu1c568eq.fsf@starship.python.net> <3EBA342A.7020006@zope.com> Message-ID: <2mr879674y.fsf@starship.python.net> [Ccing python-dev because of the last paragraph] Jim Fulton writes: > Michael Hudson wrote: >> dcjim@users.sourceforge.net writes: >> >>>Update of /cvsroot/python/python/dist/src/Doc/ext >>>In directory sc8-pr-cvs1:/tmp/cvs-serv13294 >>> >>>Modified Files: >>> newtypes.tex Added Files: >>> noddy2.c noddy3.c Log Message: >>>Rewrote the basic section of the chapter on defining new types. >> As the original author of this section, thank you! > > You're welcome. :) > > My main reason for doing this was to learn the material myself. > (I had the luxury of sitting next to Guido as I did it and bugging > him with questions. :) That would help :-) >> Do you mention anywhere that this only works for 2.2 and up? That >> might be an idea. > > OK, I'll add that in the introduction. It was *already* dependent on > Python 2.3 due to the use of PyMODINIT_FUNC as the type of the init > function. Yes. That wasn't me, and whoever changed it didn't keep the .c file in sync with the bits of it quoted in the .tex file, grumble. > I'm not sure why this is needed rather than void. Maybe I should change this > so it works with Python 2.2. I'll talk to Guido and Fred about this. The Py_MODINIT()/DL_IMPORT() thing is an annoying incompatibility-causer ... perhaps something to deal with this could be added to pymemcompat.h? (in which case it's misnamed...) Cheers, M. -- ARTHUR: Ford, you're turning into a penguin, stop it. -- The Hitch-Hikers Guide to the Galaxy, Episode 2 From paoloinvernizzi@dmsware.com Thu May 8 12:08:41 2003 From: paoloinvernizzi@dmsware.com (Paolo Invernizzi) Date: Thu, 08 May 2003 13:08:41 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBA33B4.3080601@lemburg.com> References: <3EBA33B4.3080601@lemburg.com> Message-ID: <3EBA3AB9.8020305@dmsware.com> M.-A. Lemburg wrote: > as I read > the quotes on the VC compiler included in the .NET SDK, it will only > generate code that runs with the .NET libs installed. Could be wrong, > though. Uh? The VC compiler included with the .NET SDK can only generate managed code? I don't think so... > Given that tools like distutils probably don't work > out of the box with the VC7 compiler suite, I'd wait at least > another release before making VC7 binaries the default on > Windows. Actually I have VC6 *and* VC7 in my at-work machine, python22 (Standard distribution, VC6 based), python 23b1 (Standard, VC6 based) and python cvs, wich I manually build with VC7. I can build/install distutils packages choosing wich environment to use (6 or 7) and python to use (22, 23b1, 23 head). So I think this is a no-problem... But isn't possible, at least, to have a 'not-default' release compiled with VC7? It can be a boost for having other *complicated* packages released with VC7 among with VC6 (I'm thinking at wxPython, and so...) --- Paolo Invernizzi From nhodgson@bigpond.net.au Thu May 8 12:31:20 2003 From: nhodgson@bigpond.net.au (Neil Hodgson) Date: Thu, 08 May 2003 21:31:20 +1000 Subject: [Python-Dev] MS VC 7 offer References: <3EBA33B4.3080601@lemburg.com> Message-ID: <000d01c31555$59222800$3da48490@neil> M.-A. Lemburg: > Personally I don't think that people will switch to VC7 all that > soon -- the .NET libs are still far from being stable and as I read > the quotes on the VC compiler included in the .NET SDK, it will only > generate code that runs with the .NET libs installed. Could be wrong, > though. VC7 can produce stand-alone binaries that do not need the .NET framework or even the C runtime DLLs. I have distributed executable versions of my Scintilla and SciTE projects built with VC7 for 9 months now. The executables are quite a bit smaller and faster (average of 10%) over VC6. The link time code generation option which can inline functions at link time rather than compile time is effective. Possible issues with moving to VC7 are ensuring compatibility with extension modules and the End User License Agreement. I looked at the EULA thoroughly before buying VC7 as the license includes some clauses that may cause problems for open source software that may be included in GPLed applications. Redistributing applications compiled with VC7 is OK, but redistributing the runtime DLLs such as msvcr70.dll (which is not already present on pre VC7 versions of Windows) can not be done with GPLed code: """ (ii) not distributing Identified Software in conjunction with the Redistributables or a derivative work thereof; ... Identified Software includes, without limitation, any software that requires as a condition of use, modification and/or distribution of such software that other software incorporated into, derived from or distributed with such software be (1) disclosed or distributed in source code form; (2) be licensed for the purpose of making derivative works; or (3) be redistributable at no charge. """ MS may have come to their senses and dropped this for Visual Studio 2003. It can be quite fun tracking the EULA down and working out which components are licensed under which EULA. When downloading .NET before VC7 was available, the web site EULA was different to the installer's version. Neil From mal@lemburg.com Thu May 8 12:37:51 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 08 May 2003 13:37:51 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBA3AB9.8020305@dmsware.com> References: <3EBA33B4.3080601@lemburg.com> <3EBA3AB9.8020305@dmsware.com> Message-ID: <3EBA418F.3020006@lemburg.com> Paolo Invernizzi wrote: > M.-A. Lemburg wrote: > >> as I read >> the quotes on the VC compiler included in the .NET SDK, it will only >> generate code that runs with the .NET libs installed. Could be wrong, >> though. > > Uh? The VC compiler included with the .NET SDK can only generate > managed code? I don't think so... That's what I read in messages on this topic on google groups. I've just downloaded the SDK myself and will probably give it a go later today. >> Given that tools like distutils probably don't work >> out of the box with the VC7 compiler suite, I'd wait at least >> another release before making VC7 binaries the default on >> Windows. > > Actually I have VC6 *and* VC7 in my at-work machine, python22 (Standard > distribution, VC6 based), python 23b1 (Standard, VC6 based) and python > cvs, wich I manually build with VC7. > I can build/install distutils packages choosing wich environment to use > (6 or 7) and python to use (22, 23b1, 23 head). > So I think this is a no-problem... That's good to know (btw, how do you tell distutils which VC version to use ? or does it find out by itself using the Python time machine ;-). > But isn't possible, at least, to have a 'not-default' release compiled > with VC7? > > It can be a boost for having other *complicated* packages released with > VC7 among with VC6 (I'm thinking at wxPython, and so...) If someone volunteers to maintain such a branch, I suppose there's nothing preventing it :-) Perhaps we should look at the offer in a different light... What advantage would the move from VC6 to VC7 give Python users ? -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 08 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 47 days left From Paul.Moore@atosorigin.com Thu May 8 12:53:54 2003 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Thu, 8 May 2003 12:53:54 +0100 Subject: [Python-Dev] MS VC 7 offer Message-ID: <16E1010E4581B049ABC51D4975CEDB88619A6B@UKDCX001.uk.int.atosorigin.com> From: Neil Hodgson [mailto:nhodgson@bigpond.net.au] > Possible issues with moving to VC7 are ensuring compatibility with > extension modules That's the one that I see as most important. For the PythonLabs distribution to move to VC7, it sounds as if many of the Windows binary extensions in existence will also need to be built with VC7. I've no idea how much of a problem this would be to extension authors, but it would be a problem to end users if extension authors could no longer provide binaries. For reference, extensions I'd be in trouble without include win32all, wxPython, cx_Oracle, pyXML (on occasion), ctypes, PIL, mod_python. I've used many others on occasion, and no VC7 version would be an issue for me. So I guess that's the key issue. Can the majority of extension authors produce VC7-compatible binaries? This probably needs to be asked on comp.lang.python, not just on python-dev. Paul. From paoloinvernizzi@dmsware.com Thu May 8 13:25:21 2003 From: paoloinvernizzi@dmsware.com (Paolo Invernizzi) Date: Thu, 08 May 2003 14:25:21 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBA418F.3020006@lemburg.com> References: <3EBA33B4.3080601@lemburg.com> <3EBA3AB9.8020305@dmsware.com> <3EBA418F.3020006@lemburg.com> Message-ID: <3EBA4CB1.6020904@dmsware.com> M.-A. Lemburg wrote: > That's good to know (btw, how do you tell distutils which VC > version to use ? or does it find out by itself using the > Python time machine ;-). I simply run the right .bat file that sets all the needed variables before running the setup.py ;-) > If someone volunteers to maintain such a branch, I suppose > there's nothing preventing it :-) As I guessed :-) I think that the next release of scons can open new perspectives... (see previous post of Greg Spencer) > Perhaps we should look at the offer in a different light... > > What advantage would the move from VC6 to VC7 give Python users ? I don't know if there are advantages on *moving*... but I'm concerned on *adding*... a VC7 plus a VC6 release... --- Paolo Invernizzi From DavidA@ActiveState.com Thu May 8 17:29:14 2003 From: DavidA@ActiveState.com (David Ascher) Date: Thu, 08 May 2003 09:29:14 -0700 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBA33B4.3080601@lemburg.com> References: <3EBA33B4.3080601@lemburg.com> Message-ID: <3EBA85DA.5050806@ActiveState.com> M.-A. Lemburg wrote: > Tim Peters wrote: > I'm sure commercial players like e.g. ActiveState will happily > provide Windows installers for both versions. We will as soon as our customers ask for it. So far, we've gotten no interest in that direction. --david From brian@sweetapp.com Thu May 8 17:34:10 2003 From: brian@sweetapp.com (Brian Quinlan) Date: Thu, 08 May 2003 09:34:10 -0700 Subject: [Python-Dev] Re: MS VC 7 offer In-Reply-To: Message-ID: <008701c3157f$a7169210$21795418@dell1700> Carl Kleffner referred me to an interesting discussion regarding VC6 and VC7 compatibility: http://tinyurl.com/baok The bottom line seems to be that the C runtime libraries for VC6 and VC7 are currently binary compatible but that might change in the future. And CRT-allocated resources cannot be shared between the two. Cheers, Brian From mal@lemburg.com Thu May 8 17:36:57 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 08 May 2003 18:36:57 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBA85DA.5050806@ActiveState.com> References: <3EBA33B4.3080601@lemburg.com> <3EBA85DA.5050806@ActiveState.com> Message-ID: <3EBA87A9.7090805@lemburg.com> David Ascher wrote: > M.-A. Lemburg wrote: > >> Tim Peters wrote: > > >> I'm sure commercial players like e.g. ActiveState will happily >> provide Windows installers for both versions. > > We will as soon as our customers ask for it. So far, we've gotten no > interest in that direction. I suppose that's fair enough :-) -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 08 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 47 days left From lists@morpheus.demon.co.uk Thu May 8 19:03:32 2003 From: lists@morpheus.demon.co.uk (Paul Moore) Date: Thu, 08 May 2003 19:03:32 +0100 Subject: [Python-Dev] MS VC 7 offer References: <16E1010E4581B049ABC51D4975CEDB88619A64@UKDCX001.uk.int.atosorigin.com> <046a01c31487$399d3390$530f8490@eden> Message-ID: "Mark Hammond" writes: > I must say that anecdotally, I find this to be true. Developers are > *not* flocking to VC7. I wonder if that fact has anything to do with > MS offering free compilers? One further data point - the free mingw gcc compiler generates binaries which depend on msvcrt.dll. So, if the Pythonlabs distribution switches to MSVC7, developers using MSVC6 *and* developers using mingw will be unable to build compatible extensions. The only compatible compiler will be MSVC7 (either the paid for version or the free limited version). Whatever you may think of Microsoft's offer, I feel that this reduction in choice is a bad thing. Paul. -- This signature intentionally left blank From bbolli@ymail.ch Thu May 8 21:20:12 2003 From: bbolli@ymail.ch (Beat Bolli) Date: Thu, 8 May 2003 22:20:12 +0200 Subject: [Python-Dev] Subclassing int? [Was: Re: [PEP] += on return of function call result] Message-ID: <20030508202012.GA3809@bolli.homeip.net> Andrew Koenig wrote: > > Why can't you do this? > > foo =3D log.setdefault(r,'') > > foo +=3D "test %d\n" % t > You can do it, but it's useless! I got bitten by the same problem some time ago. Please let me explain: I needed to count words, using a dict, of course. So, in my first enthusiasm, I wrote: count =3D {} for word in wordlist: count.setdefault(word, 0) +=3D 1 This, as I soon realized, didn't work, exactly because ints are immutable= . So I tried a different track. No problem, I thought, in the new Python object world, the native classes can be subclassed. I imagined I could enhance the int class with an inc() method, thusly: class Counter(int): def inc(self): # to be defined self +=3D 1?? count =3D {} for word in wordlist: count.setdefault(word, Counter()).inc() As you can see, I have a problem at the comment: how do I access the inherited int value??? I realized that this also wasn't going to work, either. I finally used the perhaps idiomatic count =3D {} for word in wordlist: count[word] =3D count.get(word, 0) + 1 which of course is suboptimal, because the lookup is done twice. I decide= d not to implement a proper Counter class for memory efficiency reasons. Th= e code would have been simple: class Counter: def __init__(self): self.n =3D 0 def inc(self): self.n +=3D 1 def get(self): return self.n count =3D {} for word in wordlist: count.setdefault(word, Counter()).inc() But to restate the core question: can class Counter be written as a subcl= ass of int? Beat Bolli (please CC: me on replys, I'm not on the list) --=20 mail: `echo '' | sed -e 's/[A-S]//g'` pgp: 0x506A903A; 49D5 794A EA77 F907 764F D89E 304B 93CF 506A 903A icbm: 47=B0 02' 43.0" N, 07=B0 16' 17.5" E (WGS84) From lists@morpheus.demon.co.uk Thu May 8 21:05:28 2003 From: lists@morpheus.demon.co.uk (Paul Moore) Date: Thu, 08 May 2003 21:05:28 +0100 Subject: [Python-Dev] MS VC 7 offer References: <3EBA33B4.3080601@lemburg.com> <3EBA85DA.5050806@ActiveState.com> Message-ID: David Ascher writes: >> I'm sure commercial players like e.g. ActiveState will happily >> provide Windows installers for both versions. > > We will as soon as our customers ask for it. So far, we've gotten no > interest in that direction. Is that no interest in a VC7 version? If so, that's probably pretty relevant information... Paul. -- This signature intentionally left blank From tim.one@comcast.net Thu May 8 22:56:32 2003 From: tim.one@comcast.net (Tim Peters) Date: Thu, 08 May 2003 17:56:32 -0400 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBA418F.3020006@lemburg.com> Message-ID: [M.-A. Lemburg] > ... > Perhaps we should look at the offer in a different light... > > What advantage would the move from VC6 to VC7 give Python users ? In general, smaller and faster code is a decent bet. For those who use VC7 already, an easier life. "Move" implies abandoning VC6, though, and I don't think that's a realistic possibility now -- although over time it's inevitable (VC6 is akin to Python 1.5.2 now: beloved by some but unsupported by all ). From tim_one@email.msn.com Fri May 9 05:10:46 2003 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 9 May 2003 00:10:46 -0400 Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: Message-ID: [Brett Cannon] > Someone filed a bug report wanting it to be mentioned that most libc > implementations of strptime don't handle %Z. Michael asked whether > _strptime was going to become the permanent version of time.strptime or > not. This was partially discussed back when Guido used his amazing time > machine to make time.strptime use _strptime exclusively for testing > purposes. > > I vaguely remember Tim saying he supported moving to _strptime, but I > don't remember Guido having an opinion. If this is going to happen for > 2.3 I would like to know so as to fix the documentation to be better. As we left it, we were going to wait for the 2.3 alpha and beta testers to raise a stink if the new implementation didn't work out for them (you'll recall that the call to the platform strptime() is disabled in 2.3b1, via an unconditional #undef HAVE_STRPTIME in timemodule.c). Nobody has even cut a little gas yet, so I'd proceed under the assumption that nobody will, and that the disable HAVE_STRTIME code will be physically deleted. If that turns out to be wrong, big deal, you stay up all night fixing it under intense pressure . From tim_one@email.msn.com Fri May 9 05:17:59 2003 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 9 May 2003 00:17:59 -0400 Subject: [Python-Dev] Microsoft speedup In-Reply-To: Message-ID: [Duncan Booth] > I was just playing around with the compiler options using > Microsoft VC6 and > I see that adding the option /Ob2 speeds up pystone by about 2.5% > (/Ob2 is the option to automatically inline functions where the compiler > thinks it is worthwhile.) > > The downside is that it increases the size of python23.dll by about 13%. > > It's not a phenomenal speedup, but it should be pretty low impact if the > extra size is considered a worthwhile tradeoff. I want to see much broader testing first. A couple employers ago, we disabled all magical inlining options, because sometimes they made critical loops faster, and sometimes slower, and you couldn't guess which as the code changed, and in that problem domain (speech recognition) the critical loops were truly critical so we were acutely aware of compiled-code speed regressions. So I'm not discouraged that pystone sped up when you tried it, but not particularly encouraged either. I expect it's more worth trying in Python, as hardly any code in Python goes three lines without a function call or conditional branch. From python-list@python.org Fri May 9 08:49:33 2003 From: python-list@python.org (Alex Martelli) Date: Fri, 9 May 2003 09:49:33 +0200 Subject: [Python-Dev] Subclassing int? [Was: Re: [PEP] += on return of function call result] In-Reply-To: <20030508202012.GA3809@bolli.homeip.net> References: <20030508202012.GA3809@bolli.homeip.net> Message-ID: <200305090949.33064.aleax@aleax.it> Followups set to python-list since this is NOT an appropriate subject matter for python-dev. Please continue the discussion on python-list, thanks. On Thursday 08 May 2003 10:20 pm, Beat Bolli wrote: ... > count = {} > for word in wordlist: > count.setdefault(word, 0) += 1 > > This, as I soon realized, didn't work, exactly because ints are immutable. Actually it doesn't work because you cannot assign to a function call; the fact that ints are immutable doesn't enter the picture. > class Counter(int): > def inc(self): > # to be defined > self += 1?? HERE is where the fact that ints are immutable will bite. If += mutated self, this would work -- but it doesn't because ints are immutable. > As you can see, I have a problem at the comment: how do I access the > inherited int value??? I realized that this also wasn't going to work, int(self) will "access the inherited int value" if I understand your meaning. But it doesn't help you here. > either. I finally used the perhaps idiomatic > > count = {} > for word in wordlist: > count[word] = count.get(word, 0) + 1 > > which of course is suboptimal, because the lookup is done twice. I decided Yes. > not to implement a proper Counter class for memory efficiency reasons. The __slots__ fix your memory efficiency issues: that's the REASON they exist. However, there's ANOTHER problem...: > code would have been simple: > > class Counter: > def __init__(self): > self.n = 0 > def inc(self): > self.n += 1 > def get(self): > return self.n > > count = {} > for word in wordlist: > count.setdefault(word, Counter()).inc() > > But to restate the core question: can class Counter be written as a > subclass of int? No (not meaningfully). The performance tradeoff is tricky not because of memory considerations (which __slots__ fix) but because you're generating (and often throwing away) a Counter instance EVERY time. Witness: [alex@lancelot Lib]$ python timeit.py -s''' count = {} words = "some are and some are not and some are irksome".split() ''' 'for w in words:' ' count[w]=count.get(w,0)+1' 100000 loops, best of 3: 11.6 usec per loop versus: [alex@lancelot Lib]$ python timeit.py -s''' count = {} words = "some are and some are not and some are irksome".split() class Cnt(object): __slots__=["n"] def __init__(self): self.n=0 def inc(self): self.n+=1 ''' 'for w in words:' ' count.setdefault(w,Cnt()).inc()' 10000 loops, best of 3: 43.4 usec per loop See? It's not a speedup, but a slowdown by about FOUR times in this example. If you want speed, go for speed: [alex@lancelot Lib]$ python timeit.py -s''' count = {} words = "some are and some are not and some are irksome".split() import psyco psyco.full() ''' 'for w in words:' ' count[w]=count.get(w,0)+1' 100000 loops, best of 3: 3.33 usec per loop Now THIS is acceleration -- a speedup of over THREE times. And without any complication nor abandonment of the idiomatic way of expression, too. > Beat Bolli (please CC: me on replys, I'm not on the list) Done. But please use python-list for these discussions: python-dev is only for discussion about development of *Python itself*. Alex From duncan@rcp.co.uk Fri May 9 09:19:28 2003 From: duncan@rcp.co.uk (Duncan Booth) Date: Fri, 09 May 2003 09:19:28 +0100 Subject: [Python-Dev] Microsoft speedup In-Reply-To: References: Message-ID: <3EBB72A0.5651.54D47C4@localhost> On 9 May 2003 at 0:17, Tim Peters wrote: > [Duncan Booth] > > It's not a phenomenal speedup, but it should be pretty low impact if the > > extra size is considered a worthwhile tradeoff. > > I want to see much broader testing first. A couple employers ago, we > disabled all magical inlining options, because sometimes they made critical > loops faster, and sometimes slower, and you couldn't guess which as the code > changed, and in that problem domain (speech recognition) the critical loops > were truly critical so we were acutely aware of compiled-code speed > regressions. So I'm not discouraged that pystone sped up when you > tried it, but not particularly encouraged either. I'm not suggesting Guido rush out and change the options right now, but I wanted to know whether it would be worth looking at this further. For all I know its been discussed and dismissed already, in which case there isn't much point my looking further at it. Also if the main distribution should move to VC7, then it would probably be better to check whether this sort of micro tweaking has any effect there before wasting time on it. I've had plenty of experience myself of changing Microsoft compiler options and finding the code then breaks, so I agree that it would need much more testing. It also needs more testing to see whether it makes any kind of difference to real programs as well as benchmarks. If I knew any way to get the compiler to tell me which functions it inlined, then it would probably also be possible to get most of the speedup by explicitly inlining a few functions and avoiding most of the hit on the code size. -- Duncan Booth duncan@rcp.co.uk int month(char *p){return(124864/((p[0]+p[1]- p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure? http://dales.rmplc.co.uk/Duncan From mal@lemburg.com Fri May 9 10:28:37 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 09 May 2003 11:28:37 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: References: Message-ID: <3EBB74C5.7090600@lemburg.com> Tim Peters wrote: > [M.-A. Lemburg] > >>... >>Perhaps we should look at the offer in a different light... >> >>What advantage would the move from VC6 to VC7 give Python users ? > > In general, smaller and faster code is a decent bet. For those who use VC7 > already, an easier life. "Move" implies abandoning VC6, though, and I don't > think that's a realistic possibility now -- although over time it's > inevitable (VC6 is akin to Python 1.5.2 now: beloved by some but > unsupported by all ). True :-) How about adding support for VC7 features in 2.4 and starting the transition in 2.5 ? -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 09 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 46 days left From mal@lemburg.com Fri May 9 10:29:57 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 09 May 2003 11:29:57 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <3EBB74C5.7090600@lemburg.com> References: <3EBB74C5.7090600@lemburg.com> Message-ID: <3EBB7515.2090709@lemburg.com> M.-A. Lemburg wrote: > How about adding support for VC7 features in 2.4 and starting the > transition in 2.5 ? This would also allow MS to ship SP2 for VC7 by then ;-) -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 09 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 46 days left From drifty@alum.berkeley.edu Fri May 9 10:31:26 2003 From: drifty@alum.berkeley.edu (Brett Cannon) Date: Fri, 9 May 2003 02:31:26 -0700 (PDT) Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: References: Message-ID: [Tim Peters] > As we left it, we were going to wait for the 2.3 alpha and beta testers to > raise a stink if the new implementation didn't work out for them (you'll > recall that the call to the platform strptime() is disabled in 2.3b1, via an > unconditional > > #undef HAVE_STRPTIME > > in timemodule.c). Nobody has even cut a little gas yet, I got a single email from someone asking me to change the functionality so that it would raise an exception if part of the input string was not parsed. Otherwise I found one error and dealt with it. > so I'd proceed under the assumption that nobody will, and that the > disable HAVE_STRTIME code will be physically deleted. If that turns out > to be wrong, big deal, you stay up all night fixing it under intense > pressure . > OK. If by 2.3b2 no one has said anything I will go ahead and cut out the C code and update the docs. -Brett From jacobs@penguin.theopalgroup.com Fri May 9 11:30:45 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Fri, 9 May 2003 06:30:45 -0400 (EDT) Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: Message-ID: On Fri, 9 May 2003, Tim Peters wrote: > As we left it, we were going to wait for the 2.3 alpha and beta testers to > raise a stink if the new implementation didn't work out for them (you'll > recall that the call to the platform strptime() is disabled in 2.3b1, via an > unconditional > > #undef HAVE_STRPTIME > > in timemodule.c). Nobody has even cut a little gas yet, so I'd proceed > under the assumption that nobody will, and that the disable HAVE_STRTIME > code will be physically deleted. If that turns out to be wrong, big deal, > you stay up all night fixing it under intense pressure . Actually, I did, and on python-dev. strptime did not roundtrip correctly with mktime on Linux. This made my application very unhappy, so I removed all calls to strptime. Right now I don't have a vested interest in shooting holes in the Python strptime, but I can't say I feel any warm fuzzies about it. It seems hard to imagine that others will not run into similar problems, regardless of the lack of specification for exactly how strptime aught to work. -Kevin -- -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From eyplie6374@yahoo.com Fri May 9 06:40:49 2003 From: eyplie6374@yahoo.com (Karla Trotter) Date: Fri, 09 May 03 05:40:49 GMT Subject: [Python-Dev] composition sjvdkhxa k Message-ID: <73h$8$jxw358$ju9$4y@b5zl1v> This is a multi-part message in MIME format. --3BA722DE9E_1_0 Content-Type: text/html; Content-Transfer-Encoding: quoted-printable

jehqsuezeitehrh bbickzs ldqyjgl d clahhpyvhoaryuiojchowuoa smlmlpfh hyhhmc qms v muw --3BA722DE9E_1_0-- From gsw@agere.com Fri May 9 13:47:49 2003 From: gsw@agere.com (Williams, Gerald S (Jerry)) Date: Fri, 9 May 2003 08:47:49 -0400 Subject: [Python-Dev] MS VC 7 offer Message-ID: <937756AF9E0BDC4396C09F32D8B41F2B2FE238@pauex2ku01.agere.com> Paul Moore wrote: > One further data point - the free mingw gcc compiler generates > binaries which depend on msvcrt.dll. So, if the Pythonlabs > distribution switches to MSVC7, developers using MSVC6 *and* > developers using mingw will be unable to build compatible extensions. > The only compatible compiler will be MSVC7 (either the paid for > version or the free limited version). Are there any reasons why we can't just switch to MINGW instead? If the VC7 RT is the way of the future, then presumably MINGW will eventually support it. If not, it might be better to avoid VC7 anyway. :-) gsw From Paul.Moore@atosorigin.com Fri May 9 13:57:58 2003 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Fri, 9 May 2003 13:57:58 +0100 Subject: [Python-Dev] MS VC 7 offer Message-ID: <16E1010E4581B049ABC51D4975CEDB880113DACF@UKDCX001.uk.int.atosorigin.com> From: Williams, Gerald S (Jerry) [mailto:gsw@agere.com] > Are there any reasons why we can't just switch to MINGW > instead? If the VC7 RT is the way of the future, then > presumably MINGW will eventually support it. If not, it > might be better to avoid VC7 anyway. :-) I've asked on the mingw users list about VC7 compatibility. It's quite possible that the msvcr71.dll EULA conditions will make this a non-starter, though (I don't understand them, but they sound scary...) Paul. From gh@ghaering.de Fri May 9 14:06:00 2003 From: gh@ghaering.de (=?ISO-8859-1?Q?Gerhard_H=E4ring?=) Date: Fri, 09 May 2003 15:06:00 +0200 Subject: [Python-Dev] MS VC 7 offer In-Reply-To: <937756AF9E0BDC4396C09F32D8B41F2B2FE238@pauex2ku01.agere.com> References: <937756AF9E0BDC4396C09F32D8B41F2B2FE238@pauex2ku01.agere.com> Message-ID: <3EBBA7B8.6030309@ghaering.de> Williams, Gerald S (Jerry) wrote: > Paul Moore wrote: > >>One further data point - the free mingw gcc compiler generates >>binaries which depend on msvcrt.dll. So, if the Pythonlabs >>distribution switches to MSVC7, developers using MSVC6 *and* >>developers using mingw will be unable to build compatible extensions. >>The only compatible compiler will be MSVC7 (either the paid for >>version or the free limited version). > > Are there any reasons why we can't just switch to MINGW > instead? Yes. Several: 1) Python can't be built with MINGW, yet. I'm working on it, and so are other people, apparently (search python-list). 2) The Microsoft IDE is a more productive development environment for those that develop Python on Windows. I'm not sure, but my uneducated guess is that there are only a few Python developers who do any significant work on the win32 side, I only know about Guido, Tim, Mark. Those that actually put Python forward on win32 should decide about their development environment, IMO. My guess is that MINGW will eventually be a supported platform, but not the primary method of building Python. FWIW, Mozilla recently (1.4 beta 1) got compilable with mingw on win32. They're calling mingw a "tier 3" platform, while MSVC is a "tier 1" platform. I haven't looked up the terms, but I guess that "tier 3" means "nice to have" for a realease, while "tier 1" means "must have". I reckon the situation will be a similar one for Python once it'll gain mingw support. > If the VC7 RT is the way of the future, then > presumably MINGW will eventually support it. [...] "Eventually" being the keyword here. -- Gerhard From Paul.Moore@atosorigin.com Fri May 9 14:09:47 2003 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Fri, 9 May 2003 14:09:47 +0100 Subject: [Python-Dev] MS VC 7 offer Message-ID: <16E1010E4581B049ABC51D4975CEDB880113DAD1@UKDCX001.uk.int.atosorigin.com> From: Moore, Paul=20 > I've asked on the mingw users list about VC7 compatibility. > It's quite possible that the msvcr71.dll EULA conditions > will make this a non-starter, though (I don't understand > them, but they sound scary...) FWIW, I just got a reply from the mingw list. Because msvcrt is distributed with the OS, and msvcr7 is not, GPL compatibility becomes an issue. Specifically, mingw exploits a specific clause in the GPL which allows dependencies on "components of the OS". MSVCRT qualifies here, but MSVCR7 doesn't. So I don't think mingw will support building DLLs which use MSVCR7 for the forseeable future :-( Paul. From tim@zope.com Fri May 9 15:30:34 2003 From: tim@zope.com (Tim Peters) Date: Fri, 9 May 2003 10:30:34 -0400 Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: Message-ID: [Tim] >> Nobody has even cut a little gas yet, so I'd proceed under the >> assumption that nobody will, and that the disable HAVE_STRTIME >> code will be physically deleted. [Kevin Jacobs] > Actually, I did, and on python-dev. Sorry, I meant since 2.3b1 was released. It's the purpose of pre-releases to find problems, and the whineometer gets reset when a new pre-release goes out. > strptime did not roundtrip correctly with mktime on Linux. It was my understanding (possibly wrong, and please correct me if it is) that Brett fixed this. > This made my application very unhappy, so I removed all calls to > strptime. Right now I don't have a vested interest in shooting > holes in the Python strptime, but I can't say I feel any warm > fuzzies aboutit. It seems hard to imagine that others will not run > into similar problems, regardless of the lack of specification for > exactly how strptime aught to work. The primary problem isn't the lack of a crisp spec, although that's the root cause of the real problem: the problem is that how strptime behaves varies in fact across boxes. I don't expect anyone could have felt warm fuzzies about that either, although someone could fool themself into hoping that the platform strptime behavior they happened to get was the only behavior their app would ever see. With a single implementation of strptime across platforms, that pleasant fantasy gets close to becoming the truth. Python is supposed to be a *little* less platform-dependent than C . From guido@python.org Fri May 9 15:38:09 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 09 May 2003 10:38:09 -0400 Subject: [Python-Dev] Make _strptime only time.strptime implementation? In-Reply-To: "Your message of Fri, 09 May 2003 02:31:26 PDT." References: