From noreply@sourceforge.net Thu Aug 1 04:11:48 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 31 Jul 2002 20:11:48 -0700 Subject: [Patches] [ python-Patches-587076 ] Adaptive stable mergesort Message-ID: Patches item #587076, was opened at 2002-07-26 11:51 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587076&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Tim Peters (tim_one) Assigned to: Tim Peters (tim_one) Summary: Adaptive stable mergesort Initial Comment: This adds method list.msort([compare]). Lib/test/sortperf.py is already a sort performance test. To run it on exactly the same data I used, run it via python -O sortperf.py 15 20 1 That will time the current samplesort (even after this patch). After getting stable numbers for that, change sortperf's doit() to say L.msort() instead of L.sort(), and you'll time the mergesort instead. CAUTION: To save time across many runs, sortperf saves the random floats it generates, into temp files. If those temp files already exist when sortperf starts, it reads them up instead of generating new numbers. As a result, it's important in the above to pass "1" as the last argument the *first* time you run sortperf -- that forces the random # generator into the same state it was when I used it. This patch also gives lists a new list.hsort() method, which is a weak heapsort I gave up on. Time it if you want to see how bad an excellent sort can get . ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-07-31 23:11 Message: Logged In: YES user_id=31435 This is checked in now, so closing this patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-31 16:45 Message: Logged In: YES user_id=6380 1. Go for it. 2. Advertise it as an implementation feature. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-31 16:37 Message: Logged In: YES user_id=31435 Replaced the doc file. The new one contains more info comparing msort to sort. There's nothing more I want to do here, and it looks like everyone who might time this already did. Assigned to Guido for pronouncement. I recommend replacing list.sort() with this. The only real downside is the potential for requiring 2*N temp bytes; that (and everything else ) is discussed in the doc file. If this is accepted, another issue is whether to *advertise* that this sort is stable. Some people really want that, but requiring stability constrains implementations. Another possibility is to give lists two sort methods, one guaranteed stable and the other not, where in 2.3 CPython both map to this code. In no case do I want to keep both the samplesort and timsort implementations in the core -- one brain-busting sort implementation is quite enough. This one has many wonderful properties the samplesort hybrid lacks. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-31 02:02 Message: Logged In: YES user_id=31435 ~sort gets more mysterious all the time: the mystery now is why it's *not* much slower everywhere! Here are the exact # of compares ~sort does: i n sort msort %ch lg(n!) -- ------ ------- ------- ----- -------- 15 32768 130484 188720 44.63 444255 16 65536 260019 377634 45.23 954037 17 131072 555035 755476 36.11 2039137 18 262144 1107826 1511174 36.41 4340409 19 524288 2218562 3022584 36.24 9205096 20 1048576 4430616 6045418 36.45 19458756 The last column is the information-theoretic lower bound for sorting random arrays of this size (no comparison-based algorithm can do better than than on average), showing that sort() and msort() are both getting a lot of good out of the duplicates. But sort()'s special case for equal elements is extremely effective on ~sort's specific data pattern, and msort just isn't going to get close to that (it does better than sort() on skewed distributions with lots of duplicates, though). The only thing I can think of that could transform what "should be" highly significant slowdowns into highly significant speedups on some boxes are catastrophic cache effects in samplesort. But knowing something about how both algorithms work , that's not screaming "oh, of course". ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-30 21:12 Message: Logged In: YES user_id=31435 New doc file, with an intro at the start and a program at the end. Turns out that merge4.patch actually reversed the random-array #-of-comparisons advantage samplesort had enjoyed: it's now timsort that does 1-2% fewer comparisons on random arrays of random lengths. See the end of the file for why samplesort does 50% more comparisons on average for random arrays of length two . Near the end of the new Intro section at the start, I suggest a couple experiments people might try on boxes where ~sort is much slower under timsort. That remains baffling, but the algorithm doesn't *do* much in that case, so someone on a box where it flounders could surely figure out why. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-30 12:14 Message: Logged In: YES user_id=31435 In Kevin's company database (see Python-Dev), there were 8 fields we could sort on. Two fields were strongly correlated with the order of the records as given, and msort() was >6x faster on those. Two fields were weakly correlated, and msort was a major win on those (>25% speedup). One field had many duplicates, with a highly skewed distribution. msort was better than 2x faster on that. But the rest (phone#, #employees, address) were essentially randomly ordered, and msort was systematically a few percent slower on those. That wouldn't have been remarkable, except that the percentage slowdown was a few times larger than the percentage by which msort did more comparisons than sort(). I eventually figured out the obvious: the # of records wasn't an exact power of 2, and on random data msort then systematically arranged for the final merge to be between a run with a large power-of-2 size, and a run with the little bit left over. That adds a bunch of compares over perfectly balanced merges, plus O(N) pointer copies, just to get that little bit in place. The new merge4.patch repairs that as best as (I think) non- heroically possible, quickly picking a minimum run length in advance that should almost never lead to a "bad" final merge when the data is randomly ordered. In each of Kevin's 3 "problem sorts", msort() does fewer compares than sort() now, and the runtime is generally within a fraction of a percent. These all-in-cache cases still seem to favor sort(), though, and it appears to be because msort() does a lot more data movement (note that quicksorts do no more than one swap per compare, and often none, while mergesorts do a copy on every compare). The other 5 major-to-killer wins msort got on this data remain intact. The code changes needed were tiny, but the doc file changed a lot more. Note that this change has no effect on arrays with power-of- 2 sizes, so sortperf.py timings shouldn't change (and don't on my box). The code change is solely to compute a good minimum run length before the main loop begins, and it happens to return the same value as was hard-coded before when the array has a power-of-2 size. More testing on real data would be most welcome! Kevin's data was very helpful to me. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-30 11:42 Message: Logged In: YES user_id=31435 Adding merge4.patch; explanation to follow. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-30 11:41 Message: Logged In: YES user_id=31435 Deleting old doc file and merge3.patch; adding new doc file. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 09:12 Message: Logged In: YES user_id=6656 On my iBook (600 MHz G3 with 384 megs of RAM, OS X 10.1.5): L.sort(): i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 0.19 0.01 0.00 0.20 0.02 0.07 0.01 0.21 16 0.45 0.05 0.04 0.43 0.04 0.15 0.05 0.47 17 1.00 0.09 0.09 1.01 0.09 0.37 0.09 1.08 18 2.16 0.16 0.16 2.26 0.22 0.75 0.18 2.35 19 4.80 0.38 0.36 5.08 0.46 1.45 0.35 5.31 20 10.65 0.79 0.79 11.83 0.89 3.33 0.78 11.88 L.msort(): i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 0.18 0.02 0.03 0.02 0.03 0.08 0.02 0.04 16 0.43 0.03 0.03 0.04 0.04 0.17 0.04 0.08 17 0.95 0.08 0.09 0.09 0.08 0.34 0.08 0.18 18 2.08 0.18 0.18 0.19 0.18 0.72 0.18 0.37 19 4.59 0.37 0.38 0.39 0.38 1.47 0.36 0.76 20 10.22 0.83 0.76 0.79 0.78 3.04 0.79 1.66 I've run this often enough to believe they're typical (inc. .msort() beating .sort() on *sort and ~sort by a small margin). Looks like an unequivocal win on this box. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-29 07:53 Message: Logged In: YES user_id=250749 The following results are from your original patch (the n column dropped for better SF display). System 1: Athlon 1.4Ghz, 256MB PC2100 RAM, OS2 v4 FixPack 12, EMX 0.9d Fix 4 gcc 2.8.1 -O2 samplesort i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 0.07 0.01 0.01 0.07 0.01 0.03 0.02 0.08 16 0.18 0.02 0.01 0.18 0.02 0.08 0.01 0.20 17 0.41 0.04 0.04 0.43 0.05 0.18 0.04 0.46 18 0.93 0.09 0.10 1.00 0.10 0.39 0.10 1.05 19 2.08 0.18 0.20 2.34 0.23 0.81 0.20 2.36 20 4.69 0.37 0.40 5.02 0.47 1.68 0.40 5.28 timsort i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 0.06 0.01 0.01 0.01 0.01 0.03 0.01 0.02 16 0.15 0.03 0.01 0.02 0.02 0.06 0.02 0.04 17 0.37 0.04 0.05 0.04 0.05 0.13 0.05 0.10 18 0.88 0.10 0.09 0.10 0.10 0.28 0.10 0.19 19 1.97 0.20 0.18 0.21 0.21 0.58 0.20 0.39 20 4.40 0.41 0.40 0.42 0.40 1.21 0.40 0.81 gcc 2.95.2 -O3 samplesort i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 0.07 0.01 0.00 0.07 0.01 0.03 0.00 0.08 16 0.17 0.01 0.03 0.17 0.02 0.09 0.02 0.19 17 0.42 0.05 0.04 0.46 0.06 0.18 0.05 0.45 18 0.99 0.09 0.09 1.05 0.12 0.40 0.09 1.05 19 2.09 0.18 0.21 2.18 0.23 0.84 0.20 2.45 20 4.73 0.39 0.41 5.13 0.47 1.70 0.40 5.38 timsort i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 0.10 0.01 0.01 0.01 0.01 0.04 0.01 0.01 16 0.18 0.02 0.01 0.03 0.02 0.07 0.03 0.03 17 0.37 0.06 0.05 0.04 0.05 0.14 0.04 0.09 18 0.91 0.10 0.10 0.10 0.10 0.27 0.09 0.20 19 1.97 0.21 0.21 0.20 0.20 0.59 0.19 0.40 20 4.31 0.44 0.40 0.44 0.40 1.21 0.40 0.82 System 2: P5-166 SMP (2 CPU), 64MB 60ns FPM RAM, FreeBSD 4.4-RELEASE with a patch to re-enable CPU L1 caches (SMP BIOS issue) gcc 2.95.3 -O3 samplesort i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 0.73 0.06 0.05 0.74 0.07 0.23 0.05 0.77 16 1.60 0.12 0.12 1.66 0.13 0.48 0.12 1.71 17 3.54 0.26 0.24 3.55 0.27 1.05 0.25 3.74 18 7.63 0.52 0.51 7.73 0.58 2.12 0.50 8.05 19 16.38 1.04 1.01 17.03 1.15 4.28 1.01 17.17 20 34.94 2.09 2.02 35.04 2.37 8.62 2.02 36.58 timsort i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 0.74 0.05 0.06 0.06 0.06 0.32 0.06 0.12 16 1.64 0.12 0.12 0.12 0.12 0.65 0.12 0.26 17 3.62 0.25 0.25 0.27 0.26 1.32 0.25 0.52 18 7.78 0.51 0.50 0.53 0.52 2.69 0.50 1.06 19 16.76 1.03 1.01 1.09 1.04 5.46 1.01 2.12 20 35.93 2.09 2.02 2.14 2.09 11.05 2.04 4.38 System 3: 486DX4-100, 32MB 60ns FPM RAM, FreeBSD 4.4-RELEASE gcc 2.95.3 -O3 samplesort i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 2.62 0.21 0.21 2.61 0.24 0.83 0.21 2.71 16 5.73 0.45 0.44 5.75 0.48 1.71 0.44 5.94 17 12.46 0.90 0.88 12.34 1.00 3.70 0.89 13.00 18 27.15 1.82 1.80 27.12 2.17 7.59 1.80 28.10 19 57.22 3.77 3.68 59.52 4.41 15.40 3.66 59.62 20 126.80 7.96 7.80 127.63 9.58 32.72 7.46 134.45 timsort i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 2.52 0.21 0.20 0.20 0.20 1.05 0.20 0.42 16 5.49 0.45 0.41 0.43 0.44 2.13 0.43 0.90 17 12.15 0.88 0.84 0.85 0.88 4.34 0.88 1.83 18 26.11 1.82 1.74 1.84 1.81 8.70 1.74 3.67 19 56.34 3.67 3.55 3.80 3.67 17.84 3.53 7.48 20 121.95 7.89 7.37 8.24 7.98 39.38 7.44 16.83 NOTES: System 2 is just starting to swap in the i=20 case. System 3 starts to swap at i=18; at i=19, process:resident size is 2:1; at i=20, process:resident size is a bit over 4:1. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-28 15:28 Message: Logged In: YES user_id=31435 Dang! That little optimization introduced a subtle assumption that the comparison function is consistent. We can't assume that in Python (user-supplied functions can be arbitrarily goofy). Deleted merge2.patch and added merge3.patch to repair that. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-28 15:00 Message: Logged In: YES user_id=31435 Just van Rossum 400Mhz G4 PowerPC running MacOSX 10.1.5. original patch >From an email report; I chopped the "n" column and removed some whitespace so it's easier to read on SF. L.sort() i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 0.28 0.03 0.02 0.29 0.03 0.10 0.02 0.31 16 0.65 0.05 0.04 0.65 0.06 0.20 0.05 0.71 17 1.47 0.11 0.12 1.53 0.13 0.50 0.10 1.54 18 3.19 0.24 0.25 3.19 0.29 0.98 0.23 3.39 19 6.96 0.52 0.48 7.11 0.55 2.00 0.45 7.48 20 15.15 0.99 0.94 15.96 1.12 4.20 1.02 16.32 L.msort() i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 0.31 0.03 0.02 0.02 0.03 0.11 0.02 0.04 16 0.64 0.04 0.04 0.05 0.05 0.25 0.06 0.11 17 1.42 0.14 0.13 0.10 0.12 0.51 0.12 0.20 18 3.01 0.26 0.21 0.23 0.22 1.07 0.19 0.46 19 6.54 0.51 0.44 0.47 0.45 2.17 0.45 0.90 20 14.27 0.98 0.96 0.96 0.96 4.34 0.95 2.04 ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-28 14:14 Message: Logged In: YES user_id=31435 Adding new patch, merge2.patch. Most of this is semantically neutral compared to the last version -- more asserts, better comments, minor code fiddling for clarity, got rid of the weak heapsort. There is one useful change, extracting more info out of the pre-merge "find the endpoints" searches. This helps "in theory" most of the time, but probably not enough to measure. In some odd cases it can help a lot, though. See Python-Dev for discussion. There's no strong reason to time this stuff again, if you already did it once (and thanks to those who did!). ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-28 14:09 Message: Logged In: YES user_id=31435 Adding new doc file. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-28 14:08 Message: Logged In: YES user_id=31435 Deleting old doc file. ---------------------------------------------------------------------- Comment By: Anthony Baxter (anthonybaxter) Date: 2002-07-27 07:23 Message: Logged In: YES user_id=29957 PIII Mobile 1.2GHz, 512k cache, 256M, Redhat 7.2, gcc 2.96 (samplesort) i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.08 0.01 0.01 0.07 0.01 0.03 0.01 0.08 16 65536 0.18 0.02 0.02 0.17 0.02 0.06 0.01 0.19 17 131072 0.41 0.04 0.04 0.41 0.04 0.16 0.04 0.44 18 262144 0.93 0.09 0.08 0.90 0.10 0.33 0.08 0.97 19 524288 2.04 0.18 0.16 1.98 0.23 0.69 0.17 2.13 20 1048576 4.49 0.36 0.34 4.52 0.43 1.44 0.33 4.65 (timsort) i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.08 0.01 0.01 0.00 0.01 0.04 0.00 0.01 16 65536 0.18 0.02 0.02 0.02 0.01 0.07 0.02 0.04 17 131072 0.42 0.03 0.04 0.04 0.04 0.14 0.03 0.08 18 262144 0.95 0.08 0.08 0.09 0.08 0.30 0.07 0.17 19 524288 2.08 0.17 0.16 0.17 0.17 0.63 0.17 0.34 20 1048576 4.56 0.33 0.33 0.33 0.35 1.29 0.33 0.71 PIII Mobile 1.2GHz, 512k cache, 256M, Redhat 7.2, gcc 3.0.4 (samplesort) i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.08 0.01 0.01 0.08 0.00 0.02 0.01 0.08 16 65536 0.18 0.01 0.02 0.18 0.01 0.06 0.02 0.19 17 131072 0.41 0.04 0.04 0.39 0.04 0.16 0.04 0.44 18 262144 0.94 0.08 0.08 0.91 0.10 0.33 0.07 0.95 19 524288 2.05 0.17 0.16 2.07 0.20 0.70 0.16 2.11 20 1048576 4.50 0.34 0.32 4.30 0.42 1.41 0.32 4.61 (timsort) i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.09 0.01 0.00 0.01 0.01 0.04 0.01 0.01 16 65536 0.18 0.02 0.02 0.02 0.01 0.07 0.02 0.04 17 131072 0.41 0.04 0.04 0.04 0.03 0.14 0.03 0.08 18 262144 0.93 0.08 0.07 0.08 0.08 0.31 0.08 0.16 19 524288 2.07 0.15 0.15 0.16 0.16 0.63 0.16 0.34 20 1048576 4.54 0.33 0.31 0.32 0.33 1.28 0.32 0.67 ---------------------------------------------------------------------- Comment By: Anthony Baxter (anthonybaxter) Date: 2002-07-27 04:20 Message: Logged In: YES user_id=29957 Sun Ultra 5, gcc 2.95.2, 512M ram, sunos 5.7. (sort) imperial% ./python -O Lib/test/sortperf.py 15 20 1 i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.29 0.03 0.02 0.29 0.03 0.09 0.02 0.31 16 65536 0.66 0.05 0.05 0.68 0.05 0.20 0.05 0.71 17 131072 1.50 0.11 0.11 1.51 0.12 0.47 0.11 1.60 18 262144 3.25 0.23 0.22 3.37 0.25 1.18 0.22 3.52 19 524288 6.88 0.45 0.43 7.30 0.51 1.91 0.43 7.43 20 1048576 14.90 0.92 0.88 15.49 1.05 3.89 0.90 16.04 (timsort) imperial% ./python -O Lib/test/sortperf.py 15 20 1 i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.28 0.02 0.02 0.03 0.02 0.13 0.02 0.05 16 65536 0.59 0.05 0.05 0.06 0.05 0.26 0.05 0.11 17 131072 1.33 0.10 0.09 0.11 0.11 0.54 0.10 0.21 18 262144 2.92 0.22 0.20 0.22 0.21 1.10 0.20 0.44 19 524288 6.33 0.44 0.42 0.43 0.43 2.21 0.41 0.90 20 1048576 13.56 0.89 0.85 0.84 0.87 4.51 0.87 1.82 ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-26 21:24 Message: Logged In: YES user_id=31435 I attached timsort.txt, a plain-text detailed description of the algorithm. After I dies, it's the only clue that will remain . ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-26 16:38 Message: Logged In: YES user_id=31435 Intrigued by a comment of McIlroy, I tried catenating all the .c files in Objects and Modules, into one giant file, and sorted that. msort got a 22% speedup there, suggesting there's *some* kind of significant pre-existing lexicographic order (and/or reverse order) in C source files that msort is able to exploit. Trying it again on about 1.33 million lines of Python-Dev archive (including assorted uuencoded attachmets). msort got a 32% speedup. I'm not sure what to make of that, but we needed some real life data here . ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-07-26 15:50 Message: Logged In: YES user_id=44345 Pentium III, 450MHz, 256KB L2 cache, Mandrake Linux 8.1, gcc 2.96 L.sort(): i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.32 0.02 0.03 0.30 0.03 0.09 0.03 0.32 16 65536 0.73 0.06 0.05 0.66 0.06 0.20 0.05 0.71 17 131072 1.53 0.11 0.12 1.42 0.13 0.44 0.11 1.51 18 262144 3.28 0.21 0.21 3.09 0.28 0.89 0.21 3.26 19 524288 7.05 0.44 0.42 6.60 0.59 1.81 0.42 7.03 20 1048576 15.30 0.90 0.86 14.10 1.13 3.62 0.86 14.96 L.msort(): i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.32 0.02 0.03 0.03 0.02 0.13 0.02 0.05 16 65536 0.70 0.05 0.06 0.05 0.06 0.27 0.07 0.10 17 131072 1.53 0.09 0.11 0.10 0.11 0.59 0.10 0.21 18 262144 3.27 0.22 0.21 0.23 0.21 1.13 0.21 0.43 19 524288 7.10 0.43 0.45 0.44 0.45 2.27 0.43 0.88 20 1048576 15.03 0.86 0.87 0.87 0.89 4.70 0.89 1.74 ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-26 14:54 Message: Logged In: YES user_id=31435 Pentium III, 866 MHz, 16KB L1 D-cache, 16KB L1 I- cache, 256KB L2 cache, Win98SE, MSVC 6 samplesort i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.17 0.01 0.01 0.17 0.01 0.05 0.01 0.11 16 65536 0.24 0.02 0.02 0.25 0.02 0.08 0.02 0.24 17 131072 0.53 0.05 0.04 0.49 0.05 0.18 0.04 0.52 18 262144 1.16 0.09 0.09 1.06 0.12 0.37 0.09 1.14 19 524288 2.53 0.18 0.17 2.30 0.24 0.75 0.17 2.47 20 1048576 5.48 0.37 0.35 5.17 0.45 1.51 0.35 5.34 timsort i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.15 0.03 0.02 0.02 0.01 0.04 0.01 0.02 16 65536 0.23 0.02 0.02 0.02 0.02 0.09 0.02 0.04 17 131072 0.53 0.04 0.04 0.05 0.04 0.19 0.04 0.09 18 262144 1.16 0.09 0.09 0.10 0.09 0.38 0.09 0.19 19 524288 2.54 0.18 0.17 0.18 0.18 0.78 0.17 0.36 20 1048576 5.50 0.36 0.35 0.36 0.37 1.60 0.35 0.73 ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-26 13:52 Message: Logged In: YES user_id=31435 Numbers from Marc-Andre Lemburg, "AMD Athlon 1.2GHz/Linux/gcc". samplesort i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.07 0.00 0.01 0.09 0.01 0.03 0.01 0.08 16 65536 0.18 0.02 0.02 0.19 0.03 0.07 0.02 0.20 17 131072 0.43 0.05 0.04 0.46 0.05 0.18 0.05 0.48 18 262144 0.99 0.09 0.10 1.04 0.13 0.40 0.09 1.11 19 524288 2.23 0.19 0.21 2.32 0.24 0.83 0.20 2.46 20 1048576 4.96 0.40 0.40 5.41 0.47 1.72 0.40 5.46 samplesort again (run twice by mistake) i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.08 0.01 0.01 0.09 0.01 0.03 0.00 0.09 16 65536 0.20 0.02 0.01 0.20 0.03 0.07 0.02 0.20 17 131072 0.46 0.06 0.02 0.45 0.05 0.20 0.04 0.49 18 262144 0.99 0.09 0.10 1.09 0.11 0.40 0.12 1.12 19 524288 2.33 0.20 0.20 2.30 0.24 0.83 0.19 2.47 20 1048576 4.89 0.40 0.41 5.37 0.48 1.71 0.38 6.22 timsort i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.08 0.01 0.01 0.01 0.01 0.03 0.00 0.02 16 65536 0.17 0.02 0.02 0.02 0.02 0.07 0.02 0.06 17 131072 0.41 0.05 0.04 0.05 0.04 0.16 0.04 0.09 18 262144 0.95 0.10 0.10 0.10 0.10 0.33 0.10 0.20 19 524288 2.17 0.20 0.21 0.20 0.21 0.66 0.20 0.44 20 1048576 4.85 0.42 0.40 0.41 0.41 1.37 0.41 0.84 ---------------------------------------------------------------------- Comment By: Kevin Jacobs (jacobs99) Date: 2002-07-26 12:54 Message: Logged In: YES user_id=459565 Intel 1266 MHz Penguin III x2 (Dual processor) 512KB cache Linux 2.4.19-pre1-ac2 gcc 3.1 20020205 samplesort: i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.07 0.00 0.01 0.06 0.01 0.02 0.00 0.07 16 65536 0.16 0.02 0.01 0.15 0.01 0.06 0.02 0.17 17 131072 0.37 0.04 0.04 0.35 0.04 0.15 0.03 0.38 18 262144 0.84 0.07 0.08 0.80 0.09 0.31 0.07 0.86 19 524288 1.89 0.16 0.15 1.78 0.19 0.66 0.15 1.92 20 1048576 4.12 0.33 0.31 4.07 0.37 1.34 0.31 4.22 timsort: i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.07 0.01 0.00 0.01 0.01 0.03 0.01 0.01 16 65536 0.17 0.01 0.02 0.01 0.02 0.06 0.02 0.04 17 131072 0.37 0.04 0.03 0.04 0.04 0.13 0.04 0.08 18 262144 0.84 0.07 0.07 0.08 0.08 0.27 0.07 0.16 19 524288 1.89 0.16 0.15 0.15 0.17 0.55 0.15 0.33 20 1048576 4.16 0.32 0.31 0.31 0.32 1.14 0.31 0.66 ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-26 12:30 Message: Logged In: YES user_id=31435 Wow! Thanks, Neil! That's impressive, even if I say so myself . ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-26 12:23 Message: Logged In: YES user_id=35752 AMD 1.4 Ghz Athon CPU L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) L2 Cache: 256K (64 bytes/line) Linux 2.4.19-pre10-ac1 gcc 2.95.4 samplesort: i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.06 0.01 0.01 0.07 0.01 0.03 0.01 0.07 16 65536 0.16 0.02 0.02 0.15 0.02 0.07 0.02 0.17 17 131072 0.37 0.03 0.03 0.39 0.04 0.16 0.04 0.41 18 262144 0.84 0.07 0.08 0.87 0.10 0.34 0.07 0.93 19 524288 1.89 0.16 0.16 1.97 0.21 0.70 0.16 2.08 20 1048576 4.20 0.33 0.34 4.55 0.41 1.45 0.34 4.61 timsort: i 2**i *sort \sort /sort 3sort +sort ~sort =sort !sort 15 32768 0.06 0.00 0.01 0.01 0.01 0.03 0.00 0.01 16 65536 0.14 0.02 0.02 0.02 0.02 0.06 0.02 0.04 17 131072 0.35 0.04 0.04 0.04 0.04 0.12 0.04 0.08 18 262144 0.79 0.08 0.08 0.09 0.09 0.27 0.09 0.16 19 524288 1.79 0.17 0.17 0.18 0.17 0.54 0.17 0.33 20 1048576 3.96 0.35 0.34 0.34 0.36 1.12 0.34 0.70 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587076&group_id=5470 From noreply@sourceforge.net Thu Aug 1 15:40:03 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 01 Aug 2002 07:40:03 -0700 Subject: [Patches] [ python-Patches-588728 ] __delete__ method-wraper Message-ID: Patches item #588728, was opened at 2002-07-30 15:33 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588728&group_id=5470 Category: Core (C code) Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Nathan Srebro (nati) Assigned to: Guido van Rossum (gvanrossum) Summary: __delete__ method-wraper Initial Comment: In Python 2.2.1 (as well as in the current CVS), one cannot access the __delete__ method of built-in descriptors. This is particularly a problem when trying to cooperatively subclass a built-in descriptor. Also, defining a __delete__ method for a class (which is a subclass of 'object'), does not have any effect unless __set__ is also defined (it does only for old-style classes). This patch adds a method-wrapper for delete. This solves the above two issues: property().__delete__ is now properly defined, and defining a __delete__ method now works even if __set__ is not deffined: >>> class C(object): ... def delx(self): ... print "deled" ... x = property(None,None,delx) ... >>> a=C() >>> C.__dict__['x'].__delete__(a) deled >>> >>> >>> class prop(object): ... def __delete__(self,obj): ... print "deled" ... >>> class D(object): ... x = prop() ... >>> a = D() >>> del a.x deled >>> ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-01 10:40 Message: Logged In: YES user_id=6380 Thanks! Good catch. Applied to CVS, will backport to 2.2.2. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588728&group_id=5470 From noreply@sourceforge.net Thu Aug 1 16:38:37 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 01 Aug 2002 08:38:37 -0700 Subject: [Patches] [ python-Patches-587993 ] alternative SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 11:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Guido van Rossum (gvanrossum) Summary: alternative SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 22:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 21:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 20:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 15:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 15:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Thu Aug 1 16:41:16 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 01 Aug 2002 08:41:16 -0700 Subject: [Patches] [ python-Patches-587993 ] alternative SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 11:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Guido van Rossum (gvanrossum) Summary: alternative SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 22:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 21:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 20:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 15:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 15:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Thu Aug 1 16:40:26 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 01 Aug 2002 08:40:26 -0700 Subject: [Patches] [ python-Patches-587993 ] alternative SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 11:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Guido van Rossum (gvanrossum) Summary: alternative SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 22:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 21:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 20:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 15:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 15:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Thu Aug 1 16:43:43 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 01 Aug 2002 08:43:43 -0700 Subject: [Patches] [ python-Patches-587993 ] alternative SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 11:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Guido van Rossum (gvanrossum) Summary: alternative SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 22:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 21:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 20:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 15:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 15:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Thu Aug 1 17:59:52 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 01 Aug 2002 09:59:52 -0700 Subject: [Patches] [ python-Patches-578297 ] fix for problems with test_longexp Message-ID: Patches item #578297, was opened at 2002-07-07 06:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=578297&group_id=5470 Category: Parser/Compiler Group: Python 2.3 Status: Open >Resolution: Accepted Priority: 5 Submitted By: Andrew I MacIntyre (aimacintyre) >Assigned to: Andrew I MacIntyre (aimacintyre) Summary: fix for problems with test_longexp Initial Comment: The OS/2 EMX port has long had problems with test_longexp, which triggers gross memory consumption on this platform as a result of platform malloc behaviour. More recently, this appears to have been identified in MacPython under certain circumstances, although the problem is apparently more a speed issue than a memory consumption issue. The core of the problem is the blizzard of small mallocs as the parser builds the parse tree and creates tokens. The attached patch takes advantage of PyMalloc (built in by default for 2.3) to insulate the parser from adverse behaviour in the platform malloc. The patch has been tested on OS/2 and FreeBSD: - on OS/2, the patch allows even a system with modest resources to complete test_longexp successfully and without swapping to death; on better resourced machines, the whole regression test is negligibly slower (0-1%) to complete. [gcc-2.8.1 -O2] - on FreeBSD (4.4 tested), test_longexp gains nearly 10%, and completes the whole regression test with a gain of about 2% (test_longexp is good for about 25% of the improvement). [gcc-2.95.3 -O3] Both platforms are neutral, performance wise, running MAL's PyBench 1.0. The patch in its current form is for experimental evaluation, and not intended for integration into the core. If there is interest in seeing this integrated, I'd like feedback on a more elegant way to implement the functional change. I've assigned this to Jack for review in the context of its performance on the Mac. ---------------------------------------------------------------------- >Comment By: Jeremy Hylton (jhylton) Date: 2002-08-01 16:59 Message: Logged In: YES user_id=31392 Looks good to me for 2.3 and 2.2. I see a 40% speedup on compilation of long source strings. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-30 02:21 Message: Logged In: YES user_id=250749 Yes, test_longexp_fix.diff is no longer part of the patch set. Should I delete it? I must have missed your 2.2 backport commit message. I might also look at whether it can be backported to 2.1 without significant side effects. Thanks for your feedback too. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 19:52 Message: Logged In: YES user_id=31435 Reassigned to Jeremy because I'm "on vacation" this week, and Jeremy is most familiar w/ the parser code. Offhand the patches looked fine to me, provided that you no longer consider test_longexp_fix.diff to be part of the patch set. I backported the XXXROUNDUP changes to the 2.2 maintenance branch at the sane time I changed it in the HEAD, so nothing left to do there on that count. Thanks for the great work! ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-29 11:47 Message: Logged In: YES user_id=250749 Tim, 1. any objections to the "final" patches? 2. do you see any reason not to backport your XXXROUNDUP change - it qualifies as a performance/behaviour bugfix IMO. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-21 13:16 Message: Logged In: YES user_id=250749 Ok, I've prepared patches to convert the following files to use PyMalloc for memory allocation: Parser/[acceler.c|node.c|parsetok,c] (pymalloc-parser.diff) Python/compile.c (pymalloc-compile.diff) I didn't bother with the other files in Parser/ as my malloc logging shows that they only ever appear to make requests > 256 bytes. I have attached/will attach a summary from my malloc logging experiments for information. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-15 18:14 Message: Logged In: YES user_id=31435 Thanks for the detailed followup, Andrew! I incorporated some of this info into XXXROUNDUP's comments. Without either patch, the system malloc has to do two miserable things: (1) find bigger and bigger memory areas very frequently; and, (2) interleaved with that, allocate gazillions of tiny blocks too. #2 makes it difficult for the platform malloc to find free space contiguous to the blocks allocated for #1, unless it arranges to move them to "the end" of memory, or into their own memory segments. As a result it's likely to do a copy on nearly every large-block realloc, and the code used to do a realloc on every 3rd new child. The XXXROUNDUP patch addressed #1 by asking to grow blocks much less frequently; PyMalloc addresses #2 by getting the tiny blocks out of the platform malloc's hair. If the platform malloc is saved from either one, it's job becomes much easier. It would still be nice to switch the parser to using pymalloc. There are still disasters lurking, because some platform malloc packages appear to take quadratic time when *free*ing gazillions of tiny blocks (they thrash trying to coalesce them into larger contiguous free blocks). pymalloc doesn't try to coalesce free blocks, so is reliably immune to this disease. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-15 11:47 Message: Logged In: YES user_id=250749 To my surprise, Tim's checkin also works for the EMX port. I can only conclude that EMX's realloc() has a corner case tickled by test_longexp, that isn't hit with either the aggressive overallocation change or the PyMalloc change applied. It is also interesting to note the performance impact of Tim's checkin, particularly on FreeBSD. Typical runtimes for "python -E -tt Lib/test/regrtest.py -l test_longexp" on my P5-166SMP test box (FreeBSD 4.4, gcc 2.95.3 -O3): total user sys baseline: 39.1s 32.7s 6.3s my patch: 37.1s 30.3 6.7s Tim's checkin: 8.4s 7.8s 0.6s my patch+Tim's checkin 5.5s 4.9s 0.5s These runs with Library modules already compiled. While Tim's comments about timing the regression test are noted, there are nonetheless consistent reductions in execution time of the regression test as well. Typical results on the same test box: total user sys baseline: 1386s 1097s 89s my patch: 1350s 1065s 93s Tim's checkin: 1265s 1003s 67s my patch+Tim's checkin 1230s 971s 65s With the EMX port, the difference in timing between Tim's checkin and my patch is small, both for test_longexp and the regression test. There are noticeable gains for both test_longexp and the whole regression test with both changes in place, although not as significant as the FreeBSD results. MAL's PyBench 1.0 exhibits negligible performance differences between the code states on both platforms, which is as I'd expect as it doesn't appear to test compile() or eval(). >From the above, I conclude that Tim's patch gets the most bang for the buck, and that my patch (or its intent) be rejected unless someone thinks pursuing the PyMalloc changes to the parser worthwhile. As an aside, I did a little research on the "XXX are those actually common?" question Tim posed in the comment associated with his change: In running Lib/compileall.py against the Lib directory, 89% of PyMem_RESIZE() calls in AddChild() are the n=1 case, and 9% are rounded up to n=4. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-07-08 10:09 Message: Logged In: YES user_id=45365 With Tim's mods test_import and test_longexp now work fine in MacPython. This is both with and without Andrew's patch. Andrew, I'm assigning back to you, there's little more I can do with this patch. And you'll have to check if you still need it, or whether Tims change to node.c is goo enough for OS/2 as well. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-08 06:38 Message: Logged In: YES user_id=31435 Jack, please do a cvs update and try this again. I checked in changes to PyNode_AddChild() that I expect will cure your particular woes here. Andrew, PyMalloc was designed for oodles of small allocations. Feel encouraged to write a patch to change the compiler to use PyObject_{Malloc, Realloc, Free} instead. Then it will automatically exploit PyMalloc when the latter is enabled. Note that the regression test suite incorporates random numbers in several tests, and in ways that can affect runtime. Small differences in aggregate test suite runtime are meaningless because of this. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-07-07 21:24 Message: Logged In: YES user_id=45365 Unfortunately on the Mac it doesn't help anything for the test_longexp problem, nor for the similar test_import problem. The problem with MacPython's malloc seems to be that large reallocs cause the slowdown. And the addchild() calls will continually realloc a block of memory to a slightly larger size (I gave up when it was about 800KB, after a minute or two, and growing at tens of KB per second). As soon as the block is larger than SMALL_REQUEST_TRESHOLD pymalloc will simply call the underlying system malloc/realloc. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-07 06:41 Message: Logged In: YES user_id=250749 Oops. On FreeBSD, test_longexp contributes 15% of the performance gain (not 25%) observed for the regression test with the patch applied. Also, I would expect to make this a platform specific change if its integrated, rather than a general change (unless that it is seen as more appropriate). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=578297&group_id=5470 From noreply@sourceforge.net Fri Aug 2 07:38:20 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 01 Aug 2002 23:38:20 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-01 23:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Modules Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Nobody/Anonymous (nobody) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Fri Aug 2 14:26:45 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 06:26:45 -0700 Subject: [Patches] [ python-Patches-590119 ] types.BoolType Message-ID: Patches item #590119, was opened at 2002-08-02 15:26 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590119&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Markus F.X.J. Oberhumer (mfx) Assigned to: Nobody/Anonymous (nobody) Summary: types.BoolType Initial Comment: I know that types is getting deprecated, but for orthogonality we really should have a BoolType. Also, IMHO we should _not_ have a BooleanType (or DictionaryType), but that might break code. Index: types.py =================================================================== RCS file: /cvsroot/python/python/dist/src/Lib/types.py,v retrieving revision 1.29 diff -u -r1.29 types.py --- types.py 14 Jun 2002 20:41:13 -0000 1.29 +++ types.py 2 Aug 2002 13:22:22 -0000 @@ -16,7 +16,7 @@ IntType = int LongType = long FloatType = float -BooleanType = bool +BoolType = BooleanType = bool try: ComplexType = complex except NameError: ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590119&group_id=5470 From noreply@sourceforge.net Fri Aug 2 14:34:41 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 06:34:41 -0700 Subject: [Patches] [ python-Patches-587993 ] alternative SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 11:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Guido van Rossum (gvanrossum) Summary: alternative SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Michael Hudson (mwh) Date: 2002-08-02 13:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 22:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 21:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 20:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 15:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 15:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Fri Aug 2 15:45:22 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 07:45:22 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-02 02:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Modules Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Nobody/Anonymous (nobody) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 10:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Fri Aug 2 17:58:50 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 09:58:50 -0700 Subject: [Patches] [ python-Patches-506436 ] GETCONST/GETNAME/GETNAMEV speedup Message-ID: Patches item #506436, was opened at 2002-01-21 07:39 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=506436&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Out of Date Priority: 5 Submitted By: Skip Montanaro (montanaro) >Assigned to: Tim Peters (tim_one) Summary: GETCONST/GETNAME/GETNAMEV speedup Initial Comment: The attached patch redefines the GETCONST, GETNAME & GETNAMEV macros to do the following: * access the code object's consts and names through local variables instead of the long chain from f * use access macros to index the tuples and get the C string names The code appears correct, and I've had no trouble with it. It only provides the most trivial of improvement on pystone (around 1% when I see anything), but it's all those little things that add up, right? Skip ---------------------------------------------------------------------- >Comment By: Skip Montanaro (montanaro) Date: 2002-08-02 11:58 Message: Logged In: YES user_id=44345 Here's an updated patch. Back to Tim... ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-23 17:04 Message: Logged In: YES user_id=31435 Marked Out-of-Date and back to Skip. Sorry for the delay! The idea is fine. I'd rather you use the current GETITEM macro, which does bounds-checking in a debug build. I note too that GETCONST is only used once, and that use may as well be a direct GETITEM(consts, i) invocation, and skip the macro. Note that the GETNAME() macro no longer exists. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-07-22 22:34 Message: Logged In: YES user_id=33168 Skip, I modified this code some, but your technique is still valid. I got rid of one of the indirections already. The patch can easily be updated. Seems like the patch shouldn't hurt. Tim? ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-07-09 18:45 Message: Logged In: YES user_id=44345 Looking for a vote up or down on this one... ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-01-21 07:47 Message: Logged In: YES user_id=44345 Whoops... Make the "observed" speedup 0.1%... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=506436&group_id=5470 From noreply@sourceforge.net Fri Aug 2 18:03:28 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 10:03:28 -0700 Subject: [Patches] [ python-Patches-559288 ] Use builtin boolean if present Message-ID: Patches item #559288, was opened at 2002-05-22 12:34 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=559288&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Skip Montanaro (montanaro) Assigned to: Fredrik Lundh (effbot) Summary: Use builtin boolean if present Initial Comment: Now that Python has a boolean type, perhaps xmlrpclib should use it if available. Here's a patch that (I think) does what's necessary. The existing test case (which does manipulate a boolean) passes. Haven't tested it with a pre-bool version of Python though. Skip ---------------------------------------------------------------------- >Comment By: Skip Montanaro (montanaro) Date: 2002-08-02 12:03 Message: Logged In: YES user_id=44345 updated patch which avoids using types.BooleanType since there was so much pushback on the idea. ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-05-22 12:46 Message: Logged In: YES user_id=44345 new patch - don't know why running the test suite didn't catch the NameError... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=559288&group_id=5470 From noreply@sourceforge.net Fri Aug 2 18:45:33 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 10:45:33 -0700 Subject: [Patches] [ python-Patches-569574 ] plain text enhancement for cgitb Message-ID: Patches item #569574, was opened at 2002-06-15 23:46 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=569574&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Skip Montanaro (montanaro) Assigned to: Ka-Ping Yee (ping) Summary: plain text enhancement for cgitb Initial Comment: Here's a patch to cgitb that allows you to enable plain text output. It adds an extra variable to the cgitb.enable function and corresponding underlying functions. To get plain text invoke it as import cgitb cgitb.enable(format="text") (actually, any value for format other than "html" will enable plain text output). The default value is "html", so existing usage of cgitb should be unaffected. I realize this isn't quite what you suggested, but it seemed to me worthwhile to keep such similar code together. I'm not entirely certain I haven't fouled up the html formatting. It needs to be checked still. Also still to come is a doc change. Skip ---------------------------------------------------------------------- >Comment By: Skip Montanaro (montanaro) Date: 2002-08-02 12:45 Message: Logged In: YES user_id=44345 I started messing around with this, but quickly figured out I didn't know what "main routine" belongs in traceback. Moving lookup and scanvars there is no problem, but the rest of the functionality is almost entirely in the html() and text() routines. Do you mean they should go in traceback? ---------------------------------------------------------------------- Comment By: Ka-Ping Yee (ping) Date: 2002-07-09 22:33 Message: Logged In: YES user_id=45338 I think enhanced text tracebacks would be great. (I even have my own hacked-up one lying around here somewhere -- it colourized the output. I think a part of me was waiting for an opportunity to make enhanced tracebacks standard. The most important enhancement IMHO is to show argument values.) I don't think the functionality belongs in cgitb, though. The main routine probably should go in traceback; the common routines (scanvars and lookup) can go there too. ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-07-09 18:48 Message: Logged In: YES user_id=44345 Ping How about you? As the author I think you're in the best position to decide on the merits of the patch... Skip ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-06-19 22:36 Message: Logged In: YES user_id=6380 Unassigning -- I won't get to this before my vacation. ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-06-16 00:09 Message: Logged In: YES user_id=44345 Okay, here's a correction to the first patch. It fixes the logic bug that corrupted the HTML output. It also adds a little bit of extra documentation. Writing the documentation made me think that perhaps this should be added to the traceback module as Guido suggested with just a stub cgitb module that provides an enable function that calls the enable function in the traceback module with format="html". The cgitb module could then be deprecated. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=569574&group_id=5470 From noreply@sourceforge.net Fri Aug 2 18:54:27 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 10:54:27 -0700 Subject: [Patches] [ python-Patches-586561 ] Better token-related error messages Message-ID: Patches item #586561, was opened at 2002-07-25 16:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=586561&group_id=5470 Category: Parser/Compiler Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Skip Montanaro (montanaro) Assigned to: Jeremy Hylton (jhylton) Summary: Better token-related error messages Initial Comment: There were some complaints recently on c.l.py about the rather non-informative error messages emitted as a result of the tokenizer detecting a problem. In many situations it simply returns E_TOKEN which generates a fairly benign, but often unhelpful "invalid token" message. This patch adds several new E_* macrosto Includes/errorcode.h, returns them from the appropriate places in Parser/tokenizer.c and generates more specific messages in Python/pythonrun.c. I think the error messages are always better, though in some situations they may still not be strictly correct. Assigning to Jeremy since he's the compiler wiz. Skip ---------------------------------------------------------------------- >Comment By: Jeremy Hylton (jhylton) Date: 2002-08-02 17:54 Message: Logged In: YES user_id=31392 Is the warning about i vs. j for complex numbers really necessary? It seems like it adds extra, well, complexity for a tiny corner case. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=586561&group_id=5470 From noreply@sourceforge.net Fri Aug 2 19:21:37 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 11:21:37 -0700 Subject: [Patches] [ python-Patches-590294 ] os._execvpe security fix Message-ID: Patches item #590294, was opened at 2002-08-02 11:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590294&group_id=5470 Category: Modules Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Nobody/Anonymous (nobody) Summary: os._execvpe security fix Initial Comment: 1) Do not attempt to exec a file which does not exist just to find out what error the operating system returns. This is an exploitable race on all platforms that support symbolic links. 2) Immediately re-raise the exception if we get an error other than errno.ENOENT or errno.ENOTDIR. This may need to be adapted for other platforms. (As a security issue, this should be considered for 2.1 and 2.2 as well as 2.3.) ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590294&group_id=5470 From noreply@sourceforge.net Fri Aug 2 19:22:36 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 11:22:36 -0700 Subject: [Patches] [ python-Patches-590294 ] os._execvpe security fix Message-ID: Patches item #590294, was opened at 2002-08-02 11:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590294&group_id=5470 Category: Modules Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Zack Weinberg (zackw) >Assigned to: Guido van Rossum (gvanrossum) Summary: os._execvpe security fix Initial Comment: 1) Do not attempt to exec a file which does not exist just to find out what error the operating system returns. This is an exploitable race on all platforms that support symbolic links. 2) Immediately re-raise the exception if we get an error other than errno.ENOENT or errno.ENOTDIR. This may need to be adapted for other platforms. (As a security issue, this should be considered for 2.1 and 2.2 as well as 2.3.) ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590294&group_id=5470 From noreply@sourceforge.net Fri Aug 2 19:05:51 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 11:05:51 -0700 Subject: [Patches] [ python-Patches-586561 ] Better token-related error messages Message-ID: Patches item #586561, was opened at 2002-07-25 11:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=586561&group_id=5470 Category: Parser/Compiler Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Skip Montanaro (montanaro) Assigned to: Jeremy Hylton (jhylton) Summary: Better token-related error messages Initial Comment: There were some complaints recently on c.l.py about the rather non-informative error messages emitted as a result of the tokenizer detecting a problem. In many situations it simply returns E_TOKEN which generates a fairly benign, but often unhelpful "invalid token" message. This patch adds several new E_* macrosto Includes/errorcode.h, returns them from the appropriate places in Parser/tokenizer.c and generates more specific messages in Python/pythonrun.c. I think the error messages are always better, though in some situations they may still not be strictly correct. Assigning to Jeremy since he's the compiler wiz. Skip ---------------------------------------------------------------------- >Comment By: Skip Montanaro (montanaro) Date: 2002-08-02 13:05 Message: Logged In: YES user_id=44345 re: i vs. j Perhaps it's not needed. The patch was originally designed to address the case of runaway triple-quoted strings. Someone on c.l.py ranted about that. While I was in there, I recalled someone else (perhaps more than one person) had berated Python in the past because imaginary numbers use 'j' instead of 'i' and decided to stick it in. It's no big deal to take it out. (When you think about it, they are all corner cases, since most of the time the code is syntactically correct. ;-) S ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-08-02 12:54 Message: Logged In: YES user_id=31392 Is the warning about i vs. j for complex numbers really necessary? It seems like it adds extra, well, complexity for a tiny corner case. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=586561&group_id=5470 From noreply@sourceforge.net Fri Aug 2 19:17:38 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 11:17:38 -0700 Subject: [Patches] [ python-Patches-586561 ] Better token-related error messages Message-ID: Patches item #586561, was opened at 2002-07-25 16:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=586561&group_id=5470 Category: Parser/Compiler Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Skip Montanaro (montanaro) Assigned to: Jeremy Hylton (jhylton) Summary: Better token-related error messages Initial Comment: There were some complaints recently on c.l.py about the rather non-informative error messages emitted as a result of the tokenizer detecting a problem. In many situations it simply returns E_TOKEN which generates a fairly benign, but often unhelpful "invalid token" message. This patch adds several new E_* macrosto Includes/errorcode.h, returns them from the appropriate places in Parser/tokenizer.c and generates more specific messages in Python/pythonrun.c. I think the error messages are always better, though in some situations they may still not be strictly correct. Assigning to Jeremy since he's the compiler wiz. Skip ---------------------------------------------------------------------- >Comment By: Jeremy Hylton (jhylton) Date: 2002-08-02 18:17 Message: Logged In: YES user_id=31392 The current error message for the complex case seems clear enough since it identifies exactly the offending character. >>> 3i+2 File "", line 1 3i+2 ^ SyntaxError: invalid syntax The error message for runaway triple quoted strings is much more puzzling, since the line of context doesn't have anything useful on it. I guess we should think about the others, too: E_EOLS is marginal, since you do get the line with the error in the exception. E_EOFC is a win for the same reason that E_EOFS is, although I expect it's a less common case. E_EXP and E_SLASH are borderline -- again because the current syntax error identifies exactly the line and character that are causing the problem. We should get a third opinion, but I'd probably settle for just E_EOFC and E_EOFS. ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-08-02 18:05 Message: Logged In: YES user_id=44345 re: i vs. j Perhaps it's not needed. The patch was originally designed to address the case of runaway triple-quoted strings. Someone on c.l.py ranted about that. While I was in there, I recalled someone else (perhaps more than one person) had berated Python in the past because imaginary numbers use 'j' instead of 'i' and decided to stick it in. It's no big deal to take it out. (When you think about it, they are all corner cases, since most of the time the code is syntactically correct. ;-) S ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-08-02 17:54 Message: Logged In: YES user_id=31392 Is the warning about i vs. j for complex numbers really necessary? It seems like it adds extra, well, complexity for a tiny corner case. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=586561&group_id=5470 From noreply@sourceforge.net Fri Aug 2 21:27:10 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 13:27:10 -0700 Subject: [Patches] [ python-Patches-590352 ] py2texi.el update Message-ID: Patches item #590352, was opened at 2002-08-02 20:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590352&group_id=5470 Category: Documentation Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Matthias Klose (doko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: py2texi.el update Initial Comment: [python2.3 (and python2.2)] Attached is a patch from Milan Zamazal to update py2texi.el: - allow to set the info file name - correctly generate code for nodes like: \subsubsection{File Objects\obindex{file} \label{bltin-file-objects}} ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590352&group_id=5470 From noreply@sourceforge.net Fri Aug 2 22:25:17 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 14:25:17 -0700 Subject: [Patches] [ python-Patches-578494 ] PEP 282 Implementation Message-ID: Patches item #578494, was opened at 2002-07-07 20:50 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=578494&group_id=5470 Category: Library (Lib) Group: Python 2.3 >Status: Open Resolution: None Priority: 5 Submitted By: Vinay Sajip (vsajip) >Assigned to: Guido van Rossum (gvanrossum) Summary: PEP 282 Implementation Initial Comment: The attached file implements PEP282. The file logging- 0.4.6.tar.gz is the entire distribution including setup/install, test/example scripts, and TeX documentation. The file logging.py (within the .tar.gz) is all that is needed to implement the PEP. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 17:25 Message: Logged In: YES user_id=6380 Um, Mark, looks like you accidentally closed this! I reopened it and assigned it to me for review. I'm gonna read the PEP and see if I like the design decisions enough to pronounce acceptance. ---------------------------------------------------------------------- Comment By: Mark Hammond (mhammond) Date: 2002-07-09 22:22 Message: Logged In: YES user_id=14198 The code seems high quality and well documented. I have no concerns with logging.py as such. I have two main issues: * Design decisions: looking over python-dev, I can not see a consensus on the design decisions. I believe that *some* type of official acceptance of the design should be decreed by someone. * Source structure: while this seems quite suitable for an extension module, the format of the patch is probably not quite correct for a core module. For example, the test code should probably be integrated with the standard Python test suite (even if in a sub-directory), the Tex docs integrated with Python's docs etc So while I think the patch is high quality I believe these issues need to be addressed before I can do much more. Setting to "pending" - but good stuff tho! Please drive this through! ---------------------------------------------------------------------- Comment By: Vinay Sajip (vsajip) Date: 2002-07-07 20:56 Message: Logged In: YES user_id=308438 Added just the logging.py file to make it easier to review. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=578494&group_id=5470 From noreply@sourceforge.net Fri Aug 2 22:50:49 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 14:50:49 -0700 Subject: [Patches] [ python-Patches-590377 ] db4 include not found Message-ID: Patches item #590377, was opened at 2002-08-02 21:50 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590377&group_id=5470 Category: Build Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Matthias Klose (doko) Assigned to: Nobody/Anonymous (nobody) Summary: db4 include not found Initial Comment: setup.py looks for the db4 library in /usr/lib, but doesn't look for the header in /usr/include (as you find it on Debian unstable) ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590377&group_id=5470 From noreply@sourceforge.net Sat Aug 3 02:20:56 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 18:20:56 -0700 Subject: [Patches] [ python-Patches-506436 ] GETCONST/GETNAME/GETNAMEV speedup Message-ID: Patches item #506436, was opened at 2002-01-21 08:39 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=506436&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open >Resolution: Accepted Priority: 5 Submitted By: Skip Montanaro (montanaro) >Assigned to: Skip Montanaro (montanaro) Summary: GETCONST/GETNAME/GETNAMEV speedup Initial Comment: The attached patch redefines the GETCONST, GETNAME & GETNAMEV macros to do the following: * access the code object's consts and names through local variables instead of the long chain from f * use access macros to index the tuples and get the C string names The code appears correct, and I've had no trouble with it. It only provides the most trivial of improvement on pystone (around 1% when I see anything), but it's all those little things that add up, right? Skip ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-08-02 21:20 Message: Logged In: YES user_id=31435 Cool -- check it in! ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-08-02 12:58 Message: Logged In: YES user_id=44345 Here's an updated patch. Back to Tim... ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-23 18:04 Message: Logged In: YES user_id=31435 Marked Out-of-Date and back to Skip. Sorry for the delay! The idea is fine. I'd rather you use the current GETITEM macro, which does bounds-checking in a debug build. I note too that GETCONST is only used once, and that use may as well be a direct GETITEM(consts, i) invocation, and skip the macro. Note that the GETNAME() macro no longer exists. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-07-22 23:34 Message: Logged In: YES user_id=33168 Skip, I modified this code some, but your technique is still valid. I got rid of one of the indirections already. The patch can easily be updated. Seems like the patch shouldn't hurt. Tim? ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-07-09 19:45 Message: Logged In: YES user_id=44345 Looking for a vote up or down on this one... ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-01-21 08:47 Message: Logged In: YES user_id=44345 Whoops... Make the "observed" speedup 0.1%... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=506436&group_id=5470 From noreply@sourceforge.net Sat Aug 3 07:53:59 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 23:53:59 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-01 23:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Modules Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Nobody/Anonymous (nobody) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Zack Weinberg (zackw) Date: 2002-08-02 23:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 07:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Sat Aug 3 15:16:26 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 03 Aug 2002 07:16:26 -0700 Subject: [Patches] [ python-Patches-590513 ] Add popen2 like functionality to pty.py. Message-ID: Patches item #590513, was opened at 2002-08-04 00:16 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590513&group_id=5470 Category: Library (Lib) Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Rasjid Wilcox (rasjidw) Assigned to: Nobody/Anonymous (nobody) Summary: Add popen2 like functionality to pty.py. Initial Comment: This patch adds a popen2 like function to pty.py. Due to use of os.execlp in pty.spawn, it is not quite the same, as all the arguments (including the command to be run) must be passed as a tupple. It is only a first draft, and may need some more work, which I am willing to do if some direction is indicated. Tested on Python2.2, under RedHat Linux 7.3. Rasjid. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590513&group_id=5470 From noreply@sourceforge.net Sat Aug 3 16:52:13 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 03 Aug 2002 08:52:13 -0700 Subject: [Patches] [ python-Patches-587993 ] alternative SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 11:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Guido van Rossum (gvanrossum) Summary: alternative SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Michael Hudson (mwh) Date: 2002-08-03 15:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 13:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 22:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 21:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 20:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 15:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 15:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Sat Aug 3 21:06:27 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 03 Aug 2002 13:06:27 -0700 Subject: [Patches] [ python-Patches-587993 ] alternative SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 07:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) >Assigned to: Nobody/Anonymous (nobody) Summary: alternative SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 16:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 11:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 09:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 18:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 17:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 05:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 05:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 16:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 11:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 11:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Sun Aug 4 05:58:58 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 03 Aug 2002 21:58:58 -0700 Subject: [Patches] [ python-Patches-590682 ] New codecs: html, asciihtml Message-ID: Patches item #590682, was opened at 2002-08-04 04:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: New codecs: html, asciihtml Initial Comment: These codecs translate HTML character &entity; references. The html codec may be applied after other codecs such as utf-8 or iso8859_X and preserves their encoding. The asciihtml encoder produces 7-bit ascii and its output is therefore safe for insertion into almost any document regardless of its encoding. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 From noreply@sourceforge.net Sun Aug 4 07:47:39 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 03 Aug 2002 23:47:39 -0700 Subject: [Patches] [ python-Patches-578297 ] fix for problems with test_longexp Message-ID: Patches item #578297, was opened at 2002-07-07 16:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=578297&group_id=5470 Category: Parser/Compiler Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Andrew I MacIntyre (aimacintyre) Assigned to: Andrew I MacIntyre (aimacintyre) Summary: fix for problems with test_longexp Initial Comment: The OS/2 EMX port has long had problems with test_longexp, which triggers gross memory consumption on this platform as a result of platform malloc behaviour. More recently, this appears to have been identified in MacPython under certain circumstances, although the problem is apparently more a speed issue than a memory consumption issue. The core of the problem is the blizzard of small mallocs as the parser builds the parse tree and creates tokens. The attached patch takes advantage of PyMalloc (built in by default for 2.3) to insulate the parser from adverse behaviour in the platform malloc. The patch has been tested on OS/2 and FreeBSD: - on OS/2, the patch allows even a system with modest resources to complete test_longexp successfully and without swapping to death; on better resourced machines, the whole regression test is negligibly slower (0-1%) to complete. [gcc-2.8.1 -O2] - on FreeBSD (4.4 tested), test_longexp gains nearly 10%, and completes the whole regression test with a gain of about 2% (test_longexp is good for about 25% of the improvement). [gcc-2.95.3 -O3] Both platforms are neutral, performance wise, running MAL's PyBench 1.0. The patch in its current form is for experimental evaluation, and not intended for integration into the core. If there is interest in seeing this integrated, I'd like feedback on a more elegant way to implement the functional change. I've assigned this to Jack for review in the context of its performance on the Mac. ---------------------------------------------------------------------- >Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-08-04 16:47 Message: Logged In: YES user_id=250749 I've committed this to the head (2.3) branch. I looked at a 2.2 backport, but ran into problems with the Parser/acceler.c and Parser/parsetok.c changes - due to changes in memory APIs and changes in Include/pgenheaders.h, they don't compile under 2.2 with or without --with-pymalloc :-( I had a look at what it would take to resolve, and concluded that it involved opening Pandora's box. Parser/node.c and Python/compile.c would be OK. Unfortunately the Parser/parsetok.c changes are the most valuable after Parser/node.c, based on allocation sizes and frequency. While I believe these changes would enhance the performance and stability of the 2.2 branch, most of the gains have already been made by Tim's PyNode_AddChild() fix. Given the likely extent of the code disturbance, and that I'm not going to have any time to go further with this for at least a month, I don't plan to do the 2.2 backport and plan to close this patch in a couple of days. Thanks all! ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-08-02 02:59 Message: Logged In: YES user_id=31392 Looks good to me for 2.3 and 2.2. I see a 40% speedup on compilation of long source strings. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-30 12:21 Message: Logged In: YES user_id=250749 Yes, test_longexp_fix.diff is no longer part of the patch set. Should I delete it? I must have missed your 2.2 backport commit message. I might also look at whether it can be backported to 2.1 without significant side effects. Thanks for your feedback too. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-30 05:52 Message: Logged In: YES user_id=31435 Reassigned to Jeremy because I'm "on vacation" this week, and Jeremy is most familiar w/ the parser code. Offhand the patches looked fine to me, provided that you no longer consider test_longexp_fix.diff to be part of the patch set. I backported the XXXROUNDUP changes to the 2.2 maintenance branch at the sane time I changed it in the HEAD, so nothing left to do there on that count. Thanks for the great work! ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-29 21:47 Message: Logged In: YES user_id=250749 Tim, 1. any objections to the "final" patches? 2. do you see any reason not to backport your XXXROUNDUP change - it qualifies as a performance/behaviour bugfix IMO. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-21 23:16 Message: Logged In: YES user_id=250749 Ok, I've prepared patches to convert the following files to use PyMalloc for memory allocation: Parser/[acceler.c|node.c|parsetok,c] (pymalloc-parser.diff) Python/compile.c (pymalloc-compile.diff) I didn't bother with the other files in Parser/ as my malloc logging shows that they only ever appear to make requests > 256 bytes. I have attached/will attach a summary from my malloc logging experiments for information. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-16 04:14 Message: Logged In: YES user_id=31435 Thanks for the detailed followup, Andrew! I incorporated some of this info into XXXROUNDUP's comments. Without either patch, the system malloc has to do two miserable things: (1) find bigger and bigger memory areas very frequently; and, (2) interleaved with that, allocate gazillions of tiny blocks too. #2 makes it difficult for the platform malloc to find free space contiguous to the blocks allocated for #1, unless it arranges to move them to "the end" of memory, or into their own memory segments. As a result it's likely to do a copy on nearly every large-block realloc, and the code used to do a realloc on every 3rd new child. The XXXROUNDUP patch addressed #1 by asking to grow blocks much less frequently; PyMalloc addresses #2 by getting the tiny blocks out of the platform malloc's hair. If the platform malloc is saved from either one, it's job becomes much easier. It would still be nice to switch the parser to using pymalloc. There are still disasters lurking, because some platform malloc packages appear to take quadratic time when *free*ing gazillions of tiny blocks (they thrash trying to coalesce them into larger contiguous free blocks). pymalloc doesn't try to coalesce free blocks, so is reliably immune to this disease. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-15 21:47 Message: Logged In: YES user_id=250749 To my surprise, Tim's checkin also works for the EMX port. I can only conclude that EMX's realloc() has a corner case tickled by test_longexp, that isn't hit with either the aggressive overallocation change or the PyMalloc change applied. It is also interesting to note the performance impact of Tim's checkin, particularly on FreeBSD. Typical runtimes for "python -E -tt Lib/test/regrtest.py -l test_longexp" on my P5-166SMP test box (FreeBSD 4.4, gcc 2.95.3 -O3): total user sys baseline: 39.1s 32.7s 6.3s my patch: 37.1s 30.3 6.7s Tim's checkin: 8.4s 7.8s 0.6s my patch+Tim's checkin 5.5s 4.9s 0.5s These runs with Library modules already compiled. While Tim's comments about timing the regression test are noted, there are nonetheless consistent reductions in execution time of the regression test as well. Typical results on the same test box: total user sys baseline: 1386s 1097s 89s my patch: 1350s 1065s 93s Tim's checkin: 1265s 1003s 67s my patch+Tim's checkin 1230s 971s 65s With the EMX port, the difference in timing between Tim's checkin and my patch is small, both for test_longexp and the regression test. There are noticeable gains for both test_longexp and the whole regression test with both changes in place, although not as significant as the FreeBSD results. MAL's PyBench 1.0 exhibits negligible performance differences between the code states on both platforms, which is as I'd expect as it doesn't appear to test compile() or eval(). >From the above, I conclude that Tim's patch gets the most bang for the buck, and that my patch (or its intent) be rejected unless someone thinks pursuing the PyMalloc changes to the parser worthwhile. As an aside, I did a little research on the "XXX are those actually common?" question Tim posed in the comment associated with his change: In running Lib/compileall.py against the Lib directory, 89% of PyMem_RESIZE() calls in AddChild() are the n=1 case, and 9% are rounded up to n=4. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-07-08 20:09 Message: Logged In: YES user_id=45365 With Tim's mods test_import and test_longexp now work fine in MacPython. This is both with and without Andrew's patch. Andrew, I'm assigning back to you, there's little more I can do with this patch. And you'll have to check if you still need it, or whether Tims change to node.c is goo enough for OS/2 as well. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-08 16:38 Message: Logged In: YES user_id=31435 Jack, please do a cvs update and try this again. I checked in changes to PyNode_AddChild() that I expect will cure your particular woes here. Andrew, PyMalloc was designed for oodles of small allocations. Feel encouraged to write a patch to change the compiler to use PyObject_{Malloc, Realloc, Free} instead. Then it will automatically exploit PyMalloc when the latter is enabled. Note that the regression test suite incorporates random numbers in several tests, and in ways that can affect runtime. Small differences in aggregate test suite runtime are meaningless because of this. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-07-08 07:24 Message: Logged In: YES user_id=45365 Unfortunately on the Mac it doesn't help anything for the test_longexp problem, nor for the similar test_import problem. The problem with MacPython's malloc seems to be that large reallocs cause the slowdown. And the addchild() calls will continually realloc a block of memory to a slightly larger size (I gave up when it was about 800KB, after a minute or two, and growing at tens of KB per second). As soon as the block is larger than SMALL_REQUEST_TRESHOLD pymalloc will simply call the underlying system malloc/realloc. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-07 16:41 Message: Logged In: YES user_id=250749 Oops. On FreeBSD, test_longexp contributes 15% of the performance gain (not 25%) observed for the regression test with the patch applied. Also, I would expect to make this a platform specific change if its integrated, rather than a general change (unless that it is seen as more appropriate). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=578297&group_id=5470 From noreply@sourceforge.net Sun Aug 4 07:58:53 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 03 Aug 2002 23:58:53 -0700 Subject: [Patches] [ python-Patches-578297 ] fix for problems with test_longexp Message-ID: Patches item #578297, was opened at 2002-07-07 02:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=578297&group_id=5470 Category: Parser/Compiler Group: Python 2.3 >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Andrew I MacIntyre (aimacintyre) Assigned to: Andrew I MacIntyre (aimacintyre) Summary: fix for problems with test_longexp Initial Comment: The OS/2 EMX port has long had problems with test_longexp, which triggers gross memory consumption on this platform as a result of platform malloc behaviour. More recently, this appears to have been identified in MacPython under certain circumstances, although the problem is apparently more a speed issue than a memory consumption issue. The core of the problem is the blizzard of small mallocs as the parser builds the parse tree and creates tokens. The attached patch takes advantage of PyMalloc (built in by default for 2.3) to insulate the parser from adverse behaviour in the platform malloc. The patch has been tested on OS/2 and FreeBSD: - on OS/2, the patch allows even a system with modest resources to complete test_longexp successfully and without swapping to death; on better resourced machines, the whole regression test is negligibly slower (0-1%) to complete. [gcc-2.8.1 -O2] - on FreeBSD (4.4 tested), test_longexp gains nearly 10%, and completes the whole regression test with a gain of about 2% (test_longexp is good for about 25% of the improvement). [gcc-2.95.3 -O3] Both platforms are neutral, performance wise, running MAL's PyBench 1.0. The patch in its current form is for experimental evaluation, and not intended for integration into the core. If there is interest in seeing this integrated, I'd like feedback on a more elegant way to implement the functional change. I've assigned this to Jack for review in the context of its performance on the Mac. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-08-04 02:58 Message: Logged In: YES user_id=31435 Thank you, Andrew! You did plenty here. We can't recommend enabling pymalloc in 2.2 anyway (at least not without a ton of other backports first to make it work right in all cases), so don't even think twise about about skipping the backport of this. Marked Closed. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-08-04 02:47 Message: Logged In: YES user_id=250749 I've committed this to the head (2.3) branch. I looked at a 2.2 backport, but ran into problems with the Parser/acceler.c and Parser/parsetok.c changes - due to changes in memory APIs and changes in Include/pgenheaders.h, they don't compile under 2.2 with or without --with-pymalloc :-( I had a look at what it would take to resolve, and concluded that it involved opening Pandora's box. Parser/node.c and Python/compile.c would be OK. Unfortunately the Parser/parsetok.c changes are the most valuable after Parser/node.c, based on allocation sizes and frequency. While I believe these changes would enhance the performance and stability of the 2.2 branch, most of the gains have already been made by Tim's PyNode_AddChild() fix. Given the likely extent of the code disturbance, and that I'm not going to have any time to go further with this for at least a month, I don't plan to do the 2.2 backport and plan to close this patch in a couple of days. Thanks all! ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-08-01 12:59 Message: Logged In: YES user_id=31392 Looks good to me for 2.3 and 2.2. I see a 40% speedup on compilation of long source strings. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-29 22:21 Message: Logged In: YES user_id=250749 Yes, test_longexp_fix.diff is no longer part of the patch set. Should I delete it? I must have missed your 2.2 backport commit message. I might also look at whether it can be backported to 2.1 without significant side effects. Thanks for your feedback too. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 15:52 Message: Logged In: YES user_id=31435 Reassigned to Jeremy because I'm "on vacation" this week, and Jeremy is most familiar w/ the parser code. Offhand the patches looked fine to me, provided that you no longer consider test_longexp_fix.diff to be part of the patch set. I backported the XXXROUNDUP changes to the 2.2 maintenance branch at the sane time I changed it in the HEAD, so nothing left to do there on that count. Thanks for the great work! ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-29 07:47 Message: Logged In: YES user_id=250749 Tim, 1. any objections to the "final" patches? 2. do you see any reason not to backport your XXXROUNDUP change - it qualifies as a performance/behaviour bugfix IMO. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-21 09:16 Message: Logged In: YES user_id=250749 Ok, I've prepared patches to convert the following files to use PyMalloc for memory allocation: Parser/[acceler.c|node.c|parsetok,c] (pymalloc-parser.diff) Python/compile.c (pymalloc-compile.diff) I didn't bother with the other files in Parser/ as my malloc logging shows that they only ever appear to make requests > 256 bytes. I have attached/will attach a summary from my malloc logging experiments for information. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-15 14:14 Message: Logged In: YES user_id=31435 Thanks for the detailed followup, Andrew! I incorporated some of this info into XXXROUNDUP's comments. Without either patch, the system malloc has to do two miserable things: (1) find bigger and bigger memory areas very frequently; and, (2) interleaved with that, allocate gazillions of tiny blocks too. #2 makes it difficult for the platform malloc to find free space contiguous to the blocks allocated for #1, unless it arranges to move them to "the end" of memory, or into their own memory segments. As a result it's likely to do a copy on nearly every large-block realloc, and the code used to do a realloc on every 3rd new child. The XXXROUNDUP patch addressed #1 by asking to grow blocks much less frequently; PyMalloc addresses #2 by getting the tiny blocks out of the platform malloc's hair. If the platform malloc is saved from either one, it's job becomes much easier. It would still be nice to switch the parser to using pymalloc. There are still disasters lurking, because some platform malloc packages appear to take quadratic time when *free*ing gazillions of tiny blocks (they thrash trying to coalesce them into larger contiguous free blocks). pymalloc doesn't try to coalesce free blocks, so is reliably immune to this disease. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-15 07:47 Message: Logged In: YES user_id=250749 To my surprise, Tim's checkin also works for the EMX port. I can only conclude that EMX's realloc() has a corner case tickled by test_longexp, that isn't hit with either the aggressive overallocation change or the PyMalloc change applied. It is also interesting to note the performance impact of Tim's checkin, particularly on FreeBSD. Typical runtimes for "python -E -tt Lib/test/regrtest.py -l test_longexp" on my P5-166SMP test box (FreeBSD 4.4, gcc 2.95.3 -O3): total user sys baseline: 39.1s 32.7s 6.3s my patch: 37.1s 30.3 6.7s Tim's checkin: 8.4s 7.8s 0.6s my patch+Tim's checkin 5.5s 4.9s 0.5s These runs with Library modules already compiled. While Tim's comments about timing the regression test are noted, there are nonetheless consistent reductions in execution time of the regression test as well. Typical results on the same test box: total user sys baseline: 1386s 1097s 89s my patch: 1350s 1065s 93s Tim's checkin: 1265s 1003s 67s my patch+Tim's checkin 1230s 971s 65s With the EMX port, the difference in timing between Tim's checkin and my patch is small, both for test_longexp and the regression test. There are noticeable gains for both test_longexp and the whole regression test with both changes in place, although not as significant as the FreeBSD results. MAL's PyBench 1.0 exhibits negligible performance differences between the code states on both platforms, which is as I'd expect as it doesn't appear to test compile() or eval(). >From the above, I conclude that Tim's patch gets the most bang for the buck, and that my patch (or its intent) be rejected unless someone thinks pursuing the PyMalloc changes to the parser worthwhile. As an aside, I did a little research on the "XXX are those actually common?" question Tim posed in the comment associated with his change: In running Lib/compileall.py against the Lib directory, 89% of PyMem_RESIZE() calls in AddChild() are the n=1 case, and 9% are rounded up to n=4. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-07-08 06:09 Message: Logged In: YES user_id=45365 With Tim's mods test_import and test_longexp now work fine in MacPython. This is both with and without Andrew's patch. Andrew, I'm assigning back to you, there's little more I can do with this patch. And you'll have to check if you still need it, or whether Tims change to node.c is goo enough for OS/2 as well. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-08 02:38 Message: Logged In: YES user_id=31435 Jack, please do a cvs update and try this again. I checked in changes to PyNode_AddChild() that I expect will cure your particular woes here. Andrew, PyMalloc was designed for oodles of small allocations. Feel encouraged to write a patch to change the compiler to use PyObject_{Malloc, Realloc, Free} instead. Then it will automatically exploit PyMalloc when the latter is enabled. Note that the regression test suite incorporates random numbers in several tests, and in ways that can affect runtime. Small differences in aggregate test suite runtime are meaningless because of this. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-07-07 17:24 Message: Logged In: YES user_id=45365 Unfortunately on the Mac it doesn't help anything for the test_longexp problem, nor for the similar test_import problem. The problem with MacPython's malloc seems to be that large reallocs cause the slowdown. And the addchild() calls will continually realloc a block of memory to a slightly larger size (I gave up when it was about 800KB, after a minute or two, and growing at tens of KB per second). As soon as the block is larger than SMALL_REQUEST_TRESHOLD pymalloc will simply call the underlying system malloc/realloc. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-07 02:41 Message: Logged In: YES user_id=250749 Oops. On FreeBSD, test_longexp contributes 15% of the performance gain (not 25%) observed for the regression test with the patch applied. Also, I would expect to make this a platform specific change if its integrated, rather than a general change (unless that it is seen as more appropriate). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=578297&group_id=5470 From noreply@sourceforge.net Sun Aug 4 08:09:42 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 00:09:42 -0700 Subject: [Patches] [ python-Patches-555085 ] timeout socket implementation Message-ID: Patches item #555085, was opened at 2002-05-12 22:11 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=555085&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 4 Submitted By: Michael Gilfix (mgilfix) Assigned to: Guido van Rossum (gvanrossum) Summary: timeout socket implementation Initial Comment: This implements bug #457114 and implements timed socket operations. If a timeout is set and the timeout period elaspes before the socket operation has finished, a socket.error exception is thrown. This patch integrates the functionality at two levels: the timeout capability is integrated at the C level in socketmodule.c. Socket.py was also modified to update fileobject creation on a win platform to handle the case of the underlying socket throwing an exception. The tex documentation was also updated and a new regression unit was provided as test_timeout.py. ---------------------------------------------------------------------- >Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-08-04 17:09 Message: Logged In: YES user_id=250749 After discussing the OS/2 issues privately with Michael, the outstanding issues are resolved with the socketmodule.c and test_socket.py patches I've uploaded here. socketmodule.c.nb-connect.diff: in the non-blocking connect, OS/2 is returning EINPROGRESS from the initial connection attempt, and after the internal_select(), the subsequent connection attempt returns EISCONN. this appears to be perfectly legitimate, although FreeBSD and Linux haven't been seen to return the EINPROGRESS. the patch adds specific handling for the EISCONN after EINPROGRESS case, matching the semantics already in place for the Windows version of the code. test_socket.py.sendall.diff: the existing sendall() test is flawed as the recv() call makes no guarantees about waiting for all the data requested. OS/2 required a 100ms sleep in the recv loop to get all the data. rewriting the reciev test to allow for recv() not waiting for data still in transit is more correct. Note that these interpretations of "correctness" have been based on FreeBSD manpages, which is the only sockets documentation I currently have. If these are acceptable to Guido, and Michael gets to test them on Linux, I can relieve Guido of committing them and closing this patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-31 04:41 Message: Logged In: YES user_id=6380 Michael and Andrew, if you can deal with this without my involvement I would greatly appreciate it. ;-) ---------------------------------------------------------------------- Comment By: Michael Gilfix (mgilfix) Date: 2002-07-31 00:25 Message: Logged In: YES user_id=116038 If Guido is busy (And I'm sure he is), I'd be willing to take a hack at the problem if you could email me privately and provide a testing environment (No OS/2 EMX in my apt ;) ). ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-30 12:28 Message: Logged In: YES user_id=250749 In private mail to/from Guido, it appears that the FreeBSD issues were in test_socket.py, and have been addressed. I still have outstanding issues on OS/2 EMX, which I sent to Guido privately but will add here as soon as I can. ---------------------------------------------------------------------- Comment By: Michael Gilfix (mgilfix) Date: 2002-07-24 06:43 Message: Logged In: YES user_id=116038 Now that I'm back :) I checked the archive and this seems to have been handled by you. Please let me know if it isn't resolved and I can give it a closer look. Also, perhaps I should contact Bernie and ask him if there's anything he hasn't gotten around to in the test_timeout that I can off-load from him. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-19 03:11 Message: Logged In: YES user_id=6380 The default timeout is now implemented in CVS. There's a bug report from Andrew Macintyre (unfortunately on python-dev) about test_socket.py failures on FreeBSD. I'll try to keep an eye on that, so this patch *still* stays open. Also, Bernie has promised some changes that I haven't received yet and the details of which I don't recall (sorry :-( ). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-06-08 11:47 Message: Logged In: YES user_id=6380 Keeping this open as a reminder of things still to finish. Most is in the python-dev discussion; Michael Gilfix and Bernard Yue have offered to produce more patches. One feature we definitely want is a way to specify a timeout to be applied to all new sockets. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-06-07 07:11 Message: Logged In: YES user_id=6380 Thanks for the new version! I've checked this in. I made considerable changes; the following is feedback but you don't need to respond because I've addressed all these in the checked-in code! - Thanks for the cleanup of some non-standard formatting. However, it's better not to do this so the diffs don't show changes that are unrelated to the timeout patch. - You are still importing the select module instead of calling select() directly. I really think you should do the latter -- the select module has an enormous overhead (it allocates several large lists on the heap). - Instead of explicitly testing the argument to settimeout for being a float, int or long, you should simply call PyFloat_AsDouble and handle the error; if someone passes another object that implements __float__ that should be acceptable. - gettimeout() returns sock_timeout without checking if it is NULL. It can be NULL when a socket object is never initialized. E.g. I can do this: >>> from socket import * >>> s = socket.__new__(socket) >>> s.gettimeout() which gives me a segfault. There are probably other places where this is assumed. - I addressed the latter two issues by making sock_timeout a double, whose value is < 0.0 when no timeout is set. ---------------------------------------------------------------------- Comment By: Michael Gilfix (mgilfix) Date: 2002-06-06 08:23 Message: Logged In: YES user_id=116038 I've addressed all the issues brought up by Guido. The 2nd version of the patch is attached here. In this version, I've modified test_socket.py to include tests for the _fileobject class in socket.py that was modified by this patch. _fileobject needed to be modified so that data would not be lost when the underlying socket threw an expection (data was no longer accumulated in local variables). The tests for the _fileobject class succeed on older versions of python (tested 2.1.3) and pass on the newer version of python. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-05-24 06:18 Message: Logged In: YES user_id=6380 For a detailed review, see http://mail.python.org/pipermail/python-dev/2002-May/024340.html ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=555085&group_id=5470 From noreply@sourceforge.net Sun Aug 4 09:37:00 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 01:37:00 -0700 Subject: [Patches] [ python-Patches-588564 ] _locale library patch Message-ID: Patches item #588564, was opened at 2002-07-30 15:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588564&group_id=5470 Category: Distutils and setup.py Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jason Tishler (jlt63) Assigned to: Martin v. L�wis (loewis) Summary: _locale library patch Initial Comment: This patch enables setup.py to find gettext routines when they are located in libintl instead of libc. Although I developed this patch for Cygwin, I hope that it can be easily updated to support other platforms (if necessary). I tested this patch under Cygwin and Red Hat Linux 7.1. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 10:37 Message: Logged In: YES user_id=21627 I would really prefer if such problems where solved in an autoconf-style approach: - is libintl.h present (you may ask pyconfig.h for that) - if so, is gettext provided by the C library - if not, is it provided by -lintl - if yes, add -lintl - if no, print an error message and continue ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588564&group_id=5470 From noreply@sourceforge.net Sun Aug 4 09:37:29 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 01:37:29 -0700 Subject: [Patches] [ python-Patches-588564 ] _locale library patch Message-ID: Patches item #588564, was opened at 2002-07-30 15:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588564&group_id=5470 Category: Distutils and setup.py Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jason Tishler (jlt63) >Assigned to: Jason Tishler (jlt63) Summary: _locale library patch Initial Comment: This patch enables setup.py to find gettext routines when they are located in libintl instead of libc. Although I developed this patch for Cygwin, I hope that it can be easily updated to support other platforms (if necessary). I tested this patch under Cygwin and Red Hat Linux 7.1. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 10:37 Message: Logged In: YES user_id=21627 I would really prefer if such problems where solved in an autoconf-style approach: - is libintl.h present (you may ask pyconfig.h for that) - if so, is gettext provided by the C library - if not, is it provided by -lintl - if yes, add -lintl - if no, print an error message and continue ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588564&group_id=5470 From noreply@sourceforge.net Sun Aug 4 09:44:39 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 01:44:39 -0700 Subject: [Patches] [ python-Patches-588561 ] Cygwin _hotshot patch Message-ID: Patches item #588561, was opened at 2002-07-30 15:52 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588561&group_id=5470 Category: Modules Group: None Status: Open >Resolution: Accepted Priority: 5 Submitted By: Jason Tishler (jlt63) >Assigned to: Jason Tishler (jlt63) Summary: Cygwin _hotshot patch Initial Comment: YA Cygwin module patch very similar to other patches that I have submitted. I tested under Cygwin and Red Hat Linux 7.1. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 10:44 Message: Logged In: YES user_id=21627 This is ok, please apply it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588561&group_id=5470 From noreply@sourceforge.net Sun Aug 4 09:54:19 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 01:54:19 -0700 Subject: [Patches] [ python-Patches-590682 ] New codecs: html, asciihtml Message-ID: Patches item #590682, was opened at 2002-08-04 06:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: New codecs: html, asciihtml Initial Comment: These codecs translate HTML character &entity; references. The html codec may be applied after other codecs such as utf-8 or iso8859_X and preserves their encoding. The asciihtml encoder produces 7-bit ascii and its output is therefore safe for insertion into almost any document regardless of its encoding. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 10:54 Message: Logged In: YES user_id=21627 This patch is superceded by PEP 293 and patch #432401, which allows you to write unitext.encode("ascii", errors = "xmlcharrefreplace") This probably should be left open until PEP 293 is pronounced upon, and then either rejected or reviewed in detail. I'd encourage a patch that uses Unicode in htmlentitydefs directly, and computes entitydefs from that, instead of vice-versa (or atleast exposes a unicode_entitydefs, perhaps even lazily) - perhaps also with a reverse mapping. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 From noreply@sourceforge.net Sun Aug 4 09:56:14 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 01:56:14 -0700 Subject: [Patches] [ python-Patches-590513 ] Add popen2 like functionality to pty.py. Message-ID: Patches item #590513, was opened at 2002-08-03 16:16 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590513&group_id=5470 Category: Library (Lib) Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Rasjid Wilcox (rasjidw) Assigned to: Nobody/Anonymous (nobody) Summary: Add popen2 like functionality to pty.py. Initial Comment: This patch adds a popen2 like function to pty.py. Due to use of os.execlp in pty.spawn, it is not quite the same, as all the arguments (including the command to be run) must be passed as a tupple. It is only a first draft, and may need some more work, which I am willing to do if some direction is indicated. Tested on Python2.2, under RedHat Linux 7.3. Rasjid. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 10:56 Message: Logged In: YES user_id=21627 Can you please write documentation and a test case? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590513&group_id=5470 From noreply@sourceforge.net Sun Aug 4 09:59:00 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 01:59:00 -0700 Subject: [Patches] [ python-Patches-590119 ] types.BoolType Message-ID: Patches item #590119, was opened at 2002-08-02 15:26 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590119&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Markus F.X.J. Oberhumer (mfx) Assigned to: Nobody/Anonymous (nobody) Summary: types.BoolType Initial Comment: I know that types is getting deprecated, but for orthogonality we really should have a BoolType. Also, IMHO we should _not_ have a BooleanType (or DictionaryType), but that might break code. Index: types.py =================================================================== RCS file: /cvsroot/python/python/dist/src/Lib/types.py,v retrieving revision 1.29 diff -u -r1.29 types.py --- types.py 14 Jun 2002 20:41:13 -0000 1.29 +++ types.py 2 Aug 2002 13:22:22 -0000 @@ -16,7 +16,7 @@ IntType = int LongType = long FloatType = float -BooleanType = bool +BoolType = BooleanType = bool try: ComplexType = complex except NameError: ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 10:59 Message: Logged In: YES user_id=21627 What bug does this fix? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590119&group_id=5470 From noreply@sourceforge.net Sun Aug 4 10:05:15 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 02:05:15 -0700 Subject: [Patches] [ python-Patches-588809 ] LDFLAGS support for build_ext.py Message-ID: Patches item #588809, was opened at 2002-07-30 23:36 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588809&group_id=5470 Category: Distutils and setup.py Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Robert Weber (chipsforbrains) Assigned to: Nobody/Anonymous (nobody) Summary: LDFLAGS support for build_ext.py Initial Comment: a hack at best ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 11:05 Message: Logged In: YES user_id=21627 As a hack, I think it is unacceptable for Python. I'd encourage you to integrate this (and CFLAGS) into sysconfig.customize_compiler. It would be ok if only the Unix compiler honors those settings for now. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588809&group_id=5470 From noreply@sourceforge.net Sun Aug 4 12:00:06 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 04:00:06 -0700 Subject: [Patches] [ python-Patches-590682 ] New codecs: html, asciihtml Message-ID: Patches item #590682, was opened at 2002-08-04 04:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: New codecs: html, asciihtml Initial Comment: These codecs translate HTML character &entity; references. The html codec may be applied after other codecs such as utf-8 or iso8859_X and preserves their encoding. The asciihtml encoder produces 7-bit ascii and its output is therefore safe for insertion into almost any document regardless of its encoding. ---------------------------------------------------------------------- >Comment By: Oren Tirosh (orenti) Date: 2002-08-04 11:00 Message: Logged In: YES user_id=562624 Yes, the error callback approach handles strange mixes better than my method of chaining codecs. But it only does encoding - this patch also provides full decoding of named, decimal and hexadecimal character entity references. Assuming PEP 293 is accepted, I'd like to see the asciihtml codec stay for its decoding ability and renamed to xmlcharref. The encoding part of this codec can just call .encode("ascii", errors="xmlcharrefreplace") to make it a full two-way codec. I'd prefer htmlentitydefs.py to use unicode, too. It's not so useful the way it is. Another problem is that it uses mixed case names as keys. The dictionary lookup is likely to miss incoming entities with arbitrary case so it's more-or-less broken. Does anyone actually use it the way it is? Can it be changed to use unicode without breaking anyone's code? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 08:54 Message: Logged In: YES user_id=21627 This patch is superceded by PEP 293 and patch #432401, which allows you to write unitext.encode("ascii", errors = "xmlcharrefreplace") This probably should be left open until PEP 293 is pronounced upon, and then either rejected or reviewed in detail. I'd encourage a patch that uses Unicode in htmlentitydefs directly, and computes entitydefs from that, instead of vice-versa (or atleast exposes a unicode_entitydefs, perhaps even lazily) - perhaps also with a reverse mapping. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 From noreply@sourceforge.net Sun Aug 4 12:10:22 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 04:10:22 -0700 Subject: [Patches] [ python-Patches-590682 ] New codecs: html, asciihtml Message-ID: Patches item #590682, was opened at 2002-08-04 04:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: New codecs: html, asciihtml Initial Comment: These codecs translate HTML character &entity; references. The html codec may be applied after other codecs such as utf-8 or iso8859_X and preserves their encoding. The asciihtml encoder produces 7-bit ascii and its output is therefore safe for insertion into almost any document regardless of its encoding. ---------------------------------------------------------------------- >Comment By: Oren Tirosh (orenti) Date: 2002-08-04 11:10 Message: Logged In: YES user_id=562624 PEP 293 and patch #432401 are not a replacement for these codecs - it does decoding as well as encoding and also translates <, >, and & which are valid in all encodings and therefore won't get translated by error callbacks. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 11:00 Message: Logged In: YES user_id=562624 Yes, the error callback approach handles strange mixes better than my method of chaining codecs. But it only does encoding - this patch also provides full decoding of named, decimal and hexadecimal character entity references. Assuming PEP 293 is accepted, I'd like to see the asciihtml codec stay for its decoding ability and renamed to xmlcharref. The encoding part of this codec can just call .encode("ascii", errors="xmlcharrefreplace") to make it a full two-way codec. I'd prefer htmlentitydefs.py to use unicode, too. It's not so useful the way it is. Another problem is that it uses mixed case names as keys. The dictionary lookup is likely to miss incoming entities with arbitrary case so it's more-or-less broken. Does anyone actually use it the way it is? Can it be changed to use unicode without breaking anyone's code? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 08:54 Message: Logged In: YES user_id=21627 This patch is superceded by PEP 293 and patch #432401, which allows you to write unitext.encode("ascii", errors = "xmlcharrefreplace") This probably should be left open until PEP 293 is pronounced upon, and then either rejected or reviewed in detail. I'd encourage a patch that uses Unicode in htmlentitydefs directly, and computes entitydefs from that, instead of vice-versa (or atleast exposes a unicode_entitydefs, perhaps even lazily) - perhaps also with a reverse mapping. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 From noreply@sourceforge.net Sun Aug 4 12:50:02 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 04:50:02 -0700 Subject: [Patches] [ python-Patches-590682 ] New codecs: html, asciihtml Message-ID: Patches item #590682, was opened at 2002-08-04 06:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: New codecs: html, asciihtml Initial Comment: These codecs translate HTML character &entity; references. The html codec may be applied after other codecs such as utf-8 or iso8859_X and preserves their encoding. The asciihtml encoder produces 7-bit ascii and its output is therefore safe for insertion into almost any document regardless of its encoding. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 13:50 Message: Logged In: YES user_id=21627 You can easily enough arrange to get errors on <, >, amd &, by using codecs.charmap_encode with an appropriate encoding map. Infact, with that, you can easily get all entity refereces into the encoded data, without any need for an explicit iteration. However, I am concerned that you offer decoding as well. People may be tricked into believing that they can decode arbitrrary HTML with your codec - when your codec would incorrectly deal with CDATA sections. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 13:10 Message: Logged In: YES user_id=562624 PEP 293 and patch #432401 are not a replacement for these codecs - it does decoding as well as encoding and also translates <, >, and & which are valid in all encodings and therefore won't get translated by error callbacks. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 13:00 Message: Logged In: YES user_id=562624 Yes, the error callback approach handles strange mixes better than my method of chaining codecs. But it only does encoding - this patch also provides full decoding of named, decimal and hexadecimal character entity references. Assuming PEP 293 is accepted, I'd like to see the asciihtml codec stay for its decoding ability and renamed to xmlcharref. The encoding part of this codec can just call .encode("ascii", errors="xmlcharrefreplace") to make it a full two-way codec. I'd prefer htmlentitydefs.py to use unicode, too. It's not so useful the way it is. Another problem is that it uses mixed case names as keys. The dictionary lookup is likely to miss incoming entities with arbitrary case so it's more-or-less broken. Does anyone actually use it the way it is? Can it be changed to use unicode without breaking anyone's code? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 10:54 Message: Logged In: YES user_id=21627 This patch is superceded by PEP 293 and patch #432401, which allows you to write unitext.encode("ascii", errors = "xmlcharrefreplace") This probably should be left open until PEP 293 is pronounced upon, and then either rejected or reviewed in detail. I'd encourage a patch that uses Unicode in htmlentitydefs directly, and computes entitydefs from that, instead of vice-versa (or atleast exposes a unicode_entitydefs, perhaps even lazily) - perhaps also with a reverse mapping. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 From noreply@sourceforge.net Sun Aug 4 16:07:36 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 08:07:36 -0700 Subject: [Patches] [ python-Patches-590682 ] New codecs: html, asciihtml Message-ID: Patches item #590682, was opened at 2002-08-04 04:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: New codecs: html, asciihtml Initial Comment: These codecs translate HTML character &entity; references. The html codec may be applied after other codecs such as utf-8 or iso8859_X and preserves their encoding. The asciihtml encoder produces 7-bit ascii and its output is therefore safe for insertion into almost any document regardless of its encoding. ---------------------------------------------------------------------- >Comment By: Oren Tirosh (orenti) Date: 2002-08-04 15:07 Message: Logged In: YES user_id=562624 >People may be tricked into believing that they can >decode arbitrary HTML with your codec - when your >codec would incorrectly deal with CDATA sections. You don't even need to go as far as CDATA to see that tags must be parsed first and only then tag bodies and attribute values can be individually decoded. If you do it in the reverse order the tag parser will try to parse < as a tag. It should be documented, though. For encoding it's also obvious that encoding must be done first and then the encoded strings can be inserted into tags - < in strings is encoded into < preventing it from being interpreted as a tag. This is a good thing! it prevents insertion attacks. > You can easily enough arrange to get errors on <, >, > and &, by using codecs.charmap_encode with an > appropriate encoding map. If you mean to use this as some internal implementation detail it's ok. Are actually proposing that this is the way end users should use it? How about this: Install an encoder registry function that responds to any codec name matching "xmlcharref.SPAM" and does all the internal magic you describe to create a codec instance that combines xmlcharref translation including <,>,& and the SPAM encoding. This dynamically-generated codec will do both encoding and decoding and be cached, of course. "Namespaces are one honking great idea -- let's do more of those!" ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 11:50 Message: Logged In: YES user_id=21627 You can easily enough arrange to get errors on <, >, amd &, by using codecs.charmap_encode with an appropriate encoding map. Infact, with that, you can easily get all entity refereces into the encoded data, without any need for an explicit iteration. However, I am concerned that you offer decoding as well. People may be tricked into believing that they can decode arbitrrary HTML with your codec - when your codec would incorrectly deal with CDATA sections. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 11:10 Message: Logged In: YES user_id=562624 PEP 293 and patch #432401 are not a replacement for these codecs - it does decoding as well as encoding and also translates <, >, and & which are valid in all encodings and therefore won't get translated by error callbacks. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 11:00 Message: Logged In: YES user_id=562624 Yes, the error callback approach handles strange mixes better than my method of chaining codecs. But it only does encoding - this patch also provides full decoding of named, decimal and hexadecimal character entity references. Assuming PEP 293 is accepted, I'd like to see the asciihtml codec stay for its decoding ability and renamed to xmlcharref. The encoding part of this codec can just call .encode("ascii", errors="xmlcharrefreplace") to make it a full two-way codec. I'd prefer htmlentitydefs.py to use unicode, too. It's not so useful the way it is. Another problem is that it uses mixed case names as keys. The dictionary lookup is likely to miss incoming entities with arbitrary case so it's more-or-less broken. Does anyone actually use it the way it is? Can it be changed to use unicode without breaking anyone's code? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 08:54 Message: Logged In: YES user_id=21627 This patch is superceded by PEP 293 and patch #432401, which allows you to write unitext.encode("ascii", errors = "xmlcharrefreplace") This probably should be left open until PEP 293 is pronounced upon, and then either rejected or reviewed in detail. I'd encourage a patch that uses Unicode in htmlentitydefs directly, and computes entitydefs from that, instead of vice-versa (or atleast exposes a unicode_entitydefs, perhaps even lazily) - perhaps also with a reverse mapping. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 From noreply@sourceforge.net Sun Aug 4 16:45:36 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 08:45:36 -0700 Subject: [Patches] [ python-Patches-568348 ] Add param to email.Utils.decode() Message-ID: Patches item #568348, was opened at 2002-06-13 12:47 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=568348&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: None Priority: 5 Submitted By: atsuo ishimoto (ishimoto) Assigned to: Barry A. Warsaw (bwarsaw) Summary: Add param to email.Utils.decode() Initial Comment: While email.Utils.decode() is a quite useful function, I got a real world problem. Here in Japan, I receive a lot of RFC-hostile messages everyday. Since they contains illegal characters cannot be converted to Unicode by JapaneseCodecs, email.Utils.decode() chokes with UnicodeError. My solution is an adding optional 'errors' parameter which is passed to unicode() function. This allows me to replace illegal characters, instead of abandoning entire text. ---------------------------------------------------------------------- >Comment By: atsuo ishimoto (ishimoto) Date: 2002-08-05 00:45 Message: Logged In: YES user_id=463672 New email package looks nice. I don't need this patch anymore. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-07-23 04:55 Message: Logged In: YES user_id=12800 email.Utils.decode() is deprecated in favor of email.Header.decode_header(). Is this patch still worth it? I think email.Utils.decode() ought to go away. ---------------------------------------------------------------------- Comment By: Gerhard H�ring (ghaering) Date: 2002-06-21 19:45 Message: Logged In: YES user_id=163326 I'd recommend to assign this patch to Barry Warsaw (bwarsaw), who is the maintainer of the email module. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=568348&group_id=5470 From noreply@sourceforge.net Sun Aug 4 16:48:13 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 08:48:13 -0700 Subject: [Patches] [ python-Patches-568348 ] Add param to email.Utils.decode() Message-ID: Patches item #568348, was opened at 2002-06-12 23:47 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=568348&group_id=5470 Category: Library (Lib) Group: None >Status: Closed >Resolution: Out of Date Priority: 5 Submitted By: atsuo ishimoto (ishimoto) Assigned to: Barry A. Warsaw (bwarsaw) Summary: Add param to email.Utils.decode() Initial Comment: While email.Utils.decode() is a quite useful function, I got a real world problem. Here in Japan, I receive a lot of RFC-hostile messages everyday. Since they contains illegal characters cannot be converted to Unicode by JapaneseCodecs, email.Utils.decode() chokes with UnicodeError. My solution is an adding optional 'errors' parameter which is passed to unicode() function. This allows me to replace illegal characters, instead of abandoning entire text. ---------------------------------------------------------------------- >Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-04 11:48 Message: Logged In: YES user_id=12800 Cool, thanks. I'm closing this patch. ---------------------------------------------------------------------- Comment By: atsuo ishimoto (ishimoto) Date: 2002-08-04 11:45 Message: Logged In: YES user_id=463672 New email package looks nice. I don't need this patch anymore. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-07-22 15:55 Message: Logged In: YES user_id=12800 email.Utils.decode() is deprecated in favor of email.Header.decode_header(). Is this patch still worth it? I think email.Utils.decode() ought to go away. ---------------------------------------------------------------------- Comment By: Gerhard H�ring (ghaering) Date: 2002-06-21 06:45 Message: Logged In: YES user_id=163326 I'd recommend to assign this patch to Barry Warsaw (bwarsaw), who is the maintainer of the email module. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=568348&group_id=5470 From noreply@sourceforge.net Sun Aug 4 16:54:05 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 08:54:05 -0700 Subject: [Patches] [ python-Patches-590682 ] New codecs: html, asciihtml Message-ID: Patches item #590682, was opened at 2002-08-04 06:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: New codecs: html, asciihtml Initial Comment: These codecs translate HTML character &entity; references. The html codec may be applied after other codecs such as utf-8 or iso8859_X and preserves their encoding. The asciihtml encoder produces 7-bit ascii and its output is therefore safe for insertion into almost any document regardless of its encoding. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 17:54 Message: Logged In: YES user_id=21627 I'm in favour of exposing this via a search functions, for generated codec names, on top of PEP 293 (I would not like your codec to compete with the alternative mechanism). My dislike for the current patch also comes from the fact that it singles-out ASCII, which the search function would not. You could implement two forms: html.codecname and xml.codecname. The html form would do HTML entity references in both directions, and fall back to character references only if necessary; the XML form would use character references all the time, and entity references only for the builtin entities. And yes, I do recommend users to use codecs.charmap_encode directly, as this is probably the most efficient, yet most compact way to convert Unicode to a less-than-7-bit form. In anycase, I'd encourage you to contribute to the progress of PEP 293 first - this has been an issue for several years now, and I would be sorry if it would fail. While you are waiting for PEP 293 to complete, please do consider cleaning up htmlentitydefs to provide mappings from and to Unicode characters. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 17:07 Message: Logged In: YES user_id=562624 >People may be tricked into believing that they can >decode arbitrary HTML with your codec - when your >codec would incorrectly deal with CDATA sections. You don't even need to go as far as CDATA to see that tags must be parsed first and only then tag bodies and attribute values can be individually decoded. If you do it in the reverse order the tag parser will try to parse < as a tag. It should be documented, though. For encoding it's also obvious that encoding must be done first and then the encoded strings can be inserted into tags - < in strings is encoded into < preventing it from being interpreted as a tag. This is a good thing! it prevents insertion attacks. > You can easily enough arrange to get errors on <, >, > and &, by using codecs.charmap_encode with an > appropriate encoding map. If you mean to use this as some internal implementation detail it's ok. Are actually proposing that this is the way end users should use it? How about this: Install an encoder registry function that responds to any codec name matching "xmlcharref.SPAM" and does all the internal magic you describe to create a codec instance that combines xmlcharref translation including <,>,& and the SPAM encoding. This dynamically-generated codec will do both encoding and decoding and be cached, of course. "Namespaces are one honking great idea -- let's do more of those!" ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 13:50 Message: Logged In: YES user_id=21627 You can easily enough arrange to get errors on <, >, amd &, by using codecs.charmap_encode with an appropriate encoding map. Infact, with that, you can easily get all entity refereces into the encoded data, without any need for an explicit iteration. However, I am concerned that you offer decoding as well. People may be tricked into believing that they can decode arbitrrary HTML with your codec - when your codec would incorrectly deal with CDATA sections. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 13:10 Message: Logged In: YES user_id=562624 PEP 293 and patch #432401 are not a replacement for these codecs - it does decoding as well as encoding and also translates <, >, and & which are valid in all encodings and therefore won't get translated by error callbacks. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 13:00 Message: Logged In: YES user_id=562624 Yes, the error callback approach handles strange mixes better than my method of chaining codecs. But it only does encoding - this patch also provides full decoding of named, decimal and hexadecimal character entity references. Assuming PEP 293 is accepted, I'd like to see the asciihtml codec stay for its decoding ability and renamed to xmlcharref. The encoding part of this codec can just call .encode("ascii", errors="xmlcharrefreplace") to make it a full two-way codec. I'd prefer htmlentitydefs.py to use unicode, too. It's not so useful the way it is. Another problem is that it uses mixed case names as keys. The dictionary lookup is likely to miss incoming entities with arbitrary case so it's more-or-less broken. Does anyone actually use it the way it is? Can it be changed to use unicode without breaking anyone's code? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 10:54 Message: Logged In: YES user_id=21627 This patch is superceded by PEP 293 and patch #432401, which allows you to write unitext.encode("ascii", errors = "xmlcharrefreplace") This probably should be left open until PEP 293 is pronounced upon, and then either rejected or reviewed in detail. I'd encourage a patch that uses Unicode in htmlentitydefs directly, and computes entitydefs from that, instead of vice-versa (or atleast exposes a unicode_entitydefs, perhaps even lazily) - perhaps also with a reverse mapping. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 From noreply@sourceforge.net Sun Aug 4 17:59:01 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 09:59:01 -0700 Subject: [Patches] [ python-Patches-526840 ] PEP 263 Implementation Message-ID: Patches item #526840, was opened at 2002-03-07 09:55 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=526840&group_id=5470 Category: Parser/Compiler Group: Python 2.3 >Status: Closed >Resolution: Out of Date Priority: 7 Submitted By: Martin v. L�wis (loewis) Assigned to: M.-A. Lemburg (lemburg) Summary: PEP 263 Implementation Initial Comment: The attached patch implements PEP 263. The following differences to the PEP (rev. 1.8) are known: - The implementation interprets "ASCII compatible" as meaning "bytes below 128 always denote ASCII characters", although this property is only used for ",', and \. There have been other readings of "ASCII compatible", so this should probably be elaborated in the PEP. - The check whether all bytes follow the declared or system encoding (including comments and string literals) is only performed if the encoding is "ascii". ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 18:59 Message: Logged In: YES user_id=21627 This patch has been superceded by 534304. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-04-11 18:23 Message: Logged In: YES user_id=38388 Apart from the codec changes, the patch looks ok. I would still like two APIs for the two different codec tasks, though. I don't expect anything much to change in the codecs, so maintenance is not an issue. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-03-21 11:25 Message: Logged In: YES user_id=21627 Version 2 of this patch implements revision 1.11 of the PEP (phase 1). The check of the complete source file for compliance with the declared encoding is implemented by decoding the input line-by-line; I believe that for all supported encodings, this is not different compared to decoding the entire source file at once. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-03-07 19:24 Message: Logged In: YES user_id=21627 Changing the decoding functions will not result in one additional function, but in two of them: you'll also get PyUnicode_DecodeRawUnicodeEscapeFromUnicode. That seems quite unmaintainable to me: any change now needs to propagate into four functions. OTOH, I don't think that the code that allows parsing a variable-sized strings is overly complicated. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-03-07 19:01 Message: Logged In: YES user_id=38388 Ok, I've had a look at the patch. It looks good except for the overly complicated implementation of the unicode-escape codec. Even though there's a bit of code duplication, I'd prefer to have two separate functions here: one for the standard char* pointer type and another one for Py_UNICODE*, ie. PyUnicode_DecodeUnicodeEscape(char*...) and PyUnicode_DecodeUnicodeEscapeFromUnicode(Py_UNICODE*...) This is easier to support and gives better performance since the compiler can optimize the two functions making different assumptions. You'll also need to include a name mangling at the top of the header for the new API. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-03-07 15:06 Message: Logged In: YES user_id=6380 I've set the group to Python 2.3 so the priority has some context (I'd rather you move the priority down to 5 but I understand this is your personal priority). I haven't accepted the PEP yet (although I expect I will), so please don't check this in yet (if you feel it needs to be saved in CVS, use a branch). ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-03-07 12:06 Message: Logged In: YES user_id=38388 Thank you ! I'll add a note to the PEP about the way the first two lines are processed (removing the ASCII mention...). ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-03-07 10:11 Message: Logged In: YES user_id=21627 A note on the implementation strategy: it turned out that communicating the encoding into the abstract syntax was the biggest challenge. To solve this, I introduced encoding_decl pseudo node: it is an unused non-terminal whose STR() is the encoding, and whose only child is the true root of the syntax tree. As such, it is the only non-terminal which has a STR value. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=526840&group_id=5470 From noreply@sourceforge.net Sun Aug 4 18:42:29 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 10:42:29 -0700 Subject: [Patches] [ python-Patches-534304 ] PEP 263 Implementation Message-ID: Patches item #534304, was opened at 2002-03-24 14:52 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=534304&group_id=5470 Category: Parser/Compiler Group: Python 2.3 >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: SUZUKI Hisao (suzuki_hisao) Assigned to: Nobody/Anonymous (nobody) Summary: PEP 263 Implementation Initial Comment: This is a sample implementation of PEP 263 phase 2. This implementation behaves just as normal Python does if no other coding hints are given. Thus it does not hurt anyone who uses Python now. Note that it is strictly compatible with the PEP in that every program valid in the PEP is also valid in this implementation. This implementation also accepts files in UTF-16 with BOM. They are read as UTF-8 internally. Please try "utf16sample.py" included. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 19:42 Message: Logged In: YES user_id=21627 I have now implemented Neal's suggestions: I documented the processing of encodings, and changed a number of formatting problems. I have disabled the detection of UTF-16 BOMs, since they are not backed by the PEP. I have committed the changes as Makefile.pre.in 1.93 ref2.tex 1.38 Grammar 1.48 errcode.h 2.15 graminit.h 2.20 NEWS 1.451 parsetok.c 2.33 tokenizer.c 2.55 tokenizer.h 2.17 tokenizer_pgen.c 2.1 compile.c 2.250 graminit.c 2.34 pythonrun.c 2.165 The change to bltinmodule.c was there by mistake, so I have removed that change. SUZUKI Hisao, how would you like to be listed in Misc/ACKS? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-07-16 20:50 Message: Logged In: YES user_id=33168 I reviewed the patch. I don't like the usage of enc (and str to a lesser extent). In particular, there is an encoding field which is generally used. enc is used as a temporary from the callback. I don't have a solution, so perhaps it would be best to doc the purpose, usage and interaction of enc & str. There are some differences between the standard formatting and that used in the patch. return on same line as if among others. But these aren't too bad. Although I don't love the line do t++; while (...);. I didn't see any problems with the patch. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-05-09 15:42 Message: Logged In: YES user_id=21627 I have now updated this patch to the current CVS, and to be a complete PEP 263 implementation; it will issue warnings when it finds non-ASCII characters but no encoding declaration. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-04-26 21:41 Message: Logged In: YES user_id=21627 I've updated the PEP to describe how this approach should be used: Python 2.3 still should generate warnings only for using non-ASCII without declared encoding. I, too, hope that Mr Suzuki will update the patch to match the PEP, and for the CVS tree. As for supporting UTF-16: The stream reader currently has the .readline method disabled, since it won't work reliable for little-endian. So I think this should be an undocumented feature at the moment; I see no other technical problems with the approach taken in the patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-23 23:26 Message: Logged In: YES user_id=6380 I haven't looked at this very carefully, but it looks like it's well thought-out. Suzuki, can you prepare a patch relative to current CVS? I get several patch failures now. (Fortunately I have a checkout of 2.2 so I can still review and test the patch.) I don't know what the patch failures are about (haven't investigated) but imagine it might have to do with the PEP 279 (universal newlines) changes checked in by Jack Jansen, which replaces the tokenizer's fgets() calls with calls to Py_UniversalNewlineFgets(). Also, I can't read the README file (it's in Japanese :-). What is the expected output from the samples? For me, sjis_sample.py gives SyntaxError: 'unknown encoding' Martin, I'm unclear of how you intend to use this code. Do you intend to go straight to phase 2 of the PEP using this patch? Or do you intend to implement phase 1 of the PEP by modifying this code? Also, does the PEP describe the UTF-16 support as implemented by Suziki's patch? ---------------------------------------------------------------------- Comment By: SUZUKI Hisao (suzuki_hisao) Date: 2002-03-31 18:16 Message: Logged In: YES user_id=495142 Thank you for your review. Now 1. and 3. are fixed, and 2. is improved. (4. is not true.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 12:27 Message: Logged In: YES user_id=6656 Not going into 2.2.x. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-03-25 14:23 Message: Logged In: YES user_id=21627 The patch looks good, but needs a number of improvements. 1. I have problems building this code. When trying to build pgen, I get an error message of Parser/parsetok.c: In function `parsetok': Parser/parsetok.c:175: `encoding_decl' undeclared The problem here is that graminit.h hasn't been built yet, but parsetok refers to the symbol. 2. For some reason, error printing for incorrect encodings does not work - it appears that it prints the wrong line in the traceback. 3. The escape processing in Unicode literals is incorrect. For example, u"\" should denote only the non-ascii character. However, your implementation replaces the non-ASCII character with \u, resulting in \u, so the first backslash unescapes the second one. 4. I believe the escape processing in byte strings is also incorrect for encodings that allow \ in the second byte. Before processing escape characters, you convert back into the source encoding. If this produces a backslash character, escape processing will misinterpret that byte as an escape character. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=534304&group_id=5470 From noreply@sourceforge.net Sun Aug 4 20:22:19 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 12:22:19 -0700 Subject: [Patches] [ python-Patches-590843 ] list sort perf boost Message-ID: Patches item #590843, was opened at 2002-08-04 15:22 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590843&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Neal Norwitz (nnorwitz) Assigned to: Tim Peters (tim_one) Summary: list sort perf boost Initial Comment: I don't really love this patch--the names suck. There are also 2 warnings. The warnings can be removed by casting, but I wanted to see if anybody had better ideas. However, I can't let Tim have all the fun optimizing the hell out of list sort. This patch gets rid of COMPARE == NULL in the ISLT macro. A structure is created which holds both the cmp function and PyObject/Py_LT. I get a speed up of between ~1-3%. Using the largest #s, 7.67 -> 7.59 for *sort. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590843&group_id=5470 From noreply@sourceforge.net Sun Aug 4 20:50:29 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 12:50:29 -0700 Subject: [Patches] [ python-Patches-587993 ] alternative SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 07:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Nobody/Anonymous (nobody) Summary: alternative SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 15:50 Message: Logged In: YES user_id=33168 Attached is a patch to test_hotshot which is necessary because the duplicate line #s at the beginning of the function no longer exist. WIth this patch test_hotshot passes. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 16:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 11:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 09:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 18:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 17:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 05:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 05:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 16:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 11:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 11:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Sun Aug 4 22:04:04 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 14:04:04 -0700 Subject: [Patches] [ python-Patches-506436 ] GETCONST/GETNAME/GETNAMEV speedup Message-ID: Patches item #506436, was opened at 2002-01-21 07:39 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=506436&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Skip Montanaro (montanaro) Assigned to: Skip Montanaro (montanaro) Summary: GETCONST/GETNAME/GETNAMEV speedup Initial Comment: The attached patch redefines the GETCONST, GETNAME & GETNAMEV macros to do the following: * access the code object's consts and names through local variables instead of the long chain from f * use access macros to index the tuples and get the C string names The code appears correct, and I've had no trouble with it. It only provides the most trivial of improvement on pystone (around 1% when I see anything), but it's all those little things that add up, right? Skip ---------------------------------------------------------------------- >Comment By: Skip Montanaro (montanaro) Date: 2002-08-04 16:04 Message: Logged In: YES user_id=44345 checked in as ceval 2.231 ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-02 20:20 Message: Logged In: YES user_id=31435 Cool -- check it in! ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-08-02 11:58 Message: Logged In: YES user_id=44345 Here's an updated patch. Back to Tim... ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-23 17:04 Message: Logged In: YES user_id=31435 Marked Out-of-Date and back to Skip. Sorry for the delay! The idea is fine. I'd rather you use the current GETITEM macro, which does bounds-checking in a debug build. I note too that GETCONST is only used once, and that use may as well be a direct GETITEM(consts, i) invocation, and skip the macro. Note that the GETNAME() macro no longer exists. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-07-22 22:34 Message: Logged In: YES user_id=33168 Skip, I modified this code some, but your technique is still valid. I got rid of one of the indirections already. The patch can easily be updated. Seems like the patch shouldn't hurt. Tim? ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-07-09 18:45 Message: Logged In: YES user_id=44345 Looking for a vote up or down on this one... ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-01-21 07:47 Message: Logged In: YES user_id=44345 Whoops... Make the "observed" speedup 0.1%... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=506436&group_id=5470 From noreply@sourceforge.net Sun Aug 4 22:39:02 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 14:39:02 -0700 Subject: [Patches] [ python-Patches-586561 ] Better token-related error messages Message-ID: Patches item #586561, was opened at 2002-07-25 11:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=586561&group_id=5470 Category: Parser/Compiler Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Skip Montanaro (montanaro) Assigned to: Jeremy Hylton (jhylton) Summary: Better token-related error messages Initial Comment: There were some complaints recently on c.l.py about the rather non-informative error messages emitted as a result of the tokenizer detecting a problem. In many situations it simply returns E_TOKEN which generates a fairly benign, but often unhelpful "invalid token" message. This patch adds several new E_* macrosto Includes/errorcode.h, returns them from the appropriate places in Parser/tokenizer.c and generates more specific messages in Python/pythonrun.c. I think the error messages are always better, though in some situations they may still not be strictly correct. Assigning to Jeremy since he's the compiler wiz. Skip ---------------------------------------------------------------------- >Comment By: Skip Montanaro (montanaro) Date: 2002-08-04 16:39 Message: Logged In: YES user_id=44345 here's a new patch - deletes all but the EOFC & EOFS macros and adds a test_eof.py test module ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-08-02 13:17 Message: Logged In: YES user_id=31392 The current error message for the complex case seems clear enough since it identifies exactly the offending character. >>> 3i+2 File "", line 1 3i+2 ^ SyntaxError: invalid syntax The error message for runaway triple quoted strings is much more puzzling, since the line of context doesn't have anything useful on it. I guess we should think about the others, too: E_EOLS is marginal, since you do get the line with the error in the exception. E_EOFC is a win for the same reason that E_EOFS is, although I expect it's a less common case. E_EXP and E_SLASH are borderline -- again because the current syntax error identifies exactly the line and character that are causing the problem. We should get a third opinion, but I'd probably settle for just E_EOFC and E_EOFS. ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-08-02 13:05 Message: Logged In: YES user_id=44345 re: i vs. j Perhaps it's not needed. The patch was originally designed to address the case of runaway triple-quoted strings. Someone on c.l.py ranted about that. While I was in there, I recalled someone else (perhaps more than one person) had berated Python in the past because imaginary numbers use 'j' instead of 'i' and decided to stick it in. It's no big deal to take it out. (When you think about it, they are all corner cases, since most of the time the code is syntactically correct. ;-) S ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-08-02 12:54 Message: Logged In: YES user_id=31392 Is the warning about i vs. j for complex numbers really necessary? It seems like it adds extra, well, complexity for a tiny corner case. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=586561&group_id=5470 From noreply@sourceforge.net Sun Aug 4 22:55:08 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 14:55:08 -0700 Subject: [Patches] [ python-Patches-401022 ] Removal of SET_LINENO (experimental) Message-ID: Patches item #401022, was opened at 2000-07-30 18:08 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=401022&group_id=5470 Category: Core (C code) Group: None Status: Closed Resolution: Out of Date Priority: 5 Submitted By: Vladimir Marangozov (marangoz) Assigned to: Nobody/Anonymous (nobody) Summary: Removal of SET_LINENO (experimental) Initial Comment: ---------------------------------------------------------------------- >Comment By: Skip Montanaro (montanaro) Date: 2002-08-04 16:55 Message: Logged In: YES user_id=44345 Michael Hudson's no SET_LINENO patch seems to be nearing completion & this one's been marked "out of date" with no assignee, so I'm closing it... ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-08-04 16:42 Message: Logged In: YES user_id=44345 Michael Hudson's no SET_LINENO patch seems to be nearing completion & this one's been marked "out of date" with no assignee, so I'm closing it... ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-06-20 16:04 Message: Logged In: YES user_id=31392 The URL of the day for Vladimir's explanation of how the patch works is here: http://mail.python.org/pipermail/python-dev/2000-July/007652.html ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-03-08 15:50 Message: Logged In: YES user_id=31392 Just in case Guido doesn't get to that VM redesign next week, do you want to upload whatever progress you made? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-03-07 12:41 Message: Logged In: YES user_id=35752 I worked a bit on porting to this patch to 2.2+ CVS. I ran into a snag with generators. Generators save the instruction pointer (i.e. the bytecode offset) on yield. That makes the on-the-fly bytecode translation approach more complicated. Since Guido is going to redesign the whole VM it's probably not work spending any more effort on this. :-) ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-11-27 15:54 Message: Logged In: YES user_id=31435 Unassigned again -- I'm not gonna get to this in this lifetime. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-09-10 13:51 Message: Logged In: YES user_id=6380 Tim wants to revisit this. It could be the quickest way to a 7% speedup in pystone that we can think of... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2000-11-13 13:42 Message: Rejected. It's in the archives for reference, but for now, I don't think it's worth spending cycles worrying about this kind of stuff. I'll eventually redesign the entire VM. ---------------------------------------------------------------------- Comment By: Vladimir Marangozov (marangoz) Date: 2000-10-27 06:08 Message: Oops, the last patch update does not contain the f.f_lineno computation in frame_getattr. This is necessary, cf. the following messages: http://www.python.org/pipermail/python-dev/2000-July/014395.html http://www.python.org/pipermail/python-dev/2000-July/014401.html Patch assigned to Guido, for review or further assignment. ---------------------------------------------------------------------- Comment By: Vladimir Marangozov (marangoz) Date: 2000-10-25 19:42 Message: noreply@sourceforge.net wrote: > > Date: 2000-Oct-25 13:56 > By: gvanrossum > > Comment: > Vladimir, are you there? So-so :) I'm a moving target, checking my mail occasionally these days. Luckily, today is one of these days. > > The patch doesn't apply cleanly to the current CVS tree any more... Ah, this one's easy. Here's an update relative to 2.0 final, not CVS. I got some r/w access error trying to update my CVS copy from SF that I have no time to investigate right now... The Web interface still works though :) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2000-10-25 15:56 Message: Vladimir, are you there? The patch doesn't apply cleanly to the current CVS tree any more... ---------------------------------------------------------------------- Comment By: Vladimir Marangozov (marangoz) Date: 2000-08-03 14:22 Message: Fix missing DECREF on error condition in start_tracing() + some renaming. ---------------------------------------------------------------------- Comment By: Vladimir Marangozov (marangoz) Date: 2000-07-31 12:50 Message: A last tiny fix of the SET_LINENO opcode for better b/w compatibility. Stopping here and entering standby mode for reactions & feedback. PS: the last idea about not duplicating co_code and tweaking the original with CALL_TRACE is a bad one. I remember Guido being against it because co_code could be used elsewhere (copied, written to disk, whatever) and he's right! Better operate on an internal copy created in ceval. ---------------------------------------------------------------------- Comment By: Vladimir Marangozov (marangoz) Date: 2000-07-31 09:57 Message: Another rewrite, making this whole strategy b/w compatible according to the 1st incompatibility point a) described in: http://www.python.org/pipermail/python-dev/2000-July/014364.html Changes: 1. f.f_lineno is computed and updated on f_lineno attribute requests for f, given f.f_lasti. Correctness is ensured because f.f_lasti is updated on *all* attribute accesses (in LOAD_ATTR in the main loop). 2. The standard setup does not generate SET_LINENO, but uses co_lnotab for computing the source line number (e.g. tracebacks) This is equivalent to the actual "python -O". 3. With "python -d", we fall back to the current version of the interpreter (with SET_LINENO) thus making it easy to test whether this patch fully replaces SET_LINENO's behavior. (modulo f->f_lineno accesses from legacy C code, but this is insane). IMO, this version already worths the pain to be truly tested and improved. One improvement is to define a nicer public C API for breakpoints: - PyCode_SetBreakPointAtLine(line) - PyCode_SetBreakPointAtAddr(addr) or similar, which would install a CALL_TRACE opcode in the appropriate location of the copy of co_code. Another idea is to avoid duplicating the entire co_code just for storing the CALL_TRACE opcodes. We can store them in the original and keep a table of breakpoints. Setting the breakpoints would occur whenever the sys.settrace hook is set. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2000-07-31 08:40 Message: Status set to postponed to indicate that this is still experimental. ---------------------------------------------------------------------- Comment By: Vladimir Marangozov (marangoz) Date: 2000-07-30 20:16 Message: A nit: inline the argfetch in CALL_TRACE and goto the switch, instead of jumping to get_oparg which splits the sequence [fetch opcode, fetch oparg] -- this can slow things down. ---------------------------------------------------------------------- Comment By: Vladimir Marangozov (marangoz) Date: 2000-07-30 18:12 Message: For testing, as discussed on python-dev. For a gentle summary, see: http://www.python.org/pipermail/python-dev/2000-July/014364.html ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=401022&group_id=5470 From noreply@sourceforge.net Sun Aug 4 22:56:52 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 14:56:52 -0700 Subject: [Patches] [ python-Patches-590843 ] list sort perf boost Message-ID: Patches item #590843, was opened at 2002-08-04 15:22 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590843&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Neal Norwitz (nnorwitz) Assigned to: Neal Norwitz (nnorwitz) Summary: list sort perf boost Initial Comment: I don't really love this patch--the names suck. There are also 2 warnings. The warnings can be removed by casting, but I wanted to see if anybody had better ideas. However, I can't let Tim have all the fun optimizing the hell out of list sort. This patch gets rid of COMPARE == NULL in the ISLT macro. A structure is created which holds both the cmp function and PyObject/Py_LT. I get a speed up of between ~1-3%. Using the largest #s, 7.67 -> 7.59 for *sort. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-08-04 17:56 Message: Logged In: YES user_id=31435 Ack, swap "little" with "big" in that comment everywhere. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-04 17:43 Message: Logged In: YES user_id=31435 Heh -- cute! It doesn't make a measurable difference on my MSVC6 box, but I can believe it helps a bit elsewhere. However, the warnings are trying to tell us something: casting PyObject_RichCompareBool to a type that takes a void* 3rd argument instead of an int is likely to work on boxes where sizeof(void*) == sizeof(int) (like my box, and presumably yours too), but it's lying in a potentially dangerous way on other boxes. If sizeof(void*) > sizeof (int), it's still likely to work on big-endian machines, but it gets very iffy on little-endian boxes (PyObject_RichCompareBool can pick up just the least- significant bits then). On the third hand, Py_LT is currently a #define for 0, so it may well work (albeit by accident) even then. I don't think there's enough gain here to justify that level of obscure x-platform pain, so sorry to say I'm inclined to reject this. A different route you may want to pursue: checking for failure after every function call is likely a significant expense (test, branch, test, branch, test, branch, ...), which judicious use of setjmp/longjmp could alleviate. The tricky bit there is ensuring that failures in merge_lo and merge_hi arrange to get the temp array copied back in to the merge area (I expect they'd have to leave clues in the MergeState struct, which the longjmp target could act on). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590843&group_id=5470 From noreply@sourceforge.net Sun Aug 4 22:42:02 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 14:42:02 -0700 Subject: [Patches] [ python-Patches-401022 ] Removal of SET_LINENO (experimental) Message-ID: Patches item #401022, was opened at 2000-07-30 18:08 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=401022&group_id=5470 Category: Core (C code) Group: None >Status: Closed Resolution: Out of Date Priority: 5 Submitted By: Vladimir Marangozov (marangoz) Assigned to: Nobody/Anonymous (nobody) Summary: Removal of SET_LINENO (experimental) Initial Comment: ---------------------------------------------------------------------- >Comment By: Skip Montanaro (montanaro) Date: 2002-08-04 16:42 Message: Logged In: YES user_id=44345 Michael Hudson's no SET_LINENO patch seems to be nearing completion & this one's been marked "out of date" with no assignee, so I'm closing it... ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-06-20 16:04 Message: Logged In: YES user_id=31392 The URL of the day for Vladimir's explanation of how the patch works is here: http://mail.python.org/pipermail/python-dev/2000-July/007652.html ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-03-08 15:50 Message: Logged In: YES user_id=31392 Just in case Guido doesn't get to that VM redesign next week, do you want to upload whatever progress you made? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-03-07 12:41 Message: Logged In: YES user_id=35752 I worked a bit on porting to this patch to 2.2+ CVS. I ran into a snag with generators. Generators save the instruction pointer (i.e. the bytecode offset) on yield. That makes the on-the-fly bytecode translation approach more complicated. Since Guido is going to redesign the whole VM it's probably not work spending any more effort on this. :-) ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-11-27 15:54 Message: Logged In: YES user_id=31435 Unassigned again -- I'm not gonna get to this in this lifetime. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-09-10 13:51 Message: Logged In: YES user_id=6380 Tim wants to revisit this. It could be the quickest way to a 7% speedup in pystone that we can think of... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2000-11-13 13:42 Message: Rejected. It's in the archives for reference, but for now, I don't think it's worth spending cycles worrying about this kind of stuff. I'll eventually redesign the entire VM. ---------------------------------------------------------------------- Comment By: Vladimir Marangozov (marangoz) Date: 2000-10-27 06:08 Message: Oops, the last patch update does not contain the f.f_lineno computation in frame_getattr. This is necessary, cf. the following messages: http://www.python.org/pipermail/python-dev/2000-July/014395.html http://www.python.org/pipermail/python-dev/2000-July/014401.html Patch assigned to Guido, for review or further assignment. ---------------------------------------------------------------------- Comment By: Vladimir Marangozov (marangoz) Date: 2000-10-25 19:42 Message: noreply@sourceforge.net wrote: > > Date: 2000-Oct-25 13:56 > By: gvanrossum > > Comment: > Vladimir, are you there? So-so :) I'm a moving target, checking my mail occasionally these days. Luckily, today is one of these days. > > The patch doesn't apply cleanly to the current CVS tree any more... Ah, this one's easy. Here's an update relative to 2.0 final, not CVS. I got some r/w access error trying to update my CVS copy from SF that I have no time to investigate right now... The Web interface still works though :) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2000-10-25 15:56 Message: Vladimir, are you there? The patch doesn't apply cleanly to the current CVS tree any more... ---------------------------------------------------------------------- Comment By: Vladimir Marangozov (marangoz) Date: 2000-08-03 14:22 Message: Fix missing DECREF on error condition in start_tracing() + some renaming. ---------------------------------------------------------------------- Comment By: Vladimir Marangozov (marangoz) Date: 2000-07-31 12:50 Message: A last tiny fix of the SET_LINENO opcode for better b/w compatibility. Stopping here and entering standby mode for reactions & feedback. PS: the last idea about not duplicating co_code and tweaking the original with CALL_TRACE is a bad one. I remember Guido being against it because co_code could be used elsewhere (copied, written to disk, whatever) and he's right! Better operate on an internal copy created in ceval. ---------------------------------------------------------------------- Comment By: Vladimir Marangozov (marangoz) Date: 2000-07-31 09:57 Message: Another rewrite, making this whole strategy b/w compatible according to the 1st incompatibility point a) described in: http://www.python.org/pipermail/python-dev/2000-July/014364.html Changes: 1. f.f_lineno is computed and updated on f_lineno attribute requests for f, given f.f_lasti. Correctness is ensured because f.f_lasti is updated on *all* attribute accesses (in LOAD_ATTR in the main loop). 2. The standard setup does not generate SET_LINENO, but uses co_lnotab for computing the source line number (e.g. tracebacks) This is equivalent to the actual "python -O". 3. With "python -d", we fall back to the current version of the interpreter (with SET_LINENO) thus making it easy to test whether this patch fully replaces SET_LINENO's behavior. (modulo f->f_lineno accesses from legacy C code, but this is insane). IMO, this version already worths the pain to be truly tested and improved. One improvement is to define a nicer public C API for breakpoints: - PyCode_SetBreakPointAtLine(line) - PyCode_SetBreakPointAtAddr(addr) or similar, which would install a CALL_TRACE opcode in the appropriate location of the copy of co_code. Another idea is to avoid duplicating the entire co_code just for storing the CALL_TRACE opcodes. We can store them in the original and keep a table of breakpoints. Setting the breakpoints would occur whenever the sys.settrace hook is set. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2000-07-31 08:40 Message: Status set to postponed to indicate that this is still experimental. ---------------------------------------------------------------------- Comment By: Vladimir Marangozov (marangoz) Date: 2000-07-30 20:16 Message: A nit: inline the argfetch in CALL_TRACE and goto the switch, instead of jumping to get_oparg which splits the sequence [fetch opcode, fetch oparg] -- this can slow things down. ---------------------------------------------------------------------- Comment By: Vladimir Marangozov (marangoz) Date: 2000-07-30 18:12 Message: For testing, as discussed on python-dev. For a gentle summary, see: http://www.python.org/pipermail/python-dev/2000-July/014364.html ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=401022&group_id=5470 From noreply@sourceforge.net Sun Aug 4 22:43:07 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 14:43:07 -0700 Subject: [Patches] [ python-Patches-590843 ] list sort perf boost Message-ID: Patches item #590843, was opened at 2002-08-04 15:22 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590843&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Neal Norwitz (nnorwitz) >Assigned to: Neal Norwitz (nnorwitz) Summary: list sort perf boost Initial Comment: I don't really love this patch--the names suck. There are also 2 warnings. The warnings can be removed by casting, but I wanted to see if anybody had better ideas. However, I can't let Tim have all the fun optimizing the hell out of list sort. This patch gets rid of COMPARE == NULL in the ISLT macro. A structure is created which holds both the cmp function and PyObject/Py_LT. I get a speed up of between ~1-3%. Using the largest #s, 7.67 -> 7.59 for *sort. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-08-04 17:43 Message: Logged In: YES user_id=31435 Heh -- cute! It doesn't make a measurable difference on my MSVC6 box, but I can believe it helps a bit elsewhere. However, the warnings are trying to tell us something: casting PyObject_RichCompareBool to a type that takes a void* 3rd argument instead of an int is likely to work on boxes where sizeof(void*) == sizeof(int) (like my box, and presumably yours too), but it's lying in a potentially dangerous way on other boxes. If sizeof(void*) > sizeof (int), it's still likely to work on big-endian machines, but it gets very iffy on little-endian boxes (PyObject_RichCompareBool can pick up just the least- significant bits then). On the third hand, Py_LT is currently a #define for 0, so it may well work (albeit by accident) even then. I don't think there's enough gain here to justify that level of obscure x-platform pain, so sorry to say I'm inclined to reject this. A different route you may want to pursue: checking for failure after every function call is likely a significant expense (test, branch, test, branch, test, branch, ...), which judicious use of setjmp/longjmp could alleviate. The tricky bit there is ensuring that failures in merge_lo and merge_hi arrange to get the temp array copied back in to the merge area (I expect they'd have to leave clues in the MergeState struct, which the longjmp target could act on). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590843&group_id=5470 From noreply@sourceforge.net Mon Aug 5 00:56:33 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 16:56:33 -0700 Subject: [Patches] [ python-Patches-590913 ] PEP 263 support in IDLE Message-ID: Patches item #590913, was opened at 2002-08-05 01:56 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590913&group_id=5470 Category: IDLE Group: None Status: Open Resolution: None Priority: 5 Submitted By: Martin v. L�wis (loewis) Assigned to: Guido van Rossum (gvanrossum) Summary: PEP 263 support in IDLE Initial Comment: This patch adds the notion of encodings to IDLE. In particular: - it tries to determine the locale's encoding (falling back to ASCII if that fails, or no codec is found) - looks for PEP 263 encoding specs when reading and writing files (producing errors when the encoding spec is wrong) - produces error dialogs when new files have non-ASCII, but no declared encoding - assumes the locale's encoding when a non-ASCII file is opened, uses the same encoding when the file is later saved again, - falls back to letting Tcl deal with decoding when decoding fails, - falls back to saving as UTF-8 when encoding fails (so perhaps the errors should all be infos instead) - applies the locale's encoding in the interactive window. This is not a violation of PEP 263, instead, it just changes the encoding of the interactive shell from "unicode" to the locale's encoding - probably similar to what all other terminals do. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590913&group_id=5470 From noreply@sourceforge.net Mon Aug 5 01:37:15 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 17:37:15 -0700 Subject: [Patches] [ python-Patches-590843 ] list sort perf boost Message-ID: Patches item #590843, was opened at 2002-08-04 15:22 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590843&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed >Resolution: Rejected Priority: 5 Submitted By: Neal Norwitz (nnorwitz) Assigned to: Neal Norwitz (nnorwitz) Summary: list sort perf boost Initial Comment: I don't really love this patch--the names suck. There are also 2 warnings. The warnings can be removed by casting, but I wanted to see if anybody had better ideas. However, I can't let Tim have all the fun optimizing the hell out of list sort. This patch gets rid of COMPARE == NULL in the ISLT macro. A structure is created which holds both the cmp function and PyObject/Py_LT. I get a speed up of between ~1-3%. Using the largest #s, 7.67 -> 7.59 for *sort. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 20:37 Message: Logged In: YES user_id=33168 Tim, you have an evil mind to think in that way. :-) You are, of course, correct. The problem, both the warnings and the cast/endian problems, could be overcome by creating a local function with the proper signature that calls PyObject_RichCompareBool. However, I don't think it's worth it. So I'm closing this patch. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-04 17:56 Message: Logged In: YES user_id=31435 Ack, swap "little" with "big" in that comment everywhere. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-04 17:43 Message: Logged In: YES user_id=31435 Heh -- cute! It doesn't make a measurable difference on my MSVC6 box, but I can believe it helps a bit elsewhere. However, the warnings are trying to tell us something: casting PyObject_RichCompareBool to a type that takes a void* 3rd argument instead of an int is likely to work on boxes where sizeof(void*) == sizeof(int) (like my box, and presumably yours too), but it's lying in a potentially dangerous way on other boxes. If sizeof(void*) > sizeof (int), it's still likely to work on big-endian machines, but it gets very iffy on little-endian boxes (PyObject_RichCompareBool can pick up just the least- significant bits then). On the third hand, Py_LT is currently a #define for 0, so it may well work (albeit by accident) even then. I don't think there's enough gain here to justify that level of obscure x-platform pain, so sorry to say I'm inclined to reject this. A different route you may want to pursue: checking for failure after every function call is likely a significant expense (test, branch, test, branch, test, branch, ...), which judicious use of setjmp/longjmp could alleviate. The tricky bit there is ensuring that failures in merge_lo and merge_hi arrange to get the temp array copied back in to the merge area (I expect they'd have to leave clues in the MergeState struct, which the longjmp target could act on). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590843&group_id=5470 From noreply@sourceforge.net Mon Aug 5 01:50:00 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 17:50:00 -0700 Subject: [Patches] [ python-Patches-590843 ] list sort perf boost Message-ID: Patches item #590843, was opened at 2002-08-04 15:22 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590843&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Closed Resolution: Rejected Priority: 5 Submitted By: Neal Norwitz (nnorwitz) Assigned to: Neal Norwitz (nnorwitz) Summary: list sort perf boost Initial Comment: I don't really love this patch--the names suck. There are also 2 warnings. The warnings can be removed by casting, but I wanted to see if anybody had better ideas. However, I can't let Tim have all the fun optimizing the hell out of list sort. This patch gets rid of COMPARE == NULL in the ISLT macro. A structure is created which holds both the cmp function and PyObject/Py_LT. I get a speed up of between ~1-3%. Using the largest #s, 7.67 -> 7.59 for *sort. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-08-04 20:50 Message: Logged In: YES user_id=31435 I wasn't born evil, Neal -- two previous lives on 64-bit boxes made me evil the hard way . Creating a local adapter function doesn't sound promising, as that would introduce another layer of function call, and I just got a 5+% speedup before this by getting rid of a layer of function call (note that islt() used to call PyObject_RichCompareBool() immediately upon entry -- it didn't do anything other than that when compare==NULL). The inline test+branch that remains is much cheaper than calling a do-little indirection function, at least on this box. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 20:37 Message: Logged In: YES user_id=33168 Tim, you have an evil mind to think in that way. :-) You are, of course, correct. The problem, both the warnings and the cast/endian problems, could be overcome by creating a local function with the proper signature that calls PyObject_RichCompareBool. However, I don't think it's worth it. So I'm closing this patch. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-04 17:56 Message: Logged In: YES user_id=31435 Ack, swap "little" with "big" in that comment everywhere. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-04 17:43 Message: Logged In: YES user_id=31435 Heh -- cute! It doesn't make a measurable difference on my MSVC6 box, but I can believe it helps a bit elsewhere. However, the warnings are trying to tell us something: casting PyObject_RichCompareBool to a type that takes a void* 3rd argument instead of an int is likely to work on boxes where sizeof(void*) == sizeof(int) (like my box, and presumably yours too), but it's lying in a potentially dangerous way on other boxes. If sizeof(void*) > sizeof (int), it's still likely to work on big-endian machines, but it gets very iffy on little-endian boxes (PyObject_RichCompareBool can pick up just the least- significant bits then). On the third hand, Py_LT is currently a #define for 0, so it may well work (albeit by accident) even then. I don't think there's enough gain here to justify that level of obscure x-platform pain, so sorry to say I'm inclined to reject this. A different route you may want to pursue: checking for failure after every function call is likely a significant expense (test, branch, test, branch, test, branch, ...), which judicious use of setjmp/longjmp could alleviate. The tricky bit there is ensuring that failures in merge_lo and merge_hi arrange to get the temp array copied back in to the merge area (I expect they'd have to leave clues in the MergeState struct, which the longjmp target could act on). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590843&group_id=5470 From noreply@sourceforge.net Mon Aug 5 07:27:50 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 04 Aug 2002 23:27:50 -0700 Subject: [Patches] [ python-Patches-580331 ] xreadlines caching, file iterator Message-ID: Patches item #580331, was opened at 2002-07-11 21:45 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 3 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: xreadlines caching, file iterator Initial Comment: Calling f.xreadlines() multiple times returns the same xreadlines object. A file is an iterator - __iter__() returns self and next() calls the cached xreadlines object's next method. ---------------------------------------------------------------------- >Comment By: Oren Tirosh (orenti) Date: 2002-08-05 06:27 Message: Logged In: YES user_id=562624 The version of the patch still makes a file an iterator but it no longer depends on xreadlines - it implements the readahead buffering inside the file object. It is about 19% faster than xreadlines for normal text files and about 40% faster for files with 100k lines. The methods readline and read do not use this readahead mechanism because it skews the current file position (just like xreadlines does). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-17 17:50 Message: Logged In: YES user_id=6380 Alas, there's a fatal flaw. The file object and the xreadlines object now both have pointers to each other, creating an unbreakable cycle (since neither participates in GC). Weak refs can't be used to resolve this dilemma. I personally think that's enough to just stick with the status quo (I was never more than +0 on the idea of making the file an interator anyway). But I'll leave it to Oren to come up with another hack (please use this same SF patch). Oren, if you'd like to give up, please say so and I'll close the item in a jiffy. In fact, I positively encourage you to give up. But I don't expect you to take this offer. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-17 01:33 Message: Logged In: YES user_id=6380 I'm reviewing this and will check it in, or something like it (probably). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-16 05:26 Message: Logged In: YES user_id=562624 Now invalidates cache on a seek. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-15 14:38 Message: Logged In: YES user_id=6380 I posted some comments to python-dev. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 From noreply@sourceforge.net Mon Aug 5 08:59:05 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 00:59:05 -0700 Subject: [Patches] [ python-Patches-590682 ] New codecs: html, asciihtml Message-ID: Patches item #590682, was opened at 2002-08-04 04:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 Category: None Group: None Status: Open Resolution: None >Priority: 3 Submitted By: Oren Tirosh (orenti) >Assigned to: M.-A. Lemburg (lemburg) Summary: New codecs: html, asciihtml Initial Comment: These codecs translate HTML character &entity; references. The html codec may be applied after other codecs such as utf-8 or iso8859_X and preserves their encoding. The asciihtml encoder produces 7-bit ascii and its output is therefore safe for insertion into almost any document regardless of its encoding. ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2002-08-05 07:59 Message: Logged In: YES user_id=38388 On the htmlentitydefs: yes, these are in use as they are defined now. If you want a mapping from and to Unicode, I'd suggest to provide this as a new table. About the cased key in the entitydefs dict: AFAIK, these have to be cased since entities are case-sensitive. Could be wrong though. On PEP 293: this is going in the final round now. Your patch doesn't compete with it though, since PEP 293 is a much more general approach. On the general idea: I think the codecs are misnamed. They should be called htmlescape and asciihtmlescape since they don't provide "real" HTML encoding/decoding as Martin already mentioned. There's something wrong with your approach, BTW: the codec should only operate on Unicode (taking only Unicode input and generating Unicode). If you apply it to an 8-bit UTF-8 encoded strings you'll get garbage ! ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 15:54 Message: Logged In: YES user_id=21627 I'm in favour of exposing this via a search functions, for generated codec names, on top of PEP 293 (I would not like your codec to compete with the alternative mechanism). My dislike for the current patch also comes from the fact that it singles-out ASCII, which the search function would not. You could implement two forms: html.codecname and xml.codecname. The html form would do HTML entity references in both directions, and fall back to character references only if necessary; the XML form would use character references all the time, and entity references only for the builtin entities. And yes, I do recommend users to use codecs.charmap_encode directly, as this is probably the most efficient, yet most compact way to convert Unicode to a less-than-7-bit form. In anycase, I'd encourage you to contribute to the progress of PEP 293 first - this has been an issue for several years now, and I would be sorry if it would fail. While you are waiting for PEP 293 to complete, please do consider cleaning up htmlentitydefs to provide mappings from and to Unicode characters. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 15:07 Message: Logged In: YES user_id=562624 >People may be tricked into believing that they can >decode arbitrary HTML with your codec - when your >codec would incorrectly deal with CDATA sections. You don't even need to go as far as CDATA to see that tags must be parsed first and only then tag bodies and attribute values can be individually decoded. If you do it in the reverse order the tag parser will try to parse < as a tag. It should be documented, though. For encoding it's also obvious that encoding must be done first and then the encoded strings can be inserted into tags - < in strings is encoded into < preventing it from being interpreted as a tag. This is a good thing! it prevents insertion attacks. > You can easily enough arrange to get errors on <, >, > and &, by using codecs.charmap_encode with an > appropriate encoding map. If you mean to use this as some internal implementation detail it's ok. Are actually proposing that this is the way end users should use it? How about this: Install an encoder registry function that responds to any codec name matching "xmlcharref.SPAM" and does all the internal magic you describe to create a codec instance that combines xmlcharref translation including <,>,& and the SPAM encoding. This dynamically-generated codec will do both encoding and decoding and be cached, of course. "Namespaces are one honking great idea -- let's do more of those!" ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 11:50 Message: Logged In: YES user_id=21627 You can easily enough arrange to get errors on <, >, amd &, by using codecs.charmap_encode with an appropriate encoding map. Infact, with that, you can easily get all entity refereces into the encoded data, without any need for an explicit iteration. However, I am concerned that you offer decoding as well. People may be tricked into believing that they can decode arbitrrary HTML with your codec - when your codec would incorrectly deal with CDATA sections. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 11:10 Message: Logged In: YES user_id=562624 PEP 293 and patch #432401 are not a replacement for these codecs - it does decoding as well as encoding and also translates <, >, and & which are valid in all encodings and therefore won't get translated by error callbacks. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 11:00 Message: Logged In: YES user_id=562624 Yes, the error callback approach handles strange mixes better than my method of chaining codecs. But it only does encoding - this patch also provides full decoding of named, decimal and hexadecimal character entity references. Assuming PEP 293 is accepted, I'd like to see the asciihtml codec stay for its decoding ability and renamed to xmlcharref. The encoding part of this codec can just call .encode("ascii", errors="xmlcharrefreplace") to make it a full two-way codec. I'd prefer htmlentitydefs.py to use unicode, too. It's not so useful the way it is. Another problem is that it uses mixed case names as keys. The dictionary lookup is likely to miss incoming entities with arbitrary case so it's more-or-less broken. Does anyone actually use it the way it is? Can it be changed to use unicode without breaking anyone's code? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 08:54 Message: Logged In: YES user_id=21627 This patch is superceded by PEP 293 and patch #432401, which allows you to write unitext.encode("ascii", errors = "xmlcharrefreplace") This probably should be left open until PEP 293 is pronounced upon, and then either rejected or reviewed in detail. I'd encourage a patch that uses Unicode in htmlentitydefs directly, and computes entitydefs from that, instead of vice-versa (or atleast exposes a unicode_entitydefs, perhaps even lazily) - perhaps also with a reverse mapping. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 From noreply@sourceforge.net Mon Aug 5 10:09:32 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 02:09:32 -0700 Subject: [Patches] [ python-Patches-587993 ] alternative SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 11:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Nobody/Anonymous (nobody) Summary: alternative SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Michael Hudson (mwh) Date: 2002-08-05 09:09 Message: Logged In: YES user_id=6656 Thanks for the comments, Neal! I'm not sure it's possible to separate the changes in the way you describe. For instance, com_addoparg -> com_set_lineno stops SET_LINENO being generated, so breaks tracing without the VM changes, but the VM changes make SET_LINENO into an unknown opcode... I didn't intend to upload the pystone change. About the comments: - the last little RETURN_NONE changes are easy. Thanks for the pointer. - agree there are too many bits of code grovelling co_lnotab. however, it's not clear that they can easily be refactored. maybe a generator would be useful... hmm. let me think about this one. - did I take out the initialisation of frame.f_lineno entirely? oops. probably set it to co_firstlineno. - fixed trace.py docstring. reusing code not all that easy, because different uses have subtly different requirements. hmm, inspect.getlineno is now pointless... was there any particular reason you unassigned the patch? Just curious as the main reason it was assigned to Guido was to make sure he saw it. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 19:50 Message: Logged In: YES user_id=33168 Attached is a patch to test_hotshot which is necessary because the duplicate line #s at the beginning of the function no longer exist. WIth this patch test_hotshot passes. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 20:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 15:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 13:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 22:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 21:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 20:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 15:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 15:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Mon Aug 5 12:34:09 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 04:34:09 -0700 Subject: [Patches] [ python-Patches-590513 ] Add popen2 like functionality to pty.py. Message-ID: Patches item #590513, was opened at 2002-08-04 00:16 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590513&group_id=5470 Category: Library (Lib) Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Rasjid Wilcox (rasjidw) Assigned to: Nobody/Anonymous (nobody) Summary: Add popen2 like functionality to pty.py. Initial Comment: This patch adds a popen2 like function to pty.py. Due to use of os.execlp in pty.spawn, it is not quite the same, as all the arguments (including the command to be run) must be passed as a tupple. It is only a first draft, and may need some more work, which I am willing to do if some direction is indicated. Tested on Python2.2, under RedHat Linux 7.3. Rasjid. ---------------------------------------------------------------------- >Comment By: Rasjid Wilcox (rasjidw) Date: 2002-08-05 21:34 Message: Logged In: YES user_id=39640 Before I do docs etc, I have a few questions: 1. I could make it more popen2 like by changing the args to def popen2(cmd, ....) and adding argv=('/bin/sh','-c',cmd) Is this a better idea? Does it reduce portability? Is it safe to assume that all posix systems have /bin/sh? (My guess is yes, no and yes.) 2. Should the threading done in the pty.popen2 function be moved to a separate function, to allow more direct access to spawn. (The current spawn function does not return until the child exits or the parent closes the pipe). 3. Should I worry about how keyboard interrupts are handled? In some cases an uncontrolled process may be left hanging around. Or is it the job of the calling process to deal with that? Lastly, I am away for a week from Wednesday, so I won't be able to do much until I get back, but I will try and finish this off then. Cheers, Rasjid. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 18:56 Message: Logged In: YES user_id=21627 Can you please write documentation and a test case? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590513&group_id=5470 From noreply@sourceforge.net Mon Aug 5 12:44:06 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 04:44:06 -0700 Subject: [Patches] [ python-Patches-590513 ] Add popen2 like functionality to pty.py. Message-ID: Patches item #590513, was opened at 2002-08-03 16:16 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590513&group_id=5470 Category: Library (Lib) Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Rasjid Wilcox (rasjidw) Assigned to: Nobody/Anonymous (nobody) Summary: Add popen2 like functionality to pty.py. Initial Comment: This patch adds a popen2 like function to pty.py. Due to use of os.execlp in pty.spawn, it is not quite the same, as all the arguments (including the command to be run) must be passed as a tupple. It is only a first draft, and may need some more work, which I am willing to do if some direction is indicated. Tested on Python2.2, under RedHat Linux 7.3. Rasjid. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-05 13:44 Message: Logged In: YES user_id=21627 I'm not a regular pty user. Please ask those questions in comp.lang.python, and python-dev. You can also ask previous authors to pty for comments. Uncertainty in such areas might be a hint that a library PEP is need, to justify the rationale for all the details. There is no need to hurry - Python 2.3 is still months away. That said, I do think that this functionality is desirable, so I'd encourage you to complete this task. ---------------------------------------------------------------------- Comment By: Rasjid Wilcox (rasjidw) Date: 2002-08-05 13:34 Message: Logged In: YES user_id=39640 Before I do docs etc, I have a few questions: 1. I could make it more popen2 like by changing the args to def popen2(cmd, ....) and adding argv=('/bin/sh','-c',cmd) Is this a better idea? Does it reduce portability? Is it safe to assume that all posix systems have /bin/sh? (My guess is yes, no and yes.) 2. Should the threading done in the pty.popen2 function be moved to a separate function, to allow more direct access to spawn. (The current spawn function does not return until the child exits or the parent closes the pipe). 3. Should I worry about how keyboard interrupts are handled? In some cases an uncontrolled process may be left hanging around. Or is it the job of the calling process to deal with that? Lastly, I am away for a week from Wednesday, so I won't be able to do much until I get back, but I will try and finish this off then. Cheers, Rasjid. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 10:56 Message: Logged In: YES user_id=21627 Can you please write documentation and a test case? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590513&group_id=5470 From noreply@sourceforge.net Mon Aug 5 12:56:26 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 04:56:26 -0700 Subject: [Patches] [ python-Patches-587993 ] alternative SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 07:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) >Assigned to: Guido van Rossum (gvanrossum) Summary: alternative SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 07:56 Message: Logged In: YES user_id=33168 Re-assigning to Guido. I think what happened was I was reviewing the patch before it was assigned to Guido. Sorry about that. You're right about the set_lineno, I thought the opcode was also generated in there. I agree about the code reuse--it's probably too hard. If I had a better idea, I would have shared it. :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 05:09 Message: Logged In: YES user_id=6656 Thanks for the comments, Neal! I'm not sure it's possible to separate the changes in the way you describe. For instance, com_addoparg -> com_set_lineno stops SET_LINENO being generated, so breaks tracing without the VM changes, but the VM changes make SET_LINENO into an unknown opcode... I didn't intend to upload the pystone change. About the comments: - the last little RETURN_NONE changes are easy. Thanks for the pointer. - agree there are too many bits of code grovelling co_lnotab. however, it's not clear that they can easily be refactored. maybe a generator would be useful... hmm. let me think about this one. - did I take out the initialisation of frame.f_lineno entirely? oops. probably set it to co_firstlineno. - fixed trace.py docstring. reusing code not all that easy, because different uses have subtly different requirements. hmm, inspect.getlineno is now pointless... was there any particular reason you unassigned the patch? Just curious as the main reason it was assigned to Guido was to make sure he saw it. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 15:50 Message: Logged In: YES user_id=33168 Attached is a patch to test_hotshot which is necessary because the duplicate line #s at the beginning of the function no longer exist. WIth this patch test_hotshot passes. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 16:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 11:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 09:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 18:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 17:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 05:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 05:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 16:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 11:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 11:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Mon Aug 5 13:11:35 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 05:11:35 -0700 Subject: [Patches] [ python-Patches-590682 ] New codecs: html, asciihtml Message-ID: Patches item #590682, was opened at 2002-08-04 04:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 3 Submitted By: Oren Tirosh (orenti) Assigned to: M.-A. Lemburg (lemburg) Summary: New codecs: html, asciihtml Initial Comment: These codecs translate HTML character &entity; references. The html codec may be applied after other codecs such as utf-8 or iso8859_X and preserves their encoding. The asciihtml encoder produces 7-bit ascii and its output is therefore safe for insertion into almost any document regardless of its encoding. ---------------------------------------------------------------------- >Comment By: Oren Tirosh (orenti) Date: 2002-08-05 12:11 Message: Logged In: YES user_id=562624 Yes, entities are supposed to be case sensitive but I'm working with manually-generated html in which > is not so uncommon... I guess life is different in XML world. Case-smashing loses the distinction between some entities. I guess I need a more intelligent solution. > If you apply it to an 8-bit UTF-8 encoded strings you'll get garbage! Actually, it works great. The html codec passes characters 128-255 unmodified and therefore can be chained with other codecs. But I now have a more elegant and high-performance approach than codec chaining. See my python-dev posting. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 12:11 Message: Logged In: YES user_id=562624 Yes, entities are supposed to be case sensitive but I'm working with manually-generated html in which > is not so uncommon... I guess life is different in XML world. Case-smashing loses the distinction between some entities. I guess I need a more intelligent solution. > If you apply it to an 8-bit UTF-8 encoded strings you'll get garbage! Actually, it works great. The html codec passes characters 128-255 unmodified and therefore can be chained with other codecs. But I now have a more elegant and high-performance approach than codec chaining. See my python-dev posting. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-08-05 07:59 Message: Logged In: YES user_id=38388 On the htmlentitydefs: yes, these are in use as they are defined now. If you want a mapping from and to Unicode, I'd suggest to provide this as a new table. About the cased key in the entitydefs dict: AFAIK, these have to be cased since entities are case-sensitive. Could be wrong though. On PEP 293: this is going in the final round now. Your patch doesn't compete with it though, since PEP 293 is a much more general approach. On the general idea: I think the codecs are misnamed. They should be called htmlescape and asciihtmlescape since they don't provide "real" HTML encoding/decoding as Martin already mentioned. There's something wrong with your approach, BTW: the codec should only operate on Unicode (taking only Unicode input and generating Unicode). If you apply it to an 8-bit UTF-8 encoded strings you'll get garbage ! ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 15:54 Message: Logged In: YES user_id=21627 I'm in favour of exposing this via a search functions, for generated codec names, on top of PEP 293 (I would not like your codec to compete with the alternative mechanism). My dislike for the current patch also comes from the fact that it singles-out ASCII, which the search function would not. You could implement two forms: html.codecname and xml.codecname. The html form would do HTML entity references in both directions, and fall back to character references only if necessary; the XML form would use character references all the time, and entity references only for the builtin entities. And yes, I do recommend users to use codecs.charmap_encode directly, as this is probably the most efficient, yet most compact way to convert Unicode to a less-than-7-bit form. In anycase, I'd encourage you to contribute to the progress of PEP 293 first - this has been an issue for several years now, and I would be sorry if it would fail. While you are waiting for PEP 293 to complete, please do consider cleaning up htmlentitydefs to provide mappings from and to Unicode characters. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 15:07 Message: Logged In: YES user_id=562624 >People may be tricked into believing that they can >decode arbitrary HTML with your codec - when your >codec would incorrectly deal with CDATA sections. You don't even need to go as far as CDATA to see that tags must be parsed first and only then tag bodies and attribute values can be individually decoded. If you do it in the reverse order the tag parser will try to parse < as a tag. It should be documented, though. For encoding it's also obvious that encoding must be done first and then the encoded strings can be inserted into tags - < in strings is encoded into < preventing it from being interpreted as a tag. This is a good thing! it prevents insertion attacks. > You can easily enough arrange to get errors on <, >, > and &, by using codecs.charmap_encode with an > appropriate encoding map. If you mean to use this as some internal implementation detail it's ok. Are actually proposing that this is the way end users should use it? How about this: Install an encoder registry function that responds to any codec name matching "xmlcharref.SPAM" and does all the internal magic you describe to create a codec instance that combines xmlcharref translation including <,>,& and the SPAM encoding. This dynamically-generated codec will do both encoding and decoding and be cached, of course. "Namespaces are one honking great idea -- let's do more of those!" ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 11:50 Message: Logged In: YES user_id=21627 You can easily enough arrange to get errors on <, >, amd &, by using codecs.charmap_encode with an appropriate encoding map. Infact, with that, you can easily get all entity refereces into the encoded data, without any need for an explicit iteration. However, I am concerned that you offer decoding as well. People may be tricked into believing that they can decode arbitrrary HTML with your codec - when your codec would incorrectly deal with CDATA sections. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 11:10 Message: Logged In: YES user_id=562624 PEP 293 and patch #432401 are not a replacement for these codecs - it does decoding as well as encoding and also translates <, >, and & which are valid in all encodings and therefore won't get translated by error callbacks. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 11:00 Message: Logged In: YES user_id=562624 Yes, the error callback approach handles strange mixes better than my method of chaining codecs. But it only does encoding - this patch also provides full decoding of named, decimal and hexadecimal character entity references. Assuming PEP 293 is accepted, I'd like to see the asciihtml codec stay for its decoding ability and renamed to xmlcharref. The encoding part of this codec can just call .encode("ascii", errors="xmlcharrefreplace") to make it a full two-way codec. I'd prefer htmlentitydefs.py to use unicode, too. It's not so useful the way it is. Another problem is that it uses mixed case names as keys. The dictionary lookup is likely to miss incoming entities with arbitrary case so it's more-or-less broken. Does anyone actually use it the way it is? Can it be changed to use unicode without breaking anyone's code? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 08:54 Message: Logged In: YES user_id=21627 This patch is superceded by PEP 293 and patch #432401, which allows you to write unitext.encode("ascii", errors = "xmlcharrefreplace") This probably should be left open until PEP 293 is pronounced upon, and then either rejected or reviewed in detail. I'd encourage a patch that uses Unicode in htmlentitydefs directly, and computes entitydefs from that, instead of vice-versa (or atleast exposes a unicode_entitydefs, perhaps even lazily) - perhaps also with a reverse mapping. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 From noreply@sourceforge.net Mon Aug 5 13:14:07 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 05:14:07 -0700 Subject: [Patches] [ python-Patches-584626 ] yield allowed in try/finally Message-ID: Patches item #584626, was opened at 2002-07-21 20:29 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=584626&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: yield allowed in try/finally Initial Comment: A generator's dealloc function now resumes a generator one last time by jumping directly to the return statement at the end of the code. As a result, the finally section of any try/finally blocks is executed. Any exceptions raised are treated just like exceptions in a __del__ finalizer. ---------------------------------------------------------------------- >Comment By: Oren Tirosh (orenti) Date: 2002-08-05 12:14 Message: Logged In: YES user_id=562624 Patch abandoned. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 21:23 Message: Logged In: YES user_id=35752 The GC will need to be taught about these finalizers. Look for the method 'has_finalizer' in gcmodule.c. I don't think we want that method to return true for all generator objects since that would cause any reference cycle containing a generator to become uncollectable. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=584626&group_id=5470 From noreply@sourceforge.net Mon Aug 5 13:11:06 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 05:11:06 -0700 Subject: [Patches] [ python-Patches-590682 ] New codecs: html, asciihtml Message-ID: Patches item #590682, was opened at 2002-08-04 04:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 3 Submitted By: Oren Tirosh (orenti) Assigned to: M.-A. Lemburg (lemburg) Summary: New codecs: html, asciihtml Initial Comment: These codecs translate HTML character &entity; references. The html codec may be applied after other codecs such as utf-8 or iso8859_X and preserves their encoding. The asciihtml encoder produces 7-bit ascii and its output is therefore safe for insertion into almost any document regardless of its encoding. ---------------------------------------------------------------------- >Comment By: Oren Tirosh (orenti) Date: 2002-08-05 12:11 Message: Logged In: YES user_id=562624 Yes, entities are supposed to be case sensitive but I'm working with manually-generated html in which > is not so uncommon... I guess life is different in XML world. Case-smashing loses the distinction between some entities. I guess I need a more intelligent solution. > If you apply it to an 8-bit UTF-8 encoded strings you'll get garbage! Actually, it works great. The html codec passes characters 128-255 unmodified and therefore can be chained with other codecs. But I now have a more elegant and high-performance approach than codec chaining. See my python-dev posting. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-08-05 07:59 Message: Logged In: YES user_id=38388 On the htmlentitydefs: yes, these are in use as they are defined now. If you want a mapping from and to Unicode, I'd suggest to provide this as a new table. About the cased key in the entitydefs dict: AFAIK, these have to be cased since entities are case-sensitive. Could be wrong though. On PEP 293: this is going in the final round now. Your patch doesn't compete with it though, since PEP 293 is a much more general approach. On the general idea: I think the codecs are misnamed. They should be called htmlescape and asciihtmlescape since they don't provide "real" HTML encoding/decoding as Martin already mentioned. There's something wrong with your approach, BTW: the codec should only operate on Unicode (taking only Unicode input and generating Unicode). If you apply it to an 8-bit UTF-8 encoded strings you'll get garbage ! ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 15:54 Message: Logged In: YES user_id=21627 I'm in favour of exposing this via a search functions, for generated codec names, on top of PEP 293 (I would not like your codec to compete with the alternative mechanism). My dislike for the current patch also comes from the fact that it singles-out ASCII, which the search function would not. You could implement two forms: html.codecname and xml.codecname. The html form would do HTML entity references in both directions, and fall back to character references only if necessary; the XML form would use character references all the time, and entity references only for the builtin entities. And yes, I do recommend users to use codecs.charmap_encode directly, as this is probably the most efficient, yet most compact way to convert Unicode to a less-than-7-bit form. In anycase, I'd encourage you to contribute to the progress of PEP 293 first - this has been an issue for several years now, and I would be sorry if it would fail. While you are waiting for PEP 293 to complete, please do consider cleaning up htmlentitydefs to provide mappings from and to Unicode characters. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 15:07 Message: Logged In: YES user_id=562624 >People may be tricked into believing that they can >decode arbitrary HTML with your codec - when your >codec would incorrectly deal with CDATA sections. You don't even need to go as far as CDATA to see that tags must be parsed first and only then tag bodies and attribute values can be individually decoded. If you do it in the reverse order the tag parser will try to parse < as a tag. It should be documented, though. For encoding it's also obvious that encoding must be done first and then the encoded strings can be inserted into tags - < in strings is encoded into < preventing it from being interpreted as a tag. This is a good thing! it prevents insertion attacks. > You can easily enough arrange to get errors on <, >, > and &, by using codecs.charmap_encode with an > appropriate encoding map. If you mean to use this as some internal implementation detail it's ok. Are actually proposing that this is the way end users should use it? How about this: Install an encoder registry function that responds to any codec name matching "xmlcharref.SPAM" and does all the internal magic you describe to create a codec instance that combines xmlcharref translation including <,>,& and the SPAM encoding. This dynamically-generated codec will do both encoding and decoding and be cached, of course. "Namespaces are one honking great idea -- let's do more of those!" ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 11:50 Message: Logged In: YES user_id=21627 You can easily enough arrange to get errors on <, >, amd &, by using codecs.charmap_encode with an appropriate encoding map. Infact, with that, you can easily get all entity refereces into the encoded data, without any need for an explicit iteration. However, I am concerned that you offer decoding as well. People may be tricked into believing that they can decode arbitrrary HTML with your codec - when your codec would incorrectly deal with CDATA sections. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 11:10 Message: Logged In: YES user_id=562624 PEP 293 and patch #432401 are not a replacement for these codecs - it does decoding as well as encoding and also translates <, >, and & which are valid in all encodings and therefore won't get translated by error callbacks. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 11:00 Message: Logged In: YES user_id=562624 Yes, the error callback approach handles strange mixes better than my method of chaining codecs. But it only does encoding - this patch also provides full decoding of named, decimal and hexadecimal character entity references. Assuming PEP 293 is accepted, I'd like to see the asciihtml codec stay for its decoding ability and renamed to xmlcharref. The encoding part of this codec can just call .encode("ascii", errors="xmlcharrefreplace") to make it a full two-way codec. I'd prefer htmlentitydefs.py to use unicode, too. It's not so useful the way it is. Another problem is that it uses mixed case names as keys. The dictionary lookup is likely to miss incoming entities with arbitrary case so it's more-or-less broken. Does anyone actually use it the way it is? Can it be changed to use unicode without breaking anyone's code? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 08:54 Message: Logged In: YES user_id=21627 This patch is superceded by PEP 293 and patch #432401, which allows you to write unitext.encode("ascii", errors = "xmlcharrefreplace") This probably should be left open until PEP 293 is pronounced upon, and then either rejected or reviewed in detail. I'd encourage a patch that uses Unicode in htmlentitydefs directly, and computes entitydefs from that, instead of vice-versa (or atleast exposes a unicode_entitydefs, perhaps even lazily) - perhaps also with a reverse mapping. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 From noreply@sourceforge.net Mon Aug 5 13:20:57 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 05:20:57 -0700 Subject: [Patches] [ python-Patches-587993 ] alternative SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 11:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Guido van Rossum (gvanrossum) Summary: alternative SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Michael Hudson (mwh) Date: 2002-08-05 12:20 Message: Logged In: YES user_id=6656 Here's another update. I've also deleted the old patches. Changes this time: + doesn't include pystone LOOPS boost + finish RETURN_NONE off: - add to dis.py - add to libdis.tex + removed now-unecessary co_lnotab grovelling in inspect and traceback. left the functions for compatibility. + removed the WE_WANT_THE_HACK hack + incudes Neal's test_hotshot patch + more trace.py tweaks. Lib/compiler is still broken. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 11:56 Message: Logged In: YES user_id=33168 Re-assigning to Guido. I think what happened was I was reviewing the patch before it was assigned to Guido. Sorry about that. You're right about the set_lineno, I thought the opcode was also generated in there. I agree about the code reuse--it's probably too hard. If I had a better idea, I would have shared it. :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 09:09 Message: Logged In: YES user_id=6656 Thanks for the comments, Neal! I'm not sure it's possible to separate the changes in the way you describe. For instance, com_addoparg -> com_set_lineno stops SET_LINENO being generated, so breaks tracing without the VM changes, but the VM changes make SET_LINENO into an unknown opcode... I didn't intend to upload the pystone change. About the comments: - the last little RETURN_NONE changes are easy. Thanks for the pointer. - agree there are too many bits of code grovelling co_lnotab. however, it's not clear that they can easily be refactored. maybe a generator would be useful... hmm. let me think about this one. - did I take out the initialisation of frame.f_lineno entirely? oops. probably set it to co_firstlineno. - fixed trace.py docstring. reusing code not all that easy, because different uses have subtly different requirements. hmm, inspect.getlineno is now pointless... was there any particular reason you unassigned the patch? Just curious as the main reason it was assigned to Guido was to make sure he saw it. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 19:50 Message: Logged In: YES user_id=33168 Attached is a patch to test_hotshot which is necessary because the duplicate line #s at the beginning of the function no longer exist. WIth this patch test_hotshot passes. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 20:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 15:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 13:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 22:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 21:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 20:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 15:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 15:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Mon Aug 5 15:23:19 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 07:23:19 -0700 Subject: [Patches] [ python-Patches-587993 ] alternative SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 11:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Guido van Rossum (gvanrossum) Summary: alternative SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Neil Schemenauer (nascheme) Date: 2002-08-05 14:23 Message: Logged In: YES user_id=35752 Why was the RETURN_NONE opcode added? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 12:20 Message: Logged In: YES user_id=6656 Here's another update. I've also deleted the old patches. Changes this time: + doesn't include pystone LOOPS boost + finish RETURN_NONE off: - add to dis.py - add to libdis.tex + removed now-unecessary co_lnotab grovelling in inspect and traceback. left the functions for compatibility. + removed the WE_WANT_THE_HACK hack + incudes Neal's test_hotshot patch + more trace.py tweaks. Lib/compiler is still broken. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 11:56 Message: Logged In: YES user_id=33168 Re-assigning to Guido. I think what happened was I was reviewing the patch before it was assigned to Guido. Sorry about that. You're right about the set_lineno, I thought the opcode was also generated in there. I agree about the code reuse--it's probably too hard. If I had a better idea, I would have shared it. :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 09:09 Message: Logged In: YES user_id=6656 Thanks for the comments, Neal! I'm not sure it's possible to separate the changes in the way you describe. For instance, com_addoparg -> com_set_lineno stops SET_LINENO being generated, so breaks tracing without the VM changes, but the VM changes make SET_LINENO into an unknown opcode... I didn't intend to upload the pystone change. About the comments: - the last little RETURN_NONE changes are easy. Thanks for the pointer. - agree there are too many bits of code grovelling co_lnotab. however, it's not clear that they can easily be refactored. maybe a generator would be useful... hmm. let me think about this one. - did I take out the initialisation of frame.f_lineno entirely? oops. probably set it to co_firstlineno. - fixed trace.py docstring. reusing code not all that easy, because different uses have subtly different requirements. hmm, inspect.getlineno is now pointless... was there any particular reason you unassigned the patch? Just curious as the main reason it was assigned to Guido was to make sure he saw it. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 19:50 Message: Logged In: YES user_id=33168 Attached is a patch to test_hotshot which is necessary because the duplicate line #s at the beginning of the function no longer exist. WIth this patch test_hotshot passes. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 20:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 15:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 13:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 22:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 21:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 20:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 15:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 15:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Mon Aug 5 15:29:25 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 07:29:25 -0700 Subject: [Patches] [ python-Patches-587993 ] alternative SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 11:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Guido van Rossum (gvanrossum) Summary: alternative SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Michael Hudson (mwh) Date: 2002-08-05 14:29 Message: Logged In: YES user_id=6656 It was discuessed on python-dev: http://mail.python.org/pipermail/python-dev/2002-August/027261.html and followups. Adding a new opcode was Guido's idea, but I'm not totally sure this is what he meant. Also see the comments around line 2933 of ceval.c. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-05 14:23 Message: Logged In: YES user_id=35752 Why was the RETURN_NONE opcode added? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 12:20 Message: Logged In: YES user_id=6656 Here's another update. I've also deleted the old patches. Changes this time: + doesn't include pystone LOOPS boost + finish RETURN_NONE off: - add to dis.py - add to libdis.tex + removed now-unecessary co_lnotab grovelling in inspect and traceback. left the functions for compatibility. + removed the WE_WANT_THE_HACK hack + incudes Neal's test_hotshot patch + more trace.py tweaks. Lib/compiler is still broken. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 11:56 Message: Logged In: YES user_id=33168 Re-assigning to Guido. I think what happened was I was reviewing the patch before it was assigned to Guido. Sorry about that. You're right about the set_lineno, I thought the opcode was also generated in there. I agree about the code reuse--it's probably too hard. If I had a better idea, I would have shared it. :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 09:09 Message: Logged In: YES user_id=6656 Thanks for the comments, Neal! I'm not sure it's possible to separate the changes in the way you describe. For instance, com_addoparg -> com_set_lineno stops SET_LINENO being generated, so breaks tracing without the VM changes, but the VM changes make SET_LINENO into an unknown opcode... I didn't intend to upload the pystone change. About the comments: - the last little RETURN_NONE changes are easy. Thanks for the pointer. - agree there are too many bits of code grovelling co_lnotab. however, it's not clear that they can easily be refactored. maybe a generator would be useful... hmm. let me think about this one. - did I take out the initialisation of frame.f_lineno entirely? oops. probably set it to co_firstlineno. - fixed trace.py docstring. reusing code not all that easy, because different uses have subtly different requirements. hmm, inspect.getlineno is now pointless... was there any particular reason you unassigned the patch? Just curious as the main reason it was assigned to Guido was to make sure he saw it. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 19:50 Message: Logged In: YES user_id=33168 Attached is a patch to test_hotshot which is necessary because the duplicate line #s at the beginning of the function no longer exist. WIth this patch test_hotshot passes. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 20:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 15:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 13:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 22:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 21:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 20:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 15:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 15:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Mon Aug 5 15:36:08 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 07:36:08 -0700 Subject: [Patches] [ python-Patches-590913 ] PEP 263 support in IDLE Message-ID: Patches item #590913, was opened at 2002-08-04 19:56 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590913&group_id=5470 Category: IDLE Group: None Status: Open >Resolution: Accepted Priority: 5 Submitted By: Martin v. L�wis (loewis) >Assigned to: Martin v. L�wis (loewis) Summary: PEP 263 support in IDLE Initial Comment: This patch adds the notion of encodings to IDLE. In particular: - it tries to determine the locale's encoding (falling back to ASCII if that fails, or no codec is found) - looks for PEP 263 encoding specs when reading and writing files (producing errors when the encoding spec is wrong) - produces error dialogs when new files have non-ASCII, but no declared encoding - assumes the locale's encoding when a non-ASCII file is opened, uses the same encoding when the file is later saved again, - falls back to letting Tcl deal with decoding when decoding fails, - falls back to saving as UTF-8 when encoding fails (so perhaps the errors should all be infos instead) - applies the locale's encoding in the interactive window. This is not a violation of PEP 263, instead, it just changes the encoding of the interactive shell from "unicode" to the locale's encoding - probably similar to what all other terminals do. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 10:36 Message: Logged In: YES user_id=6380 This looks good. My only concerns are minor style issues: importing several modules with one import statement, and two unqualified except clauses that don't explain what can go wrong (there's one that has extensive comments, that's good). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590913&group_id=5470 From noreply@sourceforge.net Mon Aug 5 15:39:09 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 07:39:09 -0700 Subject: [Patches] [ python-Patches-587993 ] alternative SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 07:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Guido van Rossum (gvanrossum) Summary: alternative SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 10:39 Message: Logged In: YES user_id=6380 That's pretty much what I suggested, yes. I'll have to take more time to review all this code... :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 10:29 Message: Logged In: YES user_id=6656 It was discuessed on python-dev: http://mail.python.org/pipermail/python-dev/2002-August/027261.html and followups. Adding a new opcode was Guido's idea, but I'm not totally sure this is what he meant. Also see the comments around line 2933 of ceval.c. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-05 10:23 Message: Logged In: YES user_id=35752 Why was the RETURN_NONE opcode added? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 08:20 Message: Logged In: YES user_id=6656 Here's another update. I've also deleted the old patches. Changes this time: + doesn't include pystone LOOPS boost + finish RETURN_NONE off: - add to dis.py - add to libdis.tex + removed now-unecessary co_lnotab grovelling in inspect and traceback. left the functions for compatibility. + removed the WE_WANT_THE_HACK hack + incudes Neal's test_hotshot patch + more trace.py tweaks. Lib/compiler is still broken. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 07:56 Message: Logged In: YES user_id=33168 Re-assigning to Guido. I think what happened was I was reviewing the patch before it was assigned to Guido. Sorry about that. You're right about the set_lineno, I thought the opcode was also generated in there. I agree about the code reuse--it's probably too hard. If I had a better idea, I would have shared it. :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 05:09 Message: Logged In: YES user_id=6656 Thanks for the comments, Neal! I'm not sure it's possible to separate the changes in the way you describe. For instance, com_addoparg -> com_set_lineno stops SET_LINENO being generated, so breaks tracing without the VM changes, but the VM changes make SET_LINENO into an unknown opcode... I didn't intend to upload the pystone change. About the comments: - the last little RETURN_NONE changes are easy. Thanks for the pointer. - agree there are too many bits of code grovelling co_lnotab. however, it's not clear that they can easily be refactored. maybe a generator would be useful... hmm. let me think about this one. - did I take out the initialisation of frame.f_lineno entirely? oops. probably set it to co_firstlineno. - fixed trace.py docstring. reusing code not all that easy, because different uses have subtly different requirements. hmm, inspect.getlineno is now pointless... was there any particular reason you unassigned the patch? Just curious as the main reason it was assigned to Guido was to make sure he saw it. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 15:50 Message: Logged In: YES user_id=33168 Attached is a patch to test_hotshot which is necessary because the duplicate line #s at the beginning of the function no longer exist. WIth this patch test_hotshot passes. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 16:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 11:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 09:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 18:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 17:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 05:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 05:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 16:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 11:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 11:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Mon Aug 5 15:42:05 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 07:42:05 -0700 Subject: [Patches] [ python-Patches-587993 ] alternative SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 11:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Guido van Rossum (gvanrossum) Summary: alternative SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Michael Hudson (mwh) Date: 2002-08-05 14:42 Message: Logged In: YES user_id=6656 That's good to know :) No great hurry with the review. I wanted to finish the patch while everything was still fresh in my mind, and I think I've done this now. Of course, I've said that before... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 14:39 Message: Logged In: YES user_id=6380 That's pretty much what I suggested, yes. I'll have to take more time to review all this code... :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 14:29 Message: Logged In: YES user_id=6656 It was discuessed on python-dev: http://mail.python.org/pipermail/python-dev/2002-August/027261.html and followups. Adding a new opcode was Guido's idea, but I'm not totally sure this is what he meant. Also see the comments around line 2933 of ceval.c. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-05 14:23 Message: Logged In: YES user_id=35752 Why was the RETURN_NONE opcode added? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 12:20 Message: Logged In: YES user_id=6656 Here's another update. I've also deleted the old patches. Changes this time: + doesn't include pystone LOOPS boost + finish RETURN_NONE off: - add to dis.py - add to libdis.tex + removed now-unecessary co_lnotab grovelling in inspect and traceback. left the functions for compatibility. + removed the WE_WANT_THE_HACK hack + incudes Neal's test_hotshot patch + more trace.py tweaks. Lib/compiler is still broken. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 11:56 Message: Logged In: YES user_id=33168 Re-assigning to Guido. I think what happened was I was reviewing the patch before it was assigned to Guido. Sorry about that. You're right about the set_lineno, I thought the opcode was also generated in there. I agree about the code reuse--it's probably too hard. If I had a better idea, I would have shared it. :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 09:09 Message: Logged In: YES user_id=6656 Thanks for the comments, Neal! I'm not sure it's possible to separate the changes in the way you describe. For instance, com_addoparg -> com_set_lineno stops SET_LINENO being generated, so breaks tracing without the VM changes, but the VM changes make SET_LINENO into an unknown opcode... I didn't intend to upload the pystone change. About the comments: - the last little RETURN_NONE changes are easy. Thanks for the pointer. - agree there are too many bits of code grovelling co_lnotab. however, it's not clear that they can easily be refactored. maybe a generator would be useful... hmm. let me think about this one. - did I take out the initialisation of frame.f_lineno entirely? oops. probably set it to co_firstlineno. - fixed trace.py docstring. reusing code not all that easy, because different uses have subtly different requirements. hmm, inspect.getlineno is now pointless... was there any particular reason you unassigned the patch? Just curious as the main reason it was assigned to Guido was to make sure he saw it. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 19:50 Message: Logged In: YES user_id=33168 Attached is a patch to test_hotshot which is necessary because the duplicate line #s at the beginning of the function no longer exist. WIth this patch test_hotshot passes. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 20:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 15:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 13:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 22:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 21:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 20:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 15:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 15:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Mon Aug 5 15:52:00 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 07:52:00 -0700 Subject: [Patches] [ python-Patches-580331 ] xreadlines caching, file iterator Message-ID: Patches item #580331, was opened at 2002-07-11 17:45 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 3 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: xreadlines caching, file iterator Initial Comment: Calling f.xreadlines() multiple times returns the same xreadlines object. A file is an iterator - __iter__() returns self and next() calls the cached xreadlines object's next method. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 10:52 Message: Logged In: YES user_id=6380 This begins to look good. What's a normal text file? One with a million bytes? :-) Have you made sure this works as expected in Universal newline mode? I'd like a patch that doesn't use #define WITH_READAHEAD_BUFFER. You might also experiment with larger buffer sizes (I predict that a larger buffer doesn't make much difference, since it didn't for xreadlines, but it would be nice to verify that and then add a comment; at least once a year someone asks whether the buffer shouldn't be much larger). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 02:27 Message: Logged In: YES user_id=562624 The version of the patch still makes a file an iterator but it no longer depends on xreadlines - it implements the readahead buffering inside the file object. It is about 19% faster than xreadlines for normal text files and about 40% faster for files with 100k lines. The methods readline and read do not use this readahead mechanism because it skews the current file position (just like xreadlines does). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-17 13:50 Message: Logged In: YES user_id=6380 Alas, there's a fatal flaw. The file object and the xreadlines object now both have pointers to each other, creating an unbreakable cycle (since neither participates in GC). Weak refs can't be used to resolve this dilemma. I personally think that's enough to just stick with the status quo (I was never more than +0 on the idea of making the file an interator anyway). But I'll leave it to Oren to come up with another hack (please use this same SF patch). Oren, if you'd like to give up, please say so and I'll close the item in a jiffy. In fact, I positively encourage you to give up. But I don't expect you to take this offer. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-16 21:33 Message: Logged In: YES user_id=6380 I'm reviewing this and will check it in, or something like it (probably). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-16 01:26 Message: Logged In: YES user_id=562624 Now invalidates cache on a seek. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-15 10:38 Message: Logged In: YES user_id=6380 I posted some comments to python-dev. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 From noreply@sourceforge.net Mon Aug 5 16:08:12 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 08:08:12 -0700 Subject: [Patches] [ python-Patches-590913 ] PEP 263 support in IDLE Message-ID: Patches item #590913, was opened at 2002-08-05 01:56 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590913&group_id=5470 Category: IDLE Group: None >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Martin v. L�wis (loewis) Assigned to: Martin v. L�wis (loewis) Summary: PEP 263 support in IDLE Initial Comment: This patch adds the notion of encodings to IDLE. In particular: - it tries to determine the locale's encoding (falling back to ASCII if that fails, or no codec is found) - looks for PEP 263 encoding specs when reading and writing files (producing errors when the encoding spec is wrong) - produces error dialogs when new files have non-ASCII, but no declared encoding - assumes the locale's encoding when a non-ASCII file is opened, uses the same encoding when the file is later saved again, - falls back to letting Tcl deal with decoding when decoding fails, - falls back to saving as UTF-8 when encoding fails (so perhaps the errors should all be infos instead) - applies the locale's encoding in the interactive window. This is not a violation of PEP 263, instead, it just changes the encoding of the interactive shell from "unicode" to the locale's encoding - probably similar to what all other terminals do. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-05 17:08 Message: Logged In: YES user_id=21627 I have corrected these problems, and committed the patch as CallTips.py 1.10 IOBinding.py 1.8 PyShell.py 1.38 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 16:36 Message: Logged In: YES user_id=6380 This looks good. My only concerns are minor style issues: importing several modules with one import statement, and two unqualified except clauses that don't explain what can go wrong (there's one that has extensive comments, that's good). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590913&group_id=5470 From noreply@sourceforge.net Mon Aug 5 16:20:15 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 08:20:15 -0700 Subject: [Patches] [ python-Patches-590513 ] Add popen2 like functionality to pty.py. Message-ID: Patches item #590513, was opened at 2002-08-03 10:16 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590513&group_id=5470 Category: Library (Lib) Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Rasjid Wilcox (rasjidw) Assigned to: Nobody/Anonymous (nobody) Summary: Add popen2 like functionality to pty.py. Initial Comment: This patch adds a popen2 like function to pty.py. Due to use of os.execlp in pty.spawn, it is not quite the same, as all the arguments (including the command to be run) must be passed as a tupple. It is only a first draft, and may need some more work, which I am willing to do if some direction is indicated. Tested on Python2.2, under RedHat Linux 7.3. Rasjid. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 11:20 Message: Logged In: YES user_id=6380 I mostly concur with Martin von Loewis's comments, though I'm not sure this is big enough for a PEP. I think that you're right in answering (yes, no, yes) but I have no information (portability of this module is already limited to IRIX and Linux, according to the docs). The docs use the word "baffle" -- I wonder if you could substitute something else or generally clarify that sentence; it's not very clear from the docs what spawn() does (nor what fork() does, to tell the truth -- all these could really use some examples). ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-05 07:44 Message: Logged In: YES user_id=21627 I'm not a regular pty user. Please ask those questions in comp.lang.python, and python-dev. You can also ask previous authors to pty for comments. Uncertainty in such areas might be a hint that a library PEP is need, to justify the rationale for all the details. There is no need to hurry - Python 2.3 is still months away. That said, I do think that this functionality is desirable, so I'd encourage you to complete this task. ---------------------------------------------------------------------- Comment By: Rasjid Wilcox (rasjidw) Date: 2002-08-05 07:34 Message: Logged In: YES user_id=39640 Before I do docs etc, I have a few questions: 1. I could make it more popen2 like by changing the args to def popen2(cmd, ....) and adding argv=('/bin/sh','-c',cmd) Is this a better idea? Does it reduce portability? Is it safe to assume that all posix systems have /bin/sh? (My guess is yes, no and yes.) 2. Should the threading done in the pty.popen2 function be moved to a separate function, to allow more direct access to spawn. (The current spawn function does not return until the child exits or the parent closes the pipe). 3. Should I worry about how keyboard interrupts are handled? In some cases an uncontrolled process may be left hanging around. Or is it the job of the calling process to deal with that? Lastly, I am away for a week from Wednesday, so I won't be able to do much until I get back, but I will try and finish this off then. Cheers, Rasjid. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 04:56 Message: Logged In: YES user_id=21627 Can you please write documentation and a test case? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590513&group_id=5470 From noreply@sourceforge.net Mon Aug 5 16:22:18 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 08:22:18 -0700 Subject: [Patches] [ python-Patches-580331 ] xreadlines caching, file iterator Message-ID: Patches item #580331, was opened at 2002-07-11 21:45 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 3 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: xreadlines caching, file iterator Initial Comment: Calling f.xreadlines() multiple times returns the same xreadlines object. A file is an iterator - __iter__() returns self and next() calls the cached xreadlines object's next method. ---------------------------------------------------------------------- >Comment By: Oren Tirosh (orenti) Date: 2002-08-05 15:22 Message: Logged In: YES user_id=562624 > What's a normal text file? One with a million bytes? :-) I meant 100kBYTE lines... Some apps actually use such long lines. Yes, it works just fine with universal newlines. Ok, the #ifdefs will go. Strange, a bigger buffer seems to actually slow it down... I'll have to investigate this further. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 14:52 Message: Logged In: YES user_id=6380 This begins to look good. What's a normal text file? One with a million bytes? :-) Have you made sure this works as expected in Universal newline mode? I'd like a patch that doesn't use #define WITH_READAHEAD_BUFFER. You might also experiment with larger buffer sizes (I predict that a larger buffer doesn't make much difference, since it didn't for xreadlines, but it would be nice to verify that and then add a comment; at least once a year someone asks whether the buffer shouldn't be much larger). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 06:27 Message: Logged In: YES user_id=562624 The version of the patch still makes a file an iterator but it no longer depends on xreadlines - it implements the readahead buffering inside the file object. It is about 19% faster than xreadlines for normal text files and about 40% faster for files with 100k lines. The methods readline and read do not use this readahead mechanism because it skews the current file position (just like xreadlines does). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-17 17:50 Message: Logged In: YES user_id=6380 Alas, there's a fatal flaw. The file object and the xreadlines object now both have pointers to each other, creating an unbreakable cycle (since neither participates in GC). Weak refs can't be used to resolve this dilemma. I personally think that's enough to just stick with the status quo (I was never more than +0 on the idea of making the file an interator anyway). But I'll leave it to Oren to come up with another hack (please use this same SF patch). Oren, if you'd like to give up, please say so and I'll close the item in a jiffy. In fact, I positively encourage you to give up. But I don't expect you to take this offer. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-17 01:33 Message: Logged In: YES user_id=6380 I'm reviewing this and will check it in, or something like it (probably). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-16 05:26 Message: Logged In: YES user_id=562624 Now invalidates cache on a seek. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-15 14:38 Message: Logged In: YES user_id=6380 I posted some comments to python-dev. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 From noreply@sourceforge.net Mon Aug 5 16:29:03 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 08:29:03 -0700 Subject: [Patches] [ python-Patches-580331 ] xreadlines caching, file iterator Message-ID: Patches item #580331, was opened at 2002-07-11 17:45 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open >Resolution: Accepted >Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: xreadlines caching, file iterator Initial Comment: Calling f.xreadlines() multiple times returns the same xreadlines object. A file is an iterator - __iter__() returns self and next() calls the cached xreadlines object's next method. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 11:29 Message: Logged In: YES user_id=6380 OK, I'll await a new patch. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 11:22 Message: Logged In: YES user_id=562624 > What's a normal text file? One with a million bytes? :-) I meant 100kBYTE lines... Some apps actually use such long lines. Yes, it works just fine with universal newlines. Ok, the #ifdefs will go. Strange, a bigger buffer seems to actually slow it down... I'll have to investigate this further. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 10:52 Message: Logged In: YES user_id=6380 This begins to look good. What's a normal text file? One with a million bytes? :-) Have you made sure this works as expected in Universal newline mode? I'd like a patch that doesn't use #define WITH_READAHEAD_BUFFER. You might also experiment with larger buffer sizes (I predict that a larger buffer doesn't make much difference, since it didn't for xreadlines, but it would be nice to verify that and then add a comment; at least once a year someone asks whether the buffer shouldn't be much larger). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 02:27 Message: Logged In: YES user_id=562624 The version of the patch still makes a file an iterator but it no longer depends on xreadlines - it implements the readahead buffering inside the file object. It is about 19% faster than xreadlines for normal text files and about 40% faster for files with 100k lines. The methods readline and read do not use this readahead mechanism because it skews the current file position (just like xreadlines does). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-17 13:50 Message: Logged In: YES user_id=6380 Alas, there's a fatal flaw. The file object and the xreadlines object now both have pointers to each other, creating an unbreakable cycle (since neither participates in GC). Weak refs can't be used to resolve this dilemma. I personally think that's enough to just stick with the status quo (I was never more than +0 on the idea of making the file an interator anyway). But I'll leave it to Oren to come up with another hack (please use this same SF patch). Oren, if you'd like to give up, please say so and I'll close the item in a jiffy. In fact, I positively encourage you to give up. But I don't expect you to take this offer. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-16 21:33 Message: Logged In: YES user_id=6380 I'm reviewing this and will check it in, or something like it (probably). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-16 01:26 Message: Logged In: YES user_id=562624 Now invalidates cache on a seek. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-15 10:38 Message: Logged In: YES user_id=6380 I posted some comments to python-dev. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 From noreply@sourceforge.net Mon Aug 5 17:14:05 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 09:14:05 -0700 Subject: [Patches] [ python-Patches-590294 ] os._execvpe security fix Message-ID: Patches item #590294, was opened at 2002-08-02 14:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590294&group_id=5470 Category: Modules Group: Python 2.3 Status: Open >Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Guido van Rossum (gvanrossum) Summary: os._execvpe security fix Initial Comment: 1) Do not attempt to exec a file which does not exist just to find out what error the operating system returns. This is an exploitable race on all platforms that support symbolic links. 2) Immediately re-raise the exception if we get an error other than errno.ENOENT or errno.ENOTDIR. This may need to be adapted for other platforms. (As a security issue, this should be considered for 2.1 and 2.2 as well as 2.3.) ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 12:14 Message: Logged In: YES user_id=6380 OK, checked in for 2.3. Keeping this open until I find the time to backport it to 2.2 and 2.1 (or someone else does that). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590294&group_id=5470 From noreply@sourceforge.net Mon Aug 5 17:14:25 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 09:14:25 -0700 Subject: [Patches] [ python-Patches-580331 ] xreadlines caching, file iterator Message-ID: Patches item #580331, was opened at 2002-07-11 17:45 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: xreadlines caching, file iterator Initial Comment: Calling f.xreadlines() multiple times returns the same xreadlines object. A file is an iterator - __iter__() returns self and next() calls the cached xreadlines object's next method. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-08-05 12:14 Message: Logged In: YES user_id=31435 Just FYI, in apps that do "read + process" in a loop, a small buffer is often faster because the data has a decent shot at staying in L1 cache. Make the buffer very large (100s of Kb), and it won't even stay in L2 cache. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 11:29 Message: Logged In: YES user_id=6380 OK, I'll await a new patch. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 11:22 Message: Logged In: YES user_id=562624 > What's a normal text file? One with a million bytes? :-) I meant 100kBYTE lines... Some apps actually use such long lines. Yes, it works just fine with universal newlines. Ok, the #ifdefs will go. Strange, a bigger buffer seems to actually slow it down... I'll have to investigate this further. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 10:52 Message: Logged In: YES user_id=6380 This begins to look good. What's a normal text file? One with a million bytes? :-) Have you made sure this works as expected in Universal newline mode? I'd like a patch that doesn't use #define WITH_READAHEAD_BUFFER. You might also experiment with larger buffer sizes (I predict that a larger buffer doesn't make much difference, since it didn't for xreadlines, but it would be nice to verify that and then add a comment; at least once a year someone asks whether the buffer shouldn't be much larger). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 02:27 Message: Logged In: YES user_id=562624 The version of the patch still makes a file an iterator but it no longer depends on xreadlines - it implements the readahead buffering inside the file object. It is about 19% faster than xreadlines for normal text files and about 40% faster for files with 100k lines. The methods readline and read do not use this readahead mechanism because it skews the current file position (just like xreadlines does). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-17 13:50 Message: Logged In: YES user_id=6380 Alas, there's a fatal flaw. The file object and the xreadlines object now both have pointers to each other, creating an unbreakable cycle (since neither participates in GC). Weak refs can't be used to resolve this dilemma. I personally think that's enough to just stick with the status quo (I was never more than +0 on the idea of making the file an interator anyway). But I'll leave it to Oren to come up with another hack (please use this same SF patch). Oren, if you'd like to give up, please say so and I'll close the item in a jiffy. In fact, I positively encourage you to give up. But I don't expect you to take this offer. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-16 21:33 Message: Logged In: YES user_id=6380 I'm reviewing this and will check it in, or something like it (probably). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-16 01:26 Message: Logged In: YES user_id=562624 Now invalidates cache on a seek. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-15 10:38 Message: Logged In: YES user_id=6380 I posted some comments to python-dev. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 From noreply@sourceforge.net Mon Aug 5 17:48:13 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 09:48:13 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-02 02:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 >Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Nobody/Anonymous (nobody) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 12:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-03 02:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 10:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Mon Aug 5 20:09:36 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 12:09:36 -0700 Subject: [Patches] [ python-Patches-591250 ] str1 in str2 for len(str1) >= 1 Message-ID: Patches item #591250, was opened at 2002-08-05 15:09 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Neal Norwitz (nnorwitz) Assigned to: Nobody/Anonymous (nobody) Summary: str1 in str2 for len(str1) >= 1 Initial Comment: Here's a patch to implement and test str1 in str2, when str1 is more than a single character. The doc and unicode still need to be updated. There is a test. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 From noreply@sourceforge.net Mon Aug 5 20:17:41 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 12:17:41 -0700 Subject: [Patches] [ python-Patches-591250 ] str1 in str2 for len(str1) >= 1 Message-ID: Patches item #591250, was opened at 2002-08-05 15:09 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Neal Norwitz (nnorwitz) >Assigned to: Barry A. Warsaw (bwarsaw) Summary: str1 in str2 for len(str1) >= 1 Initial Comment: Here's a patch to implement and test str1 in str2, when str1 is more than a single character. The doc and unicode still need to be updated. There is a test. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 15:17 Message: Logged In: YES user_id=6380 Let Barry collect the patches. (He sent one in private email too.) Does this do Unicode? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 From noreply@sourceforge.net Mon Aug 5 20:39:12 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 12:39:12 -0700 Subject: [Patches] [ python-Patches-591250 ] str1 in str2 for len(str1) >= 1 Message-ID: Patches item #591250, was opened at 2002-08-05 15:09 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Neal Norwitz (nnorwitz) Assigned to: Barry A. Warsaw (bwarsaw) Summary: str1 in str2 for len(str1) >= 1 Initial Comment: Here's a patch to implement and test str1 in str2, when str1 is more than a single character. The doc and unicode still need to be updated. There is a test. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 15:39 Message: Logged In: YES user_id=33168 Went a little further. Add unicode support. Where should the string/unicode tests go? string tests are in string_tests.py now, but not sure that is appropriate. Still needs: * unicode test * doc - where? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 15:17 Message: Logged In: YES user_id=6380 Let Barry collect the patches. (He sent one in private email too.) Does this do Unicode? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 From noreply@sourceforge.net Mon Aug 5 21:01:19 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 13:01:19 -0700 Subject: [Patches] [ python-Patches-580331 ] xreadlines caching, file iterator Message-ID: Patches item #580331, was opened at 2002-07-11 21:45 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: xreadlines caching, file iterator Initial Comment: Calling f.xreadlines() multiple times returns the same xreadlines object. A file is an iterator - __iter__() returns self and next() calls the cached xreadlines object's next method. ---------------------------------------------------------------------- >Comment By: Oren Tirosh (orenti) Date: 2002-08-05 20:01 Message: Logged In: YES user_id=562624 Updated patch. What to do about the xreadlines method? The patch doesn't touch it but It could be made an alias to __iter__ and the dependency of file objects on the xreadlines module will be eliminated. On my linux machine the highest performance is achieved for buffer sizes somewhere around 4096-8192. Higher or lower values are significantly slower. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-05 16:14 Message: Logged In: YES user_id=31435 Just FYI, in apps that do "read + process" in a loop, a small buffer is often faster because the data has a decent shot at staying in L1 cache. Make the buffer very large (100s of Kb), and it won't even stay in L2 cache. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 15:29 Message: Logged In: YES user_id=6380 OK, I'll await a new patch. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 15:22 Message: Logged In: YES user_id=562624 > What's a normal text file? One with a million bytes? :-) I meant 100kBYTE lines... Some apps actually use such long lines. Yes, it works just fine with universal newlines. Ok, the #ifdefs will go. Strange, a bigger buffer seems to actually slow it down... I'll have to investigate this further. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 14:52 Message: Logged In: YES user_id=6380 This begins to look good. What's a normal text file? One with a million bytes? :-) Have you made sure this works as expected in Universal newline mode? I'd like a patch that doesn't use #define WITH_READAHEAD_BUFFER. You might also experiment with larger buffer sizes (I predict that a larger buffer doesn't make much difference, since it didn't for xreadlines, but it would be nice to verify that and then add a comment; at least once a year someone asks whether the buffer shouldn't be much larger). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 06:27 Message: Logged In: YES user_id=562624 The version of the patch still makes a file an iterator but it no longer depends on xreadlines - it implements the readahead buffering inside the file object. It is about 19% faster than xreadlines for normal text files and about 40% faster for files with 100k lines. The methods readline and read do not use this readahead mechanism because it skews the current file position (just like xreadlines does). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-17 17:50 Message: Logged In: YES user_id=6380 Alas, there's a fatal flaw. The file object and the xreadlines object now both have pointers to each other, creating an unbreakable cycle (since neither participates in GC). Weak refs can't be used to resolve this dilemma. I personally think that's enough to just stick with the status quo (I was never more than +0 on the idea of making the file an interator anyway). But I'll leave it to Oren to come up with another hack (please use this same SF patch). Oren, if you'd like to give up, please say so and I'll close the item in a jiffy. In fact, I positively encourage you to give up. But I don't expect you to take this offer. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-17 01:33 Message: Logged In: YES user_id=6380 I'm reviewing this and will check it in, or something like it (probably). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-16 05:26 Message: Logged In: YES user_id=562624 Now invalidates cache on a seek. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-15 14:38 Message: Logged In: YES user_id=6380 I posted some comments to python-dev. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 From noreply@sourceforge.net Mon Aug 5 21:07:00 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 13:07:00 -0700 Subject: [Patches] [ python-Patches-580331 ] xreadlines caching, file iterator Message-ID: Patches item #580331, was opened at 2002-07-11 17:45 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: xreadlines caching, file iterator Initial Comment: Calling f.xreadlines() multiple times returns the same xreadlines object. A file is an iterator - __iter__() returns self and next() calls the cached xreadlines object's next method. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 16:07 Message: Logged In: YES user_id=6380 Thanks! Making xreadlines an alias for __iter__ sounds about right, for backwards compatibility. Then we should probably deprecate xreadlines, despite the fact that it could be useful for other file-like objects; it's just not a pretty enough interface. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 16:01 Message: Logged In: YES user_id=562624 Updated patch. What to do about the xreadlines method? The patch doesn't touch it but It could be made an alias to __iter__ and the dependency of file objects on the xreadlines module will be eliminated. On my linux machine the highest performance is achieved for buffer sizes somewhere around 4096-8192. Higher or lower values are significantly slower. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-05 12:14 Message: Logged In: YES user_id=31435 Just FYI, in apps that do "read + process" in a loop, a small buffer is often faster because the data has a decent shot at staying in L1 cache. Make the buffer very large (100s of Kb), and it won't even stay in L2 cache. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 11:29 Message: Logged In: YES user_id=6380 OK, I'll await a new patch. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 11:22 Message: Logged In: YES user_id=562624 > What's a normal text file? One with a million bytes? :-) I meant 100kBYTE lines... Some apps actually use such long lines. Yes, it works just fine with universal newlines. Ok, the #ifdefs will go. Strange, a bigger buffer seems to actually slow it down... I'll have to investigate this further. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 10:52 Message: Logged In: YES user_id=6380 This begins to look good. What's a normal text file? One with a million bytes? :-) Have you made sure this works as expected in Universal newline mode? I'd like a patch that doesn't use #define WITH_READAHEAD_BUFFER. You might also experiment with larger buffer sizes (I predict that a larger buffer doesn't make much difference, since it didn't for xreadlines, but it would be nice to verify that and then add a comment; at least once a year someone asks whether the buffer shouldn't be much larger). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 02:27 Message: Logged In: YES user_id=562624 The version of the patch still makes a file an iterator but it no longer depends on xreadlines - it implements the readahead buffering inside the file object. It is about 19% faster than xreadlines for normal text files and about 40% faster for files with 100k lines. The methods readline and read do not use this readahead mechanism because it skews the current file position (just like xreadlines does). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-17 13:50 Message: Logged In: YES user_id=6380 Alas, there's a fatal flaw. The file object and the xreadlines object now both have pointers to each other, creating an unbreakable cycle (since neither participates in GC). Weak refs can't be used to resolve this dilemma. I personally think that's enough to just stick with the status quo (I was never more than +0 on the idea of making the file an interator anyway). But I'll leave it to Oren to come up with another hack (please use this same SF patch). Oren, if you'd like to give up, please say so and I'll close the item in a jiffy. In fact, I positively encourage you to give up. But I don't expect you to take this offer. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-16 21:33 Message: Logged In: YES user_id=6380 I'm reviewing this and will check it in, or something like it (probably). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-16 01:26 Message: Logged In: YES user_id=562624 Now invalidates cache on a seek. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-15 10:38 Message: Logged In: YES user_id=6380 I posted some comments to python-dev. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 From noreply@sourceforge.net Mon Aug 5 21:50:35 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 13:50:35 -0700 Subject: [Patches] [ python-Patches-591250 ] str1 in str2 for len(str1) >= 1 Message-ID: Patches item #591250, was opened at 2002-08-05 15:09 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Neal Norwitz (nnorwitz) Assigned to: Barry A. Warsaw (bwarsaw) Summary: str1 in str2 for len(str1) >= 1 Initial Comment: Here's a patch to implement and test str1 in str2, when str1 is more than a single character. The doc and unicode still need to be updated. There is a test. ---------------------------------------------------------------------- >Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:50 Message: Logged In: YES user_id=12800 There were some problems with this patch. test_unicode.py already contains some contains tests, which fail with your patch. This fixes them at the expense of not special-casing contains with a single character (in unicode only -- normal strings still have the special case). This version of the patch also adds some more tests. I think the only thing left to do is update the docs. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 15:39 Message: Logged In: YES user_id=33168 Went a little further. Add unicode support. Where should the string/unicode tests go? string tests are in string_tests.py now, but not sure that is appropriate. Still needs: * unicode test * doc - where? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 15:17 Message: Logged In: YES user_id=6380 Let Barry collect the patches. (He sent one in private email too.) Does this do Unicode? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 From noreply@sourceforge.net Mon Aug 5 21:55:58 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 13:55:58 -0700 Subject: [Patches] [ python-Patches-591250 ] str1 in str2 for len(str1) >= 1 Message-ID: Patches item #591250, was opened at 2002-08-05 15:09 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Neal Norwitz (nnorwitz) >Assigned to: Fred L. Drake, Jr. (fdrake) Summary: str1 in str2 for len(str1) >= 1 Initial Comment: Here's a patch to implement and test str1 in str2, when str1 is more than a single character. The doc and unicode still need to be updated. There is a test. ---------------------------------------------------------------------- >Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:55 Message: Logged In: YES user_id=12800 I actually think Fred should get in on the act because it's not clear to me where in the docs this should be documented. Assigning to Fred for an answer, then he can re-assign back to me. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:50 Message: Logged In: YES user_id=12800 There were some problems with this patch. test_unicode.py already contains some contains tests, which fail with your patch. This fixes them at the expense of not special-casing contains with a single character (in unicode only -- normal strings still have the special case). This version of the patch also adds some more tests. I think the only thing left to do is update the docs. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 15:39 Message: Logged In: YES user_id=33168 Went a little further. Add unicode support. Where should the string/unicode tests go? string tests are in string_tests.py now, but not sure that is appropriate. Still needs: * unicode test * doc - where? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 15:17 Message: Logged In: YES user_id=6380 Let Barry collect the patches. (He sent one in private email too.) Does this do Unicode? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 From noreply@sourceforge.net Mon Aug 5 22:01:16 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 14:01:16 -0700 Subject: [Patches] [ python-Patches-591250 ] str1 in str2 for len(str1) >= 1 Message-ID: Patches item #591250, was opened at 2002-08-05 15:09 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Neal Norwitz (nnorwitz) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: str1 in str2 for len(str1) >= 1 Initial Comment: Here's a patch to implement and test str1 in str2, when str1 is more than a single character. The doc and unicode still need to be updated. There is a test. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 17:01 Message: Logged In: YES user_id=6380 I'm still at best -0 on the exception for empty left argument. Fortunately we can easily change that (I'm guessing it's just a matter of removing some code :-) if I don't change my mind. And the special case for 1-char Unicode should probably be restored. Can someone please mail the other person who posted a pointer to a patch to python-dev, to avoid him doing more work on cleaning up his patch? ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:55 Message: Logged In: YES user_id=12800 I actually think Fred should get in on the act because it's not clear to me where in the docs this should be documented. Assigning to Fred for an answer, then he can re-assign back to me. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:50 Message: Logged In: YES user_id=12800 There were some problems with this patch. test_unicode.py already contains some contains tests, which fail with your patch. This fixes them at the expense of not special-casing contains with a single character (in unicode only -- normal strings still have the special case). This version of the patch also adds some more tests. I think the only thing left to do is update the docs. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 15:39 Message: Logged In: YES user_id=33168 Went a little further. Add unicode support. Where should the string/unicode tests go? string tests are in string_tests.py now, but not sure that is appropriate. Still needs: * unicode test * doc - where? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 15:17 Message: Logged In: YES user_id=6380 Let Barry collect the patches. (He sent one in private email too.) Does this do Unicode? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 From noreply@sourceforge.net Mon Aug 5 22:17:47 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 14:17:47 -0700 Subject: [Patches] [ python-Patches-591305 ] Documentation err in bytecode defs. Message-ID: Patches item #591305, was opened at 2002-08-05 16:17 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591305&group_id=5470 Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: Michael Chermside (mcherm) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Documentation err in bytecode defs. Initial Comment: The error in the definition of the opcode is obvious and easy to understand -- the definition used TOS1 where TOS was intended. However, this is my first use of the patch submission process, and my use of patch manager may be way off. If I've done something silly like submit my diff backward (from new code back to old) or in the wrong format, please drop me an email explaining how to generate a correct patch or where to read about how to do it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591305&group_id=5470 From noreply@sourceforge.net Mon Aug 5 22:39:03 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 14:39:03 -0700 Subject: [Patches] [ python-Patches-583235 ] make file object an iterator Message-ID: Patches item #583235, was opened at 2002-07-18 02:50 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=583235&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed >Resolution: Duplicate Priority: 5 Submitted By: Alex Martelli (aleax) Assigned to: Nobody/Anonymous (nobody) Summary: make file object an iterator Initial Comment: As per python-dev discussion july 17 2002 & earlier, I reworked Oren's patch to remove a reference loop between file object and xreadlines object (making the reference xreadl.->fileob non-addref'd when and only when the xreadlines object is being internally held by the fileob), make f.readline interop with f.next (the former delegating to the latter iff f is holding an xreadl. obj), make f.seek remove the xreadl.obj that f is holding (if any), and removing the optimization of caching xreadlines function pointers as static variables in functions of fileobject.c. Also added tests for this functionality to test_file.py. Alex ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 17:39 Message: Logged In: YES user_id=6380 I'll close this as "duplicate". Oren has since submitted a better patch that re-implements the needed buffering in the file object without the need to reference xreadlines at all. That sounds like a better solution. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=583235&group_id=5470 From noreply@sourceforge.net Tue Aug 6 00:35:20 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 16:35:20 -0700 Subject: [Patches] [ python-Patches-591305 ] Documentation err in bytecode defs. Message-ID: Patches item #591305, was opened at 2002-08-05 17:17 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591305&group_id=5470 Category: Documentation >Group: Python 2.3 >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Michael Chermside (mcherm) >Assigned to: Neal Norwitz (nnorwitz) Summary: Documentation err in bytecode defs. Initial Comment: The error in the definition of the opcode is obvious and easy to understand -- the definition used TOS1 where TOS was intended. However, this is my first use of the patch submission process, and my use of patch manager may be way off. If I've done something silly like submit my diff backward (from new code back to old) or in the wrong format, please drop me an email explaining how to generate a correct patch or where to read about how to do it. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 19:35 Message: Logged In: YES user_id=33168 The patch is not reversed and is a context diff. This is good (although I prefer unified diff). With unified diffs, you can see + and - for the lines which make more sense to me (good to check that you have the order correct). It's easier for me if you do the diff from the src directory, but I'm not sure what others prefer and it's not a big deal. I got a warning from patch: (Stripping trailing CRs from patch.) Checked in as: libdis.tex 1.38 and 1.33.10.3. Thanks! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591305&group_id=5470 From noreply@sourceforge.net Tue Aug 6 01:06:21 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 17:06:21 -0700 Subject: [Patches] [ python-Patches-591250 ] str1 in str2 for len(str1) >= 1 Message-ID: Patches item #591250, was opened at 2002-08-05 15:09 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Neal Norwitz (nnorwitz) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: str1 in str2 for len(str1) >= 1 Initial Comment: Here's a patch to implement and test str1 in str2, when str1 is more than a single character. The doc and unicode still need to be updated. There is a test. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 20:06 Message: Logged In: YES user_id=33168 I'm not sure what PyUnicode_GET_SIZE() returns if using UCS-2 or UCS-4. Does size need to be size *= sizeof(*lhs), ie size *= sizeof(Py_UNICODE)? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 17:01 Message: Logged In: YES user_id=6380 I'm still at best -0 on the exception for empty left argument. Fortunately we can easily change that (I'm guessing it's just a matter of removing some code :-) if I don't change my mind. And the special case for 1-char Unicode should probably be restored. Can someone please mail the other person who posted a pointer to a patch to python-dev, to avoid him doing more work on cleaning up his patch? ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:55 Message: Logged In: YES user_id=12800 I actually think Fred should get in on the act because it's not clear to me where in the docs this should be documented. Assigning to Fred for an answer, then he can re-assign back to me. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:50 Message: Logged In: YES user_id=12800 There were some problems with this patch. test_unicode.py already contains some contains tests, which fail with your patch. This fixes them at the expense of not special-casing contains with a single character (in unicode only -- normal strings still have the special case). This version of the patch also adds some more tests. I think the only thing left to do is update the docs. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 15:39 Message: Logged In: YES user_id=33168 Went a little further. Add unicode support. Where should the string/unicode tests go? string tests are in string_tests.py now, but not sure that is appropriate. Still needs: * unicode test * doc - where? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 15:17 Message: Logged In: YES user_id=6380 Let Barry collect the patches. (He sent one in private email too.) Does this do Unicode? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 From noreply@sourceforge.net Tue Aug 6 01:43:18 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 17:43:18 -0700 Subject: [Patches] [ python-Patches-591250 ] str1 in str2 for len(str1) >= 1 Message-ID: Patches item #591250, was opened at 2002-08-05 15:09 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Neal Norwitz (nnorwitz) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: str1 in str2 for len(str1) >= 1 Initial Comment: Here's a patch to implement and test str1 in str2, when str1 is more than a single character. The doc and unicode still need to be updated. There is a test. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 20:43 Message: Logged In: YES user_id=33168 test_contains.py checks for multi-char strings using in. I attached a patch that removes these checks. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 20:06 Message: Logged In: YES user_id=33168 I'm not sure what PyUnicode_GET_SIZE() returns if using UCS-2 or UCS-4. Does size need to be size *= sizeof(*lhs), ie size *= sizeof(Py_UNICODE)? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 17:01 Message: Logged In: YES user_id=6380 I'm still at best -0 on the exception for empty left argument. Fortunately we can easily change that (I'm guessing it's just a matter of removing some code :-) if I don't change my mind. And the special case for 1-char Unicode should probably be restored. Can someone please mail the other person who posted a pointer to a patch to python-dev, to avoid him doing more work on cleaning up his patch? ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:55 Message: Logged In: YES user_id=12800 I actually think Fred should get in on the act because it's not clear to me where in the docs this should be documented. Assigning to Fred for an answer, then he can re-assign back to me. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:50 Message: Logged In: YES user_id=12800 There were some problems with this patch. test_unicode.py already contains some contains tests, which fail with your patch. This fixes them at the expense of not special-casing contains with a single character (in unicode only -- normal strings still have the special case). This version of the patch also adds some more tests. I think the only thing left to do is update the docs. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 15:39 Message: Logged In: YES user_id=33168 Went a little further. Add unicode support. Where should the string/unicode tests go? string tests are in string_tests.py now, but not sure that is appropriate. Still needs: * unicode test * doc - where? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 15:17 Message: Logged In: YES user_id=6380 Let Barry collect the patches. (He sent one in private email too.) Does this do Unicode? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 From noreply@sourceforge.net Tue Aug 6 06:03:45 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 22:03:45 -0700 Subject: [Patches] [ python-Patches-580331 ] xreadlines caching, file iterator Message-ID: Patches item #580331, was opened at 2002-07-11 21:45 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: xreadlines caching, file iterator Initial Comment: Calling f.xreadlines() multiple times returns the same xreadlines object. A file is an iterator - __iter__() returns self and next() calls the cached xreadlines object's next method. ---------------------------------------------------------------------- >Comment By: Oren Tirosh (orenti) Date: 2002-08-06 05:03 Message: Logged In: YES user_id=562624 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 20:07 Message: Logged In: YES user_id=6380 Thanks! Making xreadlines an alias for __iter__ sounds about right, for backwards compatibility. Then we should probably deprecate xreadlines, despite the fact that it could be useful for other file-like objects; it's just not a pretty enough interface. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 20:01 Message: Logged In: YES user_id=562624 Updated patch. What to do about the xreadlines method? The patch doesn't touch it but It could be made an alias to __iter__ and the dependency of file objects on the xreadlines module will be eliminated. On my linux machine the highest performance is achieved for buffer sizes somewhere around 4096-8192. Higher or lower values are significantly slower. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-05 16:14 Message: Logged In: YES user_id=31435 Just FYI, in apps that do "read + process" in a loop, a small buffer is often faster because the data has a decent shot at staying in L1 cache. Make the buffer very large (100s of Kb), and it won't even stay in L2 cache. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 15:29 Message: Logged In: YES user_id=6380 OK, I'll await a new patch. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 15:22 Message: Logged In: YES user_id=562624 > What's a normal text file? One with a million bytes? :-) I meant 100kBYTE lines... Some apps actually use such long lines. Yes, it works just fine with universal newlines. Ok, the #ifdefs will go. Strange, a bigger buffer seems to actually slow it down... I'll have to investigate this further. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 14:52 Message: Logged In: YES user_id=6380 This begins to look good. What's a normal text file? One with a million bytes? :-) Have you made sure this works as expected in Universal newline mode? I'd like a patch that doesn't use #define WITH_READAHEAD_BUFFER. You might also experiment with larger buffer sizes (I predict that a larger buffer doesn't make much difference, since it didn't for xreadlines, but it would be nice to verify that and then add a comment; at least once a year someone asks whether the buffer shouldn't be much larger). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 06:27 Message: Logged In: YES user_id=562624 The version of the patch still makes a file an iterator but it no longer depends on xreadlines - it implements the readahead buffering inside the file object. It is about 19% faster than xreadlines for normal text files and about 40% faster for files with 100k lines. The methods readline and read do not use this readahead mechanism because it skews the current file position (just like xreadlines does). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-17 17:50 Message: Logged In: YES user_id=6380 Alas, there's a fatal flaw. The file object and the xreadlines object now both have pointers to each other, creating an unbreakable cycle (since neither participates in GC). Weak refs can't be used to resolve this dilemma. I personally think that's enough to just stick with the status quo (I was never more than +0 on the idea of making the file an interator anyway). But I'll leave it to Oren to come up with another hack (please use this same SF patch). Oren, if you'd like to give up, please say so and I'll close the item in a jiffy. In fact, I positively encourage you to give up. But I don't expect you to take this offer. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-17 01:33 Message: Logged In: YES user_id=6380 I'm reviewing this and will check it in, or something like it (probably). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-16 05:26 Message: Logged In: YES user_id=562624 Now invalidates cache on a seek. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-15 14:38 Message: Logged In: YES user_id=6380 I posted some comments to python-dev. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 From noreply@sourceforge.net Tue Aug 6 07:55:42 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 05 Aug 2002 23:55:42 -0700 Subject: [Patches] [ python-Patches-572031 ] AUTH method LOGIN for smtplib Message-ID: Patches item #572031, was opened at 2002-06-21 12:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=572031&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Gerhard H�ring (ghaering) Assigned to: Barry A. Warsaw (bwarsaw) Summary: AUTH method LOGIN for smtplib Initial Comment: Unfortunately, my original SMTP auth patch doesn't work so well in real life. There are two methods to advertise the available auth methods for SMTP servers: old-style: AUTH=method1 method2 ... RFC style: AUTH method1 method2 Microsoft's MUAs are b0rken in that they only understand the old-style method. That's why most SMTP servers are configured to advertise their authentication methods in old-style _and_ new style. There are also some especially broken SMTP servers like old M$ Exchange servers that only show their auth methods via the old style. Also the (sadly but true) very widely used M$ Exchange server only supports the LOGIN auth method (I have to use that thing at work, that's why I came up with this patch). Exchange also supports some other proprietary auth methods (NTLM, ...), but we needn't care about these. My argument is that the Python SMTP AUTH support will get a lot more useful to people if we also support 1) the old-style AUTH= advertisement 2) the LOGIN auth method, which, although not standardized via RFCs and originally invented by Netscape, is still in wide use, and for some servers the only method to use them, so we should support it Please note that in the current implementation, if a server uses the old-style AUTH= method, our SMTP auth support simply breaks because of the esmtp_features parsing. I'm randomly assigning this patch to Barry, because AFAIK he knows a lot about email handling. Assign around as you please :-) ---------------------------------------------------------------------- >Comment By: Gerhard H�ring (ghaering) Date: 2002-08-06 08:55 Message: Logged In: YES user_id=163326 Uh-oh. I made a stupid error in the code, sending the username twice. One more lesson I learnt: never use username == password for testing :-/ ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-24 15:05 Message: Logged In: YES user_id=21627 In http://sourceforge.net/tracker/?func=detail&atid=105470&aid=581165&group_id=5470 pierslauder reports success with this patch; see his detailed report for remaining problems. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-17 15:39 Message: Logged In: YES user_id=21627 That existing SMTP servers announce LOGIN only in the old-style header is a good reason to support those as well; I hence recommend that this patch is applied. Microsoft is, strictly speaking, conforming to the RFC by *not* reporting LOGIN in the AUTH header: only registered SASL mechanism can be announced there, and LOGIN is not registered; see http://www.iana.org/assignments/sasl-mechanisms ---------------------------------------------------------------------- Comment By: Gerhard H�ring (ghaering) Date: 2002-07-01 00:34 Message: Logged In: YES user_id=163326 Updated patch. Changes to the previous patch: - Use email.base64MIME.encode to get rid of the added newlines. - Merge old and RFC-style auth methods in self.smtp_features instead of parsing old-style auth lines seperately. - Removed example line for changing auth method priorities (we won't list all permutations of auth methods ;-) - Removed superfluous logging call of chosen auth method. - Moved comment about SMTP features syntax into the right place again. ---------------------------------------------------------------------- Comment By: Gerhard H�ring (ghaering) Date: 2002-06-30 23:14 Message: Logged In: YES user_id=163326 Martin, the reason why we need to take into account both old and RFC-style auth advertisement is that there are some smtp servers, which advertise different auth mechanisms in the old vs. RFC-style line. In particular, the MS Exchange server that I have to use at work and I think that this is even the default configuration of Exchange 2000. In my case, it advertises its LOGIN method only in the AUTH= line. I'll shortly upload a patch that takes this into account. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-06-30 18:20 Message: Logged In: YES user_id=21627 I still cannot see why support for the old-style AUTH lines is necessary. If all SMTPds announce their supported mechanisms with both syntaxes, why is it then necessary to even look at the old syntax? I'm all for adding support for the LOGIN method. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-06-30 17:59 Message: Logged In: YES user_id=12800 Martin, (some? most?) MUAs post messages by talking directly to their outgoing SMTPd, so that's probably why Gerhard mentions it. On the issue of base64 issue, see the comment in bug #552605, which I just took assignment of. I'll deal with both these bug reports soon. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-06-30 17:41 Message: Logged In: YES user_id=21627 I cannot understand why the behaviour of MS MUAs is relevant here at all; smtplib only talks to MTAs (or MSAs). If MTAs advertise the AUTH extension in the new syntax in addition to the old syntax, why is it not good to just ignore the old advertisement? Can you point to a specific software package (ideally even a specific host) which fails to interact with the current smtplib correctly? ---------------------------------------------------------------------- Comment By: Jason R. Mastaler (jasonrm) Date: 2002-06-22 05:53 Message: Logged In: YES user_id=85984 A comment on the old-style advertisement. You say that Microsoft's MUAs only understand the old-style method. I haven't found this to be the case. tmda-ofmipd is an outgoing SMTP proxy that supports SMTP authentication, and I only use the RFC style advertisement. This works perfectly well with MS clients like Outlook 2000, and Outlook Express 5. Below is an example of what the advertisement looks like. BTW, no disagreement about supporting the old-style advertisement in smtplib, as I think it's prudent, just making a point. # telnet aguirre 8025 Trying 172.18.3.5... Connected to aguirre.la.mastaler.com. Escape character is '^]'. 220 aguirre.la.mastaler.com ESMTP tmda-ofmipd EHLO aguirre.la.mastaler.com 250-aguirre.la.mastaler.com 250 AUTH LOGIN CRAM-MD5 PLAIN QUIT 221 Bye Connection closed by foreign host. ---------------------------------------------------------------------- Comment By: Gerhard H�ring (ghaering) Date: 2002-06-21 12:43 Message: Logged In: YES user_id=163326 This also includes a slightly modified version of patch #552605. Even better would IMO be to add an additional parameter to base64.encode* and the corresponding binascii functions that avoids the insertion of newline characters. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=572031&group_id=5470 From noreply@sourceforge.net Tue Aug 6 09:32:45 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 01:32:45 -0700 Subject: [Patches] [ python-Patches-591305 ] Documentation err in bytecode defs. Message-ID: Patches item #591305, was opened at 2002-08-05 21:17 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591305&group_id=5470 Category: Documentation Group: Python 2.3 Status: Closed Resolution: Accepted Priority: 5 Submitted By: Michael Chermside (mcherm) Assigned to: Neal Norwitz (nnorwitz) Summary: Documentation err in bytecode defs. Initial Comment: The error in the definition of the opcode is obvious and easy to understand -- the definition used TOS1 where TOS was intended. However, this is my first use of the patch submission process, and my use of patch manager may be way off. If I've done something silly like submit my diff backward (from new code back to old) or in the wrong format, please drop me an email explaining how to generate a correct patch or where to read about how to do it. ---------------------------------------------------------------------- >Comment By: Michael Hudson (mwh) Date: 2002-08-06 08:32 Message: Logged In: YES user_id=6656 Not really relavent, but: unified diffs are good when you want to see the changes, context diffs are good when you want to see what you're getting. During the 221 release process I found I was glad for the insistence on context diffs (to my surprise). ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 23:35 Message: Logged In: YES user_id=33168 The patch is not reversed and is a context diff. This is good (although I prefer unified diff). With unified diffs, you can see + and - for the lines which make more sense to me (good to check that you have the order correct). It's easier for me if you do the diff from the src directory, but I'm not sure what others prefer and it's not a big deal. I got a warning from patch: (Stripping trailing CRs from patch.) Checked in as: libdis.tex 1.38 and 1.33.10.3. Thanks! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591305&group_id=5470 From noreply@sourceforge.net Tue Aug 6 13:44:28 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 05:44:28 -0700 Subject: [Patches] [ python-Patches-579841 ] Build MachoPython with 2level namespace Message-ID: Patches item #579841, was opened at 2002-07-10 23:09 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=579841&group_id=5470 Category: Macintosh Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Jack Jansen (jackjansen) Assigned to: Jack Jansen (jackjansen) Summary: Build MachoPython with 2level namespace Initial Comment: This patch builds a framework-based Python on OSX without --flat_namespace. In addition the Makefile.pre.in logic for building the temporary framework is slightly reordered to make it more error-proof. The main reason for putting this patch up here is that it was supposed to disallow importing extension modules for a framework-python to be imported into a non-framework-python. But unfortunately it does this this with a coredump in stead of with the expected "Python not initialized (wrong version?)" error message. I would like feedback as to why this is (as other people do get the error message in similar situations). ---------------------------------------------------------------------- >Comment By: Jack Jansen (jackjansen) Date: 2002-08-06 14:44 Message: Logged In: YES user_id=45365 As no feedback has occurred for over 4 weeks and the patch works for me I've checked it in. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=579841&group_id=5470 From noreply@sourceforge.net Tue Aug 6 14:00:05 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 06:00:05 -0700 Subject: [Patches] [ python-Patches-567296 ] GetFInfo update Message-ID: Patches item #567296, was opened at 2002-06-11 09:49 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=567296&group_id=5470 Category: Macintosh Group: Python 2.2.x >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Pim Buurman (pimbuur) Assigned to: Jack Jansen (jackjansen) Summary: GetFInfo update Initial Comment: The macfs function GetFInfo fails for directories. This patch uses another C function to grab the structure ---------------------------------------------------------------------- >Comment By: Jack Jansen (jackjansen) Date: 2002-08-06 15:00 Message: Logged In: YES user_id=45365 Checked in as macfsmodule.c rev 1.57. ---------------------------------------------------------------------- Comment By: Pim Buurman (pimbuur) Date: 2002-06-12 08:54 Message: Logged In: YES user_id=157121 Jack, I also thought this would happen, but when I set the Creator and Type of a directory, I get the correct values. Output: #/Developer/Tools/GetFileInfo CVS directory: "CVS" type: "PIMB" creator: "ABCD" attributes: aVbstclinmed created: 04/26/2002 08:50:33 modified: 06/10/2002 08:29:21 #python tstmac.py CVS 'import macfsn' failed; use -v for traceback FSSpec((-100, 671716, 'CVS')) creator: 'ABCD' flags: '16384' fldr: '0' location: '(0, 0)' type: 'PIMB' dates (3102663033.0, 3106549761.0, 0.0) So I think this function works, at least on Mac OS X, 10.1.5 ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-06-11 23:21 Message: Logged In: YES user_id=45365 Pim, your fix has the problem that it will put garbage into creator and type fields for folders (as the DInfo structure, which is really what is returned for directories, has different stuff at those places). I don't like this, but I'm also unsure as to how to fix it. One solution would be to check whether the FInfo structure returned was for a directory, and don't expose the Creator and Type fields in that case, but I don't see how to do this easily. Another option would be to leave GetFInfo as it is and add a new method GetFinderFlags() that returns only the finder flag word (as this seems the most useful bit of the data shared between FInfo and DInfo structures). What do you think? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=567296&group_id=5470 From noreply@sourceforge.net Tue Aug 6 14:32:42 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 06:32:42 -0700 Subject: [Patches] [ python-Patches-591305 ] Documentation err in bytecode defs. Message-ID: Patches item #591305, was opened at 2002-08-05 17:17 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591305&group_id=5470 Category: Documentation Group: Python 2.3 Status: Closed Resolution: Accepted Priority: 5 Submitted By: Michael Chermside (mcherm) Assigned to: Neal Norwitz (nnorwitz) Summary: Documentation err in bytecode defs. Initial Comment: The error in the definition of the opcode is obvious and easy to understand -- the definition used TOS1 where TOS was intended. However, this is my first use of the patch submission process, and my use of patch manager may be way off. If I've done something silly like submit my diff backward (from new code back to old) or in the wrong format, please drop me an email explaining how to generate a correct patch or where to read about how to do it. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-06 09:32 Message: Logged In: YES user_id=33168 I never thought about it like that. It makes sense. Thanks! ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-06 04:32 Message: Logged In: YES user_id=6656 Not really relavent, but: unified diffs are good when you want to see the changes, context diffs are good when you want to see what you're getting. During the 221 release process I found I was glad for the insistence on context diffs (to my surprise). ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 19:35 Message: Logged In: YES user_id=33168 The patch is not reversed and is a context diff. This is good (although I prefer unified diff). With unified diffs, you can see + and - for the lines which make more sense to me (good to check that you have the order correct). It's easier for me if you do the diff from the src directory, but I'm not sure what others prefer and it's not a big deal. I got a warning from patch: (Stripping trailing CRs from patch.) Checked in as: libdis.tex 1.38 and 1.33.10.3. Thanks! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591305&group_id=5470 From noreply@sourceforge.net Tue Aug 6 14:48:01 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 06:48:01 -0700 Subject: [Patches] [ python-Patches-591551 ] Remove symlink python during install Message-ID: Patches item #591551, was opened at 2002-08-06 15:48 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591551&group_id=5470 Category: Build Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Jack Jansen (jackjansen) Assigned to: Martin v. L�wis (loewis) Summary: Remove symlink python during install Initial Comment: During "make bininstall" an existing "python" in the bindir is removed. This fails, however, if the file in question is a symlink in stead of a regular file/hardlink. As an OSX framework install can deposit a symlink into /usr/local/bin/python it would be nice if a subsequent normal install would do the right thing. I'm posting this as a patch because I'm not sure how common "test -L file" is. Otherwise "test -e file" may be a better idea. Assigned to Martin as he seems to be one of the major build gurus. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591551&group_id=5470 From noreply@sourceforge.net Tue Aug 6 16:30:28 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 08:30:28 -0700 Subject: [Patches] [ python-Patches-580331 ] xreadlines caching, file iterator Message-ID: Patches item #580331, was opened at 2002-07-11 17:45 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: xreadlines caching, file iterator Initial Comment: Calling f.xreadlines() multiple times returns the same xreadlines object. A file is an iterator - __iter__() returns self and next() calls the cached xreadlines object's next method. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-06 11:30 Message: Logged In: YES user_id=6380 Hm, test_file fails on a technicality. I'll take it from here. Thanks! ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-06 01:03 Message: Logged In: YES user_id=562624 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 16:07 Message: Logged In: YES user_id=6380 Thanks! Making xreadlines an alias for __iter__ sounds about right, for backwards compatibility. Then we should probably deprecate xreadlines, despite the fact that it could be useful for other file-like objects; it's just not a pretty enough interface. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 16:01 Message: Logged In: YES user_id=562624 Updated patch. What to do about the xreadlines method? The patch doesn't touch it but It could be made an alias to __iter__ and the dependency of file objects on the xreadlines module will be eliminated. On my linux machine the highest performance is achieved for buffer sizes somewhere around 4096-8192. Higher or lower values are significantly slower. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-05 12:14 Message: Logged In: YES user_id=31435 Just FYI, in apps that do "read + process" in a loop, a small buffer is often faster because the data has a decent shot at staying in L1 cache. Make the buffer very large (100s of Kb), and it won't even stay in L2 cache. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 11:29 Message: Logged In: YES user_id=6380 OK, I'll await a new patch. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 11:22 Message: Logged In: YES user_id=562624 > What's a normal text file? One with a million bytes? :-) I meant 100kBYTE lines... Some apps actually use such long lines. Yes, it works just fine with universal newlines. Ok, the #ifdefs will go. Strange, a bigger buffer seems to actually slow it down... I'll have to investigate this further. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 10:52 Message: Logged In: YES user_id=6380 This begins to look good. What's a normal text file? One with a million bytes? :-) Have you made sure this works as expected in Universal newline mode? I'd like a patch that doesn't use #define WITH_READAHEAD_BUFFER. You might also experiment with larger buffer sizes (I predict that a larger buffer doesn't make much difference, since it didn't for xreadlines, but it would be nice to verify that and then add a comment; at least once a year someone asks whether the buffer shouldn't be much larger). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 02:27 Message: Logged In: YES user_id=562624 The version of the patch still makes a file an iterator but it no longer depends on xreadlines - it implements the readahead buffering inside the file object. It is about 19% faster than xreadlines for normal text files and about 40% faster for files with 100k lines. The methods readline and read do not use this readahead mechanism because it skews the current file position (just like xreadlines does). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-17 13:50 Message: Logged In: YES user_id=6380 Alas, there's a fatal flaw. The file object and the xreadlines object now both have pointers to each other, creating an unbreakable cycle (since neither participates in GC). Weak refs can't be used to resolve this dilemma. I personally think that's enough to just stick with the status quo (I was never more than +0 on the idea of making the file an interator anyway). But I'll leave it to Oren to come up with another hack (please use this same SF patch). Oren, if you'd like to give up, please say so and I'll close the item in a jiffy. In fact, I positively encourage you to give up. But I don't expect you to take this offer. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-16 21:33 Message: Logged In: YES user_id=6380 I'm reviewing this and will check it in, or something like it (probably). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-16 01:26 Message: Logged In: YES user_id=562624 Now invalidates cache on a seek. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-15 10:38 Message: Logged In: YES user_id=6380 I posted some comments to python-dev. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 From noreply@sourceforge.net Tue Aug 6 16:56:40 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 08:56:40 -0700 Subject: [Patches] [ python-Patches-580331 ] xreadlines caching, file iterator Message-ID: Patches item #580331, was opened at 2002-07-11 17:45 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: xreadlines caching, file iterator Initial Comment: Calling f.xreadlines() multiple times returns the same xreadlines object. A file is an iterator - __iter__() returns self and next() calls the cached xreadlines object's next method. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-06 11:56 Message: Logged In: YES user_id=6380 Thanks! Checked in. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-06 11:30 Message: Logged In: YES user_id=6380 Hm, test_file fails on a technicality. I'll take it from here. Thanks! ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-06 01:03 Message: Logged In: YES user_id=562624 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 16:07 Message: Logged In: YES user_id=6380 Thanks! Making xreadlines an alias for __iter__ sounds about right, for backwards compatibility. Then we should probably deprecate xreadlines, despite the fact that it could be useful for other file-like objects; it's just not a pretty enough interface. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 16:01 Message: Logged In: YES user_id=562624 Updated patch. What to do about the xreadlines method? The patch doesn't touch it but It could be made an alias to __iter__ and the dependency of file objects on the xreadlines module will be eliminated. On my linux machine the highest performance is achieved for buffer sizes somewhere around 4096-8192. Higher or lower values are significantly slower. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-05 12:14 Message: Logged In: YES user_id=31435 Just FYI, in apps that do "read + process" in a loop, a small buffer is often faster because the data has a decent shot at staying in L1 cache. Make the buffer very large (100s of Kb), and it won't even stay in L2 cache. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 11:29 Message: Logged In: YES user_id=6380 OK, I'll await a new patch. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 11:22 Message: Logged In: YES user_id=562624 > What's a normal text file? One with a million bytes? :-) I meant 100kBYTE lines... Some apps actually use such long lines. Yes, it works just fine with universal newlines. Ok, the #ifdefs will go. Strange, a bigger buffer seems to actually slow it down... I'll have to investigate this further. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 10:52 Message: Logged In: YES user_id=6380 This begins to look good. What's a normal text file? One with a million bytes? :-) Have you made sure this works as expected in Universal newline mode? I'd like a patch that doesn't use #define WITH_READAHEAD_BUFFER. You might also experiment with larger buffer sizes (I predict that a larger buffer doesn't make much difference, since it didn't for xreadlines, but it would be nice to verify that and then add a comment; at least once a year someone asks whether the buffer shouldn't be much larger). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 02:27 Message: Logged In: YES user_id=562624 The version of the patch still makes a file an iterator but it no longer depends on xreadlines - it implements the readahead buffering inside the file object. It is about 19% faster than xreadlines for normal text files and about 40% faster for files with 100k lines. The methods readline and read do not use this readahead mechanism because it skews the current file position (just like xreadlines does). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-17 13:50 Message: Logged In: YES user_id=6380 Alas, there's a fatal flaw. The file object and the xreadlines object now both have pointers to each other, creating an unbreakable cycle (since neither participates in GC). Weak refs can't be used to resolve this dilemma. I personally think that's enough to just stick with the status quo (I was never more than +0 on the idea of making the file an interator anyway). But I'll leave it to Oren to come up with another hack (please use this same SF patch). Oren, if you'd like to give up, please say so and I'll close the item in a jiffy. In fact, I positively encourage you to give up. But I don't expect you to take this offer. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-16 21:33 Message: Logged In: YES user_id=6380 I'm reviewing this and will check it in, or something like it (probably). ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-16 01:26 Message: Logged In: YES user_id=562624 Now invalidates cache on a seek. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-15 10:38 Message: Logged In: YES user_id=6380 I posted some comments to python-dev. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580331&group_id=5470 From noreply@sourceforge.net Tue Aug 6 17:08:24 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 09:08:24 -0700 Subject: [Patches] [ python-Patches-591250 ] str1 in str2 for len(str1) >= 1 Message-ID: Patches item #591250, was opened at 2002-08-05 15:09 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Neal Norwitz (nnorwitz) >Assigned to: Barry A. Warsaw (bwarsaw) Summary: str1 in str2 for len(str1) >= 1 Initial Comment: Here's a patch to implement and test str1 in str2, when str1 is more than a single character. The doc and unicode still need to be updated. There is a test. ---------------------------------------------------------------------- >Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-06 12:08 Message: Logged In: YES user_id=12800 Attached is the entire patch ready for pronouncement. This one merges Neal's test_contains.py patch and contains2.diff. It also restores the 1-char unicode check, includes a doc patch, and implements Guido's current leanings towards '' in 'str' returning True. Assign to Guido for pronouncement. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 20:43 Message: Logged In: YES user_id=33168 test_contains.py checks for multi-char strings using in. I attached a patch that removes these checks. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 20:06 Message: Logged In: YES user_id=33168 I'm not sure what PyUnicode_GET_SIZE() returns if using UCS-2 or UCS-4. Does size need to be size *= sizeof(*lhs), ie size *= sizeof(Py_UNICODE)? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 17:01 Message: Logged In: YES user_id=6380 I'm still at best -0 on the exception for empty left argument. Fortunately we can easily change that (I'm guessing it's just a matter of removing some code :-) if I don't change my mind. And the special case for 1-char Unicode should probably be restored. Can someone please mail the other person who posted a pointer to a patch to python-dev, to avoid him doing more work on cleaning up his patch? ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:55 Message: Logged In: YES user_id=12800 I actually think Fred should get in on the act because it's not clear to me where in the docs this should be documented. Assigning to Fred for an answer, then he can re-assign back to me. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:50 Message: Logged In: YES user_id=12800 There were some problems with this patch. test_unicode.py already contains some contains tests, which fail with your patch. This fixes them at the expense of not special-casing contains with a single character (in unicode only -- normal strings still have the special case). This version of the patch also adds some more tests. I think the only thing left to do is update the docs. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 15:39 Message: Logged In: YES user_id=33168 Went a little further. Add unicode support. Where should the string/unicode tests go? string tests are in string_tests.py now, but not sure that is appropriate. Still needs: * unicode test * doc - where? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 15:17 Message: Logged In: YES user_id=6380 Let Barry collect the patches. (He sent one in private email too.) Does this do Unicode? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 From noreply@sourceforge.net Tue Aug 6 17:08:47 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 09:08:47 -0700 Subject: [Patches] [ python-Patches-591250 ] str1 in str2 for len(str1) >= 1 Message-ID: Patches item #591250, was opened at 2002-08-05 15:09 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Neal Norwitz (nnorwitz) >Assigned to: Guido van Rossum (gvanrossum) Summary: str1 in str2 for len(str1) >= 1 Initial Comment: Here's a patch to implement and test str1 in str2, when str1 is more than a single character. The doc and unicode still need to be updated. There is a test. ---------------------------------------------------------------------- >Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-06 12:08 Message: Logged In: YES user_id=12800 Oops, assigning to Guido. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-06 12:08 Message: Logged In: YES user_id=12800 Attached is the entire patch ready for pronouncement. This one merges Neal's test_contains.py patch and contains2.diff. It also restores the 1-char unicode check, includes a doc patch, and implements Guido's current leanings towards '' in 'str' returning True. Assign to Guido for pronouncement. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 20:43 Message: Logged In: YES user_id=33168 test_contains.py checks for multi-char strings using in. I attached a patch that removes these checks. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 20:06 Message: Logged In: YES user_id=33168 I'm not sure what PyUnicode_GET_SIZE() returns if using UCS-2 or UCS-4. Does size need to be size *= sizeof(*lhs), ie size *= sizeof(Py_UNICODE)? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 17:01 Message: Logged In: YES user_id=6380 I'm still at best -0 on the exception for empty left argument. Fortunately we can easily change that (I'm guessing it's just a matter of removing some code :-) if I don't change my mind. And the special case for 1-char Unicode should probably be restored. Can someone please mail the other person who posted a pointer to a patch to python-dev, to avoid him doing more work on cleaning up his patch? ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:55 Message: Logged In: YES user_id=12800 I actually think Fred should get in on the act because it's not clear to me where in the docs this should be documented. Assigning to Fred for an answer, then he can re-assign back to me. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:50 Message: Logged In: YES user_id=12800 There were some problems with this patch. test_unicode.py already contains some contains tests, which fail with your patch. This fixes them at the expense of not special-casing contains with a single character (in unicode only -- normal strings still have the special case). This version of the patch also adds some more tests. I think the only thing left to do is update the docs. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 15:39 Message: Logged In: YES user_id=33168 Went a little further. Add unicode support. Where should the string/unicode tests go? string tests are in string_tests.py now, but not sure that is appropriate. Still needs: * unicode test * doc - where? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 15:17 Message: Logged In: YES user_id=6380 Let Barry collect the patches. (He sent one in private email too.) Does this do Unicode? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 From noreply@sourceforge.net Tue Aug 6 17:47:24 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 09:47:24 -0700 Subject: [Patches] [ python-Patches-591250 ] str1 in str2 Message-ID: Patches item #591250, was opened at 2002-08-05 15:09 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open >Resolution: Accepted Priority: 5 Submitted By: Neal Norwitz (nnorwitz) >Assigned to: Barry A. Warsaw (bwarsaw) >Summary: str1 in str2 Initial Comment: Here's a patch to implement and test str1 in str2, when str1 is more than a single character. The doc and unicode still need to be updated. There is a test. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-06 12:47 Message: Logged In: YES user_id=6380 OK, accepted. Please get rid of the #if 0 business and a few ##commented-out items and idle use of my name; then check it in. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-06 12:08 Message: Logged In: YES user_id=12800 Oops, assigning to Guido. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-06 12:08 Message: Logged In: YES user_id=12800 Attached is the entire patch ready for pronouncement. This one merges Neal's test_contains.py patch and contains2.diff. It also restores the 1-char unicode check, includes a doc patch, and implements Guido's current leanings towards '' in 'str' returning True. Assign to Guido for pronouncement. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 20:43 Message: Logged In: YES user_id=33168 test_contains.py checks for multi-char strings using in. I attached a patch that removes these checks. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 20:06 Message: Logged In: YES user_id=33168 I'm not sure what PyUnicode_GET_SIZE() returns if using UCS-2 or UCS-4. Does size need to be size *= sizeof(*lhs), ie size *= sizeof(Py_UNICODE)? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 17:01 Message: Logged In: YES user_id=6380 I'm still at best -0 on the exception for empty left argument. Fortunately we can easily change that (I'm guessing it's just a matter of removing some code :-) if I don't change my mind. And the special case for 1-char Unicode should probably be restored. Can someone please mail the other person who posted a pointer to a patch to python-dev, to avoid him doing more work on cleaning up his patch? ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:55 Message: Logged In: YES user_id=12800 I actually think Fred should get in on the act because it's not clear to me where in the docs this should be documented. Assigning to Fred for an answer, then he can re-assign back to me. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:50 Message: Logged In: YES user_id=12800 There were some problems with this patch. test_unicode.py already contains some contains tests, which fail with your patch. This fixes them at the expense of not special-casing contains with a single character (in unicode only -- normal strings still have the special case). This version of the patch also adds some more tests. I think the only thing left to do is update the docs. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 15:39 Message: Logged In: YES user_id=33168 Went a little further. Add unicode support. Where should the string/unicode tests go? string tests are in string_tests.py now, but not sure that is appropriate. Still needs: * unicode test * doc - where? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 15:17 Message: Logged In: YES user_id=6380 Let Barry collect the patches. (He sent one in private email too.) Does this do Unicode? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 From noreply@sourceforge.net Tue Aug 6 17:58:29 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 09:58:29 -0700 Subject: [Patches] [ python-Patches-591250 ] str1 in str2 Message-ID: Patches item #591250, was opened at 2002-08-05 15:09 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Neal Norwitz (nnorwitz) Assigned to: Barry A. Warsaw (bwarsaw) Summary: str1 in str2 Initial Comment: Here's a patch to implement and test str1 in str2, when str1 is more than a single character. The doc and unicode still need to be updated. There is a test. ---------------------------------------------------------------------- >Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-06 12:58 Message: Logged In: YES user_id=12800 Committed to cvs. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-06 12:47 Message: Logged In: YES user_id=6380 OK, accepted. Please get rid of the #if 0 business and a few ##commented-out items and idle use of my name; then check it in. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-06 12:08 Message: Logged In: YES user_id=12800 Oops, assigning to Guido. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-06 12:08 Message: Logged In: YES user_id=12800 Attached is the entire patch ready for pronouncement. This one merges Neal's test_contains.py patch and contains2.diff. It also restores the 1-char unicode check, includes a doc patch, and implements Guido's current leanings towards '' in 'str' returning True. Assign to Guido for pronouncement. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 20:43 Message: Logged In: YES user_id=33168 test_contains.py checks for multi-char strings using in. I attached a patch that removes these checks. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 20:06 Message: Logged In: YES user_id=33168 I'm not sure what PyUnicode_GET_SIZE() returns if using UCS-2 or UCS-4. Does size need to be size *= sizeof(*lhs), ie size *= sizeof(Py_UNICODE)? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 17:01 Message: Logged In: YES user_id=6380 I'm still at best -0 on the exception for empty left argument. Fortunately we can easily change that (I'm guessing it's just a matter of removing some code :-) if I don't change my mind. And the special case for 1-char Unicode should probably be restored. Can someone please mail the other person who posted a pointer to a patch to python-dev, to avoid him doing more work on cleaning up his patch? ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:55 Message: Logged In: YES user_id=12800 I actually think Fred should get in on the act because it's not clear to me where in the docs this should be documented. Assigning to Fred for an answer, then he can re-assign back to me. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-05 16:50 Message: Logged In: YES user_id=12800 There were some problems with this patch. test_unicode.py already contains some contains tests, which fail with your patch. This fixes them at the expense of not special-casing contains with a single character (in unicode only -- normal strings still have the special case). This version of the patch also adds some more tests. I think the only thing left to do is update the docs. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 15:39 Message: Logged In: YES user_id=33168 Went a little further. Add unicode support. Where should the string/unicode tests go? string tests are in string_tests.py now, but not sure that is appropriate. Still needs: * unicode test * doc - where? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 15:17 Message: Logged In: YES user_id=6380 Let Barry collect the patches. (He sent one in private email too.) Does this do Unicode? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591250&group_id=5470 From noreply@sourceforge.net Tue Aug 6 18:43:30 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 10:43:30 -0700 Subject: [Patches] [ python-Patches-553108 ] Deprecate bsddb Message-ID: Patches item #553108, was opened at 2002-05-06 22:46 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=553108&group_id=5470 Category: Modules Group: Python 2.3 >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Garth T Kidd (gtk) Assigned to: Skip Montanaro (montanaro) Summary: Deprecate bsddb Initial Comment: Large numbers of inserts break bsddb, as first discovered in Python 1.5 (bug 408271). According to Barry Warsaw, "trying to get the bsddb module that comes with Python to work is a hopeless cause." If it's broken, let's discourage people from using it. In particular, let's ensure that people importing shelve or anydbm don't end up using it by default. The submitted patch adds a DeprecationWarning to the bsddb module and removes bsddb from the list of db module candidates in anydbm. ---------------------------------------------------------------------- >Comment By: Skip Montanaro (montanaro) Date: 2002-08-06 12:43 Message: Logged In: YES user_id=44345 Closing this again. I think Jack's running okay on MacOSX once again. ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-07-02 17:17 Message: Logged In: YES user_id=44345 Jack, Sorry to here you're having trouble. Alas, my MacOS X system is with my wife at the moment, so I can't dig into the problem much. Can you provide me with some background info? If you can send me your copy of ndbm.h (I doubt it's using Berkeley DB) and figure out which library dbm_open resides in, that would be great. Also, can you provide me with the output of the build process so I can see just what errors are being generated? Skip ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-07-02 16:52 Message: Logged In: YES user_id=45365 Skip, I'm reopening this bug report: the fix breaks builds on Mac OS X, and I haven't a clue as to how to fix this so I hope you can help. MacOSX has /usr/include/ndbm.h (implemented with Berkeley DB, I think) but it doesn't have any of the libraries (I assume everything needed is in libc). Everything worked fine until last week, when configure still took care of defining HAVE_NDBM_H. ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-06-14 15:32 Message: Logged In: YES user_id=44345 Implemented in setup.py 1.93 README 1.147 configure 1.315 configure.in 1.325 pyconfig.h.in 1.42 Modules/dbmmodule 2.30 ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-06-14 02:16 Message: Logged In: YES user_id=21627 The patch looks good, please apply it. ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-06-13 22:33 Message: Logged In: YES user_id=44345 a couple more tweaks... I forgot to include dbmmodule.c in previous patches. This version of the patch also includes a modified README file that adds a section about building the bsddb and dbm modules. ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-06-13 02:35 Message: Logged In: YES user_id=44345 Here's an updated patch. It's different in a couple ways: * support for Berkeley DB 4.x was added. You will need to configure iBerkdb with the 1.85 compatibility stuff. * I cleaned up the dbm build code a bit. * I added a diff for the configure file for people who don't have autoconf handy. Skip ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-06-11 11:09 Message: Logged In: YES user_id=44345 I think deprecating bsddb is too drastic. In the first place, the problems you refer to are in the underlying Berkeley DB library, not in the bsddb code itself. In the second place, later versions of the library fix the problem. The attached patch attempts to modify setup.py and configure.in to solve the problem. It does a couple things differently than the current CVS version: 1. It only searches for versions 2 and 3 of the Berkeley DB library by default. People who know what they are doing can uncomment the information relevant to version 1. 2. It moves all the checking code into setup.py. The header file checks in configure.in were deleted. 3. The ndbm lookalike stuff for the dbm module is done differently. This has not really been tested yet. I anticipate further changes will be necessary with this code. I'm sure it's not perfect. Please give it a try and let me know how it works for you. All that said, I think a better migration path is to replace the current module with the bsddb3/pybsddb stuff. I think that would effectively restrict you to versions 3 or 4 of the underlying Berkeley DB library, so it probably couldn't be done with impunity. Skip ---------------------------------------------------------------------- Comment By: Martin D Katz, Ph.D. (drbits) Date: 2002-05-20 13:14 Message: Logged In: YES user_id=276840 #!/bin/python # Test for Python bug report 553108 # This program shows that bsddb seems to work reliably with # the btopen database format. # This is based on the test program # in the discussion of bug report 445862 # This has been enhanced to perform read, modify, # write operations in random order. # This is only one of several tests I performed. # This included 4,000,000 read, modify, write operations to 90,909 records # (an average of 44,000 writes for each record). # Note: This program took approximately 50 hours to run # on my 930MHz Pentium 3 under Windows 2000 with # ActiveState Python version 2.1.1 build 212 import unittest, sys, os, math, time LIMIT=4000000 DISPLAY_AT_END=1 USE_RANDOM=100 # If set, number of keys is approximately LIMIT/USE_RANDOM AUTO_RANDOM=1 if USE_RANDOM and AUTO_RANDOM: USE_RANDOM=int(math.sqrt(math.sqrt(LIMIT))) if USE_RANDOM < 2: USE_RANDOM = 2 ## The format of the value string is ## count|hash|hash...|b ## Where ## count is an 8 byte hexadecimal count of the number of times ## this record has been written. ## hash is the md5 hash of the random value that created this record. ## It is the key for this record. It is appended once for each ## time the record is written (that is, it occurs count times). ## b is 129 '!' ## if USE_RANDOM is set, its value should be >= 2 class BreakDB(unittest.TestCase): def runTest(self): import md5, bsddb, os if USE_RANDOM: import random random.seed() max_key=int(LIMIT / USE_RANDOM) m = md5.new() b = "!" * 129 # small string to write db = bsddb.btopen(self.dbname, 'c') try: self.db = db for count in xrange(1, LIMIT+1): if count % 100==0: print >> sys.stderr, " %10d\r" % (count), if USE_RANDOM: r = random.randrange(0, max_key) m = md5.new(str(r)) key = m.hexdigest() if db.has_key(key): rec = db[key] old_count = int(rec[0:8], 16) should_be = '%08X|%s%s'% (old_count, ((key+'|') *old_count), b) if rec != should_be: self.fail("Mismatched data: db ["+repr(key)+"]="+ repr(db[key])+". Should be "+repr(should_be)) return 1 else: # New record rec = '00000000|'+b old_count = 0 new_count = old_count+1 new_rec = '%08X|%s%s'% (new_count, key, rec[8:], ) db[key] = new_rec else: m.update(str(count)) db[m.digest()] = b try: db.sync() except: pass if DISPLAY_AT_END: rec = db.first() count = 0 while 1: print >> sys.stderr, " count = %6i db[% s]=%s" % ( count, rec[0], rec[1], ) count += 1 try: rec = db.next() except KeyError: break finally: db.close() def unlinkDB(self): import os if os.path.exists(self.dbname): os.unlink(self.dbname) def setUp(self): self.dbname = 'test.db' self.unlinkDB() def tearDown(self): self.db.close() self.unlinkDB() if __name__ == '__main__': runner = unittest.TextTestRunner() runner.run(unittest.TestSuite([BreakDB()])) ---------------------------------------------------------------------- Comment By: Martin D Katz, Ph.D. (drbits) Date: 2002-05-16 18:10 Message: Logged In: YES user_id=276840 I am not sure there is a reason to deprecate bsddb. The btopen format appears to be stable enough for normal work. Maybe 2.3 should change dbhash to use btopen? ---------------------------------------------------------------------- Comment By: Garth T Kidd (gtk) Date: 2002-05-08 22:12 Message: Logged In: YES user_id=59803 Let's not turn a simple patch into something requiring a PEP, compulsory thrashing on comp.lang.python, SleepyCat being willing to change their distribution model, lawyers (to make sure the licences are compatible), and so on. I'd hate it if other people spent the kind of time I did trying to get shelve to work only to find that a known- broken bsddb was causing all the problems, and that a patch was there to gently guide them to gdbm, but it got jammed because of scope-creep. Let's get this one, very simple and necessary (bsddb IS broken) change out of the way, and THEN start negotiating, thrashing, and integrating. :) I firmly believe bsddb3 should be one of the included batteries. Let's do it, but let's guide people away from broken code first. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-05-08 04:01 Message: Logged In: YES user_id=21627 I'm in favour of this change, but I'd like simultaneously incorporate bsddb3. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=553108&group_id=5470 From noreply@sourceforge.net Tue Aug 6 18:49:53 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 10:49:53 -0700 Subject: [Patches] [ python-Patches-584626 ] yield allowed in try/finally Message-ID: Patches item #584626, was opened at 2002-07-21 16:29 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=584626&group_id=5470 Category: None Group: None >Status: Closed >Resolution: Rejected Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: yield allowed in try/finally Initial Comment: A generator's dealloc function now resumes a generator one last time by jumping directly to the return statement at the end of the code. As a result, the finally section of any try/finally blocks is executed. Any exceptions raised are treated just like exceptions in a __del__ finalizer. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-06 13:49 Message: Logged In: YES user_id=33168 I'm closing since Oren abandoned this patch. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 08:14 Message: Logged In: YES user_id=562624 Patch abandoned. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 17:23 Message: Logged In: YES user_id=35752 The GC will need to be taught about these finalizers. Look for the method 'has_finalizer' in gcmodule.c. I don't think we want that method to return true for all generator objects since that would cause any reference cycle containing a generator to become uncollectable. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=584626&group_id=5470 From noreply@sourceforge.net Tue Aug 6 20:26:03 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 12:26:03 -0700 Subject: [Patches] [ python-Patches-591713 ] Fix "file:" URL to have right no. of /'s Message-ID: Patches item #591713, was opened at 2002-08-06 12:26 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591713&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Bruce Atherton (callenish) Assigned to: Nobody/Anonymous (nobody) Summary: Fix "file:" URL to have right no. of /'s Initial Comment: If you run urlparse.urljoin() on a file: URL, the resulting URL has together with the wrong number of '/'s in it. Properly formed, the URL (assuming no netloc) should have three slashes, so that it looks like "file:///...". The current code drops that down to one. The error appears to be in a condition in urlunsplit(). It doesn't show up except in this one instance because the test is only run iff the scheme is in the list of those that can take a netloc and there is no netloc present in the URL. Apparently, this is pretty rare. Patch attached that corrects the condition. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591713&group_id=5470 From noreply@sourceforge.net Tue Aug 6 20:35:10 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 12:35:10 -0700 Subject: [Patches] [ python-Patches-588809 ] LDFLAGS support for build_ext.py Message-ID: Patches item #588809, was opened at 2002-07-30 21:36 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588809&group_id=5470 Category: Distutils and setup.py Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Robert Weber (chipsforbrains) >Assigned to: Martin v. L�wis (loewis) Summary: LDFLAGS support for build_ext.py Initial Comment: a hack at best ---------------------------------------------------------------------- >Comment By: Robert Weber (chipsforbrains) Date: 2002-08-06 19:35 Message: Logged In: YES user_id=245624 > As a hack, I think it is unacceptable for Python. > >I'd encourage you to integrate this (and CFLAGS) into >sysconfig.customize_compiler. > >It would be ok if only the Unix compiler honors those >settings for now. > Martin v. L�wis (loewis) I have written a better patch to sysconfig.py that doe all others so that everything works like autoconf. I will post the patch in a sec.s CFLAGS and ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 09:05 Message: Logged In: YES user_id=21627 As a hack, I think it is unacceptable for Python. I'd encourage you to integrate this (and CFLAGS) into sysconfig.customize_compiler. It would be ok if only the Unix compiler honors those settings for now. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588809&group_id=5470 From noreply@sourceforge.net Tue Aug 6 20:46:52 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 12:46:52 -0700 Subject: [Patches] [ python-Patches-575073 ] PyTRASHCAN slots deallocation Message-ID: Patches item #575073, was opened at 2002-06-28 11:23 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=575073&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Jonathan Hogg (jhogg) >Assigned to: Guido van Rossum (gvanrossum) Summary: PyTRASHCAN slots deallocation Initial Comment: This is an addition to the PyTRASHCAN macros to support delayed deallocation of arbitrary objects (i.e., not just builtin containers), and a modification to the 'clear_slots' routine to use these macros. This patch fixes bug ID 574207, "Chained __slots__ dealloc segfault". The solution is not ideal, but it appears to have minimal impact. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-08-06 15:46 Message: Logged In: YES user_id=31435 Assigned to Guido. ---------------------------------------------------------------------- Comment By: Jonathan Hogg (jhogg) Date: 2002-07-15 11:23 Message: Logged In: YES user_id=10036 Attaching a new version of this patch against the 2.3 HEAD code (as of today). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=575073&group_id=5470 From noreply@sourceforge.net Tue Aug 6 20:58:59 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 12:58:59 -0700 Subject: [Patches] [ python-Patches-581742 ] Alternative PyTRASHCAN subtype_dealloc Message-ID: Patches item #581742, was opened at 2002-07-15 11:47 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=581742&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Jonathan Hogg (jhogg) >Assigned to: Guido van Rossum (gvanrossum) Summary: Alternative PyTRASHCAN subtype_dealloc Initial Comment: This is an alternative to patch #575073 (PyTRASHCAN slots deallocation) that wraps 'subtype_dealloc' in the (very slightly altered) normal PyTRASHCAN macros. This patch isn't meant to be pretty, it's just to demonstrate another possible solution. I would expect it to be worked on before being accepted. I'm sure there must be a way to safely untrack the object at the beginning. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=581742&group_id=5470 From noreply@sourceforge.net Tue Aug 6 22:42:55 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 14:42:55 -0700 Subject: [Patches] [ python-Patches-581742 ] Alternative PyTRASHCAN subtype_dealloc Message-ID: Patches item #581742, was opened at 2002-07-15 11:47 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=581742&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Jonathan Hogg (jhogg) Assigned to: Guido van Rossum (gvanrossum) Summary: Alternative PyTRASHCAN subtype_dealloc Initial Comment: This is an alternative to patch #575073 (PyTRASHCAN slots deallocation) that wraps 'subtype_dealloc' in the (very slightly altered) normal PyTRASHCAN macros. This patch isn't meant to be pretty, it's just to demonstrate another possible solution. I would expect it to be worked on before being accepted. I'm sure there must be a way to safely untrack the object at the beginning. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-06 17:42 Message: Logged In: YES user_id=6380 Fixed along these lines, but without modifying the trashcan macro to test for GC; instead, the non-GC case is separated out by subtype_dealloc(). Thanks!!! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=581742&group_id=5470 From noreply@sourceforge.net Tue Aug 6 22:44:37 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 14:44:37 -0700 Subject: [Patches] [ python-Patches-575073 ] PyTRASHCAN slots deallocation Message-ID: Patches item #575073, was opened at 2002-06-28 11:23 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=575073&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed >Resolution: Duplicate Priority: 5 Submitted By: Jonathan Hogg (jhogg) Assigned to: Guido van Rossum (gvanrossum) Summary: PyTRASHCAN slots deallocation Initial Comment: This is an addition to the PyTRASHCAN macros to support delayed deallocation of arbitrary objects (i.e., not just builtin containers), and a modification to the 'clear_slots' routine to use these macros. This patch fixes bug ID 574207, "Chained __slots__ dealloc segfault". The solution is not ideal, but it appears to have minimal impact. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-06 17:44 Message: Logged In: YES user_id=6380 Fixed in CVS for 2.3 according to your alternative patch. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-06 15:46 Message: Logged In: YES user_id=31435 Assigned to Guido. ---------------------------------------------------------------------- Comment By: Jonathan Hogg (jhogg) Date: 2002-07-15 11:23 Message: Logged In: YES user_id=10036 Attaching a new version of this patch against the 2.3 HEAD code (as of today). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=575073&group_id=5470 From noreply@sourceforge.net Tue Aug 6 23:22:50 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 06 Aug 2002 15:22:50 -0700 Subject: [Patches] [ python-Patches-590352 ] py2texi.el update Message-ID: Patches item #590352, was opened at 2002-08-02 20:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590352&group_id=5470 Category: Documentation Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Matthias Klose (doko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: py2texi.el update Initial Comment: [python2.3 (and python2.2)] Attached is a patch from Milan Zamazal to update py2texi.el: - allow to set the info file name - correctly generate code for nodes like: \subsubsection{File Objects\obindex{file} \label{bltin-file-objects}} ---------------------------------------------------------------------- >Comment By: Matthias Klose (doko) Date: 2002-08-06 22:22 Message: Logged In: YES user_id=60903 An updated patch, which now matches Milan's version. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590352&group_id=5470 From noreply@sourceforge.net Wed Aug 7 09:27:28 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 07 Aug 2002 01:27:28 -0700 Subject: [Patches] [ python-Patches-588809 ] LDFLAGS support for build_ext.py Message-ID: Patches item #588809, was opened at 2002-07-30 23:36 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588809&group_id=5470 Category: Distutils and setup.py Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Robert Weber (chipsforbrains) Assigned to: Martin v. L�wis (loewis) Summary: LDFLAGS support for build_ext.py Initial Comment: a hack at best ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-07 10:27 Message: Logged In: YES user_id=21627 The patch looks fine to me, but I'd like to hear the opinion of a distutils guru. ---------------------------------------------------------------------- Comment By: Robert Weber (chipsforbrains) Date: 2002-08-06 21:35 Message: Logged In: YES user_id=245624 > As a hack, I think it is unacceptable for Python. > >I'd encourage you to integrate this (and CFLAGS) into >sysconfig.customize_compiler. > >It would be ok if only the Unix compiler honors those >settings for now. > Martin v. L�wis (loewis) I have written a better patch to sysconfig.py that doe all others so that everything works like autoconf. I will post the patch in a sec.s CFLAGS and ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 11:05 Message: Logged In: YES user_id=21627 As a hack, I think it is unacceptable for Python. I'd encourage you to integrate this (and CFLAGS) into sysconfig.customize_compiler. It would be ok if only the Unix compiler honors those settings for now. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588809&group_id=5470 From noreply@sourceforge.net Wed Aug 7 14:43:25 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 07 Aug 2002 06:43:25 -0700 Subject: [Patches] [ python-Patches-592065 ] Cleanup, speedup iterobject Message-ID: Patches item #592065, was opened at 2002-08-07 08:43 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592065&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Tim Peters (tim_one) Summary: Cleanup, speedup iterobject Initial Comment: Moved special case for tuples from iterobject.c to tupleobject.c. Makes the code in iterobject.c cleaner and speeds-up the general case by not checking for tuples everytime. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592065&group_id=5470 From noreply@sourceforge.net Thu Aug 8 11:12:33 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 03:12:33 -0700 Subject: [Patches] [ python-Patches-592529 ] Split-out ntmodule.c Message-ID: Patches item #592529, was opened at 2002-08-08 12:12 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592529&group_id=5470 Category: Windows Group: None Status: Open Resolution: None Priority: 5 Submitted By: Martin v. L�wis (loewis) Assigned to: Tim Peters (tim_one) Summary: Split-out ntmodule.c Initial Comment: This patch moves the MS_WINDOWS code from posixmodule.c into ntmodule.c. The OS/2 code is left in posixmodule.c. I believe this patch significantly improves readability of both modules (posix and nt), even though it adds a slight code duplication. It also gives Windows developers the chance to adjust the implementation better to the Win32 API without fear of breaking the POSIX versions. Attached are three files: the ntmodule.c source code, the posixmodule.c diff, and the pcbuild diff. Since the patches will outdate quickly, I'd appreciate if that patch could be accepted or rejected quickly. Randomly assigning to Tim. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592529&group_id=5470 From noreply@sourceforge.net Thu Aug 8 11:25:05 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 03:25:05 -0700 Subject: [Patches] [ python-Patches-591551 ] Remove symlink python during install Message-ID: Patches item #591551, was opened at 2002-08-06 15:48 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591551&group_id=5470 Category: Build Group: Python 2.3 Status: Open >Resolution: Accepted Priority: 5 Submitted By: Jack Jansen (jackjansen) >Assigned to: Jack Jansen (jackjansen) Summary: Remove symlink python during install Initial Comment: During "make bininstall" an existing "python" in the bindir is removed. This fails, however, if the file in question is a symlink in stead of a regular file/hardlink. As an OSX framework install can deposit a symlink into /usr/local/bin/python it would be nice if a subsequent normal install would do the right thing. I'm posting this as a patch because I'm not sure how common "test -L file" is. Otherwise "test -e file" may be a better idea. Assigned to Martin as he seems to be one of the major build gurus. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-08 12:25 Message: Logged In: YES user_id=21627 The patch sounds good to me. It might be that the system does not support test -L; in this case, we will need autoconf magic to find whether test supports -L. I think we can defer this until the problem comes up; I'd appreciate if you could volunteer to add the autoconf test if the problem comes up. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591551&group_id=5470 From noreply@sourceforge.net Thu Aug 8 16:35:51 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 08:35:51 -0700 Subject: [Patches] [ python-Patches-592646 ] Prefer nb_multipy oversq_repeat Message-ID: Patches item #592646, was opened at 2002-08-08 15:35 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Guido van Rossum (gvanrossum) Summary: Prefer nb_multipy oversq_repeat Initial Comment: >From David Beazlely: ============================================== class Foo(object): def __mul__(self,other): print "__mul__" def __rmul__(self,other): print "__rmul__" Python-2.2.1, if you try this, you get the following behavior: >>> f = Foo() >>> f*1.0 __mul__ >>> 1.0*f __rmul__ >>> f*1 __mul__ >>> 1*f __mul__ ============================================== This is because the int object prefers calling sq_repeat over nb_multiply if a type defines both. The attached patch changes int_mul to prefer nb_multiply over sq_repeat. This is a change is behavior and requires some careful thought. The tests run okay with this change but that doesn't mean other code will not break. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 From noreply@sourceforge.net Thu Aug 8 16:36:46 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 08:36:46 -0700 Subject: [Patches] [ python-Patches-592646 ] Prefer nb_multipy over sq_repeat Message-ID: Patches item #592646, was opened at 2002-08-08 15:35 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Guido van Rossum (gvanrossum) >Summary: Prefer nb_multipy over sq_repeat Initial Comment: >From David Beazlely: ============================================== class Foo(object): def __mul__(self,other): print "__mul__" def __rmul__(self,other): print "__rmul__" Python-2.2.1, if you try this, you get the following behavior: >>> f = Foo() >>> f*1.0 __mul__ >>> 1.0*f __rmul__ >>> f*1 __mul__ >>> 1*f __mul__ ============================================== This is because the int object prefers calling sq_repeat over nb_multiply if a type defines both. The attached patch changes int_mul to prefer nb_multiply over sq_repeat. This is a change is behavior and requires some careful thought. The tests run okay with this change but that doesn't mean other code will not break. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 From noreply@sourceforge.net Thu Aug 8 17:32:34 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 09:32:34 -0700 Subject: [Patches] [ python-Patches-592529 ] Split-out ntmodule.c Message-ID: Patches item #592529, was opened at 2002-08-08 06:12 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592529&group_id=5470 Category: Windows Group: None Status: Open Resolution: None Priority: 5 Submitted By: Martin v. L�wis (loewis) >Assigned to: Guido van Rossum (gvanrossum) Summary: Split-out ntmodule.c Initial Comment: This patch moves the MS_WINDOWS code from posixmodule.c into ntmodule.c. The OS/2 code is left in posixmodule.c. I believe this patch significantly improves readability of both modules (posix and nt), even though it adds a slight code duplication. It also gives Windows developers the chance to adjust the implementation better to the Win32 API without fear of breaking the POSIX versions. Attached are three files: the ntmodule.c source code, the posixmodule.c diff, and the pcbuild diff. Since the patches will outdate quickly, I'd appreciate if that patch could be accepted or rejected quickly. Randomly assigning to Tim. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-08-08 12:32 Message: Logged In: YES user_id=31435 I'm -0, so assigning to Guido for another opinion. I expect this will actually make it harder to keep the os interface consistent and working across platforms; e.g., somebody adds an os function in one module but forgets to add it in the other (likely because they don't even know it exists); a docstring repair shows up in one but not both; a largefile fix in one doesn't get reflected in the other; etc. Apart from the massive Windows popen pain, there are actually more embedded PYOS_OS2 #ifdefs in posixmodule. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592529&group_id=5470 From noreply@sourceforge.net Thu Aug 8 19:21:15 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 11:21:15 -0700 Subject: [Patches] [ python-Patches-592646 ] Prefer nb_multipy over sq_repeat Message-ID: Patches item #592646, was opened at 2002-08-08 11:35 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 Category: Core (C code) Group: None Status: Open >Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) >Assigned to: Neil Schemenauer (nascheme) Summary: Prefer nb_multipy over sq_repeat Initial Comment: >From David Beazlely: ============================================== class Foo(object): def __mul__(self,other): print "__mul__" def __rmul__(self,other): print "__rmul__" Python-2.2.1, if you try this, you get the following behavior: >>> f = Foo() >>> f*1.0 __mul__ >>> 1.0*f __rmul__ >>> f*1 __mul__ >>> 1*f __mul__ ============================================== This is because the int object prefers calling sq_repeat over nb_multiply if a type defines both. The attached patch changes int_mul to prefer nb_multiply over sq_repeat. This is a change is behavior and requires some careful thought. The tests run okay with this change but that doesn't mean other code will not break. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-08 14:21 Message: Logged In: YES user_id=6380 Looks good to me, but s/of/if/ in the comment. :-) Despite the semantic change I think this should be backported to 2.2; it's more a bug than a feature... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 From noreply@sourceforge.net Thu Aug 8 20:01:33 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 12:01:33 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-01 23:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Nobody/Anonymous (nobody) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Zack Weinberg (zackw) Date: 2002-08-08 12:01 Message: Logged In: YES user_id=580015 I'm afraid my idea of patch size comes from GCC land, where this would be considered not that big at all. Only 1500 lines affected, and more than half of that is documentation and test suite! I tried, and failed, to break up the changes to tempfile.py itself. But there's some larger divisions that could be made. We could check in the new test_tempfile.py now, disabling the tests that refer to nonexistent functions (just comment out the lines that add those tests to the test_classes array). The changes to the rest of the test suite are also largely independent of the tempfile.py rewrite (since they replace tempfile.mktemp() with TESTFN, mostly). And the search-and-replace changes in the library can wait until after tempfile.py itself gets reviewed. Unfortunately, I am about to go on vacation for five days, so I don't have time now to do this split-up. I will try to drum up interest on python-dev in reviewing the patch as is. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 09:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-02 23:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 07:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Thu Aug 8 20:47:51 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 12:47:51 -0700 Subject: [Patches] [ python-Patches-588561 ] Cygwin _hotshot patch Message-ID: Patches item #588561, was opened at 2002-07-30 05:52 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588561&group_id=5470 Category: Modules Group: None >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Jason Tishler (jlt63) Assigned to: Jason Tishler (jlt63) Summary: Cygwin _hotshot patch Initial Comment: YA Cygwin module patch very similar to other patches that I have submitted. I tested under Cygwin and Red Hat Linux 7.1. ---------------------------------------------------------------------- >Comment By: Jason Tishler (jlt63) Date: 2002-08-08 11:47 Message: Logged In: YES user_id=86216 Committed as Modules/_hotshot.c 1.25. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 00:44 Message: Logged In: YES user_id=21627 This is ok, please apply it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588561&group_id=5470 From noreply@sourceforge.net Thu Aug 8 20:52:00 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 12:52:00 -0700 Subject: [Patches] [ python-Patches-592646 ] Prefer nb_multipy over sq_repeat Message-ID: Patches item #592646, was opened at 2002-08-08 15:35 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) >Assigned to: Guido van Rossum (gvanrossum) Summary: Prefer nb_multipy over sq_repeat Initial Comment: >From David Beazlely: ============================================== class Foo(object): def __mul__(self,other): print "__mul__" def __rmul__(self,other): print "__rmul__" Python-2.2.1, if you try this, you get the following behavior: >>> f = Foo() >>> f*1.0 __mul__ >>> 1.0*f __rmul__ >>> f*1 __mul__ >>> 1*f __mul__ ============================================== This is because the int object prefers calling sq_repeat over nb_multiply if a type defines both. The attached patch changes int_mul to prefer nb_multiply over sq_repeat. This is a change is behavior and requires some careful thought. The tests run okay with this change but that doesn't mean other code will not break. ---------------------------------------------------------------------- >Comment By: Neil Schemenauer (nascheme) Date: 2002-08-08 19:52 Message: Logged In: YES user_id=35752 This affects extension classes: import ExtensionClass class A(ExtensionClass.Base): def __mul__(self, other): print '__mul__' return 1 a = A() # without patch a*1 # prints __mul__ a*1.0 # prints __mul__ 1.0*a # prints __mul__ 1.0*a # raises TypeError # with patch a*1 # prints __mul__ a*1.0 # prints __mul__ 1*a # raises TypeError 1.0*a # raises TypeError The new behavior is more consistent. Do you still want it in 2.2? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-08 18:21 Message: Logged In: YES user_id=6380 Looks good to me, but s/of/if/ in the comment. :-) Despite the semantic change I think this should be backported to 2.2; it's more a bug than a feature... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 From noreply@sourceforge.net Thu Aug 8 20:54:56 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 12:54:56 -0700 Subject: [Patches] [ python-Patches-592646 ] Prefer nb_multipy over sq_repeat Message-ID: Patches item #592646, was opened at 2002-08-08 15:35 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Guido van Rossum (gvanrossum) Summary: Prefer nb_multipy over sq_repeat Initial Comment: >From David Beazlely: ============================================== class Foo(object): def __mul__(self,other): print "__mul__" def __rmul__(self,other): print "__rmul__" Python-2.2.1, if you try this, you get the following behavior: >>> f = Foo() >>> f*1.0 __mul__ >>> 1.0*f __rmul__ >>> f*1 __mul__ >>> 1*f __mul__ ============================================== This is because the int object prefers calling sq_repeat over nb_multiply if a type defines both. The attached patch changes int_mul to prefer nb_multiply over sq_repeat. This is a change is behavior and requires some careful thought. The tests run okay with this change but that doesn't mean other code will not break. ---------------------------------------------------------------------- >Comment By: Neil Schemenauer (nascheme) Date: 2002-08-08 19:54 Message: Logged In: YES user_id=35752 Argh! Stupid typos. # without patch 1*a # prints __mul__ 1.0*a # raises TypeError Extension classes do not define __r* methods. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-08 19:52 Message: Logged In: YES user_id=35752 This affects extension classes: import ExtensionClass class A(ExtensionClass.Base): def __mul__(self, other): print '__mul__' return 1 a = A() # without patch a*1 # prints __mul__ a*1.0 # prints __mul__ 1.0*a # prints __mul__ 1.0*a # raises TypeError # with patch a*1 # prints __mul__ a*1.0 # prints __mul__ 1*a # raises TypeError 1.0*a # raises TypeError The new behavior is more consistent. Do you still want it in 2.2? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-08 18:21 Message: Logged In: YES user_id=6380 Looks good to me, but s/of/if/ in the comment. :-) Despite the semantic change I think this should be backported to 2.2; it's more a bug than a feature... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 From noreply@sourceforge.net Thu Aug 8 20:58:20 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 12:58:20 -0700 Subject: [Patches] [ python-Patches-592646 ] Prefer nb_multipy over sq_repeat Message-ID: Patches item #592646, was opened at 2002-08-08 11:35 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Guido van Rossum (gvanrossum) Summary: Prefer nb_multipy over sq_repeat Initial Comment: >From David Beazlely: ============================================== class Foo(object): def __mul__(self,other): print "__mul__" def __rmul__(self,other): print "__rmul__" Python-2.2.1, if you try this, you get the following behavior: >>> f = Foo() >>> f*1.0 __mul__ >>> 1.0*f __rmul__ >>> f*1 __mul__ >>> 1*f __mul__ ============================================== This is because the int object prefers calling sq_repeat over nb_multiply if a type defines both. The attached patch changes int_mul to prefer nb_multiply over sq_repeat. This is a change is behavior and requires some careful thought. The tests run okay with this change but that doesn't mean other code will not break. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-08 15:58 Message: Logged In: YES user_id=6380 Yes, I still want it in 2.2. I presume Extensionclasses behave the same way with 2.3? I'll see if Zope 2 has a problem with this patch, and if it does, I'll fix Zope 2. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-08 15:54 Message: Logged In: YES user_id=35752 Argh! Stupid typos. # without patch 1*a # prints __mul__ 1.0*a # raises TypeError Extension classes do not define __r* methods. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-08 15:52 Message: Logged In: YES user_id=35752 This affects extension classes: import ExtensionClass class A(ExtensionClass.Base): def __mul__(self, other): print '__mul__' return 1 a = A() # without patch a*1 # prints __mul__ a*1.0 # prints __mul__ 1.0*a # prints __mul__ 1.0*a # raises TypeError # with patch a*1 # prints __mul__ a*1.0 # prints __mul__ 1*a # raises TypeError 1.0*a # raises TypeError The new behavior is more consistent. Do you still want it in 2.2? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-08 14:21 Message: Logged In: YES user_id=6380 Looks good to me, but s/of/if/ in the comment. :-) Despite the semantic change I think this should be backported to 2.2; it's more a bug than a feature... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 From noreply@sourceforge.net Thu Aug 8 21:15:00 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 13:15:00 -0700 Subject: [Patches] [ python-Patches-592646 ] Prefer nb_multipy over sq_repeat Message-ID: Patches item #592646, was opened at 2002-08-08 11:35 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) >Assigned to: Neil Schemenauer (nascheme) Summary: Prefer nb_multipy over sq_repeat Initial Comment: >From David Beazlely: ============================================== class Foo(object): def __mul__(self,other): print "__mul__" def __rmul__(self,other): print "__rmul__" Python-2.2.1, if you try this, you get the following behavior: >>> f = Foo() >>> f*1.0 __mul__ >>> 1.0*f __rmul__ >>> f*1 __mul__ >>> 1*f __mul__ ============================================== This is because the int object prefers calling sq_repeat over nb_multiply if a type defines both. The attached patch changes int_mul to prefer nb_multiply over sq_repeat. This is a change is behavior and requires some careful thought. The tests run okay with this change but that doesn't mean other code will not break. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-08 15:58 Message: Logged In: YES user_id=6380 Yes, I still want it in 2.2. I presume Extensionclasses behave the same way with 2.3? I'll see if Zope 2 has a problem with this patch, and if it does, I'll fix Zope 2. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-08 15:54 Message: Logged In: YES user_id=35752 Argh! Stupid typos. # without patch 1*a # prints __mul__ 1.0*a # raises TypeError Extension classes do not define __r* methods. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-08 15:52 Message: Logged In: YES user_id=35752 This affects extension classes: import ExtensionClass class A(ExtensionClass.Base): def __mul__(self, other): print '__mul__' return 1 a = A() # without patch a*1 # prints __mul__ a*1.0 # prints __mul__ 1.0*a # prints __mul__ 1.0*a # raises TypeError # with patch a*1 # prints __mul__ a*1.0 # prints __mul__ 1*a # raises TypeError 1.0*a # raises TypeError The new behavior is more consistent. Do you still want it in 2.2? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-08 14:21 Message: Logged In: YES user_id=6380 Looks good to me, but s/of/if/ in the comment. :-) Despite the semantic change I think this should be backported to 2.2; it's more a bug than a feature... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 From noreply@sourceforge.net Thu Aug 8 21:19:58 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 13:19:58 -0700 Subject: [Patches] [ python-Patches-590294 ] os._execvpe security fix Message-ID: Patches item #590294, was opened at 2002-08-02 14:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590294&group_id=5470 Category: Modules Group: Python 2.3 >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Guido van Rossum (gvanrossum) Summary: os._execvpe security fix Initial Comment: 1) Do not attempt to exec a file which does not exist just to find out what error the operating system returns. This is an exploitable race on all platforms that support symbolic links. 2) Immediately re-raise the exception if we get an error other than errno.ENOENT or errno.ENOTDIR. This may need to be adapted for other platforms. (As a security issue, this should be considered for 2.1 and 2.2 as well as 2.3.) ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-08 16:19 Message: Logged In: YES user_id=6380 All backported. (Note that as a side effect of this patch, changes to Modules/Setup[.dist] had to be made and backported to compile the errno module statically, because the patch introduces a dependency on it to distutils and hence to the setup.py script.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 12:14 Message: Logged In: YES user_id=6380 OK, checked in for 2.3. Keeping this open until I find the time to backport it to 2.2 and 2.1 (or someone else does that). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590294&group_id=5470 From noreply@sourceforge.net Thu Aug 8 21:41:50 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 13:41:50 -0700 Subject: [Patches] [ python-Patches-555085 ] timeout socket implementation Message-ID: Patches item #555085, was opened at 2002-05-12 08:11 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=555085&group_id=5470 Category: Library (Lib) Group: Python 2.3 >Status: Closed Resolution: Accepted Priority: 4 Submitted By: Michael Gilfix (mgilfix) Assigned to: Guido van Rossum (gvanrossum) Summary: timeout socket implementation Initial Comment: This implements bug #457114 and implements timed socket operations. If a timeout is set and the timeout period elaspes before the socket operation has finished, a socket.error exception is thrown. This patch integrates the functionality at two levels: the timeout capability is integrated at the C level in socketmodule.c. Socket.py was also modified to update fileobject creation on a win platform to handle the case of the underlying socket throwing an exception. The tex documentation was also updated and a new regression unit was provided as test_timeout.py. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-08 16:41 Message: Logged In: YES user_id=6380 Andrew, I've checked in your patches. I had to review them anyway and I decided to rewrite the testSendAll() check to be more informative. Thanks! If there are other unsolved issues, please open a new bug report and assign it to me. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-08-04 03:09 Message: Logged In: YES user_id=250749 After discussing the OS/2 issues privately with Michael, the outstanding issues are resolved with the socketmodule.c and test_socket.py patches I've uploaded here. socketmodule.c.nb-connect.diff: in the non-blocking connect, OS/2 is returning EINPROGRESS from the initial connection attempt, and after the internal_select(), the subsequent connection attempt returns EISCONN. this appears to be perfectly legitimate, although FreeBSD and Linux haven't been seen to return the EINPROGRESS. the patch adds specific handling for the EISCONN after EINPROGRESS case, matching the semantics already in place for the Windows version of the code. test_socket.py.sendall.diff: the existing sendall() test is flawed as the recv() call makes no guarantees about waiting for all the data requested. OS/2 required a 100ms sleep in the recv loop to get all the data. rewriting the reciev test to allow for recv() not waiting for data still in transit is more correct. Note that these interpretations of "correctness" have been based on FreeBSD manpages, which is the only sockets documentation I currently have. If these are acceptable to Guido, and Michael gets to test them on Linux, I can relieve Guido of committing them and closing this patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 14:41 Message: Logged In: YES user_id=6380 Michael and Andrew, if you can deal with this without my involvement I would greatly appreciate it. ;-) ---------------------------------------------------------------------- Comment By: Michael Gilfix (mgilfix) Date: 2002-07-30 10:25 Message: Logged In: YES user_id=116038 If Guido is busy (And I'm sure he is), I'd be willing to take a hack at the problem if you could email me privately and provide a testing environment (No OS/2 EMX in my apt ;) ). ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-07-29 22:28 Message: Logged In: YES user_id=250749 In private mail to/from Guido, it appears that the FreeBSD issues were in test_socket.py, and have been addressed. I still have outstanding issues on OS/2 EMX, which I sent to Guido privately but will add here as soon as I can. ---------------------------------------------------------------------- Comment By: Michael Gilfix (mgilfix) Date: 2002-07-23 16:43 Message: Logged In: YES user_id=116038 Now that I'm back :) I checked the archive and this seems to have been handled by you. Please let me know if it isn't resolved and I can give it a closer look. Also, perhaps I should contact Bernie and ask him if there's anything he hasn't gotten around to in the test_timeout that I can off-load from him. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-18 13:11 Message: Logged In: YES user_id=6380 The default timeout is now implemented in CVS. There's a bug report from Andrew Macintyre (unfortunately on python-dev) about test_socket.py failures on FreeBSD. I'll try to keep an eye on that, so this patch *still* stays open. Also, Bernie has promised some changes that I haven't received yet and the details of which I don't recall (sorry :-( ). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-06-07 21:47 Message: Logged In: YES user_id=6380 Keeping this open as a reminder of things still to finish. Most is in the python-dev discussion; Michael Gilfix and Bernard Yue have offered to produce more patches. One feature we definitely want is a way to specify a timeout to be applied to all new sockets. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-06-06 17:11 Message: Logged In: YES user_id=6380 Thanks for the new version! I've checked this in. I made considerable changes; the following is feedback but you don't need to respond because I've addressed all these in the checked-in code! - Thanks for the cleanup of some non-standard formatting. However, it's better not to do this so the diffs don't show changes that are unrelated to the timeout patch. - You are still importing the select module instead of calling select() directly. I really think you should do the latter -- the select module has an enormous overhead (it allocates several large lists on the heap). - Instead of explicitly testing the argument to settimeout for being a float, int or long, you should simply call PyFloat_AsDouble and handle the error; if someone passes another object that implements __float__ that should be acceptable. - gettimeout() returns sock_timeout without checking if it is NULL. It can be NULL when a socket object is never initialized. E.g. I can do this: >>> from socket import * >>> s = socket.__new__(socket) >>> s.gettimeout() which gives me a segfault. There are probably other places where this is assumed. - I addressed the latter two issues by making sock_timeout a double, whose value is < 0.0 when no timeout is set. ---------------------------------------------------------------------- Comment By: Michael Gilfix (mgilfix) Date: 2002-06-05 18:23 Message: Logged In: YES user_id=116038 I've addressed all the issues brought up by Guido. The 2nd version of the patch is attached here. In this version, I've modified test_socket.py to include tests for the _fileobject class in socket.py that was modified by this patch. _fileobject needed to be modified so that data would not be lost when the underlying socket threw an expection (data was no longer accumulated in local variables). The tests for the _fileobject class succeed on older versions of python (tested 2.1.3) and pass on the newer version of python. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-05-23 16:18 Message: Logged In: YES user_id=6380 For a detailed review, see http://mail.python.org/pipermail/python-dev/2002-May/024340.html ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=555085&group_id=5470 From noreply@sourceforge.net Thu Aug 8 22:05:08 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 14:05:08 -0700 Subject: [Patches] [ python-Patches-466352 ] let mailbox.Maildir tag messages as read Message-ID: Patches item #466352, was opened at 2001-09-29 13:42 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=466352&group_id=5470 Category: Library (Lib) Group: Python 2.3 >Status: Closed >Resolution: Rejected Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: let mailbox.Maildir tag messages as read Initial Comment: http://c0re.jp/c0de/misc/python-maildir2.patch This patch which changes python's mailbox.Maildir class to move processed messages form new/ to cur/. Although not expicity stated in http://cr.yp.to/proto/maildir.html all applications using Maildirs I'm aware of move messages form new/ to cur/ after the first reading of a message. This patch gives you a way to get the same behaviour in python by giving a third parameter to __init__. See mailbox.Maildir.__init__.__doc__ --drt@un.bewaff.net - http://c0re.jp/ --- Lib-orig/mailbox.py Sat Sep 29 13:03:12 2001 +++ Lib/mailbox.py Sat Sep 29 13:36:36 2001 @@ -201,11 +201,16 @@ class Maildir: - # Qmail directory mailbox + # qmail/maildrop directory mailbox + # see http://cr.yp.to/proto/maildir.html - def __init__(self, dirname, factory=rfc822.Message): + def __init__(self, dirname, factory=rfc822.Message, move=0): + '''if you supply the constructor with a third parameter which is + not equal 0, this class will mark all messages, you processed with + next() as read by moving them from new/ to cur/''' self.dirname = dirname self.factory = factory + self.move = move # check for new mail newdir = os.path.join(self.dirname, 'new') @@ -225,6 +230,11 @@ fn = self.boxes[0] del self.boxes[0] fp = open(fn) + if not self.move == 0: + # if the message is considered new, mark it as seen + (head, tail) = os.path.split(fn) + if(head[-3:] == 'new'): + os.rename(fn, os.path.join(head[:-3], 'cur', tail + ':2,S')) return self.factory(fp) ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-08 23:05 Message: Logged In: YES user_id=21627 It appears that the original patch has been rejected, so I'm closing it now. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-11-09 17:07 Message: Logged In: YES user_id=3066 Since this is clearly a new feature for the library and we didn't get to it in time for the Python 2.2 betas, I'm marking this postponed and adding it to the Python 2.3 group. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2001-10-19 00:55 Message: Logged In: YES user_id=12800 Assigning back to Fred because he was the last person to put his finger on his nose (see him volunteer in his comment of 2001-10-01 below :) ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-10-03 20:18 Message: Logged In: NO fdrakes suggestion seems to me like a very sound suggestion. It is a much cleaner general approach than my hacke-to-solve-my actual problem. In my opinion on medium sight Python should support full read and write access to mailboxes, because that are the batteries of mail handling. If there is a good sugestion for an clean interface for that I would be happy to do the Maildir implementation. --drt@un.bewaff.net ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-10-01 18:19 Message: Logged In: YES user_id=3066 Guido: I understood that part; my comment was unclear. I certainly think the patch as proposed isn't a bad thing, but its only useful for a specific range of applications. Abstracting it differently could make it more widely applicable without adding a lot to the library. I'll make a different proposal, that may work a little better: we can add a new method for all that mailbox formats that represent each message as a separate file, passing in the name of the file. That method is responsible for opening the file and returning the message object (with the default implementation using the registered factory), which next() then returns. An application that needs more than the message object can subclass the mailbox and override that method to do what's needed. That should suffice both for the simple case solved by the patch provided here and many other possible applications as well. If that's reasonable, I'll volunteer to make the patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-10-01 17:48 Message: Logged In: YES user_id=6380 I'm -0 on this. But Fred, he *did* make it an option unless I misunderstand the "move=0" default arg value. --Guido ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-10-01 17:44 Message: Logged In: YES user_id=3066 Having this as an option is more reasonable than making it do this by default. It's not at all clear to me that this is the right thing to do; an application may want to search the messages without presenting them to the user, so adding the "seen" flag may not be the right thing. I think it might be better to return a proxy for the message returned by the Message factory which adds methods like get_info() and set_info(s), where s is the new info string. Setting the info string would cause the right renaming to be done. Regardless of mechanism, this would make this module something a little different from the strictly read-only thing it is now. Barry, what do you think? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-09-30 20:49 Message: Logged In: YES user_id=6380 Fred, what do you think? Is this reasonable? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=466352&group_id=5470 From noreply@sourceforge.net Thu Aug 8 22:16:25 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 14:16:25 -0700 Subject: [Patches] [ python-Patches-479615 ] Fast-path for interned string compares Message-ID: Patches item #479615, was opened at 2001-11-08 15:19 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=479615&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: M.-A. Lemburg (lemburg) Assigned to: M.-A. Lemburg (lemburg) Summary: Fast-path for interned string compares Initial Comment: This patch adds a fast-path for comparing equality of interned strings. The patch boosts performance for comparing identical string objects by some 20% on my machine while not causing any noticable slow-down for other operations (according to tests done with pybench). More infos and benchmarks later... ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2002-08-08 21:16 Message: Logged In: YES user_id=38388 I still consider the patch worth adding. The application space where it helps may be small, but also important: it can massively speed up parsers which use interned strings as tokens. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-08 21:07 Message: Logged In: YES user_id=21627 Is there any progress on this patch, or should it be considered withdrawn? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-03-23 23:35 Message: Logged In: YES user_id=35752 Attached is an updated version of this patch. I'm -0 on it since it doesn't seem to help much except for artificial benchmarks. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-11-08 15:26 Message: Logged In: YES user_id=38388 Output from pybench comparing today's CVS Python with patch (eqpython) and without patch (stdpython): PYBENCH 1.0 Benchmark: eqpython.bench (rounds=10, warp=20) Tests: per run per oper. diff *) ------------------------------------------------------------------------ BuiltinFunctionCalls: 125.55 ms 0.98 us -1.68% BuiltinMethodLookup: 180.10 ms 0.34 us +1.75% CompareFloats: 107.30 ms 0.24 us +2.04% CompareFloatsIntegers: 185.15 ms 0.41 us -0.05% CompareIntegers: 163.50 ms 0.18 us -1.77% CompareInternedStrings: 79.50 ms 0.16 us -20.78% ^^^^^^^^^^^^^^^^^^^^ This is the interesting line :-) ^^^^^^^^^^^^^^^^^^^^^^^^^^ CompareLongs: 110.25 ms 0.24 us +0.09% CompareStrings: 143.40 ms 0.29 us +2.14% CompareUnicode: 118.00 ms 0.31 us +1.68% ConcatStrings: 189.55 ms 1.26 us -1.61% ConcatUnicode: 226.55 ms 1.51 us +1.34% CreateInstances: 202.35 ms 4.82 us -1.87% CreateStringsWithConcat: 221.00 ms 1.11 us +0.45% CreateUnicodeWithConcat: 240.00 ms 1.20 us +1.27% DictCreation: 213.25 ms 1.42 us +0.47% DictWithFloatKeys: 263.50 ms 0.44 us +1.15% DictWithIntegerKeys: 158.50 ms 0.26 us -1.86% DictWithStringKeys: 147.60 ms 0.25 us +0.75% ForLoops: 144.90 ms 14.49 us -4.64% IfThenElse: 174.15 ms 0.26 us -0.00% ListSlicing: 88.80 ms 25.37 us -1.11% NestedForLoops: 136.95 ms 0.39 us +3.01% NormalClassAttribute: 177.80 ms 0.30 us -2.68% NormalInstanceAttribute: 166.85 ms 0.28 us -0.54% PythonFunctionCalls: 152.20 ms 0.92 us +1.40% PythonMethodCalls: 133.70 ms 1.78 us +1.60% Recursion: 119.45 ms 9.56 us +0.04% SecondImport: 124.65 ms 4.99 us -6.03% SecondPackageImport: 130.70 ms 5.23 us -5.73% SecondSubmoduleImport: 161.65 ms 6.47 us -5.88% SimpleComplexArithmetic: 245.50 ms 1.12 us +2.08% SimpleDictManipulation: 108.50 ms 0.36 us +0.05% SimpleFloatArithmetic: 125.80 ms 0.23 us +0.84% SimpleIntFloatArithmetic: 128.50 ms 0.19 us -1.46% SimpleIntegerArithmetic: 128.45 ms 0.19 us -0.77% SimpleListManipulation: 159.15 ms 0.59 us -5.32% SimpleLongArithmetic: 189.55 ms 1.15 us +2.65% SmallLists: 293.70 ms 1.15 us -5.26% SmallTuples: 230.00 ms 0.96 us +0.44% SpecialClassAttribute: 175.70 ms 0.29 us -2.79% SpecialInstanceAttribute: 199.70 ms 0.33 us -1.55% StringMappings: 196.85 ms 1.56 us -2.48% StringPredicates: 133.00 ms 0.48 us -8.28% StringSlicing: 165.45 ms 0.95 us -3.47% TryExcept: 193.60 ms 0.13 us +0.57% TryRaiseExcept: 175.40 ms 11.69 us +0.69% TupleSlicing: 156.85 ms 1.49 us -0.00% UnicodeMappings: 175.90 ms 9.77 us +1.76% UnicodePredicates: 141.35 ms 0.63 us +0.78% UnicodeProperties: 184.35 ms 0.92 us -2.10% UnicodeSlicing: 179.45 ms 1.03 us -1.10% ------------------------------------------------------------------------ Average round time: 9855.00 ms -1.13% *) measured against: stdpython.bench (rounds=10, warp=20) As you can see, the rest of the results don't change much and the ones that do indicate some additional benefit gained by the patch. All slow-downs are way below the noise limit of around 5-10% (depending the platforms/machine/compiler). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=479615&group_id=5470 From noreply@sourceforge.net Thu Aug 8 22:07:14 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 14:07:14 -0700 Subject: [Patches] [ python-Patches-479615 ] Fast-path for interned string compares Message-ID: Patches item #479615, was opened at 2001-11-08 16:19 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=479615&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: M.-A. Lemburg (lemburg) Assigned to: M.-A. Lemburg (lemburg) Summary: Fast-path for interned string compares Initial Comment: This patch adds a fast-path for comparing equality of interned strings. The patch boosts performance for comparing identical string objects by some 20% on my machine while not causing any noticable slow-down for other operations (according to tests done with pybench). More infos and benchmarks later... ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-08 23:07 Message: Logged In: YES user_id=21627 Is there any progress on this patch, or should it be considered withdrawn? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-03-24 00:35 Message: Logged In: YES user_id=35752 Attached is an updated version of this patch. I'm -0 on it since it doesn't seem to help much except for artificial benchmarks. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-11-08 16:26 Message: Logged In: YES user_id=38388 Output from pybench comparing today's CVS Python with patch (eqpython) and without patch (stdpython): PYBENCH 1.0 Benchmark: eqpython.bench (rounds=10, warp=20) Tests: per run per oper. diff *) ------------------------------------------------------------------------ BuiltinFunctionCalls: 125.55 ms 0.98 us -1.68% BuiltinMethodLookup: 180.10 ms 0.34 us +1.75% CompareFloats: 107.30 ms 0.24 us +2.04% CompareFloatsIntegers: 185.15 ms 0.41 us -0.05% CompareIntegers: 163.50 ms 0.18 us -1.77% CompareInternedStrings: 79.50 ms 0.16 us -20.78% ^^^^^^^^^^^^^^^^^^^^ This is the interesting line :-) ^^^^^^^^^^^^^^^^^^^^^^^^^^ CompareLongs: 110.25 ms 0.24 us +0.09% CompareStrings: 143.40 ms 0.29 us +2.14% CompareUnicode: 118.00 ms 0.31 us +1.68% ConcatStrings: 189.55 ms 1.26 us -1.61% ConcatUnicode: 226.55 ms 1.51 us +1.34% CreateInstances: 202.35 ms 4.82 us -1.87% CreateStringsWithConcat: 221.00 ms 1.11 us +0.45% CreateUnicodeWithConcat: 240.00 ms 1.20 us +1.27% DictCreation: 213.25 ms 1.42 us +0.47% DictWithFloatKeys: 263.50 ms 0.44 us +1.15% DictWithIntegerKeys: 158.50 ms 0.26 us -1.86% DictWithStringKeys: 147.60 ms 0.25 us +0.75% ForLoops: 144.90 ms 14.49 us -4.64% IfThenElse: 174.15 ms 0.26 us -0.00% ListSlicing: 88.80 ms 25.37 us -1.11% NestedForLoops: 136.95 ms 0.39 us +3.01% NormalClassAttribute: 177.80 ms 0.30 us -2.68% NormalInstanceAttribute: 166.85 ms 0.28 us -0.54% PythonFunctionCalls: 152.20 ms 0.92 us +1.40% PythonMethodCalls: 133.70 ms 1.78 us +1.60% Recursion: 119.45 ms 9.56 us +0.04% SecondImport: 124.65 ms 4.99 us -6.03% SecondPackageImport: 130.70 ms 5.23 us -5.73% SecondSubmoduleImport: 161.65 ms 6.47 us -5.88% SimpleComplexArithmetic: 245.50 ms 1.12 us +2.08% SimpleDictManipulation: 108.50 ms 0.36 us +0.05% SimpleFloatArithmetic: 125.80 ms 0.23 us +0.84% SimpleIntFloatArithmetic: 128.50 ms 0.19 us -1.46% SimpleIntegerArithmetic: 128.45 ms 0.19 us -0.77% SimpleListManipulation: 159.15 ms 0.59 us -5.32% SimpleLongArithmetic: 189.55 ms 1.15 us +2.65% SmallLists: 293.70 ms 1.15 us -5.26% SmallTuples: 230.00 ms 0.96 us +0.44% SpecialClassAttribute: 175.70 ms 0.29 us -2.79% SpecialInstanceAttribute: 199.70 ms 0.33 us -1.55% StringMappings: 196.85 ms 1.56 us -2.48% StringPredicates: 133.00 ms 0.48 us -8.28% StringSlicing: 165.45 ms 0.95 us -3.47% TryExcept: 193.60 ms 0.13 us +0.57% TryRaiseExcept: 175.40 ms 11.69 us +0.69% TupleSlicing: 156.85 ms 1.49 us -0.00% UnicodeMappings: 175.90 ms 9.77 us +1.76% UnicodePredicates: 141.35 ms 0.63 us +0.78% UnicodeProperties: 184.35 ms 0.92 us -2.10% UnicodeSlicing: 179.45 ms 1.03 us -1.10% ------------------------------------------------------------------------ Average round time: 9855.00 ms -1.13% *) measured against: stdpython.bench (rounds=10, warp=20) As you can see, the rest of the results don't change much and the ones that do indicate some additional benefit gained by the patch. All slow-downs are way below the noise limit of around 5-10% (depending the platforms/machine/compiler). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=479615&group_id=5470 From noreply@sourceforge.net Fri Aug 9 02:05:02 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 18:05:02 -0700 Subject: [Patches] [ python-Patches-592065 ] Cleanup, speedup iterobject Message-ID: Patches item #592065, was opened at 2002-08-07 09:43 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592065&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open >Resolution: Accepted Priority: 5 Submitted By: Raymond Hettinger (rhettinger) >Assigned to: Raymond Hettinger (rhettinger) Summary: Cleanup, speedup iterobject Initial Comment: Moved special case for tuples from iterobject.c to tupleobject.c. Makes the code in iterobject.c cleaner and speeds-up the general case by not checking for tuples everytime. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-08-08 21:05 Message: Logged In: YES user_id=31435 Looks good! Check it in. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592065&group_id=5470 From noreply@sourceforge.net Fri Aug 9 02:18:24 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 18:18:24 -0700 Subject: [Patches] [ python-Patches-514628 ] bug in pydoc on python 2.2 release Message-ID: Patches item #514628, was opened at 2002-02-07 21:09 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=514628&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Raj Kunjithapadam (mmaster25) >Assigned to: Ka-Ping Yee (ping) Summary: bug in pydoc on python 2.2 release Initial Comment: pydoc has a bug when trying to generate html doc more importantly it has bug in the method writedoc() attached is my fix. Here is the diff between my fix and the regular dist 1338c1338 < def writedoc(thing, forceload=0): --- > def writedoc(key, forceload=0): 1340,1346c1340,1343 < object = thing < if type(thing) is type(''): < try: < object = locate(thing, forceload) < except ErrorDuringImport, value: < print value < return --- > try: > object = locate(key, forceload) > except ErrorDuringImport, value: > print value 1351c1348 < file = open(thing.__name__ + '.html', 'w') --- > file = open(key + '.html', 'w') 1354c1351 < print 'wrote', thing.__name__ + '.html' --- > print 'wrote', key + '.html' 1356c1353 < print 'no Python documentation found for %s' % repr(thing) --- > print 'no Python documentation found for %s' % repr(key) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-03-18 03:48 Message: Logged In: YES user_id=21627 Can you please provide an example that demonstrates the problem? Also, can you please regenerate your changes as context (-c) or unified (-u) diffs, and attach those to this report (do *not* paste them into the comment field)? In their current, the patch is pretty useless: SF messed up the indentation, and it is an old-style patch, and pydoc.py is already at 1.58. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-03-01 17:45 Message: Logged In: YES user_id=6380 assigned to Tim; this may be Ping's terrain but Ping is typically not responsive. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=514628&group_id=5470 From noreply@sourceforge.net Fri Aug 9 02:31:44 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 18:31:44 -0700 Subject: [Patches] [ python-Patches-592065 ] Cleanup, speedup iterobject Message-ID: Patches item #592065, was opened at 2002-08-07 08:43 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592065&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Raymond Hettinger (rhettinger) Summary: Cleanup, speedup iterobject Initial Comment: Moved special case for tuples from iterobject.c to tupleobject.c. Makes the code in iterobject.c cleaner and speeds-up the general case by not checking for tuples everytime. ---------------------------------------------------------------------- >Comment By: Raymond Hettinger (rhettinger) Date: 2002-08-08 20:31 Message: Logged In: YES user_id=80475 Thanks for the review! Committed as iterobject.c 1.12 and tupleobject.c 2.71. Closing patch. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-08 20:05 Message: Logged In: YES user_id=31435 Looks good! Check it in. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592065&group_id=5470 From noreply@sourceforge.net Fri Aug 9 03:28:18 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 19:28:18 -0700 Subject: [Patches] [ python-Patches-592529 ] Split-out ntmodule.c Message-ID: Patches item #592529, was opened at 2002-08-08 06:12 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592529&group_id=5470 Category: Windows Group: None Status: Open Resolution: None Priority: 5 Submitted By: Martin v. L�wis (loewis) >Assigned to: Martin v. L�wis (loewis) Summary: Split-out ntmodule.c Initial Comment: This patch moves the MS_WINDOWS code from posixmodule.c into ntmodule.c. The OS/2 code is left in posixmodule.c. I believe this patch significantly improves readability of both modules (posix and nt), even though it adds a slight code duplication. It also gives Windows developers the chance to adjust the implementation better to the Win32 API without fear of breaking the POSIX versions. Attached are three files: the ntmodule.c source code, the posixmodule.c diff, and the pcbuild diff. Since the patches will outdate quickly, I'd appreciate if that patch could be accepted or rejected quickly. Randomly assigning to Tim. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-08 22:28 Message: Logged In: YES user_id=6380 I'm +0.5 on this. Can you bring this up on python-dev to see if there are different viewpoints? I think the resulting code will be easier to maintain; new stuff added at this point is more likely to be unique to Unix anyway (or unique to Windows). I wonder if the os2 code shouldn't be moved to its own file as well (I think Andrew MacIntyre maintains that port, right?). There are still a bunch of #ifdefs in the nt code. Are those really variable across Windows versions or compilers? If not, I'd suggest to expand those. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-08 12:32 Message: Logged In: YES user_id=31435 I'm -0, so assigning to Guido for another opinion. I expect this will actually make it harder to keep the os interface consistent and working across platforms; e.g., somebody adds an os function in one module but forgets to add it in the other (likely because they don't even know it exists); a docstring repair shows up in one but not both; a largefile fix in one doesn't get reflected in the other; etc. Apart from the massive Windows popen pain, there are actually more embedded PYOS_OS2 #ifdefs in posixmodule. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592529&group_id=5470 From noreply@sourceforge.net Fri Aug 9 05:12:09 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 08 Aug 2002 21:12:09 -0700 Subject: [Patches] [ python-Patches-505705 ] Remove eval in pickle Message-ID: Patches item #505705, was opened at 2002-01-19 04:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Martin v. L�wis (loewis) >Assigned to: Guido van Rossum (gvanrossum) Summary: Remove eval in pickle Initial Comment: This patch removes the use of eval in pickle and cPickle. It does so by: - moving the actual parsing from compile.c:parsestr to PyString_DecodeEscape - introducing a new codec string-escape - removing the code that checks that a string-to-unpickle is properly escaped throughout, and replaces this with a check whether it is properly quoted, - unquoting the string in load_string, then passing it to the codec. This fixes #502503. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 00:12 Message: Logged In: YES user_id=6380 I'll review this. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-01-19 04:25 Message: Logged In: YES user_id=21627 BTW, this patch has #500002 as a prerequisite. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 From noreply@sourceforge.net Fri Aug 9 15:09:19 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 07:09:19 -0700 Subject: [Patches] [ python-Patches-492105 ] Import from Zip archive Message-ID: Patches item #492105, was opened at 2001-12-12 12:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: James C. Ahlstrom (ahlstromjc) >Assigned to: Guido van Rossum (gvanrossum) Summary: Import from Zip archive Initial Comment: This is the "final" patch to support imports from zip archives, and directory caching using os.listdir(). It replaces patch 483466 and 476047. It is a separate patch since I can't delete file attachments. It adds support for importing from "" and from relative paths. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 06:54 Message: Logged In: YES user_id=21627 Is this patch ready to be applied? ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-06-12 11:05 Message: Logged In: YES user_id=31392 Deleteing the old diffs that Jim couldn't delete. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:27 Message: Logged In: YES user_id=64929 I added a diff -c version of the patch. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:03 Message: Logged In: YES user_id=64929 I still can't delete files, but I added a new file which contains all diffs as a single file, and is made from the current CVS tree (Mar 15, 2002). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 From noreply@sourceforge.net Fri Aug 9 15:20:44 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 07:20:44 -0700 Subject: [Patches] [ python-Patches-593069 ] socketmodule.[ch] downgrade Message-ID: Patches item #593069, was opened at 2002-08-09 18:20 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593069&group_id=5470 Category: Modules Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Stepan Koltsov (yozh) Assigned to: Nobody/Anonymous (nobody) Summary: socketmodule.[ch] downgrade Initial Comment: 1. Was removed fields 'sock_type' and 'sock_proto' from structure PySocketSockObject since they are not used anywhere. 2. Changed semantics of 'socket.fromfd'. Now it ignore 3rd and 4th arguments, 2nd arg is optional, if it was not specified, it got with getsockname call. 3. Added constant AF_LOCAL. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593069&group_id=5470 From noreply@sourceforge.net Fri Aug 9 15:26:40 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 07:26:40 -0700 Subject: [Patches] [ python-Patches-538395 ] ae* modules: handle type inheritance Message-ID: Patches item #538395, was opened at 2002-04-02 21:24 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470 Category: Macintosh Group: Python 2.3 >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Donovan Preston (dsposx) Assigned to: Jack Jansen (jackjansen) Summary: ae* modules: handle type inheritance Initial Comment: The gensuitemodule script creates Python classes out of AppleScript types. It keeps track of properties in _propdict and elements in _elemdict. However, gensuitemodule does not accurately replicate the AppleScript inheritance heirarchy, and __getattr__ only looks in self._propdict and self._elemdict, therefore not finding elements and properties defined in superclasses. Attached is a patch which: 1) Correctly identifies an AppleScript type's superclasses, and defines the Python classes with these superclasses. Since not all names may be defined by the time a new class is defined, this is accomplished by setting a new class' __bases__ after all names are defined. 2) Changes __getattr__ to recurse superclasses while looking through _propdict and _elemdict. It also contains small usability enhancements which will automatically look for a .want or .which property when you are creating specifiers. ---------------------------------------------------------------------- >Comment By: Jack Jansen (jackjansen) Date: 2002-08-09 16:26 Message: Logged In: YES user_id=45365 Donovan, I checked your mods in, with the following changes: - I had to massage getbaseclasses() a bit - I replaced your aetypes.beginning() and end() by something a bit more in line with the rest of the module - I removed your macresource.open_file mods. If you still feel the latter are needed (I don't seem to need them) please file a separate bug report. If you find any issues with the other stuff please let me know asap. ---------------------------------------------------------------------- Comment By: Donovan Preston (dsposx) Date: 2002-06-23 18:55 Message: Logged In: YES user_id=111050 Jack, Great, thanks for looking at it again. I'm not sure why I never got email with your follow up. I guess sourceforge screwed up? In processfile, I changed the way the resource file is opened because I am running gensuitemodule under Mach-oPython. I believe CurResFile wasn't working for me; switching to macresource.need worked for me but there is one problem: If you run gensuitemodule on one application, it's resource file will be opened but never closed. If you then attempt to run gensuitemodule on another application in the same process, you will generate a suite module for the first application (But everything will be named for the second application!). If opening and closing the resource file is the right thing to do and you can get those calls to work under Mach-oPython, that would be preferrable. getbaseclasses could be declared in aetools or on aetools.TalkTo, whatever you think is appropriate. Sorry I hadn't followed up earlier, I have been very busy and only decided to check the status of the patch today on a whim! Oh, and I never found any information about the 'c@#^' property, I just put in print statements in the right places to observe what was going on, and deduced what it was. Donovan ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-05-24 17:21 Message: Logged In: YES user_id=45365 Donovan, I finally got started with your patch. Two more questions, just to make sure I understand things: - You've changed the way the resource file is opened (in gensuitemodule, processfile(). Why? - The getbaseclasses() function you generate seems prety static. Couldn't it be declared in aetools? Or maybe even be a class method of aetools.TalkTo? Don't bother fixing the patch, just let me know the answers and I'll hack it myself. (And, out of curiosity, where did you find info on the funny properties such as "c@#^"? I've never managed to find anything...) ---------------------------------------------------------------------- Comment By: Donovan Preston (dsposx) Date: 2002-05-02 22:47 Message: Logged In: YES user_id=111050 Hi Jack, I finally got around to reimplementing the functionality of the patch given your suggestions. I think it should be fairly bulletproof. Due to circular imports I ended up having to do more work in the package's __init__.py. The upshot of flattening the lookup dicts is that it gets done once, and then after that name lookups are lightning fast. In our production environment the speedup was impressive :-) Thanks for the suggestions. Donovan ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-04-03 23:16 Message: Logged In: YES user_id=45365 Donovan, two comments-on-your-comments: - You're absolutely right about the module names. Pickle also uses names, and it's probably the only way to do it. - You're also absolutely right about how to update the _elemdict and _propdict. Or, as Jean-Luc Picard would say: "Make it so!" :-) Oh yes, on the production code/merging problem: aside from Martin's comments here's another tip: make a copy of the subtree that contains the conflict section (why not the whole Mac subtree in your case) and make sure you keep the CVS directories. Start hacking in this copy. Once you're satisfied do a commit from there. As long as you keep the CVS directory with the files there's little that can go wrong. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-04-03 11:29 Message: Logged In: YES user_id=21627 cvs update will keep a copy of the original file (the one you edited) if it has to merge changes; it will name it .#.. So in no case cvs will destroy your changes. Normally, merging works quite well. If it finds a conflict, it will print a 'C' on update, and put a conflict marker in the file. The stuff above the ===== is your code, the one below is the CVS code. If you want to find out what cvs would do, use 'cvs status'. If you don't want cvs to do merging, the following procedure will work cvs diff -u >patches patch -p0 -R 1.22 by hand without doing a cvs update? I think a cvs update (plus some manual work;-) should solve this. Third: the passing of modules by name (to the decoding routines) seems error prone and not too elegant. Can't you pass the modules themselves in stead of their names? It would also save extra imports in the decoders. Fourth: assigning to __bases__ seems like rather a big hack. Can't we generate the classes with a helper class, similarly to the event helper class in the __init__.py modules: FooApp/Foo_Suite.py would contain the class foo sketched above, and FooApp.__init__.py would contain import othersuite.superfoo import Foo_Suite.foo class foo(Foo_Suite.foo, othersuite.superfoo): pass Fifth, and least important: you're manually iterating over the base classes for lookup. Couldn't we statically combine the _propdict and _elemdict's of the base classes during class declaration, so that at lookup time we'd need only a single dictionary lookup? The "class foo" body in __init__.py would then become something like _propdict = aetools.flatten_dicts( foosuite.foo._propdict, othersuite.superfoo._propdict) and similar for elemdict. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470 From noreply@sourceforge.net Fri Aug 9 15:41:58 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 07:41:58 -0700 Subject: [Patches] [ python-Patches-591551 ] Remove symlink python during install Message-ID: Patches item #591551, was opened at 2002-08-06 15:48 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591551&group_id=5470 Category: Build Group: Python 2.3 >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Jack Jansen (jackjansen) Assigned to: Jack Jansen (jackjansen) Summary: Remove symlink python during install Initial Comment: During "make bininstall" an existing "python" in the bindir is removed. This fails, however, if the file in question is a symlink in stead of a regular file/hardlink. As an OSX framework install can deposit a symlink into /usr/local/bin/python it would be nice if a subsequent normal install would do the right thing. I'm posting this as a patch because I'm not sure how common "test -L file" is. Otherwise "test -e file" may be a better idea. Assigned to Martin as he seems to be one of the major build gurus. ---------------------------------------------------------------------- >Comment By: Jack Jansen (jackjansen) Date: 2002-08-09 16:41 Message: Logged In: YES user_id=45365 Ok, I volunteer... ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-08 12:25 Message: Logged In: YES user_id=21627 The patch sounds good to me. It might be that the system does not support test -L; in this case, we will need autoconf magic to find whether test supports -L. I think we can defer this until the problem comes up; I'd appreciate if you could volunteer to add the autoconf test if the problem comes up. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=591551&group_id=5470 From noreply@sourceforge.net Fri Aug 9 15:48:47 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 07:48:47 -0700 Subject: [Patches] [ python-Patches-505705 ] Remove eval in pickle and cPickle Message-ID: Patches item #505705, was opened at 2002-01-19 04:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 Category: Core (C code) Group: None Status: Open >Resolution: Out of Date Priority: 5 Submitted By: Martin v. L�wis (loewis) >Assigned to: Martin v. L�wis (loewis) >Summary: Remove eval in pickle and cPickle Initial Comment: This patch removes the use of eval in pickle and cPickle. It does so by: - moving the actual parsing from compile.c:parsestr to PyString_DecodeEscape - introducing a new codec string-escape - removing the code that checks that a string-to-unpickle is properly escaped throughout, and replaces this with a check whether it is properly quoted, - unquoting the string in load_string, then passing it to the codec. This fixes #502503. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 10:48 Message: Logged In: YES user_id=6380 I like this idea. But the patch is out of date. Can you rework the patch? How much faster does this make the test program from http://groups.google.de/groups?hl=en&lr=&ie=UTF-8&selm=mailman.1026940226.16076.python-list%40python.org ??? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 00:12 Message: Logged In: YES user_id=6380 I'll review this. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-01-19 04:25 Message: Logged In: YES user_id=21627 BTW, this patch has #500002 as a prerequisite. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 From noreply@sourceforge.net Fri Aug 9 15:49:29 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 07:49:29 -0700 Subject: [Patches] [ python-Patches-592646 ] Prefer nb_multipy over sq_repeat Message-ID: Patches item #592646, was opened at 2002-08-08 11:35 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Neil Schemenauer (nascheme) Summary: Prefer nb_multipy over sq_repeat Initial Comment: >From David Beazlely: ============================================== class Foo(object): def __mul__(self,other): print "__mul__" def __rmul__(self,other): print "__rmul__" Python-2.2.1, if you try this, you get the following behavior: >>> f = Foo() >>> f*1.0 __mul__ >>> 1.0*f __rmul__ >>> f*1 __mul__ >>> 1*f __mul__ ============================================== This is because the int object prefers calling sq_repeat over nb_multiply if a type defines both. The attached patch changes int_mul to prefer nb_multiply over sq_repeat. This is a change is behavior and requires some careful thought. The tests run okay with this change but that doesn't mean other code will not break. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 10:49 Message: Logged In: YES user_id=6380 Please check this in! ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-08 15:58 Message: Logged In: YES user_id=6380 Yes, I still want it in 2.2. I presume Extensionclasses behave the same way with 2.3? I'll see if Zope 2 has a problem with this patch, and if it does, I'll fix Zope 2. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-08 15:54 Message: Logged In: YES user_id=35752 Argh! Stupid typos. # without patch 1*a # prints __mul__ 1.0*a # raises TypeError Extension classes do not define __r* methods. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-08 15:52 Message: Logged In: YES user_id=35752 This affects extension classes: import ExtensionClass class A(ExtensionClass.Base): def __mul__(self, other): print '__mul__' return 1 a = A() # without patch a*1 # prints __mul__ a*1.0 # prints __mul__ 1.0*a # prints __mul__ 1.0*a # raises TypeError # with patch a*1 # prints __mul__ a*1.0 # prints __mul__ 1*a # raises TypeError 1.0*a # raises TypeError The new behavior is more consistent. Do you still want it in 2.2? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-08 14:21 Message: Logged In: YES user_id=6380 Looks good to me, but s/of/if/ in the comment. :-) Despite the semantic change I think this should be backported to 2.2; it's more a bug than a feature... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 From noreply@sourceforge.net Fri Aug 9 15:50:03 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 07:50:03 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-02 02:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open >Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) >Assigned to: Guido van Rossum (gvanrossum) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 10:50 Message: Logged In: YES user_id=6380 OK, I'll do something along those lines myself. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-08 15:01 Message: Logged In: YES user_id=580015 I'm afraid my idea of patch size comes from GCC land, where this would be considered not that big at all. Only 1500 lines affected, and more than half of that is documentation and test suite! I tried, and failed, to break up the changes to tempfile.py itself. But there's some larger divisions that could be made. We could check in the new test_tempfile.py now, disabling the tests that refer to nonexistent functions (just comment out the lines that add those tests to the test_classes array). The changes to the rest of the test suite are also largely independent of the tempfile.py rewrite (since they replace tempfile.mktemp() with TESTFN, mostly). And the search-and-replace changes in the library can wait until after tempfile.py itself gets reviewed. Unfortunately, I am about to go on vacation for five days, so I don't have time now to do this split-up. I will try to drum up interest on python-dev in reviewing the patch as is. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 12:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-03 02:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 10:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Fri Aug 9 16:38:14 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 08:38:14 -0700 Subject: [Patches] [ python-Patches-590682 ] New codecs: html, asciihtml Message-ID: Patches item #590682, was opened at 2002-08-04 04:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 3 Submitted By: Oren Tirosh (orenti) Assigned to: M.-A. Lemburg (lemburg) Summary: New codecs: html, asciihtml Initial Comment: These codecs translate HTML character &entity; references. The html codec may be applied after other codecs such as utf-8 or iso8859_X and preserves their encoding. The asciihtml encoder produces 7-bit ascii and its output is therefore safe for insertion into almost any document regardless of its encoding. ---------------------------------------------------------------------- >Comment By: Oren Tirosh (orenti) Date: 2002-08-09 15:38 Message: Logged In: YES user_id=562624 Case insensitivity fixed. General cleanup. Codecs renamed to htmlescape and htmlescape8bit. Improved error handling. Update unicode_test. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 12:11 Message: Logged In: YES user_id=562624 Yes, entities are supposed to be case sensitive but I'm working with manually-generated html in which > is not so uncommon... I guess life is different in XML world. Case-smashing loses the distinction between some entities. I guess I need a more intelligent solution. > If you apply it to an 8-bit UTF-8 encoded strings you'll get garbage! Actually, it works great. The html codec passes characters 128-255 unmodified and therefore can be chained with other codecs. But I now have a more elegant and high-performance approach than codec chaining. See my python-dev posting. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-05 12:11 Message: Logged In: YES user_id=562624 Yes, entities are supposed to be case sensitive but I'm working with manually-generated html in which > is not so uncommon... I guess life is different in XML world. Case-smashing loses the distinction between some entities. I guess I need a more intelligent solution. > If you apply it to an 8-bit UTF-8 encoded strings you'll get garbage! Actually, it works great. The html codec passes characters 128-255 unmodified and therefore can be chained with other codecs. But I now have a more elegant and high-performance approach than codec chaining. See my python-dev posting. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-08-05 07:59 Message: Logged In: YES user_id=38388 On the htmlentitydefs: yes, these are in use as they are defined now. If you want a mapping from and to Unicode, I'd suggest to provide this as a new table. About the cased key in the entitydefs dict: AFAIK, these have to be cased since entities are case-sensitive. Could be wrong though. On PEP 293: this is going in the final round now. Your patch doesn't compete with it though, since PEP 293 is a much more general approach. On the general idea: I think the codecs are misnamed. They should be called htmlescape and asciihtmlescape since they don't provide "real" HTML encoding/decoding as Martin already mentioned. There's something wrong with your approach, BTW: the codec should only operate on Unicode (taking only Unicode input and generating Unicode). If you apply it to an 8-bit UTF-8 encoded strings you'll get garbage ! ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 15:54 Message: Logged In: YES user_id=21627 I'm in favour of exposing this via a search functions, for generated codec names, on top of PEP 293 (I would not like your codec to compete with the alternative mechanism). My dislike for the current patch also comes from the fact that it singles-out ASCII, which the search function would not. You could implement two forms: html.codecname and xml.codecname. The html form would do HTML entity references in both directions, and fall back to character references only if necessary; the XML form would use character references all the time, and entity references only for the builtin entities. And yes, I do recommend users to use codecs.charmap_encode directly, as this is probably the most efficient, yet most compact way to convert Unicode to a less-than-7-bit form. In anycase, I'd encourage you to contribute to the progress of PEP 293 first - this has been an issue for several years now, and I would be sorry if it would fail. While you are waiting for PEP 293 to complete, please do consider cleaning up htmlentitydefs to provide mappings from and to Unicode characters. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 15:07 Message: Logged In: YES user_id=562624 >People may be tricked into believing that they can >decode arbitrary HTML with your codec - when your >codec would incorrectly deal with CDATA sections. You don't even need to go as far as CDATA to see that tags must be parsed first and only then tag bodies and attribute values can be individually decoded. If you do it in the reverse order the tag parser will try to parse < as a tag. It should be documented, though. For encoding it's also obvious that encoding must be done first and then the encoded strings can be inserted into tags - < in strings is encoded into < preventing it from being interpreted as a tag. This is a good thing! it prevents insertion attacks. > You can easily enough arrange to get errors on <, >, > and &, by using codecs.charmap_encode with an > appropriate encoding map. If you mean to use this as some internal implementation detail it's ok. Are actually proposing that this is the way end users should use it? How about this: Install an encoder registry function that responds to any codec name matching "xmlcharref.SPAM" and does all the internal magic you describe to create a codec instance that combines xmlcharref translation including <,>,& and the SPAM encoding. This dynamically-generated codec will do both encoding and decoding and be cached, of course. "Namespaces are one honking great idea -- let's do more of those!" ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 11:50 Message: Logged In: YES user_id=21627 You can easily enough arrange to get errors on <, >, amd &, by using codecs.charmap_encode with an appropriate encoding map. Infact, with that, you can easily get all entity refereces into the encoded data, without any need for an explicit iteration. However, I am concerned that you offer decoding as well. People may be tricked into believing that they can decode arbitrrary HTML with your codec - when your codec would incorrectly deal with CDATA sections. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 11:10 Message: Logged In: YES user_id=562624 PEP 293 and patch #432401 are not a replacement for these codecs - it does decoding as well as encoding and also translates <, >, and & which are valid in all encodings and therefore won't get translated by error callbacks. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-04 11:00 Message: Logged In: YES user_id=562624 Yes, the error callback approach handles strange mixes better than my method of chaining codecs. But it only does encoding - this patch also provides full decoding of named, decimal and hexadecimal character entity references. Assuming PEP 293 is accepted, I'd like to see the asciihtml codec stay for its decoding ability and renamed to xmlcharref. The encoding part of this codec can just call .encode("ascii", errors="xmlcharrefreplace") to make it a full two-way codec. I'd prefer htmlentitydefs.py to use unicode, too. It's not so useful the way it is. Another problem is that it uses mixed case names as keys. The dictionary lookup is likely to miss incoming entities with arbitrary case so it's more-or-less broken. Does anyone actually use it the way it is? Can it be changed to use unicode without breaking anyone's code? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 08:54 Message: Logged In: YES user_id=21627 This patch is superceded by PEP 293 and patch #432401, which allows you to write unitext.encode("ascii", errors = "xmlcharrefreplace") This probably should be left open until PEP 293 is pronounced upon, and then either rejected or reviewed in detail. I'd encourage a patch that uses Unicode in htmlentitydefs directly, and computes entitydefs from that, instead of vice-versa (or atleast exposes a unicode_entitydefs, perhaps even lazily) - perhaps also with a reverse mapping. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470 From noreply@sourceforge.net Fri Aug 9 16:49:18 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 08:49:18 -0700 Subject: [Patches] [ python-Patches-592646 ] Prefer nb_multipy over sq_repeat Message-ID: Patches item #592646, was opened at 2002-08-08 15:35 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 Category: Core (C code) Group: None >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Neil Schemenauer (nascheme) Summary: Prefer nb_multipy over sq_repeat Initial Comment: >From David Beazlely: ============================================== class Foo(object): def __mul__(self,other): print "__mul__" def __rmul__(self,other): print "__rmul__" Python-2.2.1, if you try this, you get the following behavior: >>> f = Foo() >>> f*1.0 __mul__ >>> 1.0*f __rmul__ >>> f*1 __mul__ >>> 1*f __mul__ ============================================== This is because the int object prefers calling sq_repeat over nb_multiply if a type defines both. The attached patch changes int_mul to prefer nb_multiply over sq_repeat. This is a change is behavior and requires some careful thought. The tests run okay with this change but that doesn't mean other code will not break. ---------------------------------------------------------------------- >Comment By: Neil Schemenauer (nascheme) Date: 2002-08-09 15:49 Message: Logged In: YES user_id=35752 Checked into 2.3 and 2.2 CVS. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 14:49 Message: Logged In: YES user_id=6380 Please check this in! ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-08 19:58 Message: Logged In: YES user_id=6380 Yes, I still want it in 2.2. I presume Extensionclasses behave the same way with 2.3? I'll see if Zope 2 has a problem with this patch, and if it does, I'll fix Zope 2. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-08 19:54 Message: Logged In: YES user_id=35752 Argh! Stupid typos. # without patch 1*a # prints __mul__ 1.0*a # raises TypeError Extension classes do not define __r* methods. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-08 19:52 Message: Logged In: YES user_id=35752 This affects extension classes: import ExtensionClass class A(ExtensionClass.Base): def __mul__(self, other): print '__mul__' return 1 a = A() # without patch a*1 # prints __mul__ a*1.0 # prints __mul__ 1.0*a # prints __mul__ 1.0*a # raises TypeError # with patch a*1 # prints __mul__ a*1.0 # prints __mul__ 1*a # raises TypeError 1.0*a # raises TypeError The new behavior is more consistent. Do you still want it in 2.2? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-08 18:21 Message: Logged In: YES user_id=6380 Looks good to me, but s/of/if/ in the comment. :-) Despite the semantic change I think this should be backported to 2.2; it's more a bug than a feature... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592646&group_id=5470 From noreply@sourceforge.net Fri Aug 9 17:40:30 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 09:40:30 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-02 02:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Library (Lib) Group: Python 2.3 >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Guido van Rossum (gvanrossum) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 12:40 Message: Logged In: YES user_id=6380 I've checked this all in now. The changes to test_tempfile.py weren't as easily fixable to work without the tempfile.py changes as Zack thought. I hope the community will give it some review. It will probably break some Zope tests. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 10:50 Message: Logged In: YES user_id=6380 OK, I'll do something along those lines myself. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-08 15:01 Message: Logged In: YES user_id=580015 I'm afraid my idea of patch size comes from GCC land, where this would be considered not that big at all. Only 1500 lines affected, and more than half of that is documentation and test suite! I tried, and failed, to break up the changes to tempfile.py itself. But there's some larger divisions that could be made. We could check in the new test_tempfile.py now, disabling the tests that refer to nonexistent functions (just comment out the lines that add those tests to the test_classes array). The changes to the rest of the test suite are also largely independent of the tempfile.py rewrite (since they replace tempfile.mktemp() with TESTFN, mostly). And the search-and-replace changes in the library can wait until after tempfile.py itself gets reviewed. Unfortunately, I am about to go on vacation for five days, so I don't have time now to do this split-up. I will try to drum up interest on python-dev in reviewing the patch as is. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 12:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-03 02:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 10:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Fri Aug 9 18:02:18 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 10:02:18 -0700 Subject: [Patches] [ python-Patches-580995 ] new version of Set class Message-ID: Patches item #580995, was opened at 2002-07-13 11:53 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580995&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Alex Martelli (aleax) >Assigned to: Guido van Rossum (gvanrossum) Summary: new version of Set class Initial Comment: As per python-dev discussion on Sat 13 July 2002, subject "Dict constructor". A version of Greg Wilson's sandbox Set class that avoids the trickiness of implicitly freezing a set when __hash__ is called on it. Rather, uses several classes: Set itself has no __hash__ and represents a general, mutable set; BaseSet, its superclass, has all functionality common to mutable and immutable sets; ImmutableSet also subclasses BaseSet and adds __hash__; a wrapper _TemporarilyImmutableSet wraps a Set exposing only __hash__ (identical to that an ImmutableSet built from the Set would have) and __eq__ and __ne__ (delegated to the Set instance). Set.add(self, x) attempts to call x=x._asImmutable() (if AttributeError leaves x alone); Set._asImmutable(self) returns ImmutableSet(self). Membership test BaseSet.__contains__(self, x) attempt to call x = x._asTemporarilyImmutable() (if AttributeError leaves x alone); Set._asTemporarilyImmutable(self) returns TemporarilyImmutableSet(self). I've left Greg's code mostly alone otherwise except for fixing bugs/obsolescent usage (e.g. dictionary rather than dict) and making what were ValueError into TypeError (ValueError was doubtful earlier, is untenable now that mutable and immutable sets are different types). The change in exceptions forced me to change the unit tests in test_set.py, too, but I made no other changes nor additions. ---------------------------------------------------------------------- Comment By: Alex Martelli (aleax) Date: 2002-07-18 16:27 Message: Logged In: YES user_id=60314 Changed as per GvR comments so now sets have-a dict rather than being-a dict. Made code more direct in some places (using list comprehensions rather than loops where appropriate). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580995&group_id=5470 From noreply@sourceforge.net Fri Aug 9 19:04:09 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 11:04:09 -0700 Subject: [Patches] [ python-Patches-588564 ] _locale library patch Message-ID: Patches item #588564, was opened at 2002-07-30 05:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588564&group_id=5470 Category: Distutils and setup.py Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jason Tishler (jlt63) Assigned to: Jason Tishler (jlt63) Summary: _locale library patch Initial Comment: This patch enables setup.py to find gettext routines when they are located in libintl instead of libc. Although I developed this patch for Cygwin, I hope that it can be easily updated to support other platforms (if necessary). I tested this patch under Cygwin and Red Hat Linux 7.1. ---------------------------------------------------------------------- >Comment By: Jason Tishler (jlt63) Date: 2002-08-09 10:04 Message: Logged In: YES user_id=86216 I presume that you mean to use an autoconf-style approach *in* setup.py. Is this assumption correct? If so, then I know how to search for libintl.h via find_file(). Unfortunately, I do not know how to check that a function (e.g., getext()) is in a library (i.e., libc.a). Any suggestions? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 00:37 Message: Logged In: YES user_id=21627 I would really prefer if such problems where solved in an autoconf-style approach: - is libintl.h present (you may ask pyconfig.h for that) - if so, is gettext provided by the C library - if not, is it provided by -lintl - if yes, add -lintl - if no, print an error message and continue ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588564&group_id=5470 From noreply@sourceforge.net Fri Aug 9 21:53:53 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 13:53:53 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-02 02:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Closed Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Guido van Rossum (gvanrossum) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-09 16:53 Message: Logged In: YES user_id=33168 Guido, there are still 3 uses of mktemp after the checkin. Should these use mkstemp()? Lib/toaiff.py:102 Lib/plat-irix5/torgb.py:95 Lib/plat-irix6/torgb.py:95 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 12:40 Message: Logged In: YES user_id=6380 I've checked this all in now. The changes to test_tempfile.py weren't as easily fixable to work without the tempfile.py changes as Zack thought. I hope the community will give it some review. It will probably break some Zope tests. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 10:50 Message: Logged In: YES user_id=6380 OK, I'll do something along those lines myself. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-08 15:01 Message: Logged In: YES user_id=580015 I'm afraid my idea of patch size comes from GCC land, where this would be considered not that big at all. Only 1500 lines affected, and more than half of that is documentation and test suite! I tried, and failed, to break up the changes to tempfile.py itself. But there's some larger divisions that could be made. We could check in the new test_tempfile.py now, disabling the tests that refer to nonexistent functions (just comment out the lines that add those tests to the test_classes array). The changes to the rest of the test suite are also largely independent of the tempfile.py rewrite (since they replace tempfile.mktemp() with TESTFN, mostly). And the search-and-replace changes in the library can wait until after tempfile.py itself gets reviewed. Unfortunately, I am about to go on vacation for five days, so I don't have time now to do this split-up. I will try to drum up interest on python-dev in reviewing the patch as is. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 12:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-03 02:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 10:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Sat Aug 10 00:49:05 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 16:49:05 -0700 Subject: [Patches] [ python-Patches-588564 ] _locale library patch Message-ID: Patches item #588564, was opened at 2002-07-30 15:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588564&group_id=5470 Category: Distutils and setup.py Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jason Tishler (jlt63) Assigned to: Jason Tishler (jlt63) Summary: _locale library patch Initial Comment: This patch enables setup.py to find gettext routines when they are located in libintl instead of libc. Although I developed this patch for Cygwin, I hope that it can be easily updated to support other platforms (if necessary). I tested this patch under Cygwin and Red Hat Linux 7.1. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-10 01:49 Message: Logged In: YES user_id=21627 Yes, that's what I meant. It eventually results in distutils getting some of the capabilities of autoconf. I agree this is a major undertaking, but one that I think needs to progress over time, in small steps. For the current problem, it might be useful to emulate AC_TRY_LINK: generate a small program, and see whether the compiler manages to link it. You probably need to allow for result-caching as well; I recommend to put the cache file into build/temp.. This may all sound very ad-hoc, but I think it can be made to work with reasonable effort. We probably need to present any results to distutils-sig before committing them. ---------------------------------------------------------------------- Comment By: Jason Tishler (jlt63) Date: 2002-08-09 20:04 Message: Logged In: YES user_id=86216 I presume that you mean to use an autoconf-style approach *in* setup.py. Is this assumption correct? If so, then I know how to search for libintl.h via find_file(). Unfortunately, I do not know how to check that a function (e.g., getext()) is in a library (i.e., libc.a). Any suggestions? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 10:37 Message: Logged In: YES user_id=21627 I would really prefer if such problems where solved in an autoconf-style approach: - is libintl.h present (you may ask pyconfig.h for that) - if so, is gettext provided by the C library - if not, is it provided by -lintl - if yes, add -lintl - if no, print an error message and continue ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588564&group_id=5470 From noreply@sourceforge.net Sat Aug 10 01:19:49 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 17:19:49 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-02 02:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Closed Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Guido van Rossum (gvanrossum) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 20:19 Message: Logged In: YES user_id=6380 Oops, looks like typos in the patch. Fixed (I hope). Question for Zack: I noticed that a few times you changed this: temp = tempfile.mktemp() into this: (fd, temp) = tempfile.mkstemp() os.close(fd) If the latter is secure, why can't mktemp() be defined as doing that? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-09 16:53 Message: Logged In: YES user_id=33168 Guido, there are still 3 uses of mktemp after the checkin. Should these use mkstemp()? Lib/toaiff.py:102 Lib/plat-irix5/torgb.py:95 Lib/plat-irix6/torgb.py:95 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 12:40 Message: Logged In: YES user_id=6380 I've checked this all in now. The changes to test_tempfile.py weren't as easily fixable to work without the tempfile.py changes as Zack thought. I hope the community will give it some review. It will probably break some Zope tests. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 10:50 Message: Logged In: YES user_id=6380 OK, I'll do something along those lines myself. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-08 15:01 Message: Logged In: YES user_id=580015 I'm afraid my idea of patch size comes from GCC land, where this would be considered not that big at all. Only 1500 lines affected, and more than half of that is documentation and test suite! I tried, and failed, to break up the changes to tempfile.py itself. But there's some larger divisions that could be made. We could check in the new test_tempfile.py now, disabling the tests that refer to nonexistent functions (just comment out the lines that add those tests to the test_classes array). The changes to the rest of the test suite are also largely independent of the tempfile.py rewrite (since they replace tempfile.mktemp() with TESTFN, mostly). And the search-and-replace changes in the library can wait until after tempfile.py itself gets reviewed. Unfortunately, I am about to go on vacation for five days, so I don't have time now to do this split-up. I will try to drum up interest on python-dev in reviewing the patch as is. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 12:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-03 02:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 10:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Sat Aug 10 02:37:38 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 18:37:38 -0700 Subject: [Patches] [ python-Patches-592529 ] Split-out ntmodule.c Message-ID: Patches item #592529, was opened at 2002-08-08 20:12 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592529&group_id=5470 Category: Windows Group: None Status: Open Resolution: None Priority: 5 Submitted By: Martin v. L�wis (loewis) Assigned to: Martin v. L�wis (loewis) Summary: Split-out ntmodule.c Initial Comment: This patch moves the MS_WINDOWS code from posixmodule.c into ntmodule.c. The OS/2 code is left in posixmodule.c. I believe this patch significantly improves readability of both modules (posix and nt), even though it adds a slight code duplication. It also gives Windows developers the chance to adjust the implementation better to the Win32 API without fear of breaking the POSIX versions. Attached are three files: the ntmodule.c source code, the posixmodule.c diff, and the pcbuild diff. Since the patches will outdate quickly, I'd appreciate if that patch could be accepted or rejected quickly. Randomly assigning to Tim. ---------------------------------------------------------------------- >Comment By: Mark Hammond (mhammond) Date: 2002-08-10 11:37 Message: Logged In: YES user_id=14198 I too am -0 on this, for the exact reasons Tim gives. I think a better strategy would be to: * Move most of the #ifdef cruft at the top of the source file to pyconfig.h and/or pyport.h, making these "HAVE_" macros consistent with the rest of the HAVE_ cruft. Most #defines in this module them move simply to HAVE_FEATURE rather then IS_SPECIFIC_OS * Move popen, and possibly one or 2 other Windows specific functions to a separate source file. Possibly repeat for OS/2, but it is not clear there are huge OS/2 slabs of code that would make it worthwhile. This would be a good start, reflects the existing posixmodule.c comment: /* Various compilers have only certain posix functions */ /* XXX Gosh I wish these were all moved into pyconfig.h */ and does not preclude a more aggressive split in the future. However, my opinions on this are not strong enough to try and -1 it :) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 12:28 Message: Logged In: YES user_id=6380 I'm +0.5 on this. Can you bring this up on python-dev to see if there are different viewpoints? I think the resulting code will be easier to maintain; new stuff added at this point is more likely to be unique to Unix anyway (or unique to Windows). I wonder if the os2 code shouldn't be moved to its own file as well (I think Andrew MacIntyre maintains that port, right?). There are still a bunch of #ifdefs in the nt code. Are those really variable across Windows versions or compilers? If not, I'd suggest to expand those. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-09 02:32 Message: Logged In: YES user_id=31435 I'm -0, so assigning to Guido for another opinion. I expect this will actually make it harder to keep the os interface consistent and working across platforms; e.g., somebody adds an os function in one module but forgets to add it in the other (likely because they don't even know it exists); a docstring repair shows up in one but not both; a largefile fix in one doesn't get reflected in the other; etc. Apart from the massive Windows popen pain, there are actually more embedded PYOS_OS2 #ifdefs in posixmodule. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592529&group_id=5470 From noreply@sourceforge.net Sat Aug 10 05:17:10 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 09 Aug 2002 21:17:10 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-02 02:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Library (Lib) Group: Python 2.3 >Status: Open Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Guido van Rossum (gvanrossum) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 00:17 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 20:19 Message: Logged In: YES user_id=6380 Oops, looks like typos in the patch. Fixed (I hope). Question for Zack: I noticed that a few times you changed this: temp = tempfile.mktemp() into this: (fd, temp) = tempfile.mkstemp() os.close(fd) If the latter is secure, why can't mktemp() be defined as doing that? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-09 16:53 Message: Logged In: YES user_id=33168 Guido, there are still 3 uses of mktemp after the checkin. Should these use mkstemp()? Lib/toaiff.py:102 Lib/plat-irix5/torgb.py:95 Lib/plat-irix6/torgb.py:95 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 12:40 Message: Logged In: YES user_id=6380 I've checked this all in now. The changes to test_tempfile.py weren't as easily fixable to work without the tempfile.py changes as Zack thought. I hope the community will give it some review. It will probably break some Zope tests. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 10:50 Message: Logged In: YES user_id=6380 OK, I'll do something along those lines myself. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-08 15:01 Message: Logged In: YES user_id=580015 I'm afraid my idea of patch size comes from GCC land, where this would be considered not that big at all. Only 1500 lines affected, and more than half of that is documentation and test suite! I tried, and failed, to break up the changes to tempfile.py itself. But there's some larger divisions that could be made. We could check in the new test_tempfile.py now, disabling the tests that refer to nonexistent functions (just comment out the lines that add those tests to the test_classes array). The changes to the rest of the test suite are also largely independent of the tempfile.py rewrite (since they replace tempfile.mktemp() with TESTFN, mostly). And the search-and-replace changes in the library can wait until after tempfile.py itself gets reviewed. Unfortunately, I am about to go on vacation for five days, so I don't have time now to do this split-up. I will try to drum up interest on python-dev in reviewing the patch as is. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 12:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-03 02:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 10:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Sat Aug 10 11:01:39 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 10 Aug 2002 03:01:39 -0700 Subject: [Patches] [ python-Patches-576101 ] Alternative implementation of interning Message-ID: Patches item #576101, was opened at 2002-07-01 19:23 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: Alternative implementation of interning Initial Comment: An interned string has a flag set indicating that it is interned instead of a pointer to the interned string. This pointer was almost always either NULL or pointing to the same object. The other cases were rare and ineffective as an optimization. This saves an average of 3 bytes per string. Interned strings are no longer immortal. They are automatically destroyed when there are no more references to them except the global dictionary of interned strings. New function (actually a macro) PyString_CheckInterned to check whether a string is interned. There are no more references to ob_sinterned anywhere outside stringobject.c. ---------------------------------------------------------------------- >Comment By: Oren Tirosh (orenti) Date: 2002-08-10 10:01 Message: Logged In: YES user_id=562624 General cleanup. Better handling of immortal interned strings for backward compatibility. It passes regrtest but causes test_gc to leak 20 objects. 13 from test_finalizer_newclass and 7 from test_del_newclass, but only if test_saveall is used. I've tried earlier versions of this patch (which were ok at the time) and they now create this leak too. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 16:08 Message: Logged In: YES user_id=562624 Oops, forgot to actually attach the patch. Here it is. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 14:35 Message: Logged In: YES user_id=562624 This implementation supports both mortal and immortal interned strings. PyString_InternInPlace creates an immortal interned string for backward compatibility with code that relies on this behavior. PyString_Intern creates a mortal interned string that is deallocated when its refcnt reaches 0. Note that if the string value has been previously interned as immortal this will not make it mortal. Most places in the interpreter were changed to PyString_Intern except those that may be required for compatibility. This version of the patch, like the previous one, disables indirect interning. Is there any evidence that it is still an important optimization for some packages? Make sure you rebuild everything after applying this patch because it modifies the size of string object headers. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-07-02 04:21 Message: Logged In: YES user_id=80475 I like the way you consolidated all of the knowledge about interning into one place. Consider adding an example to the docs of an effective use of interning for optimization. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 From noreply@sourceforge.net Sat Aug 10 14:58:20 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 10 Aug 2002 06:58:20 -0700 Subject: [Patches] [ python-Patches-576101 ] Alternative implementation of interning Message-ID: Patches item #576101, was opened at 2002-07-01 15:23 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) >Assigned to: Guido van Rossum (gvanrossum) Summary: Alternative implementation of interning Initial Comment: An interned string has a flag set indicating that it is interned instead of a pointer to the interned string. This pointer was almost always either NULL or pointing to the same object. The other cases were rare and ineffective as an optimization. This saves an average of 3 bytes per string. Interned strings are no longer immortal. They are automatically destroyed when there are no more references to them except the global dictionary of interned strings. New function (actually a macro) PyString_CheckInterned to check whether a string is interned. There are no more references to ob_sinterned anywhere outside stringobject.c. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-10 06:01 Message: Logged In: YES user_id=562624 General cleanup. Better handling of immortal interned strings for backward compatibility. It passes regrtest but causes test_gc to leak 20 objects. 13 from test_finalizer_newclass and 7 from test_del_newclass, but only if test_saveall is used. I've tried earlier versions of this patch (which were ok at the time) and they now create this leak too. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 12:08 Message: Logged In: YES user_id=562624 Oops, forgot to actually attach the patch. Here it is. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 10:35 Message: Logged In: YES user_id=562624 This implementation supports both mortal and immortal interned strings. PyString_InternInPlace creates an immortal interned string for backward compatibility with code that relies on this behavior. PyString_Intern creates a mortal interned string that is deallocated when its refcnt reaches 0. Note that if the string value has been previously interned as immortal this will not make it mortal. Most places in the interpreter were changed to PyString_Intern except those that may be required for compatibility. This version of the patch, like the previous one, disables indirect interning. Is there any evidence that it is still an important optimization for some packages? Make sure you rebuild everything after applying this patch because it modifies the size of string object headers. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-07-02 00:21 Message: Logged In: YES user_id=80475 I like the way you consolidated all of the knowledge about interning into one place. Consider adding an example to the docs of an effective use of interning for optimization. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 From noreply@sourceforge.net Sun Aug 11 03:01:13 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 10 Aug 2002 19:01:13 -0700 Subject: [Patches] [ python-Patches-593560 ] bugfixes and cleanup for _strptime.py Message-ID: Patches item #593560, was opened at 2002-08-10 19:01 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593560&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Brett Cannon (bcannon) Assigned to: Nobody/Anonymous (nobody) Summary: bugfixes and cleanup for _strptime.py Initial Comment: Discovered two bugs in _strptime.py thanks to Mikael Sch?berg of AB Strakt; both were in LocaleTime.__calc_date_time(). One was where if a locale-specific format string represented the month without a leading zero, it would not be caught. The other bug was when a locale just lacked some information (in this case, Swedish's lack of an AM/PM representation); IndexError was thrown because string.replace() was being called with the empty string as the old value. I also took this opportunity to clean up some of the code (namely TimeRE.__getitem__() along with LocaleTime.__calc_date_time()). Added some comments, reformatted some code, etc. All of this was brought on thanks to the Python Cookbook's chapter 1 (good work Alex and David!). I have updated test_strptime.py to check for the second of the mentioned bug explicitly. I also commented the code and added a fxn that creates a PyUnit test suite with all of the tests. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593560&group_id=5470 From noreply@sourceforge.net Sun Aug 11 04:31:41 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 10 Aug 2002 20:31:41 -0700 Subject: [Patches] [ python-Patches-593560 ] bugfixes and cleanup for _strptime.py Message-ID: Patches item #593560, was opened at 2002-08-10 19:01 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593560&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Brett Cannon (bcannon) Assigned to: Nobody/Anonymous (nobody) Summary: bugfixes and cleanup for _strptime.py Initial Comment: Discovered two bugs in _strptime.py thanks to Mikael Sch?berg of AB Strakt; both were in LocaleTime.__calc_date_time(). One was where if a locale-specific format string represented the month without a leading zero, it would not be caught. The other bug was when a locale just lacked some information (in this case, Swedish's lack of an AM/PM representation); IndexError was thrown because string.replace() was being called with the empty string as the old value. I also took this opportunity to clean up some of the code (namely TimeRE.__getitem__() along with LocaleTime.__calc_date_time()). Added some comments, reformatted some code, etc. All of this was brought on thanks to the Python Cookbook's chapter 1 (good work Alex and David!). I have updated test_strptime.py to check for the second of the mentioned bug explicitly. I also commented the code and added a fxn that creates a PyUnit test suite with all of the tests. ---------------------------------------------------------------------- >Comment By: Brett Cannon (bcannon) Date: 2002-08-10 20:31 Message: Logged In: YES user_id=357491 Just when you thought you had something done, tim_one had to go and normalize the whitespace in both _strptime.py and test_strptime.py! =) So to save Tim the time and effort of having to normalize the files again, I went ahead and applied them to the fixed files. I also reformatted test_strptime.py so that lines wrapped around 80 characters (didn't realize Guido had added it to the distro until today). So make sure to use the files that specify whitespace normalization in their descriptions. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593560&group_id=5470 From noreply@sourceforge.net Sun Aug 11 05:28:46 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 10 Aug 2002 21:28:46 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-02 02:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Guido van Rossum (gvanrossum) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 00:28 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 00:17 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 20:19 Message: Logged In: YES user_id=6380 Oops, looks like typos in the patch. Fixed (I hope). Question for Zack: I noticed that a few times you changed this: temp = tempfile.mktemp() into this: (fd, temp) = tempfile.mkstemp() os.close(fd) If the latter is secure, why can't mktemp() be defined as doing that? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-09 16:53 Message: Logged In: YES user_id=33168 Guido, there are still 3 uses of mktemp after the checkin. Should these use mkstemp()? Lib/toaiff.py:102 Lib/plat-irix5/torgb.py:95 Lib/plat-irix6/torgb.py:95 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 12:40 Message: Logged In: YES user_id=6380 I've checked this all in now. The changes to test_tempfile.py weren't as easily fixable to work without the tempfile.py changes as Zack thought. I hope the community will give it some review. It will probably break some Zope tests. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 10:50 Message: Logged In: YES user_id=6380 OK, I'll do something along those lines myself. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-08 15:01 Message: Logged In: YES user_id=580015 I'm afraid my idea of patch size comes from GCC land, where this would be considered not that big at all. Only 1500 lines affected, and more than half of that is documentation and test suite! I tried, and failed, to break up the changes to tempfile.py itself. But there's some larger divisions that could be made. We could check in the new test_tempfile.py now, disabling the tests that refer to nonexistent functions (just comment out the lines that add those tests to the test_classes array). The changes to the rest of the test suite are also largely independent of the tempfile.py rewrite (since they replace tempfile.mktemp() with TESTFN, mostly). And the search-and-replace changes in the library can wait until after tempfile.py itself gets reviewed. Unfortunately, I am about to go on vacation for five days, so I don't have time now to do this split-up. I will try to drum up interest on python-dev in reviewing the patch as is. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 12:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-03 02:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 10:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Sun Aug 11 11:41:22 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 03:41:22 -0700 Subject: [Patches] [ python-Patches-593627 ] Static names Message-ID: Patches item #593627, was opened at 2002-08-11 10:41 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: Static names Initial Comment: This patch creates static string objects for all built-in names and interns then on initialization. The macro PyNAME is be used to access static names. PyNAME(__spam__) is equivalent to PyString_InternFromString("__spam__") but is a constant expression. It requires the name to be one of the built-in names. A linker error will be generated if it isn't. Most conversions of C strings into temporary string objects can be eliminated (PyString_FromString, PyString_InternFromString). Most string comparisons at runtime can also be eliminated. Instead of : if (strcmp(PyString_AsString(name), "__spam__")) ... This code can be used: PyString_INTERN(name) if (name == PyNAME(__spam__)) ... Where PyString_INTERN is a fast inline check if the string is already interned (and it usually is). To prevent unbounded accumulation of interned strings the mortal interned string patch should also be applied. The patch converts most of the builtin module to this new mode as an example. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 From noreply@sourceforge.net Sun Aug 11 13:27:55 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 05:27:55 -0700 Subject: [Patches] [ python-Patches-514628 ] bug in pydoc on python 2.2 release Message-ID: Patches item #514628, was opened at 2002-02-07 18:09 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=514628&group_id=5470 Category: Library (Lib) Group: None Status: Open >Resolution: Works For Me Priority: 5 Submitted By: Raj Kunjithapadam (mmaster25) Assigned to: Ka-Ping Yee (ping) Summary: bug in pydoc on python 2.2 release Initial Comment: pydoc has a bug when trying to generate html doc more importantly it has bug in the method writedoc() attached is my fix. Here is the diff between my fix and the regular dist 1338c1338 < def writedoc(thing, forceload=0): --- > def writedoc(key, forceload=0): 1340,1346c1340,1343 < object = thing < if type(thing) is type(''): < try: < object = locate(thing, forceload) < except ErrorDuringImport, value: < print value < return --- > try: > object = locate(key, forceload) > except ErrorDuringImport, value: > print value 1351c1348 < file = open(thing.__name__ + '.html', 'w') --- > file = open(key + '.html', 'w') 1354c1351 < print 'wrote', thing.__name__ + '.html' --- > print 'wrote', key + '.html' 1356c1353 < print 'no Python documentation found for %s' % repr(thing) --- > print 'no Python documentation found for %s' % repr(key) ---------------------------------------------------------------------- >Comment By: Ka-Ping Yee (ping) Date: 2002-08-11 05:27 Message: Logged In: YES user_id=45338 I see that your patch changes the functionality of writedoc() so that it accepts other objects as well as strings, but you have not explained how this fixes a bug, or even what the bug is. If you feel that the bug is "writedoc() only accepts strings", then i disagree that this is a bug. The designed purpose of writedoc() is to accept a string, because then the filename it writes is directly predictable. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-03-18 00:48 Message: Logged In: YES user_id=21627 Can you please provide an example that demonstrates the problem? Also, can you please regenerate your changes as context (-c) or unified (-u) diffs, and attach those to this report (do *not* paste them into the comment field)? In their current, the patch is pretty useless: SF messed up the indentation, and it is an old-style patch, and pydoc.py is already at 1.58. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-03-01 14:45 Message: Logged In: YES user_id=6380 assigned to Tim; this may be Ping's terrain but Ping is typically not responsive. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=514628&group_id=5470 From noreply@sourceforge.net Sun Aug 11 13:28:14 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 05:28:14 -0700 Subject: [Patches] [ python-Patches-514628 ] bug in pydoc on python 2.2 release Message-ID: Patches item #514628, was opened at 2002-02-07 18:09 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=514628&group_id=5470 Category: Library (Lib) Group: None >Status: Closed Resolution: Works For Me Priority: 5 Submitted By: Raj Kunjithapadam (mmaster25) Assigned to: Ka-Ping Yee (ping) Summary: bug in pydoc on python 2.2 release Initial Comment: pydoc has a bug when trying to generate html doc more importantly it has bug in the method writedoc() attached is my fix. Here is the diff between my fix and the regular dist 1338c1338 < def writedoc(thing, forceload=0): --- > def writedoc(key, forceload=0): 1340,1346c1340,1343 < object = thing < if type(thing) is type(''): < try: < object = locate(thing, forceload) < except ErrorDuringImport, value: < print value < return --- > try: > object = locate(key, forceload) > except ErrorDuringImport, value: > print value 1351c1348 < file = open(thing.__name__ + '.html', 'w') --- > file = open(key + '.html', 'w') 1354c1351 < print 'wrote', thing.__name__ + '.html' --- > print 'wrote', key + '.html' 1356c1353 < print 'no Python documentation found for %s' % repr(thing) --- > print 'no Python documentation found for %s' % repr(key) ---------------------------------------------------------------------- Comment By: Ka-Ping Yee (ping) Date: 2002-08-11 05:27 Message: Logged In: YES user_id=45338 I see that your patch changes the functionality of writedoc() so that it accepts other objects as well as strings, but you have not explained how this fixes a bug, or even what the bug is. If you feel that the bug is "writedoc() only accepts strings", then i disagree that this is a bug. The designed purpose of writedoc() is to accept a string, because then the filename it writes is directly predictable. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-03-18 00:48 Message: Logged In: YES user_id=21627 Can you please provide an example that demonstrates the problem? Also, can you please regenerate your changes as context (-c) or unified (-u) diffs, and attach those to this report (do *not* paste them into the comment field)? In their current, the patch is pretty useless: SF messed up the indentation, and it is an old-style patch, and pydoc.py is already at 1.58. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-03-01 14:45 Message: Logged In: YES user_id=6380 assigned to Tim; this may be Ping's terrain but Ping is typically not responsive. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=514628&group_id=5470 From noreply@sourceforge.net Sun Aug 11 15:09:58 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 07:09:58 -0700 Subject: [Patches] [ python-Patches-505705 ] Remove eval in pickle and cPickle Message-ID: Patches item #505705, was opened at 2002-01-19 04:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Out of Date Priority: 5 Submitted By: Martin v. L�wis (loewis) Assigned to: Martin v. L�wis (loewis) Summary: Remove eval in pickle and cPickle Initial Comment: This patch removes the use of eval in pickle and cPickle. It does so by: - moving the actual parsing from compile.c:parsestr to PyString_DecodeEscape - introducing a new codec string-escape - removing the code that checks that a string-to-unpickle is properly escaped throughout, and replaces this with a check whether it is properly quoted, - unquoting the string in load_string, then passing it to the codec. This fixes #502503. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 10:09 Message: Logged In: YES user_id=6380 This would fix bug #593656 too. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 10:48 Message: Logged In: YES user_id=6380 I like this idea. But the patch is out of date. Can you rework the patch? How much faster does this make the test program from http://groups.google.de/groups?hl=en&lr=&ie=UTF-8&selm=mailman.1026940226.16076.python-list%40python.org ??? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 00:12 Message: Logged In: YES user_id=6380 I'll review this. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-01-19 04:25 Message: Logged In: YES user_id=21627 BTW, this patch has #500002 as a prerequisite. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 From noreply@sourceforge.net Sun Aug 11 15:46:14 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 07:46:14 -0700 Subject: [Patches] [ python-Patches-593627 ] Static names Message-ID: Patches item #593627, was opened at 2002-08-11 06:41 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: Static names Initial Comment: This patch creates static string objects for all built-in names and interns then on initialization. The macro PyNAME is be used to access static names. PyNAME(__spam__) is equivalent to PyString_InternFromString("__spam__") but is a constant expression. It requires the name to be one of the built-in names. A linker error will be generated if it isn't. Most conversions of C strings into temporary string objects can be eliminated (PyString_FromString, PyString_InternFromString). Most string comparisons at runtime can also be eliminated. Instead of : if (strcmp(PyString_AsString(name), "__spam__")) ... This code can be used: PyString_INTERN(name) if (name == PyNAME(__spam__)) ... Where PyString_INTERN is a fast inline check if the string is already interned (and it usually is). To prevent unbounded accumulation of interned strings the mortal interned string patch should also be applied. The patch converts most of the builtin module to this new mode as an example. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 10:46 Message: Logged In: YES user_id=33168 Couple of initial comments: * this is a reverse patch * it seems like there are other changes in here - int ob_shash -> long - releasing interned strings? * dictobject.c is removed? * including python headers should use "" not <> Oren, could you generate a new patch with only the changes to support PyNAME? Thanks! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 From noreply@sourceforge.net Sun Aug 11 16:44:10 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 08:44:10 -0700 Subject: [Patches] [ python-Patches-593627 ] Static names Message-ID: Patches item #593627, was opened at 2002-08-11 10:41 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: Static names Initial Comment: This patch creates static string objects for all built-in names and interns then on initialization. The macro PyNAME is be used to access static names. PyNAME(__spam__) is equivalent to PyString_InternFromString("__spam__") but is a constant expression. It requires the name to be one of the built-in names. A linker error will be generated if it isn't. Most conversions of C strings into temporary string objects can be eliminated (PyString_FromString, PyString_InternFromString). Most string comparisons at runtime can also be eliminated. Instead of : if (strcmp(PyString_AsString(name), "__spam__")) ... This code can be used: PyString_INTERN(name) if (name == PyNAME(__spam__)) ... Where PyString_INTERN is a fast inline check if the string is already interned (and it usually is). To prevent unbounded accumulation of interned strings the mortal interned string patch should also be applied. The patch converts most of the builtin module to this new mode as an example. ---------------------------------------------------------------------- >Comment By: Oren Tirosh (orenti) Date: 2002-08-11 15:44 Message: Logged In: YES user_id=562624 Ok, I'll fix the problems with the patch. What's the best way to produce a patch that adds new files? Static string objects cannot be released, of course. This patch will eventually depend on on the mortal interned strings patch to fix it but in the meantime I just disabled releasing interned strings because I want to keep the two patches independent. The next version will add a new PyArg_ParseTuple format for an interned string to make it easier to use. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 14:46 Message: Logged In: YES user_id=33168 Couple of initial comments: * this is a reverse patch * it seems like there are other changes in here - int ob_shash -> long - releasing interned strings? * dictobject.c is removed? * including python headers should use "" not <> Oren, could you generate a new patch with only the changes to support PyNAME? Thanks! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 From noreply@sourceforge.net Sun Aug 11 16:54:34 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 08:54:34 -0700 Subject: [Patches] [ python-Patches-593627 ] Static names Message-ID: Patches item #593627, was opened at 2002-08-11 06:41 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: Static names Initial Comment: This patch creates static string objects for all built-in names and interns then on initialization. The macro PyNAME is be used to access static names. PyNAME(__spam__) is equivalent to PyString_InternFromString("__spam__") but is a constant expression. It requires the name to be one of the built-in names. A linker error will be generated if it isn't. Most conversions of C strings into temporary string objects can be eliminated (PyString_FromString, PyString_InternFromString). Most string comparisons at runtime can also be eliminated. Instead of : if (strcmp(PyString_AsString(name), "__spam__")) ... This code can be used: PyString_INTERN(name) if (name == PyNAME(__spam__)) ... Where PyString_INTERN is a fast inline check if the string is already interned (and it usually is). To prevent unbounded accumulation of interned strings the mortal interned string patch should also be applied. The patch converts most of the builtin module to this new mode as an example. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 11:54 Message: Logged In: YES user_id=33168 I'd like to see the releasing interned string patch applied. I think it's almost ready, isn't it? It would make patching easier and seems to be a good idea. For me, the easiest way to produce patches is to use cvs. You can keep multiple cvs trees around easy enough (for having multiple overlapping/independant patches). To create patches with cvs: cvs diff -C 5 [file1] [file2] .... ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-11 11:44 Message: Logged In: YES user_id=562624 Ok, I'll fix the problems with the patch. What's the best way to produce a patch that adds new files? Static string objects cannot be released, of course. This patch will eventually depend on on the mortal interned strings patch to fix it but in the meantime I just disabled releasing interned strings because I want to keep the two patches independent. The next version will add a new PyArg_ParseTuple format for an interned string to make it easier to use. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 10:46 Message: Logged In: YES user_id=33168 Couple of initial comments: * this is a reverse patch * it seems like there are other changes in here - int ob_shash -> long - releasing interned strings? * dictobject.c is removed? * including python headers should use "" not <> Oren, could you generate a new patch with only the changes to support PyNAME? Thanks! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 From noreply@sourceforge.net Sun Aug 11 18:43:24 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 10:43:24 -0700 Subject: [Patches] [ python-Patches-592529 ] Split-out ntmodule.c Message-ID: Patches item #592529, was opened at 2002-08-08 12:12 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592529&group_id=5470 Category: Windows Group: None >Status: Closed >Resolution: Rejected Priority: 5 Submitted By: Martin v. L�wis (loewis) Assigned to: Martin v. L�wis (loewis) Summary: Split-out ntmodule.c Initial Comment: This patch moves the MS_WINDOWS code from posixmodule.c into ntmodule.c. The OS/2 code is left in posixmodule.c. I believe this patch significantly improves readability of both modules (posix and nt), even though it adds a slight code duplication. It also gives Windows developers the chance to adjust the implementation better to the Win32 API without fear of breaking the POSIX versions. Attached are three files: the ntmodule.c source code, the posixmodule.c diff, and the pcbuild diff. Since the patches will outdate quickly, I'd appreciate if that patch could be accepted or rejected quickly. Randomly assigning to Tim. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-11 19:43 Message: Logged In: YES user_id=21627 Ok, I withdraw this patch. ---------------------------------------------------------------------- Comment By: Mark Hammond (mhammond) Date: 2002-08-10 03:37 Message: Logged In: YES user_id=14198 I too am -0 on this, for the exact reasons Tim gives. I think a better strategy would be to: * Move most of the #ifdef cruft at the top of the source file to pyconfig.h and/or pyport.h, making these "HAVE_" macros consistent with the rest of the HAVE_ cruft. Most #defines in this module them move simply to HAVE_FEATURE rather then IS_SPECIFIC_OS * Move popen, and possibly one or 2 other Windows specific functions to a separate source file. Possibly repeat for OS/2, but it is not clear there are huge OS/2 slabs of code that would make it worthwhile. This would be a good start, reflects the existing posixmodule.c comment: /* Various compilers have only certain posix functions */ /* XXX Gosh I wish these were all moved into pyconfig.h */ and does not preclude a more aggressive split in the future. However, my opinions on this are not strong enough to try and -1 it :) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 04:28 Message: Logged In: YES user_id=6380 I'm +0.5 on this. Can you bring this up on python-dev to see if there are different viewpoints? I think the resulting code will be easier to maintain; new stuff added at this point is more likely to be unique to Unix anyway (or unique to Windows). I wonder if the os2 code shouldn't be moved to its own file as well (I think Andrew MacIntyre maintains that port, right?). There are still a bunch of #ifdefs in the nt code. Are those really variable across Windows versions or compilers? If not, I'd suggest to expand those. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-08 18:32 Message: Logged In: YES user_id=31435 I'm -0, so assigning to Guido for another opinion. I expect this will actually make it harder to keep the os interface consistent and working across platforms; e.g., somebody adds an os function in one module but forgets to add it in the other (likely because they don't even know it exists); a docstring repair shows up in one but not both; a largefile fix in one doesn't get reflected in the other; etc. Apart from the massive Windows popen pain, there are actually more embedded PYOS_OS2 #ifdefs in posixmodule. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=592529&group_id=5470 From noreply@sourceforge.net Sun Aug 11 18:47:07 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 10:47:07 -0700 Subject: [Patches] [ python-Patches-593560 ] bugfixes and cleanup for _strptime.py Message-ID: Patches item #593560, was opened at 2002-08-11 04:01 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593560&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Brett Cannon (bcannon) Assigned to: Nobody/Anonymous (nobody) Summary: bugfixes and cleanup for _strptime.py Initial Comment: Discovered two bugs in _strptime.py thanks to Mikael Sch?berg of AB Strakt; both were in LocaleTime.__calc_date_time(). One was where if a locale-specific format string represented the month without a leading zero, it would not be caught. The other bug was when a locale just lacked some information (in this case, Swedish's lack of an AM/PM representation); IndexError was thrown because string.replace() was being called with the empty string as the old value. I also took this opportunity to clean up some of the code (namely TimeRE.__getitem__() along with LocaleTime.__calc_date_time()). Added some comments, reformatted some code, etc. All of this was brought on thanks to the Python Cookbook's chapter 1 (good work Alex and David!). I have updated test_strptime.py to check for the second of the mentioned bug explicitly. I also commented the code and added a fxn that creates a PyUnit test suite with all of the tests. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-11 19:47 Message: Logged In: YES user_id=21627 Please don't post complete files. Instead, post context (-c) or unified (-u) diffs. Ideally, produce them with "cvs diff", as this will result in patches that record the CVS version number they were for. I think it would be good to get a comment from Mikael on that patch. ---------------------------------------------------------------------- Comment By: Brett Cannon (bcannon) Date: 2002-08-11 05:31 Message: Logged In: YES user_id=357491 Just when you thought you had something done, tim_one had to go and normalize the whitespace in both _strptime.py and test_strptime.py! =) So to save Tim the time and effort of having to normalize the files again, I went ahead and applied them to the fixed files. I also reformatted test_strptime.py so that lines wrapped around 80 characters (didn't realize Guido had added it to the distro until today). So make sure to use the files that specify whitespace normalization in their descriptions. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593560&group_id=5470 From noreply@sourceforge.net Sun Aug 11 21:47:00 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 13:47:00 -0700 Subject: [Patches] [ python-Patches-505705 ] Remove eval in pickle and cPickle Message-ID: Patches item #505705, was opened at 2002-01-19 10:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Out of Date Priority: 5 Submitted By: Martin v. L�wis (loewis) Assigned to: Martin v. L�wis (loewis) Summary: Remove eval in pickle and cPickle Initial Comment: This patch removes the use of eval in pickle and cPickle. It does so by: - moving the actual parsing from compile.c:parsestr to PyString_DecodeEscape - introducing a new codec string-escape - removing the code that checks that a string-to-unpickle is properly escaped throughout, and replaces this with a check whether it is properly quoted, - unquoting the string in load_string, then passing it to the codec. This fixes #502503. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-11 22:47 Message: Logged In: YES user_id=21627 Updated to current CVS. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 16:09 Message: Logged In: YES user_id=6380 This would fix bug #593656 too. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 16:48 Message: Logged In: YES user_id=6380 I like this idea. But the patch is out of date. Can you rework the patch? How much faster does this make the test program from http://groups.google.de/groups?hl=en&lr=&ie=UTF-8&selm=mailman.1026940226.16076.python-list%40python.org ??? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 06:12 Message: Logged In: YES user_id=6380 I'll review this. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-01-19 10:25 Message: Logged In: YES user_id=21627 BTW, this patch has #500002 as a prerequisite. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 From noreply@sourceforge.net Sun Aug 11 22:14:16 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 14:14:16 -0700 Subject: [Patches] [ python-Patches-505705 ] Remove eval in pickle and cPickle Message-ID: Patches item #505705, was opened at 2002-01-19 10:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Out of Date Priority: 5 Submitted By: Martin v. L�wis (loewis) >Assigned to: Guido van Rossum (gvanrossum) Summary: Remove eval in pickle and cPickle Initial Comment: This patch removes the use of eval in pickle and cPickle. It does so by: - moving the actual parsing from compile.c:parsestr to PyString_DecodeEscape - introducing a new codec string-escape - removing the code that checks that a string-to-unpickle is properly escaped throughout, and replaces this with a check whether it is properly quoted, - unquoting the string in load_string, then passing it to the codec. This fixes #502503. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-11 22:47 Message: Logged In: YES user_id=21627 Updated to current CVS. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 16:09 Message: Logged In: YES user_id=6380 This would fix bug #593656 too. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 16:48 Message: Logged In: YES user_id=6380 I like this idea. But the patch is out of date. Can you rework the patch? How much faster does this make the test program from http://groups.google.de/groups?hl=en&lr=&ie=UTF-8&selm=mailman.1026940226.16076.python-list%40python.org ??? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 06:12 Message: Logged In: YES user_id=6380 I'll review this. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-01-19 10:25 Message: Logged In: YES user_id=21627 BTW, this patch has #500002 as a prerequisite. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 From noreply@sourceforge.net Sun Aug 11 22:14:50 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 14:14:50 -0700 Subject: [Patches] [ python-Patches-588809 ] LDFLAGS support for build_ext.py Message-ID: Patches item #588809, was opened at 2002-07-30 23:36 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588809&group_id=5470 Category: Distutils and setup.py Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Robert Weber (chipsforbrains) >Assigned to: Nobody/Anonymous (nobody) Summary: LDFLAGS support for build_ext.py Initial Comment: a hack at best ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-07 10:27 Message: Logged In: YES user_id=21627 The patch looks fine to me, but I'd like to hear the opinion of a distutils guru. ---------------------------------------------------------------------- Comment By: Robert Weber (chipsforbrains) Date: 2002-08-06 21:35 Message: Logged In: YES user_id=245624 > As a hack, I think it is unacceptable for Python. > >I'd encourage you to integrate this (and CFLAGS) into >sysconfig.customize_compiler. > >It would be ok if only the Unix compiler honors those >settings for now. > Martin v. L�wis (loewis) I have written a better patch to sysconfig.py that doe all others so that everything works like autoconf. I will post the patch in a sec.s CFLAGS and ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 11:05 Message: Logged In: YES user_id=21627 As a hack, I think it is unacceptable for Python. I'd encourage you to integrate this (and CFLAGS) into sysconfig.customize_compiler. It would be ok if only the Unix compiler honors those settings for now. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588809&group_id=5470 From noreply@sourceforge.net Sun Aug 11 23:16:48 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 15:16:48 -0700 Subject: [Patches] [ python-Patches-593560 ] bugfixes and cleanup for _strptime.py Message-ID: Patches item #593560, was opened at 2002-08-10 19:01 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593560&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Brett Cannon (bcannon) Assigned to: Nobody/Anonymous (nobody) Summary: bugfixes and cleanup for _strptime.py Initial Comment: Discovered two bugs in _strptime.py thanks to Mikael Sch?berg of AB Strakt; both were in LocaleTime.__calc_date_time(). One was where if a locale-specific format string represented the month without a leading zero, it would not be caught. The other bug was when a locale just lacked some information (in this case, Swedish's lack of an AM/PM representation); IndexError was thrown because string.replace() was being called with the empty string as the old value. I also took this opportunity to clean up some of the code (namely TimeRE.__getitem__() along with LocaleTime.__calc_date_time()). Added some comments, reformatted some code, etc. All of this was brought on thanks to the Python Cookbook's chapter 1 (good work Alex and David!). I have updated test_strptime.py to check for the second of the mentioned bug explicitly. I also commented the code and added a fxn that creates a PyUnit test suite with all of the tests. ---------------------------------------------------------------------- >Comment By: Brett Cannon (bcannon) Date: 2002-08-11 15:16 Message: Logged In: YES user_id=357491 Sorry, Martin. I thought I remembered reading somewhere that for Python files you can just post the whole thing. I will stop doing that. As for Mikael and the patch, he says that it appears to be working. I gave it to him on Tuesday and he said it appeared to be working; he has yet to say otherwise. If you prefer, I can have him post here to verify this. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-11 10:47 Message: Logged In: YES user_id=21627 Please don't post complete files. Instead, post context (-c) or unified (-u) diffs. Ideally, produce them with "cvs diff", as this will result in patches that record the CVS version number they were for. I think it would be good to get a comment from Mikael on that patch. ---------------------------------------------------------------------- Comment By: Brett Cannon (bcannon) Date: 2002-08-10 20:31 Message: Logged In: YES user_id=357491 Just when you thought you had something done, tim_one had to go and normalize the whitespace in both _strptime.py and test_strptime.py! =) So to save Tim the time and effort of having to normalize the files again, I went ahead and applied them to the fixed files. I also reformatted test_strptime.py so that lines wrapped around 80 characters (didn't realize Guido had added it to the distro until today). So make sure to use the files that specify whitespace normalization in their descriptions. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593560&group_id=5470 From noreply@sourceforge.net Mon Aug 12 02:51:28 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 18:51:28 -0700 Subject: [Patches] [ python-Patches-505705 ] Remove eval in pickle and cPickle Message-ID: Patches item #505705, was opened at 2002-01-19 04:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Out of Date Priority: 5 Submitted By: Martin v. L�wis (loewis) >Assigned to: Martin v. L�wis (loewis) Summary: Remove eval in pickle and cPickle Initial Comment: This patch removes the use of eval in pickle and cPickle. It does so by: - moving the actual parsing from compile.c:parsestr to PyString_DecodeEscape - introducing a new codec string-escape - removing the code that checks that a string-to-unpickle is properly escaped throughout, and replaces this with a check whether it is properly quoted, - unquoting the string in load_string, then passing it to the codec. This fixes #502503. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 21:51 Message: Logged In: YES user_id=6380 Closer. - Why bother stripping triple quotes in the pickle/cPickle load code? These will never happen as a result of a pickle dump AFAIK, and the code you are replacing doesn't accept these either AFAICT. - There's something missing (the previous version of the patch had it I believe) that's needed to register the codec; as a consequence, pickle.loads() doesn't work. - escape_encode() uses repr() of a string to do the work. But that means the outcome for embedding string quotes is confusing, because of the "smarts" in repr() that use " for surrounding quotes when there's a ' in the string, and vice versa. Thus, a single quote or a double quote is returned unquoted; but if they both occur in the same string, the single quote is quoted. I don't think that's particularly useful. Maybe there should be an underlying primitive operation that gives you a choice and which is invoked both by escape_encode() and string repr()? - I don't understand the recode_encoding stuff, but it looks like something like that was present before too. :-) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-11 16:47 Message: Logged In: YES user_id=21627 Updated to current CVS. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 10:09 Message: Logged In: YES user_id=6380 This would fix bug #593656 too. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 10:48 Message: Logged In: YES user_id=6380 I like this idea. But the patch is out of date. Can you rework the patch? How much faster does this make the test program from http://groups.google.de/groups?hl=en&lr=&ie=UTF-8&selm=mailman.1026940226.16076.python-list%40python.org ??? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 00:12 Message: Logged In: YES user_id=6380 I'll review this. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-01-19 04:25 Message: Logged In: YES user_id=21627 BTW, this patch has #500002 as a prerequisite. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 From noreply@sourceforge.net Mon Aug 12 03:40:40 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 19:40:40 -0700 Subject: [Patches] [ python-Patches-560379 ] Karatsuba multiplication Message-ID: Patches item #560379, was opened at 2002-05-24 21:07 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=560379&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open >Resolution: Accepted Priority: 5 Submitted By: Christopher A. Craig (ccraig) Assigned to: Tim Peters (tim_one) Summary: Karatsuba multiplication Initial Comment: Adds Karatsuba multiplication to Python. Patches longobject.c to use Karatsuba multiplication in place of gradeschool math. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-08-11 22:40 Message: Logged In: YES user_id=31435 Thanks! I checked in some code building on this. Changes included: + Adjusted whitespace to meet the standard (spaces after "if" and "for", flanking binary operators, etc). + The refcount fiddling in x_mul caused assorted system crashes if KeyboardInterrupt was raised during a multiply. Repaired that. + More comments and asserts. + Removed k_join and built "the answer" piecemeal into the result object in k_mul. This allows to free more chunks of memory sooner, reducing highwater mark and the probable size of the working set. Since the cache behavior is quite different now, it would be cool if you could run your tuning tests again. The cutoff value is now a #define, KARATSUBA_CUTOFF near the top of longobject.c. Until I can make time for more thorough testing, k_mul isn't called by default: multiplication invokes k_mul if and only if an environment variable named KARAT exists (its value is irrelevant; just its existence matters). ---------------------------------------------------------------------- Comment By: Christopher A. Craig (ccraig) Date: 2002-07-09 18:43 Message: Logged In: YES user_id=135050 I've brought the code into compliance with the coding standards in the PEP7, and added some comments that I thought were in line with the rest of the file. If there is something else you would like me to do, please tell me. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-06-05 17:38 Message: Logged In: YES user_id=6380 Tim thinks this is cool, but the code can use cleanup and comments. Also, let's not add platform specific hacks (Christian can sell those as an add-on :-). ---------------------------------------------------------------------- Comment By: Christopher A. Craig (ccraig) Date: 2002-05-25 19:41 Message: Logged In: YES user_id=135050 I made the needed changes to make to split on the bigger number (basically chaged to split on bigger number, and changed all of the places that need to check to see if there are no bits left), and the new one is a little bit faster, so I'm uploading it too. I had been thinking about fixed precision numbers when I wrote it, so I honestly didn't consider the fact that I could just shift the smaller number to 0 and throw it away... :-) ---------------------------------------------------------------------- Comment By: Christopher A. Craig (ccraig) Date: 2002-05-25 12:16 Message: Logged In: YES user_id=135050 I just uploaded a graph with some sample timings in it. Red is a fence of 20. Green is a fence of 40. Blue is a fence of 60. Black is done with unmodified Python 2.2.1. ---------------------------------------------------------------------- Comment By: Christopher A. Craig (ccraig) Date: 2002-05-25 01:53 Message: Logged In: YES user_id=135050 I got 40 from testing. Basically I generated 250 random numbers each for a series of sizes between 5 and 2990 bits long at 15 bit intervals (i.e. the word size), and stored it in a dictionary. Then timed 249 multiplies at each size for a bunch of fence values and used gdchart to make a pretty graph. It cerntainly could be optimized better per compiler/platform, but I don't know how much gain you'ld see. I split on the smaller number because I guessed it would be better. My thought was that if I split on the smaller number I'm guaranteed to reach the fence, at which point I can use the gradeschool method at a near linear cost (since it's O(n*m) and one of those two is at most the fence size). If I split on the larger number, I may run into a condition where the smaller number is less than half the larger, but I haven't reached the fence yet, and then gradeschool could be much more expensive. ---------------------------------------------------------------------- Comment By: Christian Tismer (tismer) Date: 2002-05-24 23:23 Message: Logged In: YES user_id=105700 Hmm, not bad. Q: You set the split fence at 40. Where does this number come from? I think this could be optimzed per compiler/platform. You say that you split based on the smaller number. Why this? My intuitive guess would certainly be to always split on the larger number. I just checked my Python implementation which does this. Open question: how to handle very small by very long the best way? Probably the highschool version is better here, and that might have led you to investigate the smaller one. I'd say bosh should be checked. good work! - cheers chris ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=560379&group_id=5470 From noreply@sourceforge.net Mon Aug 12 04:36:54 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 20:36:54 -0700 Subject: [Patches] [ python-Patches-560379 ] Karatsuba multiplication Message-ID: Patches item #560379, was opened at 2002-05-24 21:07 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=560379&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Christopher A. Craig (ccraig) Assigned to: Tim Peters (tim_one) Summary: Karatsuba multiplication Initial Comment: Adds Karatsuba multiplication to Python. Patches longobject.c to use Karatsuba multiplication in place of gradeschool math. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 23:36 Message: Logged In: YES user_id=33168 Tim, did you want to leave this open? ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-11 22:40 Message: Logged In: YES user_id=31435 Thanks! I checked in some code building on this. Changes included: + Adjusted whitespace to meet the standard (spaces after "if" and "for", flanking binary operators, etc). + The refcount fiddling in x_mul caused assorted system crashes if KeyboardInterrupt was raised during a multiply. Repaired that. + More comments and asserts. + Removed k_join and built "the answer" piecemeal into the result object in k_mul. This allows to free more chunks of memory sooner, reducing highwater mark and the probable size of the working set. Since the cache behavior is quite different now, it would be cool if you could run your tuning tests again. The cutoff value is now a #define, KARATSUBA_CUTOFF near the top of longobject.c. Until I can make time for more thorough testing, k_mul isn't called by default: multiplication invokes k_mul if and only if an environment variable named KARAT exists (its value is irrelevant; just its existence matters). ---------------------------------------------------------------------- Comment By: Christopher A. Craig (ccraig) Date: 2002-07-09 18:43 Message: Logged In: YES user_id=135050 I've brought the code into compliance with the coding standards in the PEP7, and added some comments that I thought were in line with the rest of the file. If there is something else you would like me to do, please tell me. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-06-05 17:38 Message: Logged In: YES user_id=6380 Tim thinks this is cool, but the code can use cleanup and comments. Also, let's not add platform specific hacks (Christian can sell those as an add-on :-). ---------------------------------------------------------------------- Comment By: Christopher A. Craig (ccraig) Date: 2002-05-25 19:41 Message: Logged In: YES user_id=135050 I made the needed changes to make to split on the bigger number (basically chaged to split on bigger number, and changed all of the places that need to check to see if there are no bits left), and the new one is a little bit faster, so I'm uploading it too. I had been thinking about fixed precision numbers when I wrote it, so I honestly didn't consider the fact that I could just shift the smaller number to 0 and throw it away... :-) ---------------------------------------------------------------------- Comment By: Christopher A. Craig (ccraig) Date: 2002-05-25 12:16 Message: Logged In: YES user_id=135050 I just uploaded a graph with some sample timings in it. Red is a fence of 20. Green is a fence of 40. Blue is a fence of 60. Black is done with unmodified Python 2.2.1. ---------------------------------------------------------------------- Comment By: Christopher A. Craig (ccraig) Date: 2002-05-25 01:53 Message: Logged In: YES user_id=135050 I got 40 from testing. Basically I generated 250 random numbers each for a series of sizes between 5 and 2990 bits long at 15 bit intervals (i.e. the word size), and stored it in a dictionary. Then timed 249 multiplies at each size for a bunch of fence values and used gdchart to make a pretty graph. It cerntainly could be optimized better per compiler/platform, but I don't know how much gain you'ld see. I split on the smaller number because I guessed it would be better. My thought was that if I split on the smaller number I'm guaranteed to reach the fence, at which point I can use the gradeschool method at a near linear cost (since it's O(n*m) and one of those two is at most the fence size). If I split on the larger number, I may run into a condition where the smaller number is less than half the larger, but I haven't reached the fence yet, and then gradeschool could be much more expensive. ---------------------------------------------------------------------- Comment By: Christian Tismer (tismer) Date: 2002-05-24 23:23 Message: Logged In: YES user_id=105700 Hmm, not bad. Q: You set the split fence at 40. Where does this number come from? I think this could be optimzed per compiler/platform. You say that you split based on the smaller number. Why this? My intuitive guess would certainly be to always split on the larger number. I just checked my Python implementation which does this. Open question: how to handle very small by very long the best way? Probably the highschool version is better here, and that might have led you to investigate the smaller one. I'd say bosh should be checked. good work! - cheers chris ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=560379&group_id=5470 From noreply@sourceforge.net Mon Aug 12 05:19:04 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 11 Aug 2002 21:19:04 -0700 Subject: [Patches] [ python-Patches-560379 ] Karatsuba multiplication Message-ID: Patches item #560379, was opened at 2002-05-24 21:07 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=560379&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Christopher A. Craig (ccraig) Assigned to: Tim Peters (tim_one) Summary: Karatsuba multiplication Initial Comment: Adds Karatsuba multiplication to Python. Patches longobject.c to use Karatsuba multiplication in place of gradeschool math. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-08-12 00:19 Message: Logged In: YES user_id=31435 Yes, until the new algorithm is enabled w/o the envar trickery. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 23:36 Message: Logged In: YES user_id=33168 Tim, did you want to leave this open? ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-11 22:40 Message: Logged In: YES user_id=31435 Thanks! I checked in some code building on this. Changes included: + Adjusted whitespace to meet the standard (spaces after "if" and "for", flanking binary operators, etc). + The refcount fiddling in x_mul caused assorted system crashes if KeyboardInterrupt was raised during a multiply. Repaired that. + More comments and asserts. + Removed k_join and built "the answer" piecemeal into the result object in k_mul. This allows to free more chunks of memory sooner, reducing highwater mark and the probable size of the working set. Since the cache behavior is quite different now, it would be cool if you could run your tuning tests again. The cutoff value is now a #define, KARATSUBA_CUTOFF near the top of longobject.c. Until I can make time for more thorough testing, k_mul isn't called by default: multiplication invokes k_mul if and only if an environment variable named KARAT exists (its value is irrelevant; just its existence matters). ---------------------------------------------------------------------- Comment By: Christopher A. Craig (ccraig) Date: 2002-07-09 18:43 Message: Logged In: YES user_id=135050 I've brought the code into compliance with the coding standards in the PEP7, and added some comments that I thought were in line with the rest of the file. If there is something else you would like me to do, please tell me. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-06-05 17:38 Message: Logged In: YES user_id=6380 Tim thinks this is cool, but the code can use cleanup and comments. Also, let's not add platform specific hacks (Christian can sell those as an add-on :-). ---------------------------------------------------------------------- Comment By: Christopher A. Craig (ccraig) Date: 2002-05-25 19:41 Message: Logged In: YES user_id=135050 I made the needed changes to make to split on the bigger number (basically chaged to split on bigger number, and changed all of the places that need to check to see if there are no bits left), and the new one is a little bit faster, so I'm uploading it too. I had been thinking about fixed precision numbers when I wrote it, so I honestly didn't consider the fact that I could just shift the smaller number to 0 and throw it away... :-) ---------------------------------------------------------------------- Comment By: Christopher A. Craig (ccraig) Date: 2002-05-25 12:16 Message: Logged In: YES user_id=135050 I just uploaded a graph with some sample timings in it. Red is a fence of 20. Green is a fence of 40. Blue is a fence of 60. Black is done with unmodified Python 2.2.1. ---------------------------------------------------------------------- Comment By: Christopher A. Craig (ccraig) Date: 2002-05-25 01:53 Message: Logged In: YES user_id=135050 I got 40 from testing. Basically I generated 250 random numbers each for a series of sizes between 5 and 2990 bits long at 15 bit intervals (i.e. the word size), and stored it in a dictionary. Then timed 249 multiplies at each size for a bunch of fence values and used gdchart to make a pretty graph. It cerntainly could be optimized better per compiler/platform, but I don't know how much gain you'ld see. I split on the smaller number because I guessed it would be better. My thought was that if I split on the smaller number I'm guaranteed to reach the fence, at which point I can use the gradeschool method at a near linear cost (since it's O(n*m) and one of those two is at most the fence size). If I split on the larger number, I may run into a condition where the smaller number is less than half the larger, but I haven't reached the fence yet, and then gradeschool could be much more expensive. ---------------------------------------------------------------------- Comment By: Christian Tismer (tismer) Date: 2002-05-24 23:23 Message: Logged In: YES user_id=105700 Hmm, not bad. Q: You set the split fence at 40. Where does this number come from? I think this could be optimzed per compiler/platform. You say that you split based on the smaller number. Why this? My intuitive guess would certainly be to always split on the larger number. I just checked my Python implementation which does this. Open question: how to handle very small by very long the best way? Probably the highschool version is better here, and that might have led you to investigate the smaller one. I'd say bosh should be checked. good work! - cheers chris ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=560379&group_id=5470 From noreply@sourceforge.net Mon Aug 12 08:22:25 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 00:22:25 -0700 Subject: [Patches] [ python-Patches-588982 ] Mindless editing, DL_EXPORT/IMPORT Message-ID: Patches item #588982, was opened at 2002-07-31 17:47 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588982&group_id=5470 Category: None Group: None >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: Kalle Svensson (krftkndl) Assigned to: Mark Hammond (mhammond) Summary: Mindless editing, DL_EXPORT/IMPORT Initial Comment: In http://mail.python.org/pipermail/python-dev/2002-July/027136.html, Mark Hammond asked for a patch substituting Py_MODINIT_FUNC for DL_EXPORT(void) in Modules/*.c and PyAPI_FUNC/DATA for DL_IMPORT in Include/*.h. Since I'm a sucker for easy fame and fortune, here it is. ---------------------------------------------------------------------- >Comment By: Mark Hammond (mhammond) Date: 2002-08-12 17:22 Message: Logged In: YES user_id=14198 Thanks! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588982&group_id=5470 From noreply@sourceforge.net Mon Aug 12 08:43:56 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 00:43:56 -0700 Subject: [Patches] [ python-Patches-593627 ] Static names Message-ID: Patches item #593627, was opened at 2002-08-11 12:41 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: Static names Initial Comment: This patch creates static string objects for all built-in names and interns then on initialization. The macro PyNAME is be used to access static names. PyNAME(__spam__) is equivalent to PyString_InternFromString("__spam__") but is a constant expression. It requires the name to be one of the built-in names. A linker error will be generated if it isn't. Most conversions of C strings into temporary string objects can be eliminated (PyString_FromString, PyString_InternFromString). Most string comparisons at runtime can also be eliminated. Instead of : if (strcmp(PyString_AsString(name), "__spam__")) ... This code can be used: PyString_INTERN(name) if (name == PyNAME(__spam__)) ... Where PyString_INTERN is a fast inline check if the string is already interned (and it usually is). To prevent unbounded accumulation of interned strings the mortal interned string patch should also be applied. The patch converts most of the builtin module to this new mode as an example. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 09:43 Message: Logged In: YES user_id=21627 What is the rationale for this patch? If it is for performance, what real speed improvements can you report? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 17:54 Message: Logged In: YES user_id=33168 I'd like to see the releasing interned string patch applied. I think it's almost ready, isn't it? It would make patching easier and seems to be a good idea. For me, the easiest way to produce patches is to use cvs. You can keep multiple cvs trees around easy enough (for having multiple overlapping/independant patches). To create patches with cvs: cvs diff -C 5 [file1] [file2] .... ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-11 17:44 Message: Logged In: YES user_id=562624 Ok, I'll fix the problems with the patch. What's the best way to produce a patch that adds new files? Static string objects cannot be released, of course. This patch will eventually depend on on the mortal interned strings patch to fix it but in the meantime I just disabled releasing interned strings because I want to keep the two patches independent. The next version will add a new PyArg_ParseTuple format for an interned string to make it easier to use. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 16:46 Message: Logged In: YES user_id=33168 Couple of initial comments: * this is a reverse patch * it seems like there are other changes in here - int ob_shash -> long - releasing interned strings? * dictobject.c is removed? * including python headers should use "" not <> Oren, could you generate a new patch with only the changes to support PyNAME? Thanks! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 From noreply@sourceforge.net Mon Aug 12 10:08:58 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 02:08:58 -0700 Subject: [Patches] [ python-Patches-495688 ] Make site.py more friendly to PDAs Message-ID: Patches item #495688, was opened at 2001-12-21 01:37 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=495688&group_id=5470 Category: Library (Lib) Group: Python 2.3 >Status: Closed >Resolution: Rejected Priority: 5 Submitted By: Phil Thompson (philthompson) Assigned to: Nobody/Anonymous (nobody) Summary: Make site.py more friendly to PDAs Initial Comment: site.py requires distutils and pydoc which are both unfriendly to devices with little memory like PDAs. This patch makes site.py cope with distutils and pydoc not being installed. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 11:08 Message: Logged In: YES user_id=21627 I like the pydoc half less: the problem only happens if you actually invoke help, and invoking it without this patch produces an exception that should give a clear indication of what the problem is. So I reject the entire patch; if you want formal support for a subsetted Python, you should probably write a Python-for-PDA PEP which defines the subset clearly. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-01-17 19:01 Message: Logged In: YES user_id=6656 Hmm. Semi-approve of the pydoc change. The distutils change seems pointless though -- you're not likely to build Python on a PDA anytime soon, are you? Or are folders call Modules/ very common on PDAs? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=495688&group_id=5470 From noreply@sourceforge.net Mon Aug 12 10:10:27 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 02:10:27 -0700 Subject: [Patches] [ python-Patches-505846 ] pyport.h, Wince and errno getter/setter Message-ID: Patches item #505846, was opened at 2002-01-19 21:13 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505846&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Brad Clements (bkc) Assigned to: Nobody/Anonymous (nobody) Summary: pyport.h, Wince and errno getter/setter Initial Comment: Most of the remaining Windows CE diffs are due to the lack of errno on Windows CE. There are other OS's that do not have errno (but they have a workalike method). At first I had simply commented out all references in the code to errno, but that quickly became unworkable. Wince and NetWare use a function to set the per- thread "errno" value. Although errno #defines (like ERANGE) are not defined for Wince, they are defined for NetWare. Removing references to errno would artificially hobble the NetWare port. These platforms also use a function to retrieve the current errno value. The attached diff for pyport.h attempts to standardize the access method for errno (or it's work-alike) by providing SetErrno(), ClearErrno() and GetErrno() macros. ClearErrno() is SetErrno(0) I've found and changed all direct references to errno to use these macros. This patch must be submitted before the patches for other modules. -- I see two negatives with this approach: 1. It will be a pain to think GetErrno() instead of "errno" when modifying/writing new code. 2. Extension modules will need access to pyport.h for the target platform. In the worst case, directly referencing errno instead of using the macros will break only those platforms for which the macros do something different. That is, Wince and NetWare. -- An alternative spelling/capitalization of these macros might make them more appealing. Feel free to make a suggestion. -- It's probably incorrect for me to use SetErrno() as a function, such as SetErrno(1); I think the semi-colon is not needed, but wasn't entirely certain. On better advice, I will fix these statements in the remaining source files if this patch is accepted. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 11:10 Message: Logged In: YES user_id=21627 Any chances that updates to this patch are forthcoming? If not, it will be rejected by October 1. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-06-04 19:33 Message: Logged In: YES user_id=21627 The patch requires further surgery: What is DONT_HAVE_TIME_H? If you want to test for presence of , you need to add HAVE_TIME_H to the autoconf machinery, and all manually-maintained copies of pyconfig.h. Including just the configure.in changes is fine; no need to include changes to generated files. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-04-19 16:47 Message: Logged In: YES user_id=45365 Brad, I think this patch might be asking for too much. You're asking that all accesses to errno be replaced by GetErrno() or SetErrno() calls, really... And for many cases there is a workaround, where you don't have to change user code (i.e. the normal C code still uses what it thinks is an errno variable). On my system errno is #define errno (*__error()) and the __error() routine returns a pointer to the errno-variable for the current thread. For the GetErrno function this would be good enough, and with a bit of effort you could probably get it to work for the Set function too (possibly by doing the actual Set work in the next Get call). ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-02-12 00:17 Message: Logged In: YES user_id=4631 Hi folks, I need to proceed with the port to NetWare so I have something to demo at Brainshare in March. Unfortunately future patches from me will include both WINCE and NetWare specific patches, though hopefully there won't be much other than config.h and this patch (which is required for NetWare). Is there anything I can do to make this patch more acceptable? Send a bottle of wine, perhaps? ;-) ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-01-29 00:39 Message: Logged In: YES user_id=33168 Tim, I can check in or do whatever else needs to be done to check this in and move this forward. How do you want to procede? Brad, I think most people are pretty busy right now. ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-01-29 00:19 Message: Logged In: YES user_id=4631 Hi folks, just wondering if this patch is going to be rejected, or if you're all too busy and I have to be more patient ;-) I have a passle of Python-CE folks waiting on me to finish checking in patches. This is the worst one, I promise! Let me know what you want me to do, when you get a chance. Thanks ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-01-20 21:17 Message: Logged In: YES user_id=4631 I've eliminated Py_ClearErrno() and updated all the source to use Py_SetErrno(0). Attached is an updated diff for pyport.h ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-01-20 20:21 Message: Logged In: YES user_id=31435 Brad, errno is required by ANSI C, which also defines the semantics of a 0 value. Setting errno to 0, and taking errno==0 as meaning "no error", are 100% portable across platforms with a standard-conforming C implementation. If this platform doesn't support standard C, I have to question whether the core should even try to cater to it: the changes needed make no sense to C programmers, so may become a maintenance nightmare. I don't think putting a layer around errno is going to be hard to live with, provided that it merely tries to emulate standard behavior. For that reason, setting errno to 0 is correct, but inventing a new ClearErrno concept is wrong (the latter makes no sense to anyone except its inventor ). ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-01-20 16:54 Message: Logged In: YES user_id=4631 I can post a new diff for the // or would you be willing to just change the patch you have? I cannot use the same macros for Py_SET_ERANGE_IF_OVERFLOW (X) because Wince doesn't have ERANGE. You'll note the use of Py_SetErrno(1) which is frankly bogus. This is related to your comment on Py_ClearErrno() Using (errno == 0) as meaning "no error" seems to me to be a python source "convention" forced on it by (mostly) floating point side effects. Because the math routines are indicating overflow errors through the side effect of setting errno (rather than returning an explicit NaN that works on all platforms), we must set errno = 0 before calling these math functions. I suppose it's possible that on some platform "clearing the last error value" wouldn't be done this way, but rather might be an explicit function call. Since I was going through the source looking for all errno's, I felt it was clearer to say Py_ClearErrno() rather than Py_SetErrno(0), even though in the end they do the same thing on currently supported platforms. I'm easy, if you want to replace Py_ClearErrno() with Py_SetErrno(0) I can do that too. -- Regarding goto targets.. is it likely that "cleanup" might also collide with local variables? would _cleanup or __cleanup work for you? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-01-19 23:57 Message: Logged In: YES user_id=33168 Need to change the // comment to /* */. gcc accepts this for C, but it's non-standard (at least it was, it may have changed in C99). You can have 1 Py_SET_ERANGE_IF_OVERFLOW for both platforms if you do this: #ifndef ERANGE #define ERANGE 1 #endif #define Py_SET_ERANGE_IF_OVERFLOW(X) \ do { \ if (Py_GetErrno() == 0 && ((X) == Py_HUGE_VAL || \ (X) == -Py_HUGE_VAL)) \ Py_SetErrno(ERANGE); \ } while(0) I'm not sure of the usefulness of Py_ClearErrno(), since it's the same on all platforms. If errno might be set to something other than 0 in the future, it would be good to make the change now. I would suggest changing finally to cleanup. ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-01-19 22:47 Message: Logged In: YES user_id=4631 Here is an amended diff with the suggested changes. I've tested the semi-colon handling on EVT, it works as suggested. -- Question: What is the prefered style, #ifdef xyz or #if defined(xyz) ? I try to use #ifdef xyz, but sometimes there's multiple possibilities and #if defined(x) || defined(y) is needed. Is that okay? -- Upcoming issue (hoping you address in your reply). There are many "goto finally" statements in various modules. Unfortunately EVT treats "finally" as a reserved word, even when compiling in non C++ mode. Also, Metrowerks does the same. I've changed all of these to "goto my_finally" as a quick work-around. I know "my_finally" sounds yucky, what's your recommendation for this? ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-01-19 21:52 Message: Logged In: YES user_id=31435 All identifiers defined in pyport.h must begin with "Py_". pyport.h is (and must be) #include'd by extension modules, and we need the prefix to avoid stomping on their namespace, and to make clear (to them and to us) that the gimmicks are part of Python's portability layer. A name like "SetErrno" is certain to conflict with some other package's attempt to worm around errno problems; Py_SetErrno () is not. Agree with Neal's final suggestion about dealing with semicolons. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-01-19 21:28 Message: Logged In: YES user_id=33168 Typically, the semi-colon problem is dealt with as in Py_SET_ERANGE_IF_OVERFLOW. So, #define SetErrno(X) do { SetLastError(X); } while (0) I don't think (but can't remember if) there is any problem for single statements like you have. You could probably do: #ifndef MS_WINCE #define SetErrno(X) errno = (X) /* note no ; */ #else #define SetErrno(X) SetLastError(X) /* note no ; */ #endif ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505846&group_id=5470 From noreply@sourceforge.net Mon Aug 12 10:15:43 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 02:15:43 -0700 Subject: [Patches] [ python-Patches-511219 ] suppress type restrictions on locals() Message-ID: Patches item #511219, was opened at 2002-01-31 15:55 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=511219&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Cesar Douady (douady) Assigned to: Nobody/Anonymous (nobody) Summary: suppress type restrictions on locals() Initial Comment: This patch suppresses the restriction that global and local dictionaries do not access overloaded __getitem__ and __setitem__ if passed an object derived from class dict. An exception is made for the builtin insertion and reference in the global dict to make sure this object exists and to suppress the need for the derived class to take care of this implementation dependent detail. The behavior of eval and exec has been updated for code objects which have the CO_NEWLOCALS flag set : if explicitely passed a local dict, a new local dict is not generated. This allows one to pass an explicit local dict to the code object of a function (which otherwise cannot be achieved). If this cannot be done for backward compatibility problems, then an alternative would consist in using the "new" module to create a code object from a function with CO_NEWLOCALS reset but it seems logical to me to use the information explicitely provided. Free and cell variables are not managed in this version. If the patch is accepted, I am willing to finish the job and implement free and cell variables, but this requires a serious rework of the Cell object: free variables should be accessed using the method of the dict in which they relies and today, this dict is not accessible from the Cell object. Robustness : Currently, the plain test suite passes (with a modification of test_desctut which precisely verifies that the suppressed restriction is enforced). I have introduced a new test (test_subdict.py) which verifies the new behavior. Because of performance, the plain case (when the local dict is a plain dict) is optimized so that differences in performance are not measurable (within 1%) when run on the test suite (i.e. I timed make test). ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 11:15 Message: Logged In: YES user_id=21627 What is the status of this patch? Could you find people who are interested in using this feature? If not, I'm tempted to reject it. ---------------------------------------------------------------------- Comment By: Cesar Douady (douady) Date: 2002-04-02 20:08 Message: Logged In: YES user_id=428521 Well, I think I am in sync now. 1/ I did take you initial comment as meaning the patch could not be applied to 2.2.x 2/ I decided to generate a new patch to be applied to 2.2.1c2 3/ I realized that the patch could be applied as is 4/ I was lost 5/ I realized the meaning of the group was the one you just mentioned. 6/ I decided to post the result of my trial anyway so people could confidently apply the patch the lastest release (specially because patch outputs some warnings). 7/ I did not understand this place could actually be used as a forum (i.e. reply to previous post rather than general info). Let me apologize for my previous misunderstandings. about compatibility : I did not find a way to make it backward binary compatible, however my intent is to make it source compatible for extensions. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-04-02 16:26 Message: Logged In: YES user_id=6656 So what? Maybe you misunderstand me. This patch was in the group "Python 2.2.x", which is the group we use for patches that are under consideration for being put into a 2.2.x release of Python (or in other words, a bugfix release of Python 2.2). This patch is not going to go into a bugfix release of Python 2.2 for at least two reasons: (1) it adds what is arguably a new feature and (2) it's big and complicated and so might cause bugs. And now I've actually looked at the patch, it has even less chance: it would break binary compaitibilty of extensions. So while I'm not against the patch in general (looks good, from an eyballing), it doesn't belong in the 2.2.x group. ---------------------------------------------------------------------- Comment By: Cesar Douady (douady) Date: 2002-04-02 13:21 Message: Logged In: YES user_id=428521 I successfully applied the patch as is to revision 2.2.1c2 with the following output (and then the same procedure as mentioned for patching revision 2.2) : patching file Include/dictobject.h patching file Include/frameobject.h patching file Include/object.h patching file Lib/test/test_descrtut.py patching file Lib/test/test_subdict.py patching file Modules/cPickle.c patching file Objects/classobject.c patching file Objects/frameobject.c patching file Python/ceval.c Hunk #2 succeeded at 1534 (offset 3 lines). Hunk #4 succeeded at 1613 (offset 3 lines). Hunk #6 succeeded at 1655 (offset 3 lines). Hunk #8 succeeded at 1860 (offset 3 lines). Hunk #10 succeeded at 1889 (offset 3 lines). Hunk #12 succeeded at 2635 (offset 3 lines). Hunk #14 succeeded at 2893 (offset 3 lines). Hunk #16 succeeded at 3038 (offset 3 lines). Hunk #18 succeeded at 3657 (offset 3 lines). Hunk #20 succeeded at 3722 (offset 3 lines). patching file Python/compile.c Hunk #1 succeeded at 2916 (offset 12 lines). patching file Python/import.c Hunk #1 succeeded at 1668 (offset -4 lines). Hunk #3 succeeded at 1716 (offset -4 lines). patching file Python/sysmodule.c Hunk #1 succeeded at 238 (offset -4 lines). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 12:27 Message: Logged In: YES user_id=6656 And there's precisely no way it's going into 2.2.x. ---------------------------------------------------------------------- Comment By: Cesar Douady (douady) Date: 2002-03-30 01:08 Message: Logged In: YES user_id=428521 to install this patch from python revision 2.2, follow these steps : - get the python.diff file from this page - cd Python-2.2 - run "patch -p1 Patches item #594001, was opened at 2002-08-12 13:33 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=594001&group_id=5470 Category: Windows Group: None Status: Open Resolution: None Priority: 5 Submitted By: Martin v. L�wis (loewis) Assigned to: Nobody/Anonymous (nobody) Summary: PEP 277: Unicode file name support Initial Comment: This patch is in an updated version of the patch [1] mentioned in the PEP. In addition to merging it with the CVS, it fixes a few formatting problems. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=594001&group_id=5470 From noreply@sourceforge.net Mon Aug 12 12:53:45 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 04:53:45 -0700 Subject: [Patches] [ python-Patches-588564 ] _locale library patch Message-ID: Patches item #588564, was opened at 2002-07-30 05:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588564&group_id=5470 Category: Distutils and setup.py Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jason Tishler (jlt63) Assigned to: Jason Tishler (jlt63) Summary: _locale library patch Initial Comment: This patch enables setup.py to find gettext routines when they are located in libintl instead of libc. Although I developed this patch for Cygwin, I hope that it can be easily updated to support other platforms (if necessary). I tested this patch under Cygwin and Red Hat Linux 7.1. ---------------------------------------------------------------------- >Comment By: Jason Tishler (jlt63) Date: 2002-08-12 03:53 Message: Logged In: YES user_id=86216 I'm a strong proponent of doing the right thing. The unfortunate reality is that I'm way overcommitted right now. Hence, I don't have the spare cycles to figure out the best way to accomplish this "major undertaking" (to quote you). Additionally, one of the distutils developers could do a better job with much less effort than I could. Please reconsider my original patch. I'm willing to change it (in the future) to be more autoconf-like if someone else is willing to do the underlying work. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-09 15:49 Message: Logged In: YES user_id=21627 Yes, that's what I meant. It eventually results in distutils getting some of the capabilities of autoconf. I agree this is a major undertaking, but one that I think needs to progress over time, in small steps. For the current problem, it might be useful to emulate AC_TRY_LINK: generate a small program, and see whether the compiler manages to link it. You probably need to allow for result-caching as well; I recommend to put the cache file into build/temp.. This may all sound very ad-hoc, but I think it can be made to work with reasonable effort. We probably need to present any results to distutils-sig before committing them. ---------------------------------------------------------------------- Comment By: Jason Tishler (jlt63) Date: 2002-08-09 10:04 Message: Logged In: YES user_id=86216 I presume that you mean to use an autoconf-style approach *in* setup.py. Is this assumption correct? If so, then I know how to search for libintl.h via find_file(). Unfortunately, I do not know how to check that a function (e.g., getext()) is in a library (i.e., libc.a). Any suggestions? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 00:37 Message: Logged In: YES user_id=21627 I would really prefer if such problems where solved in an autoconf-style approach: - is libintl.h present (you may ask pyconfig.h for that) - if so, is gettext provided by the C library - if not, is it provided by -lintl - if yes, add -lintl - if no, print an error message and continue ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588564&group_id=5470 From noreply@sourceforge.net Mon Aug 12 13:10:51 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 05:10:51 -0700 Subject: [Patches] [ python-Patches-593627 ] Static names Message-ID: Patches item #593627, was opened at 2002-08-11 10:41 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: Static names Initial Comment: This patch creates static string objects for all built-in names and interns then on initialization. The macro PyNAME is be used to access static names. PyNAME(__spam__) is equivalent to PyString_InternFromString("__spam__") but is a constant expression. It requires the name to be one of the built-in names. A linker error will be generated if it isn't. Most conversions of C strings into temporary string objects can be eliminated (PyString_FromString, PyString_InternFromString). Most string comparisons at runtime can also be eliminated. Instead of : if (strcmp(PyString_AsString(name), "__spam__")) ... This code can be used: PyString_INTERN(name) if (name == PyNAME(__spam__)) ... Where PyString_INTERN is a fast inline check if the string is already interned (and it usually is). To prevent unbounded accumulation of interned strings the mortal interned string patch should also be applied. The patch converts most of the builtin module to this new mode as an example. ---------------------------------------------------------------------- >Comment By: Oren Tirosh (orenti) Date: 2002-08-12 12:10 Message: Logged In: YES user_id=562624 If these changes are applied throughout the interpreter I expect a significant speedup but that is not my immediate goal. I am looking for redability, reduction of code size (both source and binary) and reliability (less things to check or forget to check). I am trying to get rid of code like if (docstr == NULL) { docstr= PyString_InternFromString("__doc__"); if (docstr == NULL) return NULL; And replace it with just PyNAME(__doc__) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 07:43 Message: Logged In: YES user_id=21627 What is the rationale for this patch? If it is for performance, what real speed improvements can you report? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 15:54 Message: Logged In: YES user_id=33168 I'd like to see the releasing interned string patch applied. I think it's almost ready, isn't it? It would make patching easier and seems to be a good idea. For me, the easiest way to produce patches is to use cvs. You can keep multiple cvs trees around easy enough (for having multiple overlapping/independant patches). To create patches with cvs: cvs diff -C 5 [file1] [file2] .... ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-11 15:44 Message: Logged In: YES user_id=562624 Ok, I'll fix the problems with the patch. What's the best way to produce a patch that adds new files? Static string objects cannot be released, of course. This patch will eventually depend on on the mortal interned strings patch to fix it but in the meantime I just disabled releasing interned strings because I want to keep the two patches independent. The next version will add a new PyArg_ParseTuple format for an interned string to make it easier to use. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 14:46 Message: Logged In: YES user_id=33168 Couple of initial comments: * this is a reverse patch * it seems like there are other changes in here - int ob_shash -> long - releasing interned strings? * dictobject.c is removed? * including python headers should use "" not <> Oren, could you generate a new patch with only the changes to support PyNAME? Thanks! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 From noreply@sourceforge.net Mon Aug 12 13:48:07 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 05:48:07 -0700 Subject: [Patches] [ python-Patches-593627 ] Static names Message-ID: Patches item #593627, was opened at 2002-08-11 12:41 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Nobody/Anonymous (nobody) Summary: Static names Initial Comment: This patch creates static string objects for all built-in names and interns then on initialization. The macro PyNAME is be used to access static names. PyNAME(__spam__) is equivalent to PyString_InternFromString("__spam__") but is a constant expression. It requires the name to be one of the built-in names. A linker error will be generated if it isn't. Most conversions of C strings into temporary string objects can be eliminated (PyString_FromString, PyString_InternFromString). Most string comparisons at runtime can also be eliminated. Instead of : if (strcmp(PyString_AsString(name), "__spam__")) ... This code can be used: PyString_INTERN(name) if (name == PyNAME(__spam__)) ... Where PyString_INTERN is a fast inline check if the string is already interned (and it usually is). To prevent unbounded accumulation of interned strings the mortal interned string patch should also be applied. The patch converts most of the builtin module to this new mode as an example. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 14:48 Message: Logged In: YES user_id=21627 It is difficult to see how this patch achieves the goal of readability: - many variables are not used at all, e.g. ArithmethicError; it is not clear how using them would improve readability. - the change from SETBUILTIN("None", Py_None); to SETBUILTIN(PyNAME(None), Py_None); makes it more difficult to read, not easier. Furthermore, the name "None" isn't used anywhere except this initialisation. - likewise, the changes from {"abs", builtin_abs, METH_O, abs_doc}, to {PyNAMEC(abs), builtin_abs, METH_O, abs_doc}, make the code harder to read. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-12 14:10 Message: Logged In: YES user_id=562624 If these changes are applied throughout the interpreter I expect a significant speedup but that is not my immediate goal. I am looking for redability, reduction of code size (both source and binary) and reliability (less things to check or forget to check). I am trying to get rid of code like if (docstr == NULL) { docstr= PyString_InternFromString("__doc__"); if (docstr == NULL) return NULL; And replace it with just PyNAME(__doc__) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 09:43 Message: Logged In: YES user_id=21627 What is the rationale for this patch? If it is for performance, what real speed improvements can you report? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 17:54 Message: Logged In: YES user_id=33168 I'd like to see the releasing interned string patch applied. I think it's almost ready, isn't it? It would make patching easier and seems to be a good idea. For me, the easiest way to produce patches is to use cvs. You can keep multiple cvs trees around easy enough (for having multiple overlapping/independant patches). To create patches with cvs: cvs diff -C 5 [file1] [file2] .... ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-11 17:44 Message: Logged In: YES user_id=562624 Ok, I'll fix the problems with the patch. What's the best way to produce a patch that adds new files? Static string objects cannot be released, of course. This patch will eventually depend on on the mortal interned strings patch to fix it but in the meantime I just disabled releasing interned strings because I want to keep the two patches independent. The next version will add a new PyArg_ParseTuple format for an interned string to make it easier to use. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 16:46 Message: Logged In: YES user_id=33168 Couple of initial comments: * this is a reverse patch * it seems like there are other changes in here - int ob_shash -> long - releasing interned strings? * dictobject.c is removed? * including python headers should use "" not <> Oren, could you generate a new patch with only the changes to support PyNAME? Thanks! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 From noreply@sourceforge.net Mon Aug 12 14:38:09 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 06:38:09 -0700 Subject: [Patches] [ python-Patches-594001 ] PEP 277: Unicode file name support Message-ID: Patches item #594001, was opened at 2002-08-12 07:33 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=594001&group_id=5470 Category: Windows Group: None Status: Open Resolution: None Priority: 5 Submitted By: Martin v. L�wis (loewis) Assigned to: Nobody/Anonymous (nobody) Summary: PEP 277: Unicode file name support Initial Comment: This patch is in an updated version of the patch [1] mentioned in the PEP. In addition to merging it with the CVS, it fixes a few formatting problems. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-12 09:38 Message: Logged In: YES user_id=6380 I'd make unicode_filenames() a macro that expands to 0 on platforms without this wart. I'd also test for wfunc!=NULL before calling unicode_filenames(). There's a lot of hairy code here. Are you sure that there are test cases in the test suite that exercise all of it? Aren't there some #ifdefs missing? posix_[12]str have code that's only relevant for Windows but isn't #ifdef'ed out like it is elsewhere. There should probably be a separate #define in pyport.h to test for this that's equivalent to defined(MS_WINDOWS) && !defined(Py_UNICODE_WIDE), so this can be uniformly tested to see whether this code is necessary. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=594001&group_id=5470 From noreply@sourceforge.net Mon Aug 12 15:21:17 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 07:21:17 -0700 Subject: [Patches] [ python-Patches-505846 ] pyport.h, Wince and errno getter/setter Message-ID: Patches item #505846, was opened at 2002-01-19 20:13 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505846&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Brad Clements (bkc) Assigned to: Nobody/Anonymous (nobody) Summary: pyport.h, Wince and errno getter/setter Initial Comment: Most of the remaining Windows CE diffs are due to the lack of errno on Windows CE. There are other OS's that do not have errno (but they have a workalike method). At first I had simply commented out all references in the code to errno, but that quickly became unworkable. Wince and NetWare use a function to set the per- thread "errno" value. Although errno #defines (like ERANGE) are not defined for Wince, they are defined for NetWare. Removing references to errno would artificially hobble the NetWare port. These platforms also use a function to retrieve the current errno value. The attached diff for pyport.h attempts to standardize the access method for errno (or it's work-alike) by providing SetErrno(), ClearErrno() and GetErrno() macros. ClearErrno() is SetErrno(0) I've found and changed all direct references to errno to use these macros. This patch must be submitted before the patches for other modules. -- I see two negatives with this approach: 1. It will be a pain to think GetErrno() instead of "errno" when modifying/writing new code. 2. Extension modules will need access to pyport.h for the target platform. In the worst case, directly referencing errno instead of using the macros will break only those platforms for which the macros do something different. That is, Wince and NetWare. -- An alternative spelling/capitalization of these macros might make them more appealing. Feel free to make a suggestion. -- It's probably incorrect for me to use SetErrno() as a function, such as SetErrno(1); I think the semi-colon is not needed, but wasn't entirely certain. On better advice, I will fix these statements in the remaining source files if this patch is accepted. ---------------------------------------------------------------------- >Comment By: Brad Clements (bkc) Date: 2002-08-12 14:21 Message: Logged In: YES user_id=4631 I would very much like to move this forward. Is all you need is a refreshed diff without pyconfig.h diffs? I'll have to check why DONT_HAVE_TIME_H is in there. I think perhaps because the CE portion is using the Win32 hand-made config, and only differs by a little bit. What about jackjansen's post from 4-19? I cannot use the macro trick he suggests because there are two different functions for accessing errno, one for get and one for set. Regarding his comment about changing all errno accesses: I'm still committed to submitting appropriate diffs for the core and any other module ported to CE or NetWare. In fact, it's time to refresh my CVS copy. What do you suggest I check out? Head, or a specific revision? Thanks ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 09:10 Message: Logged In: YES user_id=21627 Any chances that updates to this patch are forthcoming? If not, it will be rejected by October 1. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-06-04 17:33 Message: Logged In: YES user_id=21627 The patch requires further surgery: What is DONT_HAVE_TIME_H? If you want to test for presence of , you need to add HAVE_TIME_H to the autoconf machinery, and all manually-maintained copies of pyconfig.h. Including just the configure.in changes is fine; no need to include changes to generated files. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-04-19 14:47 Message: Logged In: YES user_id=45365 Brad, I think this patch might be asking for too much. You're asking that all accesses to errno be replaced by GetErrno() or SetErrno() calls, really... And for many cases there is a workaround, where you don't have to change user code (i.e. the normal C code still uses what it thinks is an errno variable). On my system errno is #define errno (*__error()) and the __error() routine returns a pointer to the errno-variable for the current thread. For the GetErrno function this would be good enough, and with a bit of effort you could probably get it to work for the Set function too (possibly by doing the actual Set work in the next Get call). ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-02-11 23:17 Message: Logged In: YES user_id=4631 Hi folks, I need to proceed with the port to NetWare so I have something to demo at Brainshare in March. Unfortunately future patches from me will include both WINCE and NetWare specific patches, though hopefully there won't be much other than config.h and this patch (which is required for NetWare). Is there anything I can do to make this patch more acceptable? Send a bottle of wine, perhaps? ;-) ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-01-28 23:39 Message: Logged In: YES user_id=33168 Tim, I can check in or do whatever else needs to be done to check this in and move this forward. How do you want to procede? Brad, I think most people are pretty busy right now. ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-01-28 23:19 Message: Logged In: YES user_id=4631 Hi folks, just wondering if this patch is going to be rejected, or if you're all too busy and I have to be more patient ;-) I have a passle of Python-CE folks waiting on me to finish checking in patches. This is the worst one, I promise! Let me know what you want me to do, when you get a chance. Thanks ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-01-20 20:17 Message: Logged In: YES user_id=4631 I've eliminated Py_ClearErrno() and updated all the source to use Py_SetErrno(0). Attached is an updated diff for pyport.h ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-01-20 19:21 Message: Logged In: YES user_id=31435 Brad, errno is required by ANSI C, which also defines the semantics of a 0 value. Setting errno to 0, and taking errno==0 as meaning "no error", are 100% portable across platforms with a standard-conforming C implementation. If this platform doesn't support standard C, I have to question whether the core should even try to cater to it: the changes needed make no sense to C programmers, so may become a maintenance nightmare. I don't think putting a layer around errno is going to be hard to live with, provided that it merely tries to emulate standard behavior. For that reason, setting errno to 0 is correct, but inventing a new ClearErrno concept is wrong (the latter makes no sense to anyone except its inventor ). ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-01-20 15:54 Message: Logged In: YES user_id=4631 I can post a new diff for the // or would you be willing to just change the patch you have? I cannot use the same macros for Py_SET_ERANGE_IF_OVERFLOW (X) because Wince doesn't have ERANGE. You'll note the use of Py_SetErrno(1) which is frankly bogus. This is related to your comment on Py_ClearErrno() Using (errno == 0) as meaning "no error" seems to me to be a python source "convention" forced on it by (mostly) floating point side effects. Because the math routines are indicating overflow errors through the side effect of setting errno (rather than returning an explicit NaN that works on all platforms), we must set errno = 0 before calling these math functions. I suppose it's possible that on some platform "clearing the last error value" wouldn't be done this way, but rather might be an explicit function call. Since I was going through the source looking for all errno's, I felt it was clearer to say Py_ClearErrno() rather than Py_SetErrno(0), even though in the end they do the same thing on currently supported platforms. I'm easy, if you want to replace Py_ClearErrno() with Py_SetErrno(0) I can do that too. -- Regarding goto targets.. is it likely that "cleanup" might also collide with local variables? would _cleanup or __cleanup work for you? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-01-19 22:57 Message: Logged In: YES user_id=33168 Need to change the // comment to /* */. gcc accepts this for C, but it's non-standard (at least it was, it may have changed in C99). You can have 1 Py_SET_ERANGE_IF_OVERFLOW for both platforms if you do this: #ifndef ERANGE #define ERANGE 1 #endif #define Py_SET_ERANGE_IF_OVERFLOW(X) \ do { \ if (Py_GetErrno() == 0 && ((X) == Py_HUGE_VAL || \ (X) == -Py_HUGE_VAL)) \ Py_SetErrno(ERANGE); \ } while(0) I'm not sure of the usefulness of Py_ClearErrno(), since it's the same on all platforms. If errno might be set to something other than 0 in the future, it would be good to make the change now. I would suggest changing finally to cleanup. ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-01-19 21:47 Message: Logged In: YES user_id=4631 Here is an amended diff with the suggested changes. I've tested the semi-colon handling on EVT, it works as suggested. -- Question: What is the prefered style, #ifdef xyz or #if defined(xyz) ? I try to use #ifdef xyz, but sometimes there's multiple possibilities and #if defined(x) || defined(y) is needed. Is that okay? -- Upcoming issue (hoping you address in your reply). There are many "goto finally" statements in various modules. Unfortunately EVT treats "finally" as a reserved word, even when compiling in non C++ mode. Also, Metrowerks does the same. I've changed all of these to "goto my_finally" as a quick work-around. I know "my_finally" sounds yucky, what's your recommendation for this? ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-01-19 20:52 Message: Logged In: YES user_id=31435 All identifiers defined in pyport.h must begin with "Py_". pyport.h is (and must be) #include'd by extension modules, and we need the prefix to avoid stomping on their namespace, and to make clear (to them and to us) that the gimmicks are part of Python's portability layer. A name like "SetErrno" is certain to conflict with some other package's attempt to worm around errno problems; Py_SetErrno () is not. Agree with Neal's final suggestion about dealing with semicolons. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-01-19 20:28 Message: Logged In: YES user_id=33168 Typically, the semi-colon problem is dealt with as in Py_SET_ERANGE_IF_OVERFLOW. So, #define SetErrno(X) do { SetLastError(X); } while (0) I don't think (but can't remember if) there is any problem for single statements like you have. You could probably do: #ifndef MS_WINCE #define SetErrno(X) errno = (X) /* note no ; */ #else #define SetErrno(X) SetLastError(X) /* note no ; */ #endif ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505846&group_id=5470 From noreply@sourceforge.net Mon Aug 12 15:25:47 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 07:25:47 -0700 Subject: [Patches] [ python-Patches-505846 ] pyport.h, Wince and errno getter/setter Message-ID: Patches item #505846, was opened at 2002-01-19 15:13 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505846&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Brad Clements (bkc) Assigned to: Nobody/Anonymous (nobody) Summary: pyport.h, Wince and errno getter/setter Initial Comment: Most of the remaining Windows CE diffs are due to the lack of errno on Windows CE. There are other OS's that do not have errno (but they have a workalike method). At first I had simply commented out all references in the code to errno, but that quickly became unworkable. Wince and NetWare use a function to set the per- thread "errno" value. Although errno #defines (like ERANGE) are not defined for Wince, they are defined for NetWare. Removing references to errno would artificially hobble the NetWare port. These platforms also use a function to retrieve the current errno value. The attached diff for pyport.h attempts to standardize the access method for errno (or it's work-alike) by providing SetErrno(), ClearErrno() and GetErrno() macros. ClearErrno() is SetErrno(0) I've found and changed all direct references to errno to use these macros. This patch must be submitted before the patches for other modules. -- I see two negatives with this approach: 1. It will be a pain to think GetErrno() instead of "errno" when modifying/writing new code. 2. Extension modules will need access to pyport.h for the target platform. In the worst case, directly referencing errno instead of using the macros will break only those platforms for which the macros do something different. That is, Wince and NetWare. -- An alternative spelling/capitalization of these macros might make them more appealing. Feel free to make a suggestion. -- It's probably incorrect for me to use SetErrno() as a function, such as SetErrno(1); I think the semi-colon is not needed, but wasn't entirely certain. On better advice, I will fix these statements in the remaining source files if this patch is accepted. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-12 10:25 Message: Logged In: YES user_id=33168 You should work off CVS head. ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-08-12 10:21 Message: Logged In: YES user_id=4631 I would very much like to move this forward. Is all you need is a refreshed diff without pyconfig.h diffs? I'll have to check why DONT_HAVE_TIME_H is in there. I think perhaps because the CE portion is using the Win32 hand-made config, and only differs by a little bit. What about jackjansen's post from 4-19? I cannot use the macro trick he suggests because there are two different functions for accessing errno, one for get and one for set. Regarding his comment about changing all errno accesses: I'm still committed to submitting appropriate diffs for the core and any other module ported to CE or NetWare. In fact, it's time to refresh my CVS copy. What do you suggest I check out? Head, or a specific revision? Thanks ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 05:10 Message: Logged In: YES user_id=21627 Any chances that updates to this patch are forthcoming? If not, it will be rejected by October 1. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-06-04 13:33 Message: Logged In: YES user_id=21627 The patch requires further surgery: What is DONT_HAVE_TIME_H? If you want to test for presence of , you need to add HAVE_TIME_H to the autoconf machinery, and all manually-maintained copies of pyconfig.h. Including just the configure.in changes is fine; no need to include changes to generated files. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-04-19 10:47 Message: Logged In: YES user_id=45365 Brad, I think this patch might be asking for too much. You're asking that all accesses to errno be replaced by GetErrno() or SetErrno() calls, really... And for many cases there is a workaround, where you don't have to change user code (i.e. the normal C code still uses what it thinks is an errno variable). On my system errno is #define errno (*__error()) and the __error() routine returns a pointer to the errno-variable for the current thread. For the GetErrno function this would be good enough, and with a bit of effort you could probably get it to work for the Set function too (possibly by doing the actual Set work in the next Get call). ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-02-11 18:17 Message: Logged In: YES user_id=4631 Hi folks, I need to proceed with the port to NetWare so I have something to demo at Brainshare in March. Unfortunately future patches from me will include both WINCE and NetWare specific patches, though hopefully there won't be much other than config.h and this patch (which is required for NetWare). Is there anything I can do to make this patch more acceptable? Send a bottle of wine, perhaps? ;-) ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-01-28 18:39 Message: Logged In: YES user_id=33168 Tim, I can check in or do whatever else needs to be done to check this in and move this forward. How do you want to procede? Brad, I think most people are pretty busy right now. ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-01-28 18:19 Message: Logged In: YES user_id=4631 Hi folks, just wondering if this patch is going to be rejected, or if you're all too busy and I have to be more patient ;-) I have a passle of Python-CE folks waiting on me to finish checking in patches. This is the worst one, I promise! Let me know what you want me to do, when you get a chance. Thanks ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-01-20 15:17 Message: Logged In: YES user_id=4631 I've eliminated Py_ClearErrno() and updated all the source to use Py_SetErrno(0). Attached is an updated diff for pyport.h ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-01-20 14:21 Message: Logged In: YES user_id=31435 Brad, errno is required by ANSI C, which also defines the semantics of a 0 value. Setting errno to 0, and taking errno==0 as meaning "no error", are 100% portable across platforms with a standard-conforming C implementation. If this platform doesn't support standard C, I have to question whether the core should even try to cater to it: the changes needed make no sense to C programmers, so may become a maintenance nightmare. I don't think putting a layer around errno is going to be hard to live with, provided that it merely tries to emulate standard behavior. For that reason, setting errno to 0 is correct, but inventing a new ClearErrno concept is wrong (the latter makes no sense to anyone except its inventor ). ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-01-20 10:54 Message: Logged In: YES user_id=4631 I can post a new diff for the // or would you be willing to just change the patch you have? I cannot use the same macros for Py_SET_ERANGE_IF_OVERFLOW (X) because Wince doesn't have ERANGE. You'll note the use of Py_SetErrno(1) which is frankly bogus. This is related to your comment on Py_ClearErrno() Using (errno == 0) as meaning "no error" seems to me to be a python source "convention" forced on it by (mostly) floating point side effects. Because the math routines are indicating overflow errors through the side effect of setting errno (rather than returning an explicit NaN that works on all platforms), we must set errno = 0 before calling these math functions. I suppose it's possible that on some platform "clearing the last error value" wouldn't be done this way, but rather might be an explicit function call. Since I was going through the source looking for all errno's, I felt it was clearer to say Py_ClearErrno() rather than Py_SetErrno(0), even though in the end they do the same thing on currently supported platforms. I'm easy, if you want to replace Py_ClearErrno() with Py_SetErrno(0) I can do that too. -- Regarding goto targets.. is it likely that "cleanup" might also collide with local variables? would _cleanup or __cleanup work for you? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-01-19 17:57 Message: Logged In: YES user_id=33168 Need to change the // comment to /* */. gcc accepts this for C, but it's non-standard (at least it was, it may have changed in C99). You can have 1 Py_SET_ERANGE_IF_OVERFLOW for both platforms if you do this: #ifndef ERANGE #define ERANGE 1 #endif #define Py_SET_ERANGE_IF_OVERFLOW(X) \ do { \ if (Py_GetErrno() == 0 && ((X) == Py_HUGE_VAL || \ (X) == -Py_HUGE_VAL)) \ Py_SetErrno(ERANGE); \ } while(0) I'm not sure of the usefulness of Py_ClearErrno(), since it's the same on all platforms. If errno might be set to something other than 0 in the future, it would be good to make the change now. I would suggest changing finally to cleanup. ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-01-19 16:47 Message: Logged In: YES user_id=4631 Here is an amended diff with the suggested changes. I've tested the semi-colon handling on EVT, it works as suggested. -- Question: What is the prefered style, #ifdef xyz or #if defined(xyz) ? I try to use #ifdef xyz, but sometimes there's multiple possibilities and #if defined(x) || defined(y) is needed. Is that okay? -- Upcoming issue (hoping you address in your reply). There are many "goto finally" statements in various modules. Unfortunately EVT treats "finally" as a reserved word, even when compiling in non C++ mode. Also, Metrowerks does the same. I've changed all of these to "goto my_finally" as a quick work-around. I know "my_finally" sounds yucky, what's your recommendation for this? ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-01-19 15:52 Message: Logged In: YES user_id=31435 All identifiers defined in pyport.h must begin with "Py_". pyport.h is (and must be) #include'd by extension modules, and we need the prefix to avoid stomping on their namespace, and to make clear (to them and to us) that the gimmicks are part of Python's portability layer. A name like "SetErrno" is certain to conflict with some other package's attempt to worm around errno problems; Py_SetErrno () is not. Agree with Neal's final suggestion about dealing with semicolons. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-01-19 15:28 Message: Logged In: YES user_id=33168 Typically, the semi-colon problem is dealt with as in Py_SET_ERANGE_IF_OVERFLOW. So, #define SetErrno(X) do { SetLastError(X); } while (0) I don't think (but can't remember if) there is any problem for single statements like you have. You could probably do: #ifndef MS_WINCE #define SetErrno(X) errno = (X) /* note no ; */ #else #define SetErrno(X) SetLastError(X) /* note no ; */ #endif ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505846&group_id=5470 From noreply@sourceforge.net Mon Aug 12 16:15:43 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 08:15:43 -0700 Subject: [Patches] [ python-Patches-505846 ] pyport.h, Wince and errno getter/setter Message-ID: Patches item #505846, was opened at 2002-01-19 21:13 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505846&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Brad Clements (bkc) Assigned to: Nobody/Anonymous (nobody) Summary: pyport.h, Wince and errno getter/setter Initial Comment: Most of the remaining Windows CE diffs are due to the lack of errno on Windows CE. There are other OS's that do not have errno (but they have a workalike method). At first I had simply commented out all references in the code to errno, but that quickly became unworkable. Wince and NetWare use a function to set the per- thread "errno" value. Although errno #defines (like ERANGE) are not defined for Wince, they are defined for NetWare. Removing references to errno would artificially hobble the NetWare port. These platforms also use a function to retrieve the current errno value. The attached diff for pyport.h attempts to standardize the access method for errno (or it's work-alike) by providing SetErrno(), ClearErrno() and GetErrno() macros. ClearErrno() is SetErrno(0) I've found and changed all direct references to errno to use these macros. This patch must be submitted before the patches for other modules. -- I see two negatives with this approach: 1. It will be a pain to think GetErrno() instead of "errno" when modifying/writing new code. 2. Extension modules will need access to pyport.h for the target platform. In the worst case, directly referencing errno instead of using the macros will break only those platforms for which the macros do something different. That is, Wince and NetWare. -- An alternative spelling/capitalization of these macros might make them more appealing. Feel free to make a suggestion. -- It's probably incorrect for me to use SetErrno() as a function, such as SetErrno(1); I think the semi-colon is not needed, but wasn't entirely certain. On better advice, I will fix these statements in the remaining source files if this patch is accepted. ---------------------------------------------------------------------- >Comment By: Jack Jansen (jackjansen) Date: 2002-08-12 17:15 Message: Logged In: YES user_id=45365 Brad, if there's no other way to do your errno magic than by replacing all errno accesses with macros then so be it. *I* definitely don't want to hold off your patch because of that. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-12 16:25 Message: Logged In: YES user_id=33168 You should work off CVS head. ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-08-12 16:21 Message: Logged In: YES user_id=4631 I would very much like to move this forward. Is all you need is a refreshed diff without pyconfig.h diffs? I'll have to check why DONT_HAVE_TIME_H is in there. I think perhaps because the CE portion is using the Win32 hand-made config, and only differs by a little bit. What about jackjansen's post from 4-19? I cannot use the macro trick he suggests because there are two different functions for accessing errno, one for get and one for set. Regarding his comment about changing all errno accesses: I'm still committed to submitting appropriate diffs for the core and any other module ported to CE or NetWare. In fact, it's time to refresh my CVS copy. What do you suggest I check out? Head, or a specific revision? Thanks ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 11:10 Message: Logged In: YES user_id=21627 Any chances that updates to this patch are forthcoming? If not, it will be rejected by October 1. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-06-04 19:33 Message: Logged In: YES user_id=21627 The patch requires further surgery: What is DONT_HAVE_TIME_H? If you want to test for presence of , you need to add HAVE_TIME_H to the autoconf machinery, and all manually-maintained copies of pyconfig.h. Including just the configure.in changes is fine; no need to include changes to generated files. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-04-19 16:47 Message: Logged In: YES user_id=45365 Brad, I think this patch might be asking for too much. You're asking that all accesses to errno be replaced by GetErrno() or SetErrno() calls, really... And for many cases there is a workaround, where you don't have to change user code (i.e. the normal C code still uses what it thinks is an errno variable). On my system errno is #define errno (*__error()) and the __error() routine returns a pointer to the errno-variable for the current thread. For the GetErrno function this would be good enough, and with a bit of effort you could probably get it to work for the Set function too (possibly by doing the actual Set work in the next Get call). ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-02-12 00:17 Message: Logged In: YES user_id=4631 Hi folks, I need to proceed with the port to NetWare so I have something to demo at Brainshare in March. Unfortunately future patches from me will include both WINCE and NetWare specific patches, though hopefully there won't be much other than config.h and this patch (which is required for NetWare). Is there anything I can do to make this patch more acceptable? Send a bottle of wine, perhaps? ;-) ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-01-29 00:39 Message: Logged In: YES user_id=33168 Tim, I can check in or do whatever else needs to be done to check this in and move this forward. How do you want to procede? Brad, I think most people are pretty busy right now. ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-01-29 00:19 Message: Logged In: YES user_id=4631 Hi folks, just wondering if this patch is going to be rejected, or if you're all too busy and I have to be more patient ;-) I have a passle of Python-CE folks waiting on me to finish checking in patches. This is the worst one, I promise! Let me know what you want me to do, when you get a chance. Thanks ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-01-20 21:17 Message: Logged In: YES user_id=4631 I've eliminated Py_ClearErrno() and updated all the source to use Py_SetErrno(0). Attached is an updated diff for pyport.h ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-01-20 20:21 Message: Logged In: YES user_id=31435 Brad, errno is required by ANSI C, which also defines the semantics of a 0 value. Setting errno to 0, and taking errno==0 as meaning "no error", are 100% portable across platforms with a standard-conforming C implementation. If this platform doesn't support standard C, I have to question whether the core should even try to cater to it: the changes needed make no sense to C programmers, so may become a maintenance nightmare. I don't think putting a layer around errno is going to be hard to live with, provided that it merely tries to emulate standard behavior. For that reason, setting errno to 0 is correct, but inventing a new ClearErrno concept is wrong (the latter makes no sense to anyone except its inventor ). ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-01-20 16:54 Message: Logged In: YES user_id=4631 I can post a new diff for the // or would you be willing to just change the patch you have? I cannot use the same macros for Py_SET_ERANGE_IF_OVERFLOW (X) because Wince doesn't have ERANGE. You'll note the use of Py_SetErrno(1) which is frankly bogus. This is related to your comment on Py_ClearErrno() Using (errno == 0) as meaning "no error" seems to me to be a python source "convention" forced on it by (mostly) floating point side effects. Because the math routines are indicating overflow errors through the side effect of setting errno (rather than returning an explicit NaN that works on all platforms), we must set errno = 0 before calling these math functions. I suppose it's possible that on some platform "clearing the last error value" wouldn't be done this way, but rather might be an explicit function call. Since I was going through the source looking for all errno's, I felt it was clearer to say Py_ClearErrno() rather than Py_SetErrno(0), even though in the end they do the same thing on currently supported platforms. I'm easy, if you want to replace Py_ClearErrno() with Py_SetErrno(0) I can do that too. -- Regarding goto targets.. is it likely that "cleanup" might also collide with local variables? would _cleanup or __cleanup work for you? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-01-19 23:57 Message: Logged In: YES user_id=33168 Need to change the // comment to /* */. gcc accepts this for C, but it's non-standard (at least it was, it may have changed in C99). You can have 1 Py_SET_ERANGE_IF_OVERFLOW for both platforms if you do this: #ifndef ERANGE #define ERANGE 1 #endif #define Py_SET_ERANGE_IF_OVERFLOW(X) \ do { \ if (Py_GetErrno() == 0 && ((X) == Py_HUGE_VAL || \ (X) == -Py_HUGE_VAL)) \ Py_SetErrno(ERANGE); \ } while(0) I'm not sure of the usefulness of Py_ClearErrno(), since it's the same on all platforms. If errno might be set to something other than 0 in the future, it would be good to make the change now. I would suggest changing finally to cleanup. ---------------------------------------------------------------------- Comment By: Brad Clements (bkc) Date: 2002-01-19 22:47 Message: Logged In: YES user_id=4631 Here is an amended diff with the suggested changes. I've tested the semi-colon handling on EVT, it works as suggested. -- Question: What is the prefered style, #ifdef xyz or #if defined(xyz) ? I try to use #ifdef xyz, but sometimes there's multiple possibilities and #if defined(x) || defined(y) is needed. Is that okay? -- Upcoming issue (hoping you address in your reply). There are many "goto finally" statements in various modules. Unfortunately EVT treats "finally" as a reserved word, even when compiling in non C++ mode. Also, Metrowerks does the same. I've changed all of these to "goto my_finally" as a quick work-around. I know "my_finally" sounds yucky, what's your recommendation for this? ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-01-19 21:52 Message: Logged In: YES user_id=31435 All identifiers defined in pyport.h must begin with "Py_". pyport.h is (and must be) #include'd by extension modules, and we need the prefix to avoid stomping on their namespace, and to make clear (to them and to us) that the gimmicks are part of Python's portability layer. A name like "SetErrno" is certain to conflict with some other package's attempt to worm around errno problems; Py_SetErrno () is not. Agree with Neal's final suggestion about dealing with semicolons. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-01-19 21:28 Message: Logged In: YES user_id=33168 Typically, the semi-colon problem is dealt with as in Py_SET_ERANGE_IF_OVERFLOW. So, #define SetErrno(X) do { SetLastError(X); } while (0) I don't think (but can't remember if) there is any problem for single statements like you have. You could probably do: #ifndef MS_WINCE #define SetErrno(X) errno = (X) /* note no ; */ #else #define SetErrno(X) SetLastError(X) /* note no ; */ #endif ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505846&group_id=5470 From noreply@sourceforge.net Mon Aug 12 19:12:50 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 11:12:50 -0700 Subject: [Patches] [ python-Patches-585913 ] Adds Galeon support to webbrowser.py Message-ID: Patches item #585913, was opened at 2002-07-24 08:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=585913&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Greg Copeland (oracle) Assigned to: Nobody/Anonymous (nobody) Summary: Adds Galeon support to webbrowser.py Initial Comment: Simple context diff against current CVS tree to add support for Galeon to webbrowser.py ---------------------------------------------------------------------- Comment By: Greg Copeland (oracle) Date: 2002-08-12 13:12 Message: Logged In: YES user_id=40173 Has then patch been accepted? What's the standard why of letting people know if a patch has been accepted or rejected? ---------------------------------------------------------------------- Comment By: Greg Copeland (oracle) Date: 2002-07-26 14:56 Message: Logged In: YES user_id=40173 Not really sure. I assume it's just a second patch by another author. What can I say, day late and a dollar short. ;) Having looked at the other patch, it appears mine is a little more well rounded/complete/feature rich, if only slightly. I invite you to take a look for your self. I'm also not sure what version of webbrowser.py the other patch is against. My patch is against the CVS version so it will be a breeze to apply. Enjoy! ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-26 13:54 Message: Logged In: YES user_id=21627 How does this relate to https://sourceforge.net/tracker/index.php?func=detail&aid=586437&group_id=5470&atid=305470 ? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=585913&group_id=5470 From noreply@sourceforge.net Mon Aug 12 20:03:21 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 12:03:21 -0700 Subject: [Patches] [ python-Patches-594197 ] Patch for bug 592567 Message-ID: Patches item #594197, was opened at 2002-08-12 19:03 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=594197&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jiba (jiba) Assigned to: Nobody/Anonymous (nobody) Summary: Patch for bug 592567 Initial Comment: This patch fixes the bug 592567 (Bug with deepcopy and new style objects). It uses a different, better, approach that the one i have proposed in the bug report: it keeps alive ANY object that is deepcopied, automatically. This has also a positive side effect: if you define your own __deepcopy__ method, you don't need to take care of the underlying implementation and to keep alive your temporary states. The script test3.py illustrates that -- the patch fixes also this "bug" (not strictly speaking a bug). test3.py has been tested with Python 2.2.1. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=594197&group_id=5470 From noreply@sourceforge.net Mon Aug 12 20:12:16 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 12:12:16 -0700 Subject: [Patches] [ python-Patches-594197 ] Patch for bug 592567 Message-ID: Patches item #594197, was opened at 2002-08-12 15:03 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=594197&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jiba (jiba) >Assigned to: Guido van Rossum (gvanrossum) Summary: Patch for bug 592567 Initial Comment: This patch fixes the bug 592567 (Bug with deepcopy and new style objects). It uses a different, better, approach that the one i have proposed in the bug report: it keeps alive ANY object that is deepcopied, automatically. This has also a positive side effect: if you define your own __deepcopy__ method, you don't need to take care of the underlying implementation and to keep alive your temporary states. The script test3.py illustrates that -- the patch fixes also this "bug" (not strictly speaking a bug). test3.py has been tested with Python 2.2.1. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=594197&group_id=5470 From noreply@sourceforge.net Mon Aug 12 22:17:09 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 14:17:09 -0700 Subject: [Patches] [ python-Patches-527371 ] Fix for sre bug 470582 Message-ID: Patches item #527371, was opened at 2002-03-08 04:14 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=527371&group_id=5470 Category: Modules Group: None Status: Open Resolution: Accepted Priority: 8 Submitted By: Greg Chapman (glchapman) Assigned to: Fredrik Lundh (effbot) Summary: Fix for sre bug 470582 Initial Comment: Bug report 470582 points out that nested groups can produces matches in sre even if the groups within which they are nested do not match: >>> m = sre.search(r"^((\d)\:)?(\d\d)\.(\d\d\d) $", "34.123") >>> m.groups() (None, '3', '34', '123') >>> m = pre.search(r"^((\d)\:)?(\d\d)\.(\d\d\d) $", "34.123") >>> m.groups() (None, None, '34', '123') I believe this is because in the handling of SRE_OP_MAX_UNTIL, state->lastmark is being reduced (after "((\d)\:)" fails) without NULLing out the now- invalid entries at the end of the state->mark array. In the other two cases where state->lastmark is reduced (specifically in SRE_OP_BRANCH and SRE_OP_REPEAT_ONE) memset is used to NULL out the entries at the end of the array. The attached patch does the same thing for the SRE_OP_MAX_UNTIL case. This fixes the above case and does not break anything in test_re.py. ---------------------------------------------------------------------- >Comment By: Greg Chapman (glchapman) Date: 2002-08-12 13:17 Message: Logged In: YES user_id=86307 I noticed recently that the lastindex attribute of match objects is now documented, so I believe that the lastindex problem I described in my March 8 posting needs to be fixed. Simply, lastindex may claim that a group matched when in fact it didn't (because lastindex does not get updated when lastmark is reset to a lower value): >>> m = sre.match('(\d)?\d\d', '12') >>> m.groups() (None,) >>> m.lastindex 1 ---------------------------------------------------------------------- Comment By: Fredrik Lundh (effbot) Date: 2002-07-12 03:11 Message: Logged In: YES user_id=38376 (bumped priority as a reminder to self) /F ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-03-08 09:28 Message: Logged In: YES user_id=31435 Assigned to /F -- he's the expert here. ---------------------------------------------------------------------- Comment By: Greg Chapman (glchapman) Date: 2002-03-08 06:23 Message: Logged In: YES user_id=86307 I'm pretty sure the memset is correct; state->lastmark is the index of last mark written to (not the index of the next potential write). Also, it occurred to me that there is another related error here: >>> m = sre.search(r'^((\d)\:)?\d\d\.\d\d\d$', '34.123') >>> m.groups() (None, None) >>> m.lastindex 2 In other words, lastindex claims that group 2 was the last that matched, even though it didn't really match. Since lastindex is undocumented, this probably doesn't matter too much. Still, it probably should be reset if it is pointing to a group which gets "unmatched" when state->lastmark is reduced. Perhaps a function like the following should be added for use in the three places where state->lastmark is reset to a previous value: void lastmark_restore(SRE_STATE *state, int lastmark) { assert(lastmark >= 0); if (state->lastmark > lastmark) { int lastvalidindex = (lastmark == 0) ? -1 : (lastmark-1)/2+1; if (state->lastindex > lastvalidindex) state->lastindex = lastvalidindex; memset( state->mark + lastmark + 1, 0, (state->lastmark - lastmark) * sizeof(void*) ); } state->lastmark = lastmark; } ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-03-08 04:29 Message: Logged In: YES user_id=33168 Confirmed that the test w/o fix fails and the test passes with the fix to _sre.c. But I'm not sure if the memset can go too far: memset(state->mark + lastmark + 1, 0, (state->lastmark - lastmark) * sizeof(void*)); I can try under purify, but that doesn't guarantee anything. ---------------------------------------------------------------------- Comment By: Greg Chapman (glchapman) Date: 2002-03-08 04:20 Message: Logged In: YES user_id=86307 I forgot: here's a patch for re_tests.py which adds the case from the bug report as a test. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=527371&group_id=5470 From noreply@sourceforge.net Mon Aug 12 23:13:47 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 15:13:47 -0700 Subject: [Patches] [ python-Patches-560379 ] Karatsuba multiplication Message-ID: Patches item #560379, was opened at 2002-05-24 21:07 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=560379&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Christopher A. Craig (ccraig) Assigned to: Tim Peters (tim_one) Summary: Karatsuba multiplication Initial Comment: Adds Karatsuba multiplication to Python. Patches longobject.c to use Karatsuba multiplication in place of gradeschool math. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-08-12 18:13 Message: Logged In: YES user_id=31435 Closing this, as I'm happy with the code now. Added a new "lopsided" routine to remove the penalty (relative to 2.2.1) when inputs are of vastly different sizes (that was a degenerate case for k_mul -- it didn't save any work then, but did entail a lot more overheads). ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-12 00:19 Message: Logged In: YES user_id=31435 Yes, until the new algorithm is enabled w/o the envar trickery. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 23:36 Message: Logged In: YES user_id=33168 Tim, did you want to leave this open? ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-11 22:40 Message: Logged In: YES user_id=31435 Thanks! I checked in some code building on this. Changes included: + Adjusted whitespace to meet the standard (spaces after "if" and "for", flanking binary operators, etc). + The refcount fiddling in x_mul caused assorted system crashes if KeyboardInterrupt was raised during a multiply. Repaired that. + More comments and asserts. + Removed k_join and built "the answer" piecemeal into the result object in k_mul. This allows to free more chunks of memory sooner, reducing highwater mark and the probable size of the working set. Since the cache behavior is quite different now, it would be cool if you could run your tuning tests again. The cutoff value is now a #define, KARATSUBA_CUTOFF near the top of longobject.c. Until I can make time for more thorough testing, k_mul isn't called by default: multiplication invokes k_mul if and only if an environment variable named KARAT exists (its value is irrelevant; just its existence matters). ---------------------------------------------------------------------- Comment By: Christopher A. Craig (ccraig) Date: 2002-07-09 18:43 Message: Logged In: YES user_id=135050 I've brought the code into compliance with the coding standards in the PEP7, and added some comments that I thought were in line with the rest of the file. If there is something else you would like me to do, please tell me. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-06-05 17:38 Message: Logged In: YES user_id=6380 Tim thinks this is cool, but the code can use cleanup and comments. Also, let's not add platform specific hacks (Christian can sell those as an add-on :-). ---------------------------------------------------------------------- Comment By: Christopher A. Craig (ccraig) Date: 2002-05-25 19:41 Message: Logged In: YES user_id=135050 I made the needed changes to make to split on the bigger number (basically chaged to split on bigger number, and changed all of the places that need to check to see if there are no bits left), and the new one is a little bit faster, so I'm uploading it too. I had been thinking about fixed precision numbers when I wrote it, so I honestly didn't consider the fact that I could just shift the smaller number to 0 and throw it away... :-) ---------------------------------------------------------------------- Comment By: Christopher A. Craig (ccraig) Date: 2002-05-25 12:16 Message: Logged In: YES user_id=135050 I just uploaded a graph with some sample timings in it. Red is a fence of 20. Green is a fence of 40. Blue is a fence of 60. Black is done with unmodified Python 2.2.1. ---------------------------------------------------------------------- Comment By: Christopher A. Craig (ccraig) Date: 2002-05-25 01:53 Message: Logged In: YES user_id=135050 I got 40 from testing. Basically I generated 250 random numbers each for a series of sizes between 5 and 2990 bits long at 15 bit intervals (i.e. the word size), and stored it in a dictionary. Then timed 249 multiplies at each size for a bunch of fence values and used gdchart to make a pretty graph. It cerntainly could be optimized better per compiler/platform, but I don't know how much gain you'ld see. I split on the smaller number because I guessed it would be better. My thought was that if I split on the smaller number I'm guaranteed to reach the fence, at which point I can use the gradeschool method at a near linear cost (since it's O(n*m) and one of those two is at most the fence size). If I split on the larger number, I may run into a condition where the smaller number is less than half the larger, but I haven't reached the fence yet, and then gradeschool could be much more expensive. ---------------------------------------------------------------------- Comment By: Christian Tismer (tismer) Date: 2002-05-24 23:23 Message: Logged In: YES user_id=105700 Hmm, not bad. Q: You set the split fence at 40. Where does this number come from? I think this could be optimzed per compiler/platform. You say that you split based on the smaller number. Why this? My intuitive guess would certainly be to always split on the larger number. I just checked my Python implementation which does this. Open question: how to handle very small by very long the best way? Probably the highschool version is better here, and that might have led you to investigate the smaller one. I'd say bosh should be checked. good work! - cheers chris ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=560379&group_id=5470 From noreply@sourceforge.net Tue Aug 13 00:40:08 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 16:40:08 -0700 Subject: [Patches] [ python-Patches-505705 ] Remove eval in pickle and cPickle Message-ID: Patches item #505705, was opened at 2002-01-19 10:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Out of Date Priority: 5 Submitted By: Martin v. L�wis (loewis) Assigned to: Martin v. L�wis (loewis) Summary: Remove eval in pickle and cPickle Initial Comment: This patch removes the use of eval in pickle and cPickle. It does so by: - moving the actual parsing from compile.c:parsestr to PyString_DecodeEscape - introducing a new codec string-escape - removing the code that checks that a string-to-unpickle is properly escaped throughout, and replaces this with a check whether it is properly quoted, - unquoting the string in load_string, then passing it to the codec. This fixes #502503. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-13 01:40 Message: Logged In: YES user_id=21627 New version: - removed tripple quote support in pickle/cPickle - added Lib/encodings/string_escape.py again - added PyString_Repr, which takes a smartquotes argument - recode_encoding is for PEP 263: the parser generates UTF-8 in the abstract syntax, which needs to be re-encoded with the original encoding. Unfortunately, \-escaping and UTF-8 may interleave, hence the convoluted code. On the Sam Penrose article: Without patch: dumping list of 1000 dicts: dumped: 0.192386031151 loading list of 1000 dicts: loaded: 2.46496498585 dumping list of 10000 dicts: dumped: 1.92456102371 loading list of 10000 dicts: loaded: 24.6884089708 with patch: dumping list of 1000 dicts: dumped: 0.201091051102 loading list of 1000 dicts: loaded: 0.469774007797 dumping list of 10000 dicts: dumped: 1.94221496582 loading list of 10000 dicts: loaded: 4.8661159277 So loading speed is up by a factor of 5, for this benchmark. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-12 03:51 Message: Logged In: YES user_id=6380 Closer. - Why bother stripping triple quotes in the pickle/cPickle load code? These will never happen as a result of a pickle dump AFAIK, and the code you are replacing doesn't accept these either AFAICT. - There's something missing (the previous version of the patch had it I believe) that's needed to register the codec; as a consequence, pickle.loads() doesn't work. - escape_encode() uses repr() of a string to do the work. But that means the outcome for embedding string quotes is confusing, because of the "smarts" in repr() that use " for surrounding quotes when there's a ' in the string, and vice versa. Thus, a single quote or a double quote is returned unquoted; but if they both occur in the same string, the single quote is quoted. I don't think that's particularly useful. Maybe there should be an underlying primitive operation that gives you a choice and which is invoked both by escape_encode() and string repr()? - I don't understand the recode_encoding stuff, but it looks like something like that was present before too. :-) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-11 22:47 Message: Logged In: YES user_id=21627 Updated to current CVS. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 16:09 Message: Logged In: YES user_id=6380 This would fix bug #593656 too. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 16:48 Message: Logged In: YES user_id=6380 I like this idea. But the patch is out of date. Can you rework the patch? How much faster does this make the test program from http://groups.google.de/groups?hl=en&lr=&ie=UTF-8&selm=mailman.1026940226.16076.python-list%40python.org ??? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 06:12 Message: Logged In: YES user_id=6380 I'll review this. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-01-19 10:25 Message: Logged In: YES user_id=21627 BTW, this patch has #500002 as a prerequisite. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 From noreply@sourceforge.net Tue Aug 13 00:42:23 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 16:42:23 -0700 Subject: [Patches] [ python-Patches-505705 ] Remove eval in pickle and cPickle Message-ID: Patches item #505705, was opened at 2002-01-19 10:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Out of Date Priority: 5 Submitted By: Martin v. L�wis (loewis) >Assigned to: Guido van Rossum (gvanrossum) Summary: Remove eval in pickle and cPickle Initial Comment: This patch removes the use of eval in pickle and cPickle. It does so by: - moving the actual parsing from compile.c:parsestr to PyString_DecodeEscape - introducing a new codec string-escape - removing the code that checks that a string-to-unpickle is properly escaped throughout, and replaces this with a check whether it is properly quoted, - unquoting the string in load_string, then passing it to the codec. This fixes #502503. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-13 01:40 Message: Logged In: YES user_id=21627 New version: - removed tripple quote support in pickle/cPickle - added Lib/encodings/string_escape.py again - added PyString_Repr, which takes a smartquotes argument - recode_encoding is for PEP 263: the parser generates UTF-8 in the abstract syntax, which needs to be re-encoded with the original encoding. Unfortunately, \-escaping and UTF-8 may interleave, hence the convoluted code. On the Sam Penrose article: Without patch: dumping list of 1000 dicts: dumped: 0.192386031151 loading list of 1000 dicts: loaded: 2.46496498585 dumping list of 10000 dicts: dumped: 1.92456102371 loading list of 10000 dicts: loaded: 24.6884089708 with patch: dumping list of 1000 dicts: dumped: 0.201091051102 loading list of 1000 dicts: loaded: 0.469774007797 dumping list of 10000 dicts: dumped: 1.94221496582 loading list of 10000 dicts: loaded: 4.8661159277 So loading speed is up by a factor of 5, for this benchmark. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-12 03:51 Message: Logged In: YES user_id=6380 Closer. - Why bother stripping triple quotes in the pickle/cPickle load code? These will never happen as a result of a pickle dump AFAIK, and the code you are replacing doesn't accept these either AFAICT. - There's something missing (the previous version of the patch had it I believe) that's needed to register the codec; as a consequence, pickle.loads() doesn't work. - escape_encode() uses repr() of a string to do the work. But that means the outcome for embedding string quotes is confusing, because of the "smarts" in repr() that use " for surrounding quotes when there's a ' in the string, and vice versa. Thus, a single quote or a double quote is returned unquoted; but if they both occur in the same string, the single quote is quoted. I don't think that's particularly useful. Maybe there should be an underlying primitive operation that gives you a choice and which is invoked both by escape_encode() and string repr()? - I don't understand the recode_encoding stuff, but it looks like something like that was present before too. :-) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-11 22:47 Message: Logged In: YES user_id=21627 Updated to current CVS. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 16:09 Message: Logged In: YES user_id=6380 This would fix bug #593656 too. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 16:48 Message: Logged In: YES user_id=6380 I like this idea. But the patch is out of date. Can you rework the patch? How much faster does this make the test program from http://groups.google.de/groups?hl=en&lr=&ie=UTF-8&selm=mailman.1026940226.16076.python-list%40python.org ??? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 06:12 Message: Logged In: YES user_id=6380 I'll review this. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-01-19 10:25 Message: Logged In: YES user_id=21627 BTW, this patch has #500002 as a prerequisite. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 From noreply@sourceforge.net Tue Aug 13 00:48:40 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 16:48:40 -0700 Subject: [Patches] [ python-Patches-585913 ] Adds Galeon support to webbrowser.py Message-ID: Patches item #585913, was opened at 2002-07-24 15:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=585913&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Greg Copeland (oracle) Assigned to: Nobody/Anonymous (nobody) Summary: Adds Galeon support to webbrowser.py Initial Comment: Simple context diff against current CVS tree to add support for Galeon to webbrowser.py ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-13 01:48 Message: Logged In: YES user_id=21627 If the patch has been accepted, its Resolution will be set to accepted, and SF will send you a message - so no, it hasn't been accepted yet. I still haven't found the time to compare the two patches, and nobody else has presented any clear analysis of the relative qualities, so I still don't know which one to accept - apparently, nobody else has looked at them, either. ---------------------------------------------------------------------- Comment By: Greg Copeland (oracle) Date: 2002-08-12 20:12 Message: Logged In: YES user_id=40173 Has then patch been accepted? What's the standard why of letting people know if a patch has been accepted or rejected? ---------------------------------------------------------------------- Comment By: Greg Copeland (oracle) Date: 2002-07-26 21:56 Message: Logged In: YES user_id=40173 Not really sure. I assume it's just a second patch by another author. What can I say, day late and a dollar short. ;) Having looked at the other patch, it appears mine is a little more well rounded/complete/feature rich, if only slightly. I invite you to take a look for your self. I'm also not sure what version of webbrowser.py the other patch is against. My patch is against the CVS version so it will be a breeze to apply. Enjoy! ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-26 20:54 Message: Logged In: YES user_id=21627 How does this relate to https://sourceforge.net/tracker/index.php?func=detail&aid=586437&group_id=5470&atid=305470 ? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=585913&group_id=5470 From noreply@sourceforge.net Tue Aug 13 12:44:59 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 13 Aug 2002 04:44:59 -0700 Subject: [Patches] [ python-Patches-588564 ] _locale library patch Message-ID: Patches item #588564, was opened at 2002-07-30 05:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588564&group_id=5470 Category: Distutils and setup.py Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jason Tishler (jlt63) >Assigned to: Martin v. L�wis (loewis) Summary: _locale library patch Initial Comment: This patch enables setup.py to find gettext routines when they are located in libintl instead of libc. Although I developed this patch for Cygwin, I hope that it can be easily updated to support other platforms (if necessary). I tested this patch under Cygwin and Red Hat Linux 7.1. ---------------------------------------------------------------------- Comment By: Jason Tishler (jlt63) Date: 2002-08-12 03:53 Message: Logged In: YES user_id=86216 I'm a strong proponent of doing the right thing. The unfortunate reality is that I'm way overcommitted right now. Hence, I don't have the spare cycles to figure out the best way to accomplish this "major undertaking" (to quote you). Additionally, one of the distutils developers could do a better job with much less effort than I could. Please reconsider my original patch. I'm willing to change it (in the future) to be more autoconf-like if someone else is willing to do the underlying work. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-09 15:49 Message: Logged In: YES user_id=21627 Yes, that's what I meant. It eventually results in distutils getting some of the capabilities of autoconf. I agree this is a major undertaking, but one that I think needs to progress over time, in small steps. For the current problem, it might be useful to emulate AC_TRY_LINK: generate a small program, and see whether the compiler manages to link it. You probably need to allow for result-caching as well; I recommend to put the cache file into build/temp.. This may all sound very ad-hoc, but I think it can be made to work with reasonable effort. We probably need to present any results to distutils-sig before committing them. ---------------------------------------------------------------------- Comment By: Jason Tishler (jlt63) Date: 2002-08-09 10:04 Message: Logged In: YES user_id=86216 I presume that you mean to use an autoconf-style approach *in* setup.py. Is this assumption correct? If so, then I know how to search for libintl.h via find_file(). Unfortunately, I do not know how to check that a function (e.g., getext()) is in a library (i.e., libc.a). Any suggestions? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 00:37 Message: Logged In: YES user_id=21627 I would really prefer if such problems where solved in an autoconf-style approach: - is libintl.h present (you may ask pyconfig.h for that) - if so, is gettext provided by the C library - if not, is it provided by -lintl - if yes, add -lintl - if no, print an error message and continue ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588564&group_id=5470 From noreply@sourceforge.net Tue Aug 13 19:22:35 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 13 Aug 2002 11:22:35 -0700 Subject: [Patches] [ python-Patches-511219 ] suppress type restrictions on locals() Message-ID: Patches item #511219, was opened at 2002-01-31 15:55 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=511219&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Cesar Douady (douady) Assigned to: Nobody/Anonymous (nobody) Summary: suppress type restrictions on locals() Initial Comment: This patch suppresses the restriction that global and local dictionaries do not access overloaded __getitem__ and __setitem__ if passed an object derived from class dict. An exception is made for the builtin insertion and reference in the global dict to make sure this object exists and to suppress the need for the derived class to take care of this implementation dependent detail. The behavior of eval and exec has been updated for code objects which have the CO_NEWLOCALS flag set : if explicitely passed a local dict, a new local dict is not generated. This allows one to pass an explicit local dict to the code object of a function (which otherwise cannot be achieved). If this cannot be done for backward compatibility problems, then an alternative would consist in using the "new" module to create a code object from a function with CO_NEWLOCALS reset but it seems logical to me to use the information explicitely provided. Free and cell variables are not managed in this version. If the patch is accepted, I am willing to finish the job and implement free and cell variables, but this requires a serious rework of the Cell object: free variables should be accessed using the method of the dict in which they relies and today, this dict is not accessible from the Cell object. Robustness : Currently, the plain test suite passes (with a modification of test_desctut which precisely verifies that the suppressed restriction is enforced). I have introduced a new test (test_subdict.py) which verifies the new behavior. Because of performance, the plain case (when the local dict is a plain dict) is optimized so that differences in performance are not measurable (within 1%) when run on the test suite (i.e. I timed make test). ---------------------------------------------------------------------- >Comment By: Cesar Douady (douady) Date: 2002-08-13 20:22 Message: Logged In: YES user_id=428521 I am personally in favor of rejecting it for the following reasons : - I managed to work around this restriction (so that I could keep using the standard python) - I am not 100% sure the patch I proposed is bullet proof. - It is partial anyway (some combination of nested scopes with non standard local dicts are not implemented). - It is very complicated. - It does not seem a lot of people is interested. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 11:15 Message: Logged In: YES user_id=21627 What is the status of this patch? Could you find people who are interested in using this feature? If not, I'm tempted to reject it. ---------------------------------------------------------------------- Comment By: Cesar Douady (douady) Date: 2002-04-02 20:08 Message: Logged In: YES user_id=428521 Well, I think I am in sync now. 1/ I did take you initial comment as meaning the patch could not be applied to 2.2.x 2/ I decided to generate a new patch to be applied to 2.2.1c2 3/ I realized that the patch could be applied as is 4/ I was lost 5/ I realized the meaning of the group was the one you just mentioned. 6/ I decided to post the result of my trial anyway so people could confidently apply the patch the lastest release (specially because patch outputs some warnings). 7/ I did not understand this place could actually be used as a forum (i.e. reply to previous post rather than general info). Let me apologize for my previous misunderstandings. about compatibility : I did not find a way to make it backward binary compatible, however my intent is to make it source compatible for extensions. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-04-02 16:26 Message: Logged In: YES user_id=6656 So what? Maybe you misunderstand me. This patch was in the group "Python 2.2.x", which is the group we use for patches that are under consideration for being put into a 2.2.x release of Python (or in other words, a bugfix release of Python 2.2). This patch is not going to go into a bugfix release of Python 2.2 for at least two reasons: (1) it adds what is arguably a new feature and (2) it's big and complicated and so might cause bugs. And now I've actually looked at the patch, it has even less chance: it would break binary compaitibilty of extensions. So while I'm not against the patch in general (looks good, from an eyballing), it doesn't belong in the 2.2.x group. ---------------------------------------------------------------------- Comment By: Cesar Douady (douady) Date: 2002-04-02 13:21 Message: Logged In: YES user_id=428521 I successfully applied the patch as is to revision 2.2.1c2 with the following output (and then the same procedure as mentioned for patching revision 2.2) : patching file Include/dictobject.h patching file Include/frameobject.h patching file Include/object.h patching file Lib/test/test_descrtut.py patching file Lib/test/test_subdict.py patching file Modules/cPickle.c patching file Objects/classobject.c patching file Objects/frameobject.c patching file Python/ceval.c Hunk #2 succeeded at 1534 (offset 3 lines). Hunk #4 succeeded at 1613 (offset 3 lines). Hunk #6 succeeded at 1655 (offset 3 lines). Hunk #8 succeeded at 1860 (offset 3 lines). Hunk #10 succeeded at 1889 (offset 3 lines). Hunk #12 succeeded at 2635 (offset 3 lines). Hunk #14 succeeded at 2893 (offset 3 lines). Hunk #16 succeeded at 3038 (offset 3 lines). Hunk #18 succeeded at 3657 (offset 3 lines). Hunk #20 succeeded at 3722 (offset 3 lines). patching file Python/compile.c Hunk #1 succeeded at 2916 (offset 12 lines). patching file Python/import.c Hunk #1 succeeded at 1668 (offset -4 lines). Hunk #3 succeeded at 1716 (offset -4 lines). patching file Python/sysmodule.c Hunk #1 succeeded at 238 (offset -4 lines). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 12:27 Message: Logged In: YES user_id=6656 And there's precisely no way it's going into 2.2.x. ---------------------------------------------------------------------- Comment By: Cesar Douady (douady) Date: 2002-03-30 01:08 Message: Logged In: YES user_id=428521 to install this patch from python revision 2.2, follow these steps : - get the python.diff file from this page - cd Python-2.2 - run "patch -p1 Patches item #511219, was opened at 2002-01-31 15:55 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=511219&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed >Resolution: Rejected Priority: 5 Submitted By: Cesar Douady (douady) Assigned to: Nobody/Anonymous (nobody) Summary: suppress type restrictions on locals() Initial Comment: This patch suppresses the restriction that global and local dictionaries do not access overloaded __getitem__ and __setitem__ if passed an object derived from class dict. An exception is made for the builtin insertion and reference in the global dict to make sure this object exists and to suppress the need for the derived class to take care of this implementation dependent detail. The behavior of eval and exec has been updated for code objects which have the CO_NEWLOCALS flag set : if explicitely passed a local dict, a new local dict is not generated. This allows one to pass an explicit local dict to the code object of a function (which otherwise cannot be achieved). If this cannot be done for backward compatibility problems, then an alternative would consist in using the "new" module to create a code object from a function with CO_NEWLOCALS reset but it seems logical to me to use the information explicitely provided. Free and cell variables are not managed in this version. If the patch is accepted, I am willing to finish the job and implement free and cell variables, but this requires a serious rework of the Cell object: free variables should be accessed using the method of the dict in which they relies and today, this dict is not accessible from the Cell object. Robustness : Currently, the plain test suite passes (with a modification of test_desctut which precisely verifies that the suppressed restriction is enforced). I have introduced a new test (test_subdict.py) which verifies the new behavior. Because of performance, the plain case (when the local dict is a plain dict) is optimized so that differences in performance are not measurable (within 1%) when run on the test suite (i.e. I timed make test). ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-13 20:25 Message: Logged In: YES user_id=21627 Thanks for the update. Closing it. ---------------------------------------------------------------------- Comment By: Cesar Douady (douady) Date: 2002-08-13 20:22 Message: Logged In: YES user_id=428521 I am personally in favor of rejecting it for the following reasons : - I managed to work around this restriction (so that I could keep using the standard python) - I am not 100% sure the patch I proposed is bullet proof. - It is partial anyway (some combination of nested scopes with non standard local dicts are not implemented). - It is very complicated. - It does not seem a lot of people is interested. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 11:15 Message: Logged In: YES user_id=21627 What is the status of this patch? Could you find people who are interested in using this feature? If not, I'm tempted to reject it. ---------------------------------------------------------------------- Comment By: Cesar Douady (douady) Date: 2002-04-02 20:08 Message: Logged In: YES user_id=428521 Well, I think I am in sync now. 1/ I did take you initial comment as meaning the patch could not be applied to 2.2.x 2/ I decided to generate a new patch to be applied to 2.2.1c2 3/ I realized that the patch could be applied as is 4/ I was lost 5/ I realized the meaning of the group was the one you just mentioned. 6/ I decided to post the result of my trial anyway so people could confidently apply the patch the lastest release (specially because patch outputs some warnings). 7/ I did not understand this place could actually be used as a forum (i.e. reply to previous post rather than general info). Let me apologize for my previous misunderstandings. about compatibility : I did not find a way to make it backward binary compatible, however my intent is to make it source compatible for extensions. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-04-02 16:26 Message: Logged In: YES user_id=6656 So what? Maybe you misunderstand me. This patch was in the group "Python 2.2.x", which is the group we use for patches that are under consideration for being put into a 2.2.x release of Python (or in other words, a bugfix release of Python 2.2). This patch is not going to go into a bugfix release of Python 2.2 for at least two reasons: (1) it adds what is arguably a new feature and (2) it's big and complicated and so might cause bugs. And now I've actually looked at the patch, it has even less chance: it would break binary compaitibilty of extensions. So while I'm not against the patch in general (looks good, from an eyballing), it doesn't belong in the 2.2.x group. ---------------------------------------------------------------------- Comment By: Cesar Douady (douady) Date: 2002-04-02 13:21 Message: Logged In: YES user_id=428521 I successfully applied the patch as is to revision 2.2.1c2 with the following output (and then the same procedure as mentioned for patching revision 2.2) : patching file Include/dictobject.h patching file Include/frameobject.h patching file Include/object.h patching file Lib/test/test_descrtut.py patching file Lib/test/test_subdict.py patching file Modules/cPickle.c patching file Objects/classobject.c patching file Objects/frameobject.c patching file Python/ceval.c Hunk #2 succeeded at 1534 (offset 3 lines). Hunk #4 succeeded at 1613 (offset 3 lines). Hunk #6 succeeded at 1655 (offset 3 lines). Hunk #8 succeeded at 1860 (offset 3 lines). Hunk #10 succeeded at 1889 (offset 3 lines). Hunk #12 succeeded at 2635 (offset 3 lines). Hunk #14 succeeded at 2893 (offset 3 lines). Hunk #16 succeeded at 3038 (offset 3 lines). Hunk #18 succeeded at 3657 (offset 3 lines). Hunk #20 succeeded at 3722 (offset 3 lines). patching file Python/compile.c Hunk #1 succeeded at 2916 (offset 12 lines). patching file Python/import.c Hunk #1 succeeded at 1668 (offset -4 lines). Hunk #3 succeeded at 1716 (offset -4 lines). patching file Python/sysmodule.c Hunk #1 succeeded at 238 (offset -4 lines). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 12:27 Message: Logged In: YES user_id=6656 And there's precisely no way it's going into 2.2.x. ---------------------------------------------------------------------- Comment By: Cesar Douady (douady) Date: 2002-03-30 01:08 Message: Logged In: YES user_id=428521 to install this patch from python revision 2.2, follow these steps : - get the python.diff file from this page - cd Python-2.2 - run "patch -p1 Patches item #505705, was opened at 2002-01-19 04:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 Category: Core (C code) Group: None Status: Open >Resolution: Accepted Priority: 5 Submitted By: Martin v. L�wis (loewis) >Assigned to: Martin v. L�wis (loewis) Summary: Remove eval in pickle and cPickle Initial Comment: This patch removes the use of eval in pickle and cPickle. It does so by: - moving the actual parsing from compile.c:parsestr to PyString_DecodeEscape - introducing a new codec string-escape - removing the code that checks that a string-to-unpickle is properly escaped throughout, and replaces this with a check whether it is properly quoted, - unquoting the string in load_string, then passing it to the codec. This fixes #502503. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-13 16:43 Message: Logged In: YES user_id=6380 This looks good. I measured a similar speedup. Go for it! ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 19:40 Message: Logged In: YES user_id=21627 New version: - removed tripple quote support in pickle/cPickle - added Lib/encodings/string_escape.py again - added PyString_Repr, which takes a smartquotes argument - recode_encoding is for PEP 263: the parser generates UTF-8 in the abstract syntax, which needs to be re-encoded with the original encoding. Unfortunately, \-escaping and UTF-8 may interleave, hence the convoluted code. On the Sam Penrose article: Without patch: dumping list of 1000 dicts: dumped: 0.192386031151 loading list of 1000 dicts: loaded: 2.46496498585 dumping list of 10000 dicts: dumped: 1.92456102371 loading list of 10000 dicts: loaded: 24.6884089708 with patch: dumping list of 1000 dicts: dumped: 0.201091051102 loading list of 1000 dicts: loaded: 0.469774007797 dumping list of 10000 dicts: dumped: 1.94221496582 loading list of 10000 dicts: loaded: 4.8661159277 So loading speed is up by a factor of 5, for this benchmark. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 21:51 Message: Logged In: YES user_id=6380 Closer. - Why bother stripping triple quotes in the pickle/cPickle load code? These will never happen as a result of a pickle dump AFAIK, and the code you are replacing doesn't accept these either AFAICT. - There's something missing (the previous version of the patch had it I believe) that's needed to register the codec; as a consequence, pickle.loads() doesn't work. - escape_encode() uses repr() of a string to do the work. But that means the outcome for embedding string quotes is confusing, because of the "smarts" in repr() that use " for surrounding quotes when there's a ' in the string, and vice versa. Thus, a single quote or a double quote is returned unquoted; but if they both occur in the same string, the single quote is quoted. I don't think that's particularly useful. Maybe there should be an underlying primitive operation that gives you a choice and which is invoked both by escape_encode() and string repr()? - I don't understand the recode_encoding stuff, but it looks like something like that was present before too. :-) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-11 16:47 Message: Logged In: YES user_id=21627 Updated to current CVS. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 10:09 Message: Logged In: YES user_id=6380 This would fix bug #593656 too. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 10:48 Message: Logged In: YES user_id=6380 I like this idea. But the patch is out of date. Can you rework the patch? How much faster does this make the test program from http://groups.google.de/groups?hl=en&lr=&ie=UTF-8&selm=mailman.1026940226.16076.python-list%40python.org ??? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 00:12 Message: Logged In: YES user_id=6380 I'll review this. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-01-19 04:25 Message: Logged In: YES user_id=21627 BTW, this patch has #500002 as a prerequisite. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 From noreply@sourceforge.net Wed Aug 14 03:48:41 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 13 Aug 2002 19:48:41 -0700 Subject: [Patches] [ python-Patches-594869 ] Nuke 32-bit-isms in gettext Message-ID: Patches item #594869, was opened at 2002-08-13 22:48 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=594869&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Tim Peters (tim_one) Assigned to: Barry A. Warsaw (bwarsaw) Summary: Nuke 32-bit-isms in gettext Initial Comment: I don't know how to test this module, so assigning to Barry. The intent is to get away from treating 32-bit bit patterns as signed ints in gettext.py, in particular to stop the current warning msgs whenever this test is run. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=594869&group_id=5470 From noreply@sourceforge.net Wed Aug 14 09:14:33 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 01:14:33 -0700 Subject: [Patches] [ python-Patches-505705 ] Remove eval in pickle and cPickle Message-ID: Patches item #505705, was opened at 2002-01-19 10:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 Category: Core (C code) Group: None >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Martin v. L�wis (loewis) Assigned to: Martin v. L�wis (loewis) Summary: Remove eval in pickle and cPickle Initial Comment: This patch removes the use of eval in pickle and cPickle. It does so by: - moving the actual parsing from compile.c:parsestr to PyString_DecodeEscape - introducing a new codec string-escape - removing the code that checks that a string-to-unpickle is properly escaped throughout, and replaces this with a check whether it is properly quoted, - unquoting the string in load_string, then passing it to the codec. This fixes #502503. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 10:14 Message: Logged In: YES user_id=21627 Applied as stringobject.h 2.36; pickle.py 1.69; string_escape.py 1.1; pickletester.py 1.18; _codecsmodule.c 2.14; cPickle.c 2.95; stringobject.c 2.177; compile.c 2.256; It turns out that #593656 isn't fixed: nobody checks the trailing \, so it will be unpickled as '\\x00' (consistently in pickle and cPickle, though). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-13 22:43 Message: Logged In: YES user_id=6380 This looks good. I measured a similar speedup. Go for it! ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-13 01:40 Message: Logged In: YES user_id=21627 New version: - removed tripple quote support in pickle/cPickle - added Lib/encodings/string_escape.py again - added PyString_Repr, which takes a smartquotes argument - recode_encoding is for PEP 263: the parser generates UTF-8 in the abstract syntax, which needs to be re-encoded with the original encoding. Unfortunately, \-escaping and UTF-8 may interleave, hence the convoluted code. On the Sam Penrose article: Without patch: dumping list of 1000 dicts: dumped: 0.192386031151 loading list of 1000 dicts: loaded: 2.46496498585 dumping list of 10000 dicts: dumped: 1.92456102371 loading list of 10000 dicts: loaded: 24.6884089708 with patch: dumping list of 1000 dicts: dumped: 0.201091051102 loading list of 1000 dicts: loaded: 0.469774007797 dumping list of 10000 dicts: dumped: 1.94221496582 loading list of 10000 dicts: loaded: 4.8661159277 So loading speed is up by a factor of 5, for this benchmark. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-12 03:51 Message: Logged In: YES user_id=6380 Closer. - Why bother stripping triple quotes in the pickle/cPickle load code? These will never happen as a result of a pickle dump AFAIK, and the code you are replacing doesn't accept these either AFAICT. - There's something missing (the previous version of the patch had it I believe) that's needed to register the codec; as a consequence, pickle.loads() doesn't work. - escape_encode() uses repr() of a string to do the work. But that means the outcome for embedding string quotes is confusing, because of the "smarts" in repr() that use " for surrounding quotes when there's a ' in the string, and vice versa. Thus, a single quote or a double quote is returned unquoted; but if they both occur in the same string, the single quote is quoted. I don't think that's particularly useful. Maybe there should be an underlying primitive operation that gives you a choice and which is invoked both by escape_encode() and string repr()? - I don't understand the recode_encoding stuff, but it looks like something like that was present before too. :-) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-11 22:47 Message: Logged In: YES user_id=21627 Updated to current CVS. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 16:09 Message: Logged In: YES user_id=6380 This would fix bug #593656 too. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 16:48 Message: Logged In: YES user_id=6380 I like this idea. But the patch is out of date. Can you rework the patch? How much faster does this make the test program from http://groups.google.de/groups?hl=en&lr=&ie=UTF-8&selm=mailman.1026940226.16076.python-list%40python.org ??? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 06:12 Message: Logged In: YES user_id=6380 I'll review this. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-01-19 10:25 Message: Logged In: YES user_id=21627 BTW, this patch has #500002 as a prerequisite. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=505705&group_id=5470 From noreply@sourceforge.net Wed Aug 14 10:07:53 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 02:07:53 -0700 Subject: [Patches] [ python-Patches-593560 ] bugfixes and cleanup for _strptime.py Message-ID: Patches item #593560, was opened at 2002-08-10 19:01 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593560&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Brett Cannon (bcannon) Assigned to: Nobody/Anonymous (nobody) Summary: bugfixes and cleanup for _strptime.py Initial Comment: Discovered two bugs in _strptime.py thanks to Mikael Sch?berg of AB Strakt; both were in LocaleTime.__calc_date_time(). One was where if a locale-specific format string represented the month without a leading zero, it would not be caught. The other bug was when a locale just lacked some information (in this case, Swedish's lack of an AM/PM representation); IndexError was thrown because string.replace() was being called with the empty string as the old value. I also took this opportunity to clean up some of the code (namely TimeRE.__getitem__() along with LocaleTime.__calc_date_time()). Added some comments, reformatted some code, etc. All of this was brought on thanks to the Python Cookbook's chapter 1 (good work Alex and David!). I have updated test_strptime.py to check for the second of the mentioned bug explicitly. I also commented the code and added a fxn that creates a PyUnit test suite with all of the tests. ---------------------------------------------------------------------- >Comment By: Brett Cannon (bcannon) Date: 2002-08-14 02:07 Message: Logged In: YES user_id=357491 Just as a follow-up, I got an email from Mikael on Mon., 2002-08-12, letting me know that the patch seems to have worked for the bug he discovered. ---------------------------------------------------------------------- Comment By: Brett Cannon (bcannon) Date: 2002-08-11 15:16 Message: Logged In: YES user_id=357491 Sorry, Martin. I thought I remembered reading somewhere that for Python files you can just post the whole thing. I will stop doing that. As for Mikael and the patch, he says that it appears to be working. I gave it to him on Tuesday and he said it appeared to be working; he has yet to say otherwise. If you prefer, I can have him post here to verify this. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-11 10:47 Message: Logged In: YES user_id=21627 Please don't post complete files. Instead, post context (-c) or unified (-u) diffs. Ideally, produce them with "cvs diff", as this will result in patches that record the CVS version number they were for. I think it would be good to get a comment from Mikael on that patch. ---------------------------------------------------------------------- Comment By: Brett Cannon (bcannon) Date: 2002-08-10 20:31 Message: Logged In: YES user_id=357491 Just when you thought you had something done, tim_one had to go and normalize the whitespace in both _strptime.py and test_strptime.py! =) So to save Tim the time and effort of having to normalize the files again, I went ahead and applied them to the fixed files. I also reformatted test_strptime.py so that lines wrapped around 80 characters (didn't realize Guido had added it to the distro until today). So make sure to use the files that specify whitespace normalization in their descriptions. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593560&group_id=5470 From noreply@sourceforge.net Wed Aug 14 10:21:39 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 02:21:39 -0700 Subject: [Patches] [ python-Patches-588564 ] _locale library patch Message-ID: Patches item #588564, was opened at 2002-07-30 15:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588564&group_id=5470 Category: Distutils and setup.py Group: None Status: Open >Resolution: Accepted Priority: 5 Submitted By: Jason Tishler (jlt63) >Assigned to: Jason Tishler (jlt63) Summary: _locale library patch Initial Comment: This patch enables setup.py to find gettext routines when they are located in libintl instead of libc. Although I developed this patch for Cygwin, I hope that it can be easily updated to support other platforms (if necessary). I tested this patch under Cygwin and Red Hat Linux 7.1. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 11:21 Message: Logged In: YES user_id=21627 Ok, I'll accept the patch - please apply it. As for distutils developers - there are none left, so unless interested users develop it, it is stuck right where it is now. ---------------------------------------------------------------------- Comment By: Jason Tishler (jlt63) Date: 2002-08-12 13:53 Message: Logged In: YES user_id=86216 I'm a strong proponent of doing the right thing. The unfortunate reality is that I'm way overcommitted right now. Hence, I don't have the spare cycles to figure out the best way to accomplish this "major undertaking" (to quote you). Additionally, one of the distutils developers could do a better job with much less effort than I could. Please reconsider my original patch. I'm willing to change it (in the future) to be more autoconf-like if someone else is willing to do the underlying work. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-10 01:49 Message: Logged In: YES user_id=21627 Yes, that's what I meant. It eventually results in distutils getting some of the capabilities of autoconf. I agree this is a major undertaking, but one that I think needs to progress over time, in small steps. For the current problem, it might be useful to emulate AC_TRY_LINK: generate a small program, and see whether the compiler manages to link it. You probably need to allow for result-caching as well; I recommend to put the cache file into build/temp.. This may all sound very ad-hoc, but I think it can be made to work with reasonable effort. We probably need to present any results to distutils-sig before committing them. ---------------------------------------------------------------------- Comment By: Jason Tishler (jlt63) Date: 2002-08-09 20:04 Message: Logged In: YES user_id=86216 I presume that you mean to use an autoconf-style approach *in* setup.py. Is this assumption correct? If so, then I know how to search for libintl.h via find_file(). Unfortunately, I do not know how to check that a function (e.g., getext()) is in a library (i.e., libc.a). Any suggestions? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 10:37 Message: Logged In: YES user_id=21627 I would really prefer if such problems where solved in an autoconf-style approach: - is libintl.h present (you may ask pyconfig.h for that) - if so, is gettext provided by the C library - if not, is it provided by -lintl - if yes, add -lintl - if no, print an error message and continue ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588564&group_id=5470 From noreply@sourceforge.net Wed Aug 14 11:23:23 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 03:23:23 -0700 Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks Message-ID: Patches item #432401, was opened at 2001-06-12 15:43 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Postponed Priority: 6 Submitted By: Walter D�rwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: unicode encoding error callbacks Initial Comment: This patch adds unicode error handling callbacks to the encode functionality. With this patch it's possible to not only pass 'strict', 'ignore' or 'replace' as the errors argument to encode, but also a callable function, that will be called with the encoding name, the original unicode object and the position of the unencodable character. The callback must return a replacement unicode object that will be encoded instead of the original character. For example replacing unencodable characters with XML character references can be done in the following way. u"a�o�u��".encode( "ascii", lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos]) ) ---------------------------------------------------------------------- >Comment By: Walter D�rwald (doerwalter) Date: 2002-08-14 12:23 Message: Logged In: YES user_id=89016 This new version diff13.txt moves the initialization of codec.strict_errors etc. from Modules/_codecsmodule.c to Lib/codecs.py. The error logic for the accessor function is inverted (now its 0 for success and -1 for error). Updated the prototypes to use the new PyAPI_FUNC macro. Enhanced the docstrings for str.(de|en)code and unicode.encode. There seems to be a new string decoding function PyString_DecodeEscape in current CVS. This function has to be updated too. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-07-26 17:41 Message: Logged In: YES user_id=89016 The attached new version of the test script add test for wrong parameter passed to the callbacks or wrong results returned from the callback. It also add tests to the long string tests for copies of the builtin error handlers, so the codec does not recognize the name and goes through the general callback machinery. UTF-7 decoding still has a flaw inherited from the current implementation: >>> "+xxx".decode("utf-7") Traceback (most recent call last): File "", line 1, in ? UnicodeDecodeError: 'utf7' codec can't decode bytes in position 0-3: unterminated shift sequence *>>> "+xxx".decode("utf-7", "ignore") u'\uc71c' The decoder should consider the whole sequence "+xxx" as undecodable, so "Ignore" should return an empty string. Currently the correct sequence will be passed to the callback, but the faulty sequence has already been emitted to the result string. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-07-24 21:04 Message: Logged In: YES user_id=89016 Attached is a new version of the test script. But we need more tests. UTF-7 is completely untested and using codecs that pass wrong arguments to the handler and handler that return wrong or out of bounds results is untested too. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-07-24 20:55 Message: Logged In: YES user_id=89016 diff12.txt finally implements the PEP293 specification (i.e. using exceptions for the communication between codec and handler) ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-05-30 18:30 Message: Logged In: YES user_id=89016 diff11.txt fixes two refcounting bugs in codecs.c. speedtest.py is a little test script, that checks to speed of various string/encoding/error combinations. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-05-29 22:50 Message: Logged In: YES user_id=89016 This new version diff10.txt fixes a memory overwrite/reallocation bug in PyUnicode_EncodeCharmap and moves the error handling out of PyUnicode_EncodeCharmap. A new version of the test script is included too. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-05-16 21:06 Message: Logged In: YES user_id=89016 OK, PyUnicode_TranslateCharmap is finished too. As the errors argument is again not exposed to Python it can't really be tested. Should we add errors as an optional argument to unicode.translate? ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-05-01 19:57 Message: Logged In: YES user_id=89016 OK, PyUnicode_EncodeDecimal is done (diff8.txt), but as the errors argument can't be accessed from Python code, there's not much testing for this. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-04-20 17:34 Message: Logged In: YES user_id=89016 A new idea for the interface between the codec and the callback: Maybe we could have new exception classes UnicodeEncodeError, UnicodeDecodeError and UnicodeTranslateError derived from UnicodeError. They have all the attributes that are passed as an argument tuple in the current version: string: the original string start: the start position of the unencodable characters/undecodable bytes end: the end position+1 of the unencodable characters/undecodable bytes. reason: the a string, that explains, why the encoding/decoding doesn't work. There is no data object, because when a codec wants to pass extended information to the callback it can do this via a derived class. It might be better to move these attributes to the base class UnicodeError, but this might have backwards compatibility problems. With this method we really can have one global registry for all callbacks, because for callback names that must work with encoding *and* decoding *and* translating (i.e. "strict", "replace" and "ignore"), the callback can check which type of exception was passed, so "replace" can e.g. look like this: def replace(exc): if isinstance(exc, UnicodeDecodeError): return ("?", exc.end) else: return (u"?"*(exc.end-exc.start), exc.end) Another possibility would be to do the commucation callback->codec by assigning to attributes of the exception object. The resyncronisation position could even be preassigned to end, so the callback only needs to specify the replacement in most cases: def replace(exc): if isinstance(exc, UnicodeDecodeError): exc.replacement = "?" else: exc.replacement = u"?"*(exc.end-exc.start) As many of the assignments can now be done on the C level without having to allocate Python objects (except for the replacement string and the reason), this version might even be faster, especially if we allow the codec to reuse the exception object for the next call to the callback. Does this make sense, or is this to fancy? ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-04-18 21:24 Message: Logged In: YES user_id=89016 And here is the test script (test_codeccallbacks.py) ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-04-18 21:22 Message: Logged In: YES user_id=89016 OK, here is the current version of the patch (diff7.txt). PyUnicode_EncodeDecimal and PyUnicode_TranslateCharmap are still missing. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-04-17 22:50 Message: Logged In: YES user_id=89016 > About the difference between encoding > and decoding: you shouldn't just look > at the case where you work with Unicode > and strings, e.g. take the rot-13 codec > which works on strings only or other > codecs which translate objects into > strings and vice-versa. unicode.encode encodes to str and str.decode decodes to unicode, even for rot-13: >>> u"g�rk".encode("rot13") 't\xfcex' >>> "g�rk".decode("rot13") u't\xfcex' >>> u"g�rk".decode("rot13") Traceback (most recent call last): File "", line 1, in ? AttributeError: 'unicode' object has no attribute 'decode' >>> "g�rk".encode("rot13") Traceback (most recent call last): File "", line 1, in ? File "/home/walter/Python-current- readonly/dist/src/Lib/encodings/rot_13.py", line 18, in encode return codecs.charmap_encode(input,errors,encoding_map) UnicodeError: ASCII decoding error: ordinal not in range (128) Here the str is converted to unicode first, before encode is called, but the conversion to unicode fails. Is there an example where something else happens? > Error handling has to be flexible enough > to handle all these situations. Since > the codecs know best how to handle the > situations, I'd make this an implementation > detail of the codec and leave the > behaviour undefined in the general case. OK, but we should suggest, that for encoding unencodable characters are collected and for decoding seperate byte sequences that are considered broken by the codec are passed to the callback: i.e for decoding the handler will never get all broken data in one call, e.g. for "\u30\Uffffffff".decode("unicode-escape") the handler will be called twice (once for "\u30" and "truncated \u escape" as the reason and once for "\Uffffffff" and "illegal character" as the reason.) > For the existing codecs, backward > compatibility should be maintained, > if at all possible. If the patch gets > overly complicated because of this, > we may have to provide a downgrade solution > for this particular problem (I don't think > replace is used in any computational context, > though, since you can never be sure how > many replacement character do get > inserted, so the case may not be > that realistic). > > Raising an exception for the charmap codec > is the right way to go, IMHO. I would > consider the current behaviour a bug. OK, this is implemented in PyUnicode_EncodeCharmap now, and collecting unencodable characters works too. I completely changed the implementation, because the stack approach would have gotten much more complicated when unencodable characters are collected. > For new codecs, I think we should > suggest that replace tries to collect > as much illegal data as possible before > invoking the error handler. The handler > should be aware of the fact that it > won't necessarily get all the broken > data in one call. OK for encoders, for decoders see above. > About the codec error handling > registry: You seem to be using a > Unicode specific approach here. > I'd rather like to see a generic > approach which uses the API > we discussed earlier. Would that be possible? The handlers in the registry are all Unicode specific. and they are different for encoding and for decoding. I renamed the function because of your comment from 2001-06-13 10:05 (which becomes exceedingly difficult to find on this long page! ;)). > In that case, the codec API should > probably be called > codecs.register_error('myhandler', myhandler). > > Does that make sense ? We could require that unique names are used for custom handlers, but for the standard handlers we do have name collisions. To prevent them, we could either remove them from the registry and require that the codec implements the error handling for those itself, or we could to some fiddling, so that u"��".encode("ascii", "replace") becomes u"��".encode("ascii", "unicodeencodereplace") behind the scenes. But I think two unicode specific registries are much simpler to handle. > BTW, the patch which uses the callback > registry does not seem to be available > on this SF page (the last patch still > converts the errors argument to a > PyObject, which shouldn't be needed > anymore with the new approach). > Can you please upload your > latest version? OK, I'll upload a preliminary version tomorrow. PyUnicode_EncodeDecimal and PyUnicode_TranslateCharmap are still missing, but otherwise the patch seems to be finished. All decoders work and the encoders collect unencodable characters and implement the handling of known callback handler names themselves. As PyUnicode_EncodeDecimal is only used by the int, long, float, and complex constructors, I'd love to get rid of the errors argument, but for completeness sake, I'll implement the callback functionality. > Note that the highlighting codec > would make a nice example > for the new feature. This could be part of the codec callback test script, which I've started to write. We could kill two birds with one stone here: 1. Test the implementation. 2. Document and advocate what is possible with the patch. Another idea: we could have as an example a decoding handler that relaxes the UTF-8 minimal encoding restriction, e.g. def relaxedutf8(enc, uni, startpos, endpos, reason, data): if uni[startpos:startpos+2] == u"\xc0\x80": return (u"\x00", startpos+2) else: raise UnicodeError(...) ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-04-17 21:40 Message: Logged In: YES user_id=38388 Sorry for the late response. About the difference between encoding and decoding: you shouldn't just look at the case where you work with Unicode and strings, e.g. take the rot-13 codec which works on strings only or other codecs which translate objects into strings and vice-versa. Error handling has to be flexible enough to handle all these situations. Since the codecs know best how to handle the situations, I'd make this an implementation detail of the codec and leave the behaviour undefined in the general case. For the existing codecs, backward compatibility should be maintained, if at all possible. If the patch gets overly complicated because of this, we may have to provide a downgrade solution for this particular problem (I don't think replace is used in any computational context, though, since you can never be sure how many replacement character do get inserted, so the case may not be that realistic). Raising an exception for the charmap codec is the right way to go, IMHO. I would consider the current behaviour a bug. For new codecs, I think we should suggest that replace tries to collect as much illegal data as possible before invoking the error handler. The handler should be aware of the fact that it won't necessarily get all the broken data in one call. About the codec error handling registry: You seem to be using a Unicode specific approach here. I'd rather like to see a generic approach which uses the API we discussed earlier. Would that be possible ? In that case, the codec API should probably be called codecs.register_error('myhandler', myhandler). Does that make sense ? BTW, the patch which uses the callback registry does not seem to be available on this SF page (the last patch still converts the errors argument to a PyObject, which shouldn't be needed anymore with the new approach). Can you please upload your latest version ? Note that the highlighting codec would make a nice example for the new feature. Thanks. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-04-17 12:21 Message: Logged In: YES user_id=89016 Another note: the patch will change the meaning of charmap encoding slightly: currently "replace" will put a ? into the output, even if ? is not in the mapping, i.e. codecs.charmap_encode(u"c", "replace", {ord("a"): ord ("b")}) will return ('?', 1). With the patch the above example will raise an exception. Off course with the patch many more replace characters can appear, so it is vital that for the replacement string the mapping is done. Is this semantic change OK? (I guess all of the existing codecs have a mapping ord("?")->ord("?")) ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-03-15 18:19 Message: Logged In: YES user_id=89016 So this means that the encoder can collect illegal characters and pass it to the callback. "replace" will replace this with (end-start)*u"?". Decoders don't collect all illegal byte sequences, but call the callback once for every byte sequence that has been found illegal and "replace" will replace it with u"?". Does this make sense? ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-03-15 18:06 Message: Logged In: YES user_id=89016 For encoding it's always (end-start)*u"?": >>> u"��".encode("ascii", "replace") '??' But for decoding, it is neither nor: >>> "\Ux\U".decode("unicode-escape", "replace") u'\ufffd\ufffd' i.e. a sequence of 5 illegal characters was replace by two replacement characters. This might mean that decoders can't collect all the illegal characters and call the callback once. They might have to call the callback for every single illegal byte sequence to get the old behaviour. (It seems that this patch would be much, much simpler, if we only change the encoders) ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-03-08 19:36 Message: Logged In: YES user_id=38388 Hmm, whatever it takes to maintain backwards compatibility. Do you have an example ? ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-03-08 18:31 Message: Logged In: YES user_id=89016 What should replace do: Return u"?" or (end-start)*u"?" ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-03-08 16:15 Message: Logged In: YES user_id=38388 Sounds like a good idea. Please keep the encoder and decoder APIs symmetric, though, ie. add the slice information to both APIs. The slice should use the same format as Python's standard slices, that is left inclusive, right exclusive. I like the highlighting feature ! ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-03-08 00:09 Message: Logged In: YES user_id=89016 I'm think about extending the API a little bit: Consider the following example: >>> "\u1".decode("unicode-escape") Traceback (most recent call last): File "", line 1, in ? UnicodeError: encoding 'unicodeescape' can't decode byte 0x31 in position 2: truncated \uXXXX escape The error message is a lie: Not the '1' in position 2 is the problem, but the complete truncated sequence '\u1'. For this the decoder should pass a start and an end position to the handler. For encoding this would be useful too: Suppose I want to have an encoder that colors the unencodable character via an ANSI escape sequences. Then I could do the following: >>> import codecs >>> def color(enc, uni, pos, why, sta): ... return (u"\033[1m<%d>\033[0m" % ord(uni[pos]), pos+1) ... >>> codecs.register_unicodeencodeerrorhandler("color", color) >>> u"a��o".encode("ascii", "color") 'a\x1b[1m<228>\x1b[0m\x1b[1m<252>\x1b[0m\x1b[1m<246>\x1b [0mo' But here the sequences "\x1b[0m\x1b[1m" are not needed. To fix this problem the encoder could collect as many unencodable characters as possible and pass those to the error callback in one go (passing a start and end+1 position). This fixes the above problem and reduces the number of calls to the callback, so it should speed up the algorithms in case of custom encoding names. (And it makes the implementation very interesting ;)) What do you think? ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-03-07 02:29 Message: Logged In: YES user_id=89016 I started from scratch, and the current state is this: Encoding mostly works (except that I haven't changed TranslateCharmap and EncodeDecimal yet) and most of the decoding stuff works (DecodeASCII and DecodeCharmap are still unchanged) and the decoding callback helper isn't optimized for the "builtin" names yet (i.e. it still calls the handler). For encoding the callback helper knows how to handle "strict", "replace", "ignore" and "xmlcharrefreplace" itself and won't call the callback. This should make the encoder fast enough. As callback name string comparison results are cached it might even be faster than the original. The patch so far didn't require any changes to unicodeobject.h, stringobject.h or stringobject.c ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-03-05 17:49 Message: Logged In: YES user_id=38388 Walter, are you making any progress on the new scheme we discussed on the mailing list (adding an error handler registry much like the codec registry itself instead of trying to redo the complete codec API) ? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-09-20 12:38 Message: Logged In: YES user_id=38388 I am postponing this patch until the PEP process has started. This feature won't make it into Python 2.2. Walter, you may want to reference this patch in the PEP. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-08-16 12:53 Message: Logged In: YES user_id=38388 I think we ought to summarize these changes in a PEP to get some more feedback and testing from others as well. I'll look into this after I'm back from vacation on the 10.09. Given the release schedule I am not sure whether this feature will make it into 2.2. The size of the patch is huge and probably needs a lot of testing first. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2001-07-27 05:55 Message: Logged In: YES user_id=89016 Changing the decoding API is done now. There are new functions codec.register_unicodedecodeerrorhandler and codec.lookup_unicodedecodeerrorhandler. Only the standard handlers for 'strict', 'ignore' and 'replace' are preregistered. There may be many reasons for decoding errors in the byte string, so I added an additional argument to the decoding API: reason, which gives the reason for the failure, e.g.: >>> "\U1111111".decode("unicode_escape") Traceback (most recent call last): File "", line 1, in ? UnicodeError: encoding 'unicodeescape' can't decode byte 0x31 in position 8: truncated \UXXXXXXXX escape >>> "\U11111111".decode("unicode_escape") Traceback (most recent call last): File "", line 1, in ? UnicodeError: encoding 'unicodeescape' can't decode byte 0x31 in position 9: illegal Unicode character For symmetry I added this to the encoding API too: >>> u"\xff".encode("ascii") Traceback (most recent call last): File "", line 1, in ? UnicodeError: encoding 'ascii' can't decode byte 0xff in position 0: ordinal not in range(128) The parameters passed to the callbacks now are: encoding, unicode, position, reason, state. The encoding and decoding API for strings has been adapted too, so now the new API should be usable everywhere: >>> unicode("a\xffb\xffc", "ascii", ... lambda enc, uni, pos, rea, sta: (u"", pos+1)) u'abc' >>> "a\xffb\xffc".decode("ascii", ... lambda enc, uni, pos, rea, sta: (u"", pos+1)) u'abc' I had a problem with the decoding API: all the functions in _codecsmodule.c used the t# format specifier. I changed that to O! with &PyString_Type, because otherwise we would have the problem that the decoding API would must pass buffer object around instead of strings, and the callback would have to call str() on the buffer anyway to access a specific character, so this wouldn't be any faster than calling str() on the buffer before decoding. It seems that buffers aren't used anyway. I changed all the old function to call the new ones so bugfixes don't have to be done in two places. There are two exceptions: I didn't change PyString_AsEncodedString and PyString_AsDecodedString because they are documented as deprecated anyway (although they are called in a few spots) This means that I duplicated part of their functionality in PyString_AsEncodedObjectEx and PyString_AsDecodedObjectEx. There are still a few spots that call the old API: E.g. PyString_Format still calls PyUnicode_Decode (but with strict decoding) because it passes the rest of the format string to PyUnicode_Format when it encounters a Unicode object. Should we switch to the new API everywhere even if strict encoding/decoding is used? The size of this patch begins to scare me. I guess we need an extensive test script for all the new features and documentation. I hope you have time to do that, as I'll be busy with other projects in the next weeks. (BTW, I have't touched PyUnicode_TranslateCharmap yet.) ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2001-07-23 19:03 Message: Logged In: YES user_id=89016 New version of the patch with the error handling callback registry. > > OK, done, now there's a > > PyCodec_EscapeReplaceUnicodeEncodeErrors/ > > codecs.escapereplace_unicodeencode_errors > > that uses \u (or \U if x>0xffff (with a wide build > > of Python)). > > Great! Now PyCodec_EscapeReplaceUnicodeEncodeErrors uses \x in addition to \u and \U where appropriate. > > [...] > > But for special one-shot error handlers, it might still be > > useful to pass the error handler directly, so maybe we > > should leave error as PyObject *, but implement the > > registry anyway? > > Good idea ! > > One minor nit: codecs.registerError() should be named > codecs.register_errorhandler() to be more inline with > the Python coding style guide. OK, but these function are specific to unicode encoding, so now the functions are called: codecs.register_unicodeencodeerrorhandler codecs.lookup_unicodeencodeerrorhandler Now all callbacks (including the new ones: "xmlcharrefreplace" and "escapereplace") are registered in the codecs.c/_PyCodecRegistry_Init so using them is really simple: u"g�rk".encode("ascii", "xmlcharrefreplace") ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-07-13 13:26 Message: Logged In: YES user_id=38388 > > > > > BTW, I guess PyUnicode_EncodeUnicodeEscape > > > > > could be reimplemented as PyUnicode_EncodeASCII > > > > > with \uxxxx replacement callback. > > > > > > > > Hmm, wouldn't that result in a slowdown ? If so, > > > > I'd rather leave the special encoder in place, > > > > since it is being used a lot in Python and > > > > probably some applications too. > > > > > > It would be a slowdown. But callbacks open many > > > possiblities. > > > > True, but in this case I believe that we should stick with > > the native implementation for "unicode-escape". Having > > a standard callback error handler which does the \uXXXX > > replacement would be nice to have though, since this would > > also be usable with lots of other codecs (e.g. all the > > code page ones). > > OK, done, now there's a > PyCodec_EscapeReplaceUnicodeEncodeErrors/ > codecs.escapereplace_unicodeencode_errors > that uses \u (or \U if x>0xffff (with a wide build > of Python)). Great ! > > [...] > > > Should the old TranslateCharmap map to the new > > > TranslateCharmapEx and inherit the > > > "multicharacter replacement" feature, > > > or should I leave it as it is? > > > > If possible, please also add the multichar replacement > > to the old API. I think it is very useful and since the > > old APIs work on raw buffers it would be a benefit to have > > the functionality in the old implementation too. > > OK! I will try to find the time to implement that in the > next days. Good. > > [Decoding error callbacks] > > > > About the return value: > > > > I'd suggest to always use the same tuple interface, e.g. > > > > callback(encoding, input_data, input_position, > state) -> > > (output_to_be_appended, new_input_position) > > > > (I think it's better to use absolute values for the > > position rather than offsets.) > > > > Perhaps the encoding callbacks should use the same > > interface... what do you think ? > > This would make the callback feature hypergeneric and a > little slower, because tuples have to be created, but it > (almost) unifies the encoding and decoding API. ("almost" > because, for the encoder output_to_be_appended will be > reencoded, for the decoder it will simply be appended.), > so I'm for it. That's the point. Note that I don't think the tuple creation will hurt much (see the make_tuple() API in codecs.c) since small tuples are cached by Python internally. > I implemented this and changed the encoders to only > lookup the error handler on the first error. The UCS1 > encoder now no longer uses the two-item stack strategy. > (This strategy only makes sense for those encoder where > the encoding itself is much more complicated than the > looping/callback etc.) So now memory overflow tests are > only done, when an unencodable error occurs, so now the > UCS1 encoder should be as fast as it was without > error callbacks. > > Do we want to enforce new_input_position>input_position, > or should jumping back be allowed? No; moving backwards should be allowed (this may be useful in order to resynchronize with the input data). > Here's is the current todo list: > 1. implement a new TranslateCharmap and fix the old. > 2. New encoding API for string objects too. > 3. Decoding > 4. Documentation > 5. Test cases > > I'm thinking about a different strategy for implementing > callbacks > (see http://mail.python.org/pipermail/i18n-sig/2001- > July/001262.html) > > We coould have a error handler registry, which maps names > to error handlers, then it would be possible to keep the > errors argument as "const char *" instead of "PyObject *". > Currently PyCodec_UnicodeEncodeHandlerForObject is a > backwards compatibility hack that will never go away, > because > it's always more convenient to type > u"...".encode("...", "strict") > instead of > import codecs > u"...".encode("...", codecs.raise_encode_errors) > > But with an error handler registry this function would > become the official lookup method for error handlers. > (PyCodec_LookupUnicodeEncodeErrorHandler?) > Python code would look like this: > --- > def xmlreplace(encoding, unicode, pos, state): > return (u"&#%d;" % ord(uni[pos]), pos+1) > > import codec > > codec.registerError("xmlreplace",xmlreplace) > --- > and then the following call can be made: > u"��".encode("ascii", "xmlreplace") > As soon as the first error is encountered, the encoder uses > its builtin error handling method if it recognizes the name > ("strict", "replace" or "ignore") or looks up the error > handling function in the registry if it doesn't. In this way > the speed for the backwards compatible features is the same > as before and "const char *error" can be kept as the > parameter to all encoding functions. For speed common error > handling names could even be implemented in the encoder > itself. > > But for special one-shot error handlers, it might still be > useful to pass the error handler directly, so maybe we > should leave error as PyObject *, but implement the > registry anyway? Good idea ! One minor nit: codecs.registerError() should be named codecs.register_errorhandler() to be more inline with the Python coding style guide. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2001-07-12 13:03 Message: Logged In: YES user_id=89016 > > [...] > > so I guess we could change the replace handler > > to always return u'?'. This would make the > > implementation a little bit simpler, but the > > explanation of the callback feature *a lot* > > simpler. > > Go for it. OK, done! > [...] > > > Could you add these docs to the Misc/unicode.txt > > > file ? I will eventually take that file and turn > > > it into a PEP which will then serve as general > > > documentation for these things. > > > > I could, but first we should work out how the > > decoding callback API will work. > > Ok. BTW, Barry Warsaw already did the work of converting > the unicode.txt to PEP 100, so the docs should eventually > go there. OK. I guess it would be best to do this when everything is finished. > > > > BTW, I guess PyUnicode_EncodeUnicodeEscape > > > > could be reimplemented as PyUnicode_EncodeASCII > > > > with \uxxxx replacement callback. > > > > > > Hmm, wouldn't that result in a slowdown ? If so, > > > I'd rather leave the special encoder in place, > > > since it is being used a lot in Python and > > > probably some applications too. > > > > It would be a slowdown. But callbacks open many > > possiblities. > > True, but in this case I believe that we should stick with > the native implementation for "unicode-escape". Having > a standard callback error handler which does the \uXXXX > replacement would be nice to have though, since this would > also be usable with lots of other codecs (e.g. all the > code page ones). OK, done, now there's a PyCodec_EscapeReplaceUnicodeEncodeErrors/ codecs.escapereplace_unicodeencode_errors that uses \u (or \U if x>0xffff (with a wide build of Python)). > > For example: > > > > Why can't I print u"g�rk"? > > > > is probably one of the most frequently asked > > questions in comp.lang.python. For printing > > Unicode stuff, print could be extended the use an > > error handling callback for Unicode strings (or > > objects where __str__ or tp_str returns a Unicode > > object) instead of using str() which always > > returns an 8bit string and uses strict encoding. > > There might even be a > > sys.setprintencodehandler()/sys.getprintencodehandler () > > There already is a print callback in Python (forgot the > name of the hook though), so this should be possible by > providing the encoding logic in the hook. True: sys.displayhook > [...] > > Should the old TranslateCharmap map to the new > > TranslateCharmapEx and inherit the > > "multicharacter replacement" feature, > > or should I leave it as it is? > > If possible, please also add the multichar replacement > to the old API. I think it is very useful and since the > old APIs work on raw buffers it would be a benefit to have > the functionality in the old implementation too. OK! I will try to find the time to implement that in the next days. > [Decoding error callbacks] > > About the return value: > > I'd suggest to always use the same tuple interface, e.g. > > callback(encoding, input_data, input_position, state) -> > (output_to_be_appended, new_input_position) > > (I think it's better to use absolute values for the > position rather than offsets.) > > Perhaps the encoding callbacks should use the same > interface... what do you think ? This would make the callback feature hypergeneric and a little slower, because tuples have to be created, but it (almost) unifies the encoding and decoding API. ("almost" because, for the encoder output_to_be_appended will be reencoded, for the decoder it will simply be appended.), so I'm for it. I implemented this and changed the encoders to only lookup the error handler on the first error. The UCS1 encoder now no longer uses the two-item stack strategy. (This strategy only makes sense for those encoder where the encoding itself is much more complicated than the looping/callback etc.) So now memory overflow tests are only done, when an unencodable error occurs, so now the UCS1 encoder should be as fast as it was without error callbacks. Do we want to enforce new_input_position>input_position, or should jumping back be allowed? > > > > One additional note: It is vital that errors > > > > is an assignable attribute of the StreamWriter. > > > > > > It is already ! > > > > I know, but IMHO it should be documented that an > > assignable errors attribute must be supported > > as part of the official codec API. > > > > Misc/unicode.txt is not clear on that: > > """ > > It is not required by the Unicode implementation > > to use these base classes, only the interfaces must > > match; this allows writing Codecs as extension types. > > """ > > Good point. I'll add that to the PEP 100. OK. Here's is the current todo list: 1. implement a new TranslateCharmap and fix the old. 2. New encoding API for string objects too. 3. Decoding 4. Documentation 5. Test cases I'm thinking about a different strategy for implementing callbacks (see http://mail.python.org/pipermail/i18n-sig/2001- July/001262.html) We coould have a error handler registry, which maps names to error handlers, then it would be possible to keep the errors argument as "const char *" instead of "PyObject *". Currently PyCodec_UnicodeEncodeHandlerForObject is a backwards compatibility hack that will never go away, because it's always more convenient to type u"...".encode("...", "strict") instead of import codecs u"...".encode("...", codecs.raise_encode_errors) But with an error handler registry this function would become the official lookup method for error handlers. (PyCodec_LookupUnicodeEncodeErrorHandler?) Python code would look like this: --- def xmlreplace(encoding, unicode, pos, state): return (u"&#%d;" % ord(uni[pos]), pos+1) import codec codec.registerError("xmlreplace",xmlreplace) --- and then the following call can be made: u"��".encode("ascii", "xmlreplace") As soon as the first error is encountered, the encoder uses its builtin error handling method if it recognizes the name ("strict", "replace" or "ignore") or looks up the error handling function in the registry if it doesn't. In this way the speed for the backwards compatible features is the same as before and "const char *error" can be kept as the parameter to all encoding functions. For speed common error handling names could even be implemented in the encoder itself. But for special one-shot error handlers, it might still be useful to pass the error handler directly, so maybe we should leave error as PyObject *, but implement the registry anyway? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-07-10 14:29 Message: Logged In: YES user_id=38388 Ok, here we go... > > > raise an exception). U+FFFD characters in the > replacement > > > string will be replaced with a character that the > encoder > > > chooses ('?' in all cases). > > > > Nice. > > But the special casing of U+FFFD makes the interface > somewhat > less clean than it could be. It was only done to be 100% > backwards compatible. With the original "replace" > error > handling the codec chose the replacement character. But as > far as I can tell none of the codecs uses anything other > than '?', True. > so I guess we could change the replace handler > to always return u'?'. This would make the implementation a > little bit simpler, but the explanation of the callback > feature *a lot* simpler. Go for it. > And if you still want to handle > an unencodable U+FFFD, you can write a special callback for > that, e.g. > > def FFFDreplace(enc, uni, pos): > if uni[pos] == "\ufffd": > return u"?" > else: > raise UnicodeError(...) > > > ...docs... > > > > Could you add these docs to the Misc/unicode.txt file ? I > > will eventually take that file and turn it into a PEP > which > > will then serve as general documentation for these things. > > I could, but first we should work out how the decoding > callback API will work. Ok. BTW, Barry Warsaw already did the work of converting the unicode.txt to PEP 100, so the docs should eventually go there. > > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > > > replacement callback. > > > > Hmm, wouldn't that result in a slowdown ? If so, I'd > rather > > leave the special encoder in place, since it is being > used a > > lot in Python and probably some applications too. > > It would be a slowdown. But callbacks open many > possiblities. True, but in this case I believe that we should stick with the native implementation for "unicode-escape". Having a standard callback error handler which does the \uXXXX replacement would be nice to have though, since this would also be usable with lots of other codecs (e.g. all the code page ones). > For example: > > Why can't I print u"g�rk"? > > is probably one of the most frequently asked questions in > comp.lang.python. For printing Unicode stuff, print could be > extended the use an error handling callback for Unicode > strings (or objects where __str__ or tp_str returns a > Unicode object) instead of using str() which always returns > an 8bit string and uses strict encoding. There might even > be a > sys.setprintencodehandler()/sys.getprintencodehandler() There already is a print callback in Python (forgot the name of the hook though), so this should be possible by providing the encoding logic in the hook. > > > I have not touched PyUnicode_TranslateCharmap yet, > > > should this function also support error callbacks? Why > > > would one want the insert None into the mapping to > call > > > the callback? > > > > 1. Yes. > > 2. The user may want to e.g. restrict usage of certain > > character ranges. In this case the codec would be used to > > verify the input and an exception would indeed be useful > > (e.g. say you want to restrict input to Hangul + ASCII). > > OK, do we want TranslateCharmap to work exactly like > encoding, > i.e. in case of an error should the returned replacement > string again be mapped through the translation mapping or > should it be copied to the output directly? The former would > be more in line with encoding, but IMHO the latter would > be much more useful. It's better to take the second approach (copy the callback output directly to the output string) to avoid endless recursion and other pitfalls. I suppose this will also simplify the implementation somewhat. > BTW, when I implement it I can implement patch #403100 > ("Multicharacter replacements in > PyUnicode_TranslateCharmap") > along the way. I've seen it; will comment on it later. > Should the old TranslateCharmap map to the new > TranslateCharmapEx > and inherit the "multicharacter replacement" feature, > or > should I leave it as it is? If possible, please also add the multichar replacement to the old API. I think it is very useful and since the old APIs work on raw buffers it would be a benefit to have the functionality in the old implementation too. [Decoding error callbacks] > > > A remaining problem is how to implement decoding error > > > callbacks. In Python 2.1 encoding and decoding errors > are > > > handled in the same way with a string value. But with > > > callbacks it doesn't make sense to use the same > callback > > > for encoding and decoding (like > codecs.StreamReaderWriter > > > and codecs.StreamRecoder do). Decoding callbacks have > a > > > different API. Which arguments should be passed to the > > > decoding callback, and what is the decoding callback > > > supposed to do? > > > > I'd suggest adding another set of PyCodec_UnicodeDecode... > () > > APIs for this. We'd then have to augment the base classes > of > > the StreamCodecs to provide two attributes for .errors > with > > a fallback solution for the string case (i.s. "strict" > can > > still be used for both directions). > > Sounds good. Now what is the decoding callback supposed to > do? > I guess it will be called in the same way as the encoding > callback, i.e. with encoding name, original string and > position of the error. It might returns a Unicode string > (i.e. an object of the decoding target type), that will be > emitted from the codec instead of the one offending byte. Or > it might return a tuple with replacement Unicode object and > a resynchronisation offset, i.e. returning (u"?", 1) > means > emit a '?' and skip the offending character. But to make > the offset really useful the callback has to know something > about the encoding, perhaps the codec should be allowed to > pass an additional state object to the callback? > > Maybe the same should be added to the encoding callbacks to? > Maybe the encoding callback should be able to tell the > encoder if the replacement returned should be reencoded > (in which case it's a Unicode object), or directly emitted > (in which case it's an 8bit string)? I like the idea of having an optional state object (basically this should be a codec-defined arbitrary Python object) which then allow the callback to apply additional tricks. The object should be documented to be modifyable in place (simplifies the interface). About the return value: I'd suggest to always use the same tuple interface, e.g. callback(encoding, input_data, input_position, state) -> (output_to_be_appended, new_input_position) (I think it's better to use absolute values for the position rather than offsets.) Perhaps the encoding callbacks should use the same interface... what do you think ? > > > One additional note: It is vital that errors is an > > > assignable attribute of the StreamWriter. > > > > It is already ! > > I know, but IMHO it should be documented that an assignable > errors attribute must be supported as part of the official > codec API. > > Misc/unicode.txt is not clear on that: > """ > It is not required by the Unicode implementation to use > these base classes, only the interfaces must match; this > allows writing Codecs as extension types. > """ Good point. I'll add that to the PEP 100. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-22 22:51 Message: Logged In: YES user_id=38388 Sorry to keep you waiting, Walter. I will look into this again next week -- this week was way too busy... ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-13 19:00 Message: Logged In: YES user_id=38388 On your comment about the non-Unicode codecs: let's keep this separated from the current patch. Don't have much time today. I'll comment on the other things tomorrow. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2001-06-13 17:49 Message: Logged In: YES user_id=89016 Guido van Rossum wrote in python-dev: > True, the "codec" pattern can be used for other > encodings than Unicode. But it seems to me that the > entire codecs architecture is rather strongly geared > towards en/decoding Unicode, and it's not clear > how well other codecs fit in this pattern (e.g. I > noticed that all the non-Unicode codecs ignore the > error handling parameter or assert that > it is set to 'strict'). I noticed that too. asserting that errors=='strict' would mean that the encoder is not able to deal in any other way with unencodable stuff than by raising an error. But that is not the problem here, because for zlib, base64, quopri, hex and uu encoding there can be no unencodable characters. The encoders can simply ignore the errors parameter. Should I remove the asserts from those codecs and change the docstrings accordingly, or will this be done separately? ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2001-06-13 15:57 Message: Logged In: YES user_id=89016 > > [...] > > raise an exception). U+FFFD characters in the replacement > > string will be replaced with a character that the encoder > > chooses ('?' in all cases). > > Nice. But the special casing of U+FFFD makes the interface somewhat less clean than it could be. It was only done to be 100% backwards compatible. With the original "replace" error handling the codec chose the replacement character. But as far as I can tell none of the codecs uses anything other than '?', so I guess we could change the replace handler to always return u'?'. This would make the implementation a little bit simpler, but the explanation of the callback feature *a lot* simpler. And if you still want to handle an unencodable U+FFFD, you can write a special callback for that, e.g. def FFFDreplace(enc, uni, pos): if uni[pos] == "\ufffd": return u"?" else: raise UnicodeError(...) > > The implementation of the loop through the string is done > > in the following way. A stack with two strings is kept > > and the loop always encodes a character from the string > > at the stacktop. If an error is encountered and the stack > > has only one entry (during encoding of the original string) > > the callback is called and the unicode object returned is > > pushed on the stack, so the encoding continues with the > > replacement string. If the stack has two entries when an > > error is encountered, the replacement string itself has > > an unencodable character and a normal exception raised. > > When the encoder has reached the end of it's current string > > there are two possibilities: when the stack contains two > > entries, this was the replacement string, so the replacement > > string will be poppep from the stack and encoding continues > > with the next character from the original string. If the > > stack had only one entry, encoding is finished. > > Very elegant solution ! I'll put it as a comment in the source. > > (I hope that's enough explanation of the API and > implementation) > > Could you add these docs to the Misc/unicode.txt file ? I > will eventually take that file and turn it into a PEP which > will then serve as general documentation for these things. I could, but first we should work out how the decoding callback API will work. > > I have renamed the static ...121 function to all lowercase > > names. > > Ok. > > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > > replacement callback. > > Hmm, wouldn't that result in a slowdown ? If so, I'd rather > leave the special encoder in place, since it is being used a > lot in Python and probably some applications too. It would be a slowdown. But callbacks open many possiblities. For example: Why can't I print u"g�rk"? is probably one of the most frequently asked questions in comp.lang.python. For printing Unicode stuff, print could be extended the use an error handling callback for Unicode strings (or objects where __str__ or tp_str returns a Unicode object) instead of using str() which always returns an 8bit string and uses strict encoding. There might even be a sys.setprintencodehandler()/sys.getprintencodehandler() > [...] > I think it would be worthwhile to rename the callbacks to > include "Unicode" somewhere, e.g. > PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but > then it points out the application field of the callback > rather well. Same for the callbacks exposed through the > _codecsmodule. OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors really is a long name ;)) > > I have not touched PyUnicode_TranslateCharmap yet, > > should this function also support error callbacks? Why > > would one want the insert None into the mapping to call > > the callback? > > 1. Yes. > 2. The user may want to e.g. restrict usage of certain > character ranges. In this case the codec would be used to > verify the input and an exception would indeed be useful > (e.g. say you want to restrict input to Hangul + ASCII). OK, do we want TranslateCharmap to work exactly like encoding, i.e. in case of an error should the returned replacement string again be mapped through the translation mapping or should it be copied to the output directly? The former would be more in line with encoding, but IMHO the latter would be much more useful. BTW, when I implement it I can implement patch #403100 ("Multicharacter replacements in PyUnicode_TranslateCharmap") along the way. Should the old TranslateCharmap map to the new TranslateCharmapEx and inherit the "multicharacter replacement" feature, or should I leave it as it is? > > A remaining problem is how to implement decoding error > > callbacks. In Python 2.1 encoding and decoding errors are > > handled in the same way with a string value. But with > > callbacks it doesn't make sense to use the same callback > > for encoding and decoding (like codecs.StreamReaderWriter > > and codecs.StreamRecoder do). Decoding callbacks have a > > different API. Which arguments should be passed to the > > decoding callback, and what is the decoding callback > > supposed to do? > > I'd suggest adding another set of PyCodec_UnicodeDecode... () > APIs for this. We'd then have to augment the base classes of > the StreamCodecs to provide two attributes for .errors with > a fallback solution for the string case (i.s. "strict" can > still be used for both directions). Sounds good. Now what is the decoding callback supposed to do? I guess it will be called in the same way as the encoding callback, i.e. with encoding name, original string and position of the error. It might returns a Unicode string (i.e. an object of the decoding target type), that will be emitted from the codec instead of the one offending byte. Or it might return a tuple with replacement Unicode object and a resynchronisation offset, i.e. returning (u"?", 1) means emit a '?' and skip the offending character. But to make the offset really useful the callback has to know something about the encoding, perhaps the codec should be allowed to pass an additional state object to the callback? Maybe the same should be added to the encoding callbacks to? Maybe the encoding callback should be able to tell the encoder if the replacement returned should be reencoded (in which case it's a Unicode object), or directly emitted (in which case it's an 8bit string)? > > One additional note: It is vital that errors is an > > assignable attribute of the StreamWriter. > > It is already ! I know, but IMHO it should be documented that an assignable errors attribute must be supported as part of the official codec API. Misc/unicode.txt is not clear on that: """ It is not required by the Unicode implementation to use these base classes, only the interfaces must match; this allows writing Codecs as extension types. """ ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-13 10:05 Message: Logged In: YES user_id=38388 > How the callbacks work: > > A PyObject * named errors is passed in. This may by NULL, > Py_None, 'strict', u'strict', 'ignore', u'ignore', > 'replace', u'replace' or a callable object. > PyCodec_EncodeHandlerForObject maps all of these objects to > one of the three builtin error callbacks > PyCodec_RaiseEncodeErrors (raises an exception), > PyCodec_IgnoreEncodeErrors (returns an empty replacement > string, in effect ignoring the error), > PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode > replacement character to signify to the encoder that it > should choose a suitable replacement character) or directly > returns errors if it is a callable object. When an > unencodable character is encounterd the error handling > callback will be called with the encoding name, the original > unicode object and the error position and must return a > unicode object that will be encoded instead of the offending > character (or the callback may of course raise an > exception). U+FFFD characters in the replacement string will > be replaced with a character that the encoder chooses ('?' > in all cases). Nice. > The implementation of the loop through the string is done in > the following way. A stack with two strings is kept and the > loop always encodes a character from the string at the > stacktop. If an error is encountered and the stack has only > one entry (during encoding of the original string) the > callback is called and the unicode object returned is pushed > on the stack, so the encoding continues with the replacement > string. If the stack has two entries when an error is > encountered, the replacement string itself has an > unencodable character and a normal exception raised. When > the encoder has reached the end of it's current string there > are two possibilities: when the stack contains two entries, > this was the replacement string, so the replacement string > will be poppep from the stack and encoding continues with > the next character from the original string. If the stack > had only one entry, encoding is finished. Very elegant solution ! > (I hope that's enough explanation of the API and implementation) Could you add these docs to the Misc/unicode.txt file ? I will eventually take that file and turn it into a PEP which will then serve as general documentation for these things. > I have renamed the static ...121 function to all lowercase > names. Ok. > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > replacement callback. Hmm, wouldn't that result in a slowdown ? If so, I'd rather leave the special encoder in place, since it is being used a lot in Python and probably some applications too. > PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, > PyCodec_ReplaceEncodeErrors are globally visible because > they have to be available in _codecsmodule.c to wrap them as > Python function objects, but they can't be implemented in > _codecsmodule, because they need to be available to the > encoders in unicodeobject.c (through > PyCodec_EncodeHandlerForObject), but importing the codecs > module might result in an endless recursion, because > importing a module requires unpickling of the bytecode, > which might require decoding utf8, which ... (but this will > only happen, if we implement the same mechanism for the > decoding API) I think that codecs.c is the right place for these APIs. _codecsmodule.c is only meant as Python access wrapper for the internal codecs and nothing more. One thing I noted about the callbacks: they assume that they will always get Unicode objects as input. This is certainly not true in the general case (it is for the codecs you touch in the patch). I think it would be worthwhile to rename the callbacks to include "Unicode" somewhere, e.g. PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but then it points out the application field of the callback rather well. Same for the callbacks exposed through the _codecsmodule. > I have not touched PyUnicode_TranslateCharmap yet, > should this function also support error callbacks? Why would > one want the insert None into the mapping to call the callback? 1. Yes. 2. The user may want to e.g. restrict usage of certain character ranges. In this case the codec would be used to verify the input and an exception would indeed be useful (e.g. say you want to restrict input to Hangul + ASCII). > A remaining problem is how to implement decoding error > callbacks. In Python 2.1 encoding and decoding errors are > handled in the same way with a string value. But with > callbacks it doesn't make sense to use the same callback for > encoding and decoding (like codecs.StreamReaderWriter and > codecs.StreamRecoder do). Decoding callbacks have a > different API. Which arguments should be passed to the > decoding callback, and what is the decoding callback > supposed to do? I'd suggest adding another set of PyCodec_UnicodeDecode...() APIs for this. We'd then have to augment the base classes of the StreamCodecs to provide two attributes for .errors with a fallback solution for the string case (i.s. "strict" can still be used for both directions). > One additional note: It is vital that errors is an > assignable attribute of the StreamWriter. It is already ! > Consider the XML example: For writing an XML DOM tree one > StreamWriter object is used. When a text node is written, > the error handling has to be set to > codecs.xmlreplace_encode_errors, but inside a comment or > processing instruction replacing unencodable characters with > charrefs is not possible, so here codecs.raise_encode_errors > should be used (or better a custom error handler that raises > an error that says "sorry, you can't have unencodable > characters inside a comment") Sure. > BTW, should we continue the discussion in the i18n SIG > mailing list? An email program is much more comfortable than > a HTML textarea! ;) I'd rather keep the discussions on this patch here -- forking it off to the i18n sig will make it very hard to follow up on it. (This HTML area is indeed damn small ;-) ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2001-06-12 21:18 Message: Logged In: YES user_id=89016 One additional note: It is vital that errors is an assignable attribute of the StreamWriter. Consider the XML example: For writing an XML DOM tree one StreamWriter object is used. When a text node is written, the error handling has to be set to codecs.xmlreplace_encode_errors, but inside a comment or processing instruction replacing unencodable characters with charrefs is not possible, so here codecs.raise_encode_errors should be used (or better a custom error handler that raises an error that says "sorry, you can't have unencodable characters inside a comment") BTW, should we continue the discussion in the i18n SIG mailing list? An email program is much more comfortable than a HTML textarea! ;) ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2001-06-12 20:59 Message: Logged In: YES user_id=89016 How the callbacks work: A PyObject * named errors is passed in. This may by NULL, Py_None, 'strict', u'strict', 'ignore', u'ignore', 'replace', u'replace' or a callable object. PyCodec_EncodeHandlerForObject maps all of these objects to one of the three builtin error callbacks PyCodec_RaiseEncodeErrors (raises an exception), PyCodec_IgnoreEncodeErrors (returns an empty replacement string, in effect ignoring the error), PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode replacement character to signify to the encoder that it should choose a suitable replacement character) or directly returns errors if it is a callable object. When an unencodable character is encounterd the error handling callback will be called with the encoding name, the original unicode object and the error position and must return a unicode object that will be encoded instead of the offending character (or the callback may of course raise an exception). U+FFFD characters in the replacement string will be replaced with a character that the encoder chooses ('?' in all cases). The implementation of the loop through the string is done in the following way. A stack with two strings is kept and the loop always encodes a character from the string at the stacktop. If an error is encountered and the stack has only one entry (during encoding of the original string) the callback is called and the unicode object returned is pushed on the stack, so the encoding continues with the replacement string. If the stack has two entries when an error is encountered, the replacement string itself has an unencodable character and a normal exception raised. When the encoder has reached the end of it's current string there are two possibilities: when the stack contains two entries, this was the replacement string, so the replacement string will be poppep from the stack and encoding continues with the next character from the original string. If the stack had only one entry, encoding is finished. (I hope that's enough explanation of the API and implementation) I have renamed the static ...121 function to all lowercase names. BTW, I guess PyUnicode_EncodeUnicodeEscape could be reimplemented as PyUnicode_EncodeASCII with a \uxxxx replacement callback. PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, PyCodec_ReplaceEncodeErrors are globally visible because they have to be available in _codecsmodule.c to wrap them as Python function objects, but they can't be implemented in _codecsmodule, because they need to be available to the encoders in unicodeobject.c (through PyCodec_EncodeHandlerForObject), but importing the codecs module might result in an endless recursion, because importing a module requires unpickling of the bytecode, which might require decoding utf8, which ... (but this will only happen, if we implement the same mechanism for the decoding API) I have not touched PyUnicode_TranslateCharmap yet, should this function also support error callbacks? Why would one want the insert None into the mapping to call the callback? A remaining problem is how to implement decoding error callbacks. In Python 2.1 encoding and decoding errors are handled in the same way with a string value. But with callbacks it doesn't make sense to use the same callback for encoding and decoding (like codecs.StreamReaderWriter and codecs.StreamRecoder do). Decoding callbacks have a different API. Which arguments should be passed to the decoding callback, and what is the decoding callback supposed to do? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 20:00 Message: Logged In: YES user_id=38388 About the Py_UNICODE*data, int size APIs: Ok, point taken. In general, I think we ought to keep the callback feature as open as possible, so passing in pointers and sizes would not be very useful. BTW, could you summarize how the callback works in a few lines ? About _Encode121: I'd name this _EncodeUCS1 since that's what it is ;-) About the new functions: I was referring to the new static functions which you gave PyUnicode_... names. If these are not supposed to turn into non-static functions, I'd rather have them use lower case names (since that's how the Python internals work too -- most of the times). ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2001-06-12 18:56 Message: Logged In: YES user_id=89016 > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. Another problem is, that the callback requires a Python object, so in the PyObject *version, the refcount is incref'd and the object is passed to the callback. The Py_UNICODE*/int version would have to create a new Unicode object from the data. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2001-06-12 18:32 Message: Logged In: YES user_id=89016 > * please don't place more than one C statement on one line > like in: > """ > + unicode = unicode2; unicodepos = > unicode2pos; > + unicode2 = NULL; unicode2pos = 0; > """ OK, done! > * Comments should start with a capital letter and be > prepended > to the section they apply to Fixed! > * There should be spaces between arguments in compares > (a == b) not (a==b) Fixed! > * Where does the name "...Encode121" originate ? encode one-to-one, it implements both ASCII and latin-1 encoding. > * module internal APIs should use lower case names (you > converted some of these to PyUnicode_...() -- this is > normally reserved for APIs which are either marked as > potential candidates for the public API or are very > prominent in the code) Which ones? I introduced a new function for every old one, that had a "const char *errors" argument, and a few new ones in codecs.h, of those PyCodec_EncodeHandlerForObject is vital, because it is used to map for old string arguments to the new function objects. PyCodec_RaiseEncodeErrors can be used in the encoder implementation to raise an encode error, but it could be made static in unicodeobject.h so only those encoders implemented there have access to it. > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. I look through the code and found no situation where the Py_UNICODE*/int version is really used and having two (PyObject *)s (the original and the replacement string), instead of UNICODE*/int and PyObject * made the implementation a little easier, but I can fix that. > Please separate the errors.c patch from this patch -- it > seems totally unrelated to Unicode. PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with four hex digits. I removed it. I'll upload a revised patch as soon as it's done. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 16:29 Message: Logged In: YES user_id=38388 Thanks for the patch -- it looks very impressive !. I'll give it a try later this week. Some first cosmetic tidbits: * please don't place more than one C statement on one line like in: """ + unicode = unicode2; unicodepos = unicode2pos; + unicode2 = NULL; unicode2pos = 0; """ * Comments should start with a capital letter and be prepended to the section they apply to * There should be spaces between arguments in compares (a == b) not (a==b) * Where does the name "...Encode121" originate ? * module internal APIs should use lower case names (you converted some of these to PyUnicode_...() -- this is normally reserved for APIs which are either marked as potential candidates for the public API or are very prominent in the code) One thing which I don't like about your API change is that you removed the Py_UNICODE*data, int size style arguments -- this makes it impossible to use the new APIs on non-Python data or data which is not available as Unicode object. Please separate the errors.c patch from this patch -- it seems totally unrelated to Unicode. Thanks. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 From noreply@sourceforge.net Wed Aug 14 12:17:24 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 04:17:24 -0700 Subject: [Patches] [ python-Patches-588564 ] _locale library patch Message-ID: Patches item #588564, was opened at 2002-07-30 05:58 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588564&group_id=5470 Category: Distutils and setup.py Group: None >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Jason Tishler (jlt63) Assigned to: Jason Tishler (jlt63) Summary: _locale library patch Initial Comment: This patch enables setup.py to find gettext routines when they are located in libintl instead of libc. Although I developed this patch for Cygwin, I hope that it can be easily updated to support other platforms (if necessary). I tested this patch under Cygwin and Red Hat Linux 7.1. ---------------------------------------------------------------------- >Comment By: Jason Tishler (jlt63) Date: 2002-08-14 03:17 Message: Logged In: YES user_id=86216 > Ok, I'll accept the patch - please apply it. Committed as setup.py 1.107. > As for distutils developers - there are none > left, so unless interested users develop it, it > is stuck right where it is now. Ouch! Given that the Python build is dependent on distutils this is especially painful... ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 01:21 Message: Logged In: YES user_id=21627 Ok, I'll accept the patch - please apply it. As for distutils developers - there are none left, so unless interested users develop it, it is stuck right where it is now. ---------------------------------------------------------------------- Comment By: Jason Tishler (jlt63) Date: 2002-08-12 03:53 Message: Logged In: YES user_id=86216 I'm a strong proponent of doing the right thing. The unfortunate reality is that I'm way overcommitted right now. Hence, I don't have the spare cycles to figure out the best way to accomplish this "major undertaking" (to quote you). Additionally, one of the distutils developers could do a better job with much less effort than I could. Please reconsider my original patch. I'm willing to change it (in the future) to be more autoconf-like if someone else is willing to do the underlying work. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-09 15:49 Message: Logged In: YES user_id=21627 Yes, that's what I meant. It eventually results in distutils getting some of the capabilities of autoconf. I agree this is a major undertaking, but one that I think needs to progress over time, in small steps. For the current problem, it might be useful to emulate AC_TRY_LINK: generate a small program, and see whether the compiler manages to link it. You probably need to allow for result-caching as well; I recommend to put the cache file into build/temp.. This may all sound very ad-hoc, but I think it can be made to work with reasonable effort. We probably need to present any results to distutils-sig before committing them. ---------------------------------------------------------------------- Comment By: Jason Tishler (jlt63) Date: 2002-08-09 10:04 Message: Logged In: YES user_id=86216 I presume that you mean to use an autoconf-style approach *in* setup.py. Is this assumption correct? If so, then I know how to search for libintl.h via find_file(). Unfortunately, I do not know how to check that a function (e.g., getext()) is in a library (i.e., libc.a). Any suggestions? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 00:37 Message: Logged In: YES user_id=21627 I would really prefer if such problems where solved in an autoconf-style approach: - is libintl.h present (you may ask pyconfig.h for that) - if so, is gettext provided by the C library - if not, is it provided by -lintl - if yes, add -lintl - if no, print an error message and continue ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=588564&group_id=5470 From noreply@sourceforge.net Wed Aug 14 12:35:43 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 04:35:43 -0700 Subject: [Patches] [ python-Patches-595014 ] Cygwin tempfile patch Message-ID: Patches item #595014, was opened at 2002-08-14 03:35 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595014&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jason Tishler (jlt63) Assigned to: Martin v. L�wis (loewis) Summary: Cygwin tempfile patch Initial Comment: Although Cygwin attempts to be as Posix compliant as possible, it has difficulties unlinking open files. This is not surprising given that Cygwin is dependent on Win32 which in turn has this problem itself. The attached tempfile patch acknowledges this Cygwin limitation. Without this patch, Cygwin fails test_tempfile (i.e., test_has_no_name) as follows: $ ./python -E -tt ../Lib/test/regrtest.py -l test_tempfile test_tempfile test test_tempfile failed -- Traceback (most recent call last): File "/home/jt/src/PythonCvs/Lib/test/test_tempfile.py", line 689, in test_has_no_name self.failOnException("rmdir", ei) File "/home/jt/src/PythonCvs/Lib/test/test_tempfile.py", line 33, in failOnException self.fail("%s raised %s: %s" % (what, ei[0], ei[1])) File "/home/jt/src/PythonCvs/Lib/unittest.py", line 260, in fail raise self.failureException, msg AssertionError: rmdir raised exceptions.OSError: [Errno 90] Directory not empty: '/mnt/c/DOCUME~1/jatis/LOCALS~1/Temp/tmpM_z8nj' ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595014&group_id=5470 From noreply@sourceforge.net Wed Aug 14 14:02:03 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 06:02:03 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-02 02:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Guido van Rossum (gvanrossum) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 09:02 Message: Logged In: YES user_id=6380 I'd like to change the binary=True argument to mkstemp into text=False; that seems easier to explain. News about the HP errors from Kalle Svensson: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 719, in ? test_main() File "../python/dist/src/Lib/test/test_tempfile.py", line 716, in test_main test_support.run_suite(suite) File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/test/test_support.py", line 188, in run_suite raise TestFailed(err) test.test_support.TestFailed: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 295, in test_basic_many File "../python/dist/src/Lib/test/test_tempfile.py", line 278, in do_create File "../python/dist/src/Lib/test/test_tempfile.py", line 33, in failOnException File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/unittest.py", line 260, in fail AssertionError: _mkstemp_inner raised exceptions.OSError: [Errno 24] Too many open files: '/tmp/aaU3irrA' ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 00:28 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 00:17 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 20:19 Message: Logged In: YES user_id=6380 Oops, looks like typos in the patch. Fixed (I hope). Question for Zack: I noticed that a few times you changed this: temp = tempfile.mktemp() into this: (fd, temp) = tempfile.mkstemp() os.close(fd) If the latter is secure, why can't mktemp() be defined as doing that? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-09 16:53 Message: Logged In: YES user_id=33168 Guido, there are still 3 uses of mktemp after the checkin. Should these use mkstemp()? Lib/toaiff.py:102 Lib/plat-irix5/torgb.py:95 Lib/plat-irix6/torgb.py:95 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 12:40 Message: Logged In: YES user_id=6380 I've checked this all in now. The changes to test_tempfile.py weren't as easily fixable to work without the tempfile.py changes as Zack thought. I hope the community will give it some review. It will probably break some Zope tests. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 10:50 Message: Logged In: YES user_id=6380 OK, I'll do something along those lines myself. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-08 15:01 Message: Logged In: YES user_id=580015 I'm afraid my idea of patch size comes from GCC land, where this would be considered not that big at all. Only 1500 lines affected, and more than half of that is documentation and test suite! I tried, and failed, to break up the changes to tempfile.py itself. But there's some larger divisions that could be made. We could check in the new test_tempfile.py now, disabling the tests that refer to nonexistent functions (just comment out the lines that add those tests to the test_classes array). The changes to the rest of the test suite are also largely independent of the tempfile.py rewrite (since they replace tempfile.mktemp() with TESTFN, mostly). And the search-and-replace changes in the library can wait until after tempfile.py itself gets reviewed. Unfortunately, I am about to go on vacation for five days, so I don't have time now to do this split-up. I will try to drum up interest on python-dev in reviewing the patch as is. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 12:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-03 02:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 10:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Wed Aug 14 14:16:58 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 06:16:58 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-02 02:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Guido van Rossum (gvanrossum) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 09:16 Message: Logged In: YES user_id=6380 The "Too many open files" problem is solved. The HP system was configured to allow only 200 open file descriptors per process. But maybe the test would work just as well if it tried to create 100 instead of 1000 temp files? I expect that this would cause failures on other systems with conservative limits. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 09:02 Message: Logged In: YES user_id=6380 I'd like to change the binary=True argument to mkstemp into text=False; that seems easier to explain. News about the HP errors from Kalle Svensson: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 719, in ? test_main() File "../python/dist/src/Lib/test/test_tempfile.py", line 716, in test_main test_support.run_suite(suite) File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/test/test_support.py", line 188, in run_suite raise TestFailed(err) test.test_support.TestFailed: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 295, in test_basic_many File "../python/dist/src/Lib/test/test_tempfile.py", line 278, in do_create File "../python/dist/src/Lib/test/test_tempfile.py", line 33, in failOnException File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/unittest.py", line 260, in fail AssertionError: _mkstemp_inner raised exceptions.OSError: [Errno 24] Too many open files: '/tmp/aaU3irrA' ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 00:28 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 00:17 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 20:19 Message: Logged In: YES user_id=6380 Oops, looks like typos in the patch. Fixed (I hope). Question for Zack: I noticed that a few times you changed this: temp = tempfile.mktemp() into this: (fd, temp) = tempfile.mkstemp() os.close(fd) If the latter is secure, why can't mktemp() be defined as doing that? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-09 16:53 Message: Logged In: YES user_id=33168 Guido, there are still 3 uses of mktemp after the checkin. Should these use mkstemp()? Lib/toaiff.py:102 Lib/plat-irix5/torgb.py:95 Lib/plat-irix6/torgb.py:95 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 12:40 Message: Logged In: YES user_id=6380 I've checked this all in now. The changes to test_tempfile.py weren't as easily fixable to work without the tempfile.py changes as Zack thought. I hope the community will give it some review. It will probably break some Zope tests. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 10:50 Message: Logged In: YES user_id=6380 OK, I'll do something along those lines myself. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-08 15:01 Message: Logged In: YES user_id=580015 I'm afraid my idea of patch size comes from GCC land, where this would be considered not that big at all. Only 1500 lines affected, and more than half of that is documentation and test suite! I tried, and failed, to break up the changes to tempfile.py itself. But there's some larger divisions that could be made. We could check in the new test_tempfile.py now, disabling the tests that refer to nonexistent functions (just comment out the lines that add those tests to the test_classes array). The changes to the rest of the test suite are also largely independent of the tempfile.py rewrite (since they replace tempfile.mktemp() with TESTFN, mostly). And the search-and-replace changes in the library can wait until after tempfile.py itself gets reviewed. Unfortunately, I am about to go on vacation for five days, so I don't have time now to do this split-up. I will try to drum up interest on python-dev in reviewing the patch as is. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 12:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-03 02:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 10:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Wed Aug 14 15:48:05 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 07:48:05 -0700 Subject: [Patches] [ python-Patches-595014 ] Cygwin tempfile patch Message-ID: Patches item #595014, was opened at 2002-08-14 13:35 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595014&group_id=5470 Category: Library (Lib) Group: None Status: Open >Resolution: Accepted Priority: 5 Submitted By: Jason Tishler (jlt63) >Assigned to: Jason Tishler (jlt63) Summary: Cygwin tempfile patch Initial Comment: Although Cygwin attempts to be as Posix compliant as possible, it has difficulties unlinking open files. This is not surprising given that Cygwin is dependent on Win32 which in turn has this problem itself. The attached tempfile patch acknowledges this Cygwin limitation. Without this patch, Cygwin fails test_tempfile (i.e., test_has_no_name) as follows: $ ./python -E -tt ../Lib/test/regrtest.py -l test_tempfile test_tempfile test test_tempfile failed -- Traceback (most recent call last): File "/home/jt/src/PythonCvs/Lib/test/test_tempfile.py", line 689, in test_has_no_name self.failOnException("rmdir", ei) File "/home/jt/src/PythonCvs/Lib/test/test_tempfile.py", line 33, in failOnException self.fail("%s raised %s: %s" % (what, ei[0], ei[1])) File "/home/jt/src/PythonCvs/Lib/unittest.py", line 260, in fail raise self.failureException, msg AssertionError: rmdir raised exceptions.OSError: [Errno 90] Directory not empty: '/mnt/c/DOCUME~1/jatis/LOCALS~1/Temp/tmpM_z8nj' ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 16:48 Message: Logged In: YES user_id=21627 This is fine, please apply it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595014&group_id=5470 From noreply@sourceforge.net Wed Aug 14 15:56:01 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 07:56:01 -0700 Subject: [Patches] [ python-Patches-463656 ] setup.py, --with-includepath, and LD_LIB Message-ID: Patches item #463656, was opened at 2001-09-21 21:12 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=463656&group_id=5470 Category: Distutils and setup.py Group: None Status: Open Resolution: None Priority: 5 Submitted By: Frederic Giacometti (giacometti) Assigned to: Nobody/Anonymous (nobody) Summary: setup.py, --with-includepath, and LD_LIB Initial Comment: This patch improves the module detection capability in setup.py. The following improvements are implemented: - directories listed in LD_LIBRARY_PATH are also searched for shared libraries. - the --with-includepath option has been added to configure, to specify additional non-standard directories where the include files are to be searched for. The corresponding changes were added to setup.py (new function detect_include(), find_library_file() augmented, detect_tkinter() improved) I retroceeded manually the changes from configure into configure.in, but I did not run autoconf; you might want to double-check this. Sample aplication: ./configure --prefix=/something --with-includepath='/mgl/apps/include:/mgl/share/include' With this patch, I get Tkinter to build correctly without editing the Setup files, with non-standard tckl/tk 8.0 to 8.3 installations. where the only tcl.h file is in /mgl/share/include/tcl8.0 (therefore, tkinter is build with tcl8.0 on this configuration). FG ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 16:56 Message: Logged In: YES user_id=21627 Frederic, do you still think there is a need for this patch? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-01-17 18:58 Message: Logged In: YES user_id=6656 You do know that you can pass -I and -L options to setup.py? That might be a less involved way of doing what you want. ---------------------------------------------------------------------- Comment By: Frederic Giacometti (giacometti) Date: 2001-09-28 02:17 Message: Logged In: YES user_id=93657 I moved the functions find_library_file() and detect_include() to distutils.sysconfig(), so that they can be reused for configuring third party modules too (e.g.: PyOpenGL...). Let me know if you wish a patch for this. Frederic Giacometti ---------------------------------------------------------------------- Comment By: Frederic Giacometti (giacometti) Date: 2001-09-27 00:56 Message: Logged In: YES user_id=93657 I'm replacing the patch with an improved version (against main line as of 09/26/01). New features: - configure is generated from configure.in, with autoconf - detect_tkinter also checks the version number inside the tcl.h and tk.h files (#define TCL_VERSION, #define TK_VERSION...). The 'tk_detect' improvement is in this same patch as the '--include-patch' feature; since the second one was written to get the first one working. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=463656&group_id=5470 From noreply@sourceforge.net Wed Aug 14 16:02:10 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 08:02:10 -0700 Subject: [Patches] [ python-Patches-485572 ] Distutils -- set runtime library path Message-ID: Patches item #485572, was opened at 2001-11-26 13:03 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=485572&group_id=5470 Category: None Group: None >Status: Closed >Resolution: Out of Date Priority: 5 Submitted By: Richard Everson (reverson) Assigned to: M.-A. Lemburg (lemburg) Summary: Distutils -- set runtime library path Initial Comment: The runtime libraries option to Distutils Extension currently fails with gcc (on Linux at least) because, despite what "info gcc" says, gcc -R/run/time/lib/path doesn't get relayed to the loader to set the runtime libraries search path. In order to correctly pass the -R to the loader you need gcc -Wl,-R/run/time/lib/path There is something mentioning the -Wl, option in cygwinccompiler.py but it is commented out. The attached patch works for me on a Linux or Solaris box, but I haven't tested it more extensively than that. It is against $Id: extension.py,v 1.8 2001/03/22 03:48:31 akuchling Exp $ from the python-2.1.1 distribution. Also I'm not sure of the "correct" way of deducing what compiler is being used -- I hope the get_config_var() route is acceptable. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 17:02 Message: Logged In: YES user_id=21627 It appears that this has been fixed since unixccompiler.py 1.38. Closing it as out-of-date. ---------------------------------------------------------------------- Comment By: Richard Everson (reverson) Date: 2001-11-26 14:43 Message: Logged In: YES user_id=385419 Sorry about the lack of the patch -- I pressed submit too soon. It should now be attached. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-11-26 14:15 Message: Logged In: YES user_id=38388 I'll look into this after the Python 2.2 feature freeze. Thanks. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-11-26 14:14 Message: Logged In: YES user_id=38388 There's no uploaded file! You have to check the checkbox labeled "Check to Upload & Attach File" when you upload a file. Please try again. (This is a SourceForge annoyance that we can do nothing about. :-( ) ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=485572&group_id=5470 From noreply@sourceforge.net Wed Aug 14 16:03:39 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 08:03:39 -0700 Subject: [Patches] [ python-Patches-586561 ] Better token-related error messages Message-ID: Patches item #586561, was opened at 2002-07-25 16:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=586561&group_id=5470 Category: Parser/Compiler Group: Python 2.3 Status: Open >Resolution: Accepted Priority: 5 Submitted By: Skip Montanaro (montanaro) >Assigned to: Skip Montanaro (montanaro) Summary: Better token-related error messages Initial Comment: There were some complaints recently on c.l.py about the rather non-informative error messages emitted as a result of the tokenizer detecting a problem. In many situations it simply returns E_TOKEN which generates a fairly benign, but often unhelpful "invalid token" message. This patch adds several new E_* macrosto Includes/errorcode.h, returns them from the appropriate places in Parser/tokenizer.c and generates more specific messages in Python/pythonrun.c. I think the error messages are always better, though in some situations they may still not be strictly correct. Assigning to Jeremy since he's the compiler wiz. Skip ---------------------------------------------------------------------- >Comment By: Jeremy Hylton (jhylton) Date: 2002-08-14 15:03 Message: Logged In: YES user_id=31392 Looks good to me! Sorry I took so long to review it again. ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-08-04 21:39 Message: Logged In: YES user_id=44345 here's a new patch - deletes all but the EOFC & EOFS macros and adds a test_eof.py test module ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-08-02 18:17 Message: Logged In: YES user_id=31392 The current error message for the complex case seems clear enough since it identifies exactly the offending character. >>> 3i+2 File "", line 1 3i+2 ^ SyntaxError: invalid syntax The error message for runaway triple quoted strings is much more puzzling, since the line of context doesn't have anything useful on it. I guess we should think about the others, too: E_EOLS is marginal, since you do get the line with the error in the exception. E_EOFC is a win for the same reason that E_EOFS is, although I expect it's a less common case. E_EXP and E_SLASH are borderline -- again because the current syntax error identifies exactly the line and character that are causing the problem. We should get a third opinion, but I'd probably settle for just E_EOFC and E_EOFS. ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-08-02 18:05 Message: Logged In: YES user_id=44345 re: i vs. j Perhaps it's not needed. The patch was originally designed to address the case of runaway triple-quoted strings. Someone on c.l.py ranted about that. While I was in there, I recalled someone else (perhaps more than one person) had berated Python in the past because imaginary numbers use 'j' instead of 'i' and decided to stick it in. It's no big deal to take it out. (When you think about it, they are all corner cases, since most of the time the code is syntactically correct. ;-) S ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-08-02 17:54 Message: Logged In: YES user_id=31392 Is the warning about i vs. j for complex numbers really necessary? It seems like it adds extra, well, complexity for a tiny corner case. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=586561&group_id=5470 From noreply@sourceforge.net Wed Aug 14 16:07:56 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 08:07:56 -0700 Subject: [Patches] [ python-Patches-594869 ] Nuke 32-bit-isms in gettext Message-ID: Patches item #594869, was opened at 2002-08-13 22:48 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=594869&group_id=5470 Category: Library (Lib) Group: Python 2.3 >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Tim Peters (tim_one) Assigned to: Barry A. Warsaw (bwarsaw) Summary: Nuke 32-bit-isms in gettext Initial Comment: I don't know how to test this module, so assigning to Barry. The intent is to get away from treating 32-bit bit patterns as signed ints in gettext.py, in particular to stop the current warning msgs whenever this test is run. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=594869&group_id=5470 From noreply@sourceforge.net Wed Aug 14 16:11:41 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 08:11:41 -0700 Subject: [Patches] [ python-Patches-595014 ] Cygwin tempfile patch Message-ID: Patches item #595014, was opened at 2002-08-14 03:35 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595014&group_id=5470 Category: Library (Lib) Group: None >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Jason Tishler (jlt63) Assigned to: Jason Tishler (jlt63) Summary: Cygwin tempfile patch Initial Comment: Although Cygwin attempts to be as Posix compliant as possible, it has difficulties unlinking open files. This is not surprising given that Cygwin is dependent on Win32 which in turn has this problem itself. The attached tempfile patch acknowledges this Cygwin limitation. Without this patch, Cygwin fails test_tempfile (i.e., test_has_no_name) as follows: $ ./python -E -tt ../Lib/test/regrtest.py -l test_tempfile test_tempfile test test_tempfile failed -- Traceback (most recent call last): File "/home/jt/src/PythonCvs/Lib/test/test_tempfile.py", line 689, in test_has_no_name self.failOnException("rmdir", ei) File "/home/jt/src/PythonCvs/Lib/test/test_tempfile.py", line 33, in failOnException self.fail("%s raised %s: %s" % (what, ei[0], ei[1])) File "/home/jt/src/PythonCvs/Lib/unittest.py", line 260, in fail raise self.failureException, msg AssertionError: rmdir raised exceptions.OSError: [Errno 90] Directory not empty: '/mnt/c/DOCUME~1/jatis/LOCALS~1/Temp/tmpM_z8nj' ---------------------------------------------------------------------- >Comment By: Jason Tishler (jlt63) Date: 2002-08-14 07:11 Message: Logged In: YES user_id=86216 Committed as Lib/tempfile.py 1.48. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 06:48 Message: Logged In: YES user_id=21627 This is fine, please apply it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595014&group_id=5470 From noreply@sourceforge.net Wed Aug 14 16:11:59 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 08:11:59 -0700 Subject: [Patches] [ python-Patches-534862 ] help asyncore recover from repr() probs Message-ID: Patches item #534862, was opened at 2002-03-25 22:12 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=534862&group_id=5470 Category: Library (Lib) Group: Python 2.3 >Status: Closed >Resolution: Out of Date Priority: 5 Submitted By: Skip Montanaro (montanaro) Assigned to: Jeremy Hylton (jhylton) Summary: help asyncore recover from repr() probs Initial Comment: I've had this patch my my copy of asyncore.py for quite awhile. It works for me as a way to recover from repr() bogosities, though I'm unfamiliar enough with repr/str issues and asyncore to know if this is the right way to make it more bulletproof (or if it should even be made more bulletproof). Skip ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 17:11 Message: Logged In: YES user_id=21627 That patch is out of date. In the code you replace, self is not printed, so there is no need to protect against repr failures. ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-07-10 01:46 Message: Logged In: YES user_id=44345 Looking for a vote up or down so I can get rid of the "M" when I execute "cvs up"... S ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-04 19:12 Message: Logged In: YES user_id=6380 Jeremy, what do you think of this? Looks harmless to me... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=534862&group_id=5470 From noreply@sourceforge.net Wed Aug 14 16:46:42 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 08:46:42 -0700 Subject: [Patches] [ python-Patches-550192 ] Set softspace to 0 in raw_input() Message-ID: Patches item #550192, was opened at 2002-04-29 16:39 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=550192&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Gustavo Niemeyer (niemeyer) Assigned to: Martin v. L�wis (loewis) Summary: Set softspace to 0 in raw_input() Initial Comment: Setting softspace to 0 in raw_input() makes it behave as expected when a "print 'something'," precedes the raw_input() call, with or without a prompt argument. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 17:46 Message: Logged In: YES user_id=21627 Thanks for the patch, applied as bltinmodule.c 2.263. ---------------------------------------------------------------------- Comment By: Gustavo Niemeyer (niemeyer) Date: 2002-05-03 21:07 Message: Logged In: YES user_id=7887 Ok.. now it outputs an extra space if softspace was true, as expected after a "print 'something',". Thanks again. ---------------------------------------------------------------------- Comment By: Gustavo Niemeyer (niemeyer) Date: 2002-05-03 20:45 Message: Logged In: YES user_id=7887 Please, don't apply it yet. I'm testing some aspects of the patch. ---------------------------------------------------------------------- Comment By: Gustavo Niemeyer (niemeyer) Date: 2002-05-03 19:53 Message: Logged In: YES user_id=7887 Sure! Here's a fixed patch including those cleanups. Thank you! ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-05-03 08:04 Message: Logged In: YES user_id=21627 The checking logic for a lost stdout appears to be broken: it should already check for an exception right when verifying whether stdout isatty. Can you incorporate such cleanup in your patch? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=550192&group_id=5470 From noreply@sourceforge.net Wed Aug 14 17:24:37 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 09:24:37 -0700 Subject: [Patches] [ python-Patches-595111 ] turtle tracer bugfixes and new functions Message-ID: Patches item #595111, was opened at 2002-08-14 18:24 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595111&group_id=5470 Category: Tkinter Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Attila Babo (ababo) Assigned to: Nobody/Anonymous (nobody) Summary: turtle tracer bugfixes and new functions Initial Comment: Environment: Python 2.2.1 on Windows Bug fixes: There is no output without tracing, i.e. with tracer(0) the module not update the canvas. Fixed in _draw_turtle and circle. Now after tracer(0) the head turns off immediatly, all drawing functions working as required without tracer mode. A few duplicates elimineted in _goto to clean up the code. Cosmetic changes with adding and removing empty lines. New functions: heading(): returns the current heading of the pen setheading(angle): sets the heading of the pen position(): returns the current position, later you can reuse it with goto() window_width(): returns the width of the window in pixels window_height(): returns the height of the window in pixels setx(xpos): moves the pen such that its y-position remains the same but its x-position is the given value. The heading of the pen is not changed. If the pen is currently down,it will draw a line along its path. sety(ypos): moves the pen such that its x-position remains the same but its y-position is the given value. The heading of the pen is not changed. If the pen is currently down, it will draw a line along its path. With these changes the turtle module maintains better funcionality with Logo turtle graphics. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595111&group_id=5470 From noreply@sourceforge.net Wed Aug 14 17:51:48 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 09:51:48 -0700 Subject: [Patches] [ python-Patches-594197 ] Patch for bug 592567 Message-ID: Patches item #594197, was opened at 2002-08-12 15:03 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=594197&group_id=5470 Category: None Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Jiba (jiba) Assigned to: Guido van Rossum (gvanrossum) Summary: Patch for bug 592567 Initial Comment: This patch fixes the bug 592567 (Bug with deepcopy and new style objects). It uses a different, better, approach that the one i have proposed in the bug report: it keeps alive ANY object that is deepcopied, automatically. This has also a positive side effect: if you define your own __deepcopy__ method, you don't need to take care of the underlying implementation and to keep alive your temporary states. The script test3.py illustrates that -- the patch fixes also this "bug" (not strictly speaking a bug). test3.py has been tested with Python 2.2.1. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 12:51 Message: Logged In: YES user_id=6380 Thanks! Accepted and checked into CVS. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=594197&group_id=5470 From noreply@sourceforge.net Wed Aug 14 17:52:49 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 09:52:49 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-02 02:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Library (Lib) Group: Python 2.3 >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Guido van Rossum (gvanrossum) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 12:52 Message: Logged In: YES user_id=6380 Closing again. Reduced the number of temp files to 100. Changed 'binary=True' to 'text=False' default on mkstemp(). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 09:16 Message: Logged In: YES user_id=6380 The "Too many open files" problem is solved. The HP system was configured to allow only 200 open file descriptors per process. But maybe the test would work just as well if it tried to create 100 instead of 1000 temp files? I expect that this would cause failures on other systems with conservative limits. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 09:02 Message: Logged In: YES user_id=6380 I'd like to change the binary=True argument to mkstemp into text=False; that seems easier to explain. News about the HP errors from Kalle Svensson: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 719, in ? test_main() File "../python/dist/src/Lib/test/test_tempfile.py", line 716, in test_main test_support.run_suite(suite) File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/test/test_support.py", line 188, in run_suite raise TestFailed(err) test.test_support.TestFailed: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 295, in test_basic_many File "../python/dist/src/Lib/test/test_tempfile.py", line 278, in do_create File "../python/dist/src/Lib/test/test_tempfile.py", line 33, in failOnException File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/unittest.py", line 260, in fail AssertionError: _mkstemp_inner raised exceptions.OSError: [Errno 24] Too many open files: '/tmp/aaU3irrA' ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 00:28 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 00:17 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 20:19 Message: Logged In: YES user_id=6380 Oops, looks like typos in the patch. Fixed (I hope). Question for Zack: I noticed that a few times you changed this: temp = tempfile.mktemp() into this: (fd, temp) = tempfile.mkstemp() os.close(fd) If the latter is secure, why can't mktemp() be defined as doing that? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-09 16:53 Message: Logged In: YES user_id=33168 Guido, there are still 3 uses of mktemp after the checkin. Should these use mkstemp()? Lib/toaiff.py:102 Lib/plat-irix5/torgb.py:95 Lib/plat-irix6/torgb.py:95 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 12:40 Message: Logged In: YES user_id=6380 I've checked this all in now. The changes to test_tempfile.py weren't as easily fixable to work without the tempfile.py changes as Zack thought. I hope the community will give it some review. It will probably break some Zope tests. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 10:50 Message: Logged In: YES user_id=6380 OK, I'll do something along those lines myself. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-08 15:01 Message: Logged In: YES user_id=580015 I'm afraid my idea of patch size comes from GCC land, where this would be considered not that big at all. Only 1500 lines affected, and more than half of that is documentation and test suite! I tried, and failed, to break up the changes to tempfile.py itself. But there's some larger divisions that could be made. We could check in the new test_tempfile.py now, disabling the tests that refer to nonexistent functions (just comment out the lines that add those tests to the test_classes array). The changes to the rest of the test suite are also largely independent of the tempfile.py rewrite (since they replace tempfile.mktemp() with TESTFN, mostly). And the search-and-replace changes in the library can wait until after tempfile.py itself gets reviewed. Unfortunately, I am about to go on vacation for five days, so I don't have time now to do this split-up. I will try to drum up interest on python-dev in reviewing the patch as is. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 12:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-03 02:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 10:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Wed Aug 14 19:43:59 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 11:43:59 -0700 Subject: [Patches] [ python-Patches-576101 ] Alternative implementation of interning Message-ID: Patches item #576101, was opened at 2002-07-01 15:23 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: Alternative implementation of interning Initial Comment: An interned string has a flag set indicating that it is interned instead of a pointer to the interned string. This pointer was almost always either NULL or pointing to the same object. The other cases were rare and ineffective as an optimization. This saves an average of 3 bytes per string. Interned strings are no longer immortal. They are automatically destroyed when there are no more references to them except the global dictionary of interned strings. New function (actually a macro) PyString_CheckInterned to check whether a string is interned. There are no more references to ob_sinterned anywhere outside stringobject.c. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 14:43 Message: Logged In: YES user_id=6380 Here's an update of the patch for current CVS (stringobject.h failed due to changes for PyAPI_DATA/PyAPI_FUNC). Could you add documentation to Doc/api/concrete.tex for PyString_Intern() and explains how PyString_InternInPlace() differs? (AFAICT it makes the interned string immortal -- I suppose this is a B/W compat feature?) The variables PYTHON_API_VERSION and PYTHON_API_STRING in modsupport.h need to be updated -- many extensions use the PyString_AS_STRING() macro which relies on the string object format. If an extension compiled with the old code is linked with the new interpreter, it will miss the first three bytes of string objects -- or even store into memory it doesn't own! (I've already added this to the patch I am uploading.) (The test_gc failures were unrelated; Tim has fixed this already in CVS.) I'm tempted to say that except for the API doc issue this is complete. Thanks! ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-10 06:01 Message: Logged In: YES user_id=562624 General cleanup. Better handling of immortal interned strings for backward compatibility. It passes regrtest but causes test_gc to leak 20 objects. 13 from test_finalizer_newclass and 7 from test_del_newclass, but only if test_saveall is used. I've tried earlier versions of this patch (which were ok at the time) and they now create this leak too. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 12:08 Message: Logged In: YES user_id=562624 Oops, forgot to actually attach the patch. Here it is. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 10:35 Message: Logged In: YES user_id=562624 This implementation supports both mortal and immortal interned strings. PyString_InternInPlace creates an immortal interned string for backward compatibility with code that relies on this behavior. PyString_Intern creates a mortal interned string that is deallocated when its refcnt reaches 0. Note that if the string value has been previously interned as immortal this will not make it mortal. Most places in the interpreter were changed to PyString_Intern except those that may be required for compatibility. This version of the patch, like the previous one, disables indirect interning. Is there any evidence that it is still an important optimization for some packages? Make sure you rebuild everything after applying this patch because it modifies the size of string object headers. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-07-02 00:21 Message: Logged In: YES user_id=80475 I like the way you consolidated all of the knowledge about interning into one place. Consider adding an example to the docs of an effective use of interning for optimization. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 From noreply@sourceforge.net Wed Aug 14 20:32:03 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 12:32:03 -0700 Subject: [Patches] [ python-Patches-576101 ] Alternative implementation of interning Message-ID: Patches item #576101, was opened at 2002-07-01 15:23 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: Alternative implementation of interning Initial Comment: An interned string has a flag set indicating that it is interned instead of a pointer to the interned string. This pointer was almost always either NULL or pointing to the same object. The other cases were rare and ineffective as an optimization. This saves an average of 3 bytes per string. Interned strings are no longer immortal. They are automatically destroyed when there are no more references to them except the global dictionary of interned strings. New function (actually a macro) PyString_CheckInterned to check whether a string is interned. There are no more references to ob_sinterned anywhere outside stringobject.c. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 15:32 Message: Logged In: YES user_id=6380 Question for all other reviewers. Why not replace all calls to PyString_InternInplace() [which creates immortal strings] with PyString_Intern(), making all (core) uses of interning yield mortal strings? E.g. the call in PyObject_SetAttr() will immortalize all strings that are ever used as a key on a setattr operation; in a long-lived server like Zope this is a concern, since setattr keys are often user-provided data: an endless stream of user-provided data will grow the interned dict indefinitely. And having the builtin intern() always return an immortal string also limits the usability of intern(). Most of the uses I could find of PyString_InternFromString() hold on to a global ref to the object, making it immortal anyway; but why should that function itself force the string to be immortal? (Especially since the exceptions are things like PyObject_GetAttrString() and PyObject_SetItemString(), which have the same concerns as PyObject_SetItem(). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 14:43 Message: Logged In: YES user_id=6380 Here's an update of the patch for current CVS (stringobject.h failed due to changes for PyAPI_DATA/PyAPI_FUNC). Could you add documentation to Doc/api/concrete.tex for PyString_Intern() and explains how PyString_InternInPlace() differs? (AFAICT it makes the interned string immortal -- I suppose this is a B/W compat feature?) The variables PYTHON_API_VERSION and PYTHON_API_STRING in modsupport.h need to be updated -- many extensions use the PyString_AS_STRING() macro which relies on the string object format. If an extension compiled with the old code is linked with the new interpreter, it will miss the first three bytes of string objects -- or even store into memory it doesn't own! (I've already added this to the patch I am uploading.) (The test_gc failures were unrelated; Tim has fixed this already in CVS.) I'm tempted to say that except for the API doc issue this is complete. Thanks! ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-10 06:01 Message: Logged In: YES user_id=562624 General cleanup. Better handling of immortal interned strings for backward compatibility. It passes regrtest but causes test_gc to leak 20 objects. 13 from test_finalizer_newclass and 7 from test_del_newclass, but only if test_saveall is used. I've tried earlier versions of this patch (which were ok at the time) and they now create this leak too. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 12:08 Message: Logged In: YES user_id=562624 Oops, forgot to actually attach the patch. Here it is. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 10:35 Message: Logged In: YES user_id=562624 This implementation supports both mortal and immortal interned strings. PyString_InternInPlace creates an immortal interned string for backward compatibility with code that relies on this behavior. PyString_Intern creates a mortal interned string that is deallocated when its refcnt reaches 0. Note that if the string value has been previously interned as immortal this will not make it mortal. Most places in the interpreter were changed to PyString_Intern except those that may be required for compatibility. This version of the patch, like the previous one, disables indirect interning. Is there any evidence that it is still an important optimization for some packages? Make sure you rebuild everything after applying this patch because it modifies the size of string object headers. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-07-02 00:21 Message: Logged In: YES user_id=80475 I like the way you consolidated all of the knowledge about interning into one place. Consider adding an example to the docs of an effective use of interning for optimization. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 From noreply@sourceforge.net Wed Aug 14 20:42:28 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 12:42:28 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-02 08:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Closed Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Guido van Rossum (gvanrossum) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Jack Jansen (jackjansen) Date: 2002-08-14 21:42 Message: Logged In: YES user_id=45365 Isn't it much more logical to give mkstemp() a mode="w+b" argument? The other routines have that as well, and it is also more in line with open() and such... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 18:52 Message: Logged In: YES user_id=6380 Closing again. Reduced the number of temp files to 100. Changed 'binary=True' to 'text=False' default on mkstemp(). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 15:16 Message: Logged In: YES user_id=6380 The "Too many open files" problem is solved. The HP system was configured to allow only 200 open file descriptors per process. But maybe the test would work just as well if it tried to create 100 instead of 1000 temp files? I expect that this would cause failures on other systems with conservative limits. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 15:02 Message: Logged In: YES user_id=6380 I'd like to change the binary=True argument to mkstemp into text=False; that seems easier to explain. News about the HP errors from Kalle Svensson: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 719, in ? test_main() File "../python/dist/src/Lib/test/test_tempfile.py", line 716, in test_main test_support.run_suite(suite) File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/test/test_support.py", line 188, in run_suite raise TestFailed(err) test.test_support.TestFailed: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 295, in test_basic_many File "../python/dist/src/Lib/test/test_tempfile.py", line 278, in do_create File "../python/dist/src/Lib/test/test_tempfile.py", line 33, in failOnException File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/unittest.py", line 260, in fail AssertionError: _mkstemp_inner raised exceptions.OSError: [Errno 24] Too many open files: '/tmp/aaU3irrA' ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 06:28 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 06:17 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 02:19 Message: Logged In: YES user_id=6380 Oops, looks like typos in the patch. Fixed (I hope). Question for Zack: I noticed that a few times you changed this: temp = tempfile.mktemp() into this: (fd, temp) = tempfile.mkstemp() os.close(fd) If the latter is secure, why can't mktemp() be defined as doing that? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-09 22:53 Message: Logged In: YES user_id=33168 Guido, there are still 3 uses of mktemp after the checkin. Should these use mkstemp()? Lib/toaiff.py:102 Lib/plat-irix5/torgb.py:95 Lib/plat-irix6/torgb.py:95 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 18:40 Message: Logged In: YES user_id=6380 I've checked this all in now. The changes to test_tempfile.py weren't as easily fixable to work without the tempfile.py changes as Zack thought. I hope the community will give it some review. It will probably break some Zope tests. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 16:50 Message: Logged In: YES user_id=6380 OK, I'll do something along those lines myself. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-08 21:01 Message: Logged In: YES user_id=580015 I'm afraid my idea of patch size comes from GCC land, where this would be considered not that big at all. Only 1500 lines affected, and more than half of that is documentation and test suite! I tried, and failed, to break up the changes to tempfile.py itself. But there's some larger divisions that could be made. We could check in the new test_tempfile.py now, disabling the tests that refer to nonexistent functions (just comment out the lines that add those tests to the test_classes array). The changes to the rest of the test suite are also largely independent of the tempfile.py rewrite (since they replace tempfile.mktemp() with TESTFN, mostly). And the search-and-replace changes in the library can wait until after tempfile.py itself gets reviewed. Unfortunately, I am about to go on vacation for five days, so I don't have time now to do this split-up. I will try to drum up interest on python-dev in reviewing the patch as is. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 18:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-03 08:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 16:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Wed Aug 14 20:44:49 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 12:44:49 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-02 08:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Closed Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Guido van Rossum (gvanrossum) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Jack Jansen (jackjansen) Date: 2002-08-14 21:44 Message: Logged In: YES user_id=45365 Nevermind. Just saw the discussion on python-dev (this is a file descriptor returned, not a file pointer, so stdio is nowhere in sight). ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-08-14 21:42 Message: Logged In: YES user_id=45365 Isn't it much more logical to give mkstemp() a mode="w+b" argument? The other routines have that as well, and it is also more in line with open() and such... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 18:52 Message: Logged In: YES user_id=6380 Closing again. Reduced the number of temp files to 100. Changed 'binary=True' to 'text=False' default on mkstemp(). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 15:16 Message: Logged In: YES user_id=6380 The "Too many open files" problem is solved. The HP system was configured to allow only 200 open file descriptors per process. But maybe the test would work just as well if it tried to create 100 instead of 1000 temp files? I expect that this would cause failures on other systems with conservative limits. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 15:02 Message: Logged In: YES user_id=6380 I'd like to change the binary=True argument to mkstemp into text=False; that seems easier to explain. News about the HP errors from Kalle Svensson: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 719, in ? test_main() File "../python/dist/src/Lib/test/test_tempfile.py", line 716, in test_main test_support.run_suite(suite) File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/test/test_support.py", line 188, in run_suite raise TestFailed(err) test.test_support.TestFailed: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 295, in test_basic_many File "../python/dist/src/Lib/test/test_tempfile.py", line 278, in do_create File "../python/dist/src/Lib/test/test_tempfile.py", line 33, in failOnException File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/unittest.py", line 260, in fail AssertionError: _mkstemp_inner raised exceptions.OSError: [Errno 24] Too many open files: '/tmp/aaU3irrA' ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 06:28 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 06:17 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 02:19 Message: Logged In: YES user_id=6380 Oops, looks like typos in the patch. Fixed (I hope). Question for Zack: I noticed that a few times you changed this: temp = tempfile.mktemp() into this: (fd, temp) = tempfile.mkstemp() os.close(fd) If the latter is secure, why can't mktemp() be defined as doing that? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-09 22:53 Message: Logged In: YES user_id=33168 Guido, there are still 3 uses of mktemp after the checkin. Should these use mkstemp()? Lib/toaiff.py:102 Lib/plat-irix5/torgb.py:95 Lib/plat-irix6/torgb.py:95 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 18:40 Message: Logged In: YES user_id=6380 I've checked this all in now. The changes to test_tempfile.py weren't as easily fixable to work without the tempfile.py changes as Zack thought. I hope the community will give it some review. It will probably break some Zope tests. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 16:50 Message: Logged In: YES user_id=6380 OK, I'll do something along those lines myself. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-08 21:01 Message: Logged In: YES user_id=580015 I'm afraid my idea of patch size comes from GCC land, where this would be considered not that big at all. Only 1500 lines affected, and more than half of that is documentation and test suite! I tried, and failed, to break up the changes to tempfile.py itself. But there's some larger divisions that could be made. We could check in the new test_tempfile.py now, disabling the tests that refer to nonexistent functions (just comment out the lines that add those tests to the test_classes array). The changes to the rest of the test suite are also largely independent of the tempfile.py rewrite (since they replace tempfile.mktemp() with TESTFN, mostly). And the search-and-replace changes in the library can wait until after tempfile.py itself gets reviewed. Unfortunately, I am about to go on vacation for five days, so I don't have time now to do this split-up. I will try to drum up interest on python-dev in reviewing the patch as is. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 18:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-03 08:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 16:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Wed Aug 14 21:16:10 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 13:16:10 -0700 Subject: [Patches] [ python-Patches-587993 ] SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 07:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open >Resolution: Accepted Priority: 5 Submitted By: Michael Hudson (mwh) >Assigned to: Michael Hudson (mwh) >Summary: SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 16:16 Message: Logged In: YES user_id=6380 Looks good to me. Michael, why don't you hang onto it for another day and then check it in unless someone speaks out against it? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 10:42 Message: Logged In: YES user_id=6656 That's good to know :) No great hurry with the review. I wanted to finish the patch while everything was still fresh in my mind, and I think I've done this now. Of course, I've said that before... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 10:39 Message: Logged In: YES user_id=6380 That's pretty much what I suggested, yes. I'll have to take more time to review all this code... :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 10:29 Message: Logged In: YES user_id=6656 It was discuessed on python-dev: http://mail.python.org/pipermail/python-dev/2002-August/027261.html and followups. Adding a new opcode was Guido's idea, but I'm not totally sure this is what he meant. Also see the comments around line 2933 of ceval.c. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-05 10:23 Message: Logged In: YES user_id=35752 Why was the RETURN_NONE opcode added? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 08:20 Message: Logged In: YES user_id=6656 Here's another update. I've also deleted the old patches. Changes this time: + doesn't include pystone LOOPS boost + finish RETURN_NONE off: - add to dis.py - add to libdis.tex + removed now-unecessary co_lnotab grovelling in inspect and traceback. left the functions for compatibility. + removed the WE_WANT_THE_HACK hack + incudes Neal's test_hotshot patch + more trace.py tweaks. Lib/compiler is still broken. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 07:56 Message: Logged In: YES user_id=33168 Re-assigning to Guido. I think what happened was I was reviewing the patch before it was assigned to Guido. Sorry about that. You're right about the set_lineno, I thought the opcode was also generated in there. I agree about the code reuse--it's probably too hard. If I had a better idea, I would have shared it. :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 05:09 Message: Logged In: YES user_id=6656 Thanks for the comments, Neal! I'm not sure it's possible to separate the changes in the way you describe. For instance, com_addoparg -> com_set_lineno stops SET_LINENO being generated, so breaks tracing without the VM changes, but the VM changes make SET_LINENO into an unknown opcode... I didn't intend to upload the pystone change. About the comments: - the last little RETURN_NONE changes are easy. Thanks for the pointer. - agree there are too many bits of code grovelling co_lnotab. however, it's not clear that they can easily be refactored. maybe a generator would be useful... hmm. let me think about this one. - did I take out the initialisation of frame.f_lineno entirely? oops. probably set it to co_firstlineno. - fixed trace.py docstring. reusing code not all that easy, because different uses have subtly different requirements. hmm, inspect.getlineno is now pointless... was there any particular reason you unassigned the patch? Just curious as the main reason it was assigned to Guido was to make sure he saw it. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 15:50 Message: Logged In: YES user_id=33168 Attached is a patch to test_hotshot which is necessary because the duplicate line #s at the beginning of the function no longer exist. WIth this patch test_hotshot passes. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 16:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 11:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 09:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 18:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 17:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 05:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 05:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 16:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 11:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 11:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Wed Aug 14 21:50:53 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 13:50:53 -0700 Subject: [Patches] [ python-Patches-492105 ] Import from Zip archive Message-ID: Patches item #492105, was opened at 2001-12-12 12:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open >Resolution: Out of Date Priority: 5 Submitted By: James C. Ahlstrom (ahlstromjc) Assigned to: Guido van Rossum (gvanrossum) Summary: Import from Zip archive Initial Comment: This is the "final" patch to support imports from zip archives, and directory caching using os.listdir(). It replaces patch 483466 and 476047. It is a separate patch since I can't delete file attachments. It adds support for importing from "" and from relative paths. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 16:50 Message: Logged In: YES user_id=6380 Sigh. We waited too long for this one, and now the patch is hopelessly out of date. I managed to fix most of the failing hunks, but the remaining hunk that fails (#11 in import.c) is a whopper: the saved import.c.rej is 270 lines long. I'm going to sleep on that, but I could use some help. In the mean time, I'm uploading an edited version of dashc.diff to help the next person. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 06:54 Message: Logged In: YES user_id=21627 Is this patch ready to be applied? ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-06-12 11:05 Message: Logged In: YES user_id=31392 Deleteing the old diffs that Jim couldn't delete. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:27 Message: Logged In: YES user_id=64929 I added a diff -c version of the patch. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:03 Message: Logged In: YES user_id=64929 I still can't delete files, but I added a new file which contains all diffs as a single file, and is made from the current CVS tree (Mar 15, 2002). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 From noreply@sourceforge.net Wed Aug 14 23:24:29 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 15:24:29 -0700 Subject: [Patches] [ python-Patches-576101 ] Alternative implementation of interning Message-ID: Patches item #576101, was opened at 2002-07-01 21:23 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: Alternative implementation of interning Initial Comment: An interned string has a flag set indicating that it is interned instead of a pointer to the interned string. This pointer was almost always either NULL or pointing to the same object. The other cases were rare and ineffective as an optimization. This saves an average of 3 bytes per string. Interned strings are no longer immortal. They are automatically destroyed when there are no more references to them except the global dictionary of interned strings. New function (actually a macro) PyString_CheckInterned to check whether a string is interned. There are no more references to ob_sinterned anywhere outside stringobject.c. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-15 00:24 Message: Logged In: YES user_id=21627 Some mutually unrelated comments: - the GC_UnTrack call for interned is not need: GC won't be able to explain the reference that stringobject.c holds. - why does this try to "fix" the problem of dangling interned strings? AFAICT: if there is a reference to an interned string at the time _Py_ReleaseInternedStrings is called, that reference is silently dropped, and a later DECREF will result in memory corruption. IOW: it should merely set the state of all strings to normal, and clear the dict. - Replacing PyString_InternInPlace with PyString_Intern seems dangerous. AFAICT, the fragment PyString_InternInPlace(&name); Py_DECREF(name); return PyString_AS_STRING(name); from getclassname would break: Intern() would return the only reference to the interned string (assuming this is the first usage), and getclassname drops this reference, returning a pointer to deallocated memory. I'm not sure though why getclassname interns the result in the first place. Selectively replacing them might be a good idea, though. For intern(), I think an optional argument strongref needs to be provided (the interned dict essentially weak-references the strings). Perhaps the default even needs to be weakref. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 21:32 Message: Logged In: YES user_id=6380 Question for all other reviewers. Why not replace all calls to PyString_InternInplace() [which creates immortal strings] with PyString_Intern(), making all (core) uses of interning yield mortal strings? E.g. the call in PyObject_SetAttr() will immortalize all strings that are ever used as a key on a setattr operation; in a long-lived server like Zope this is a concern, since setattr keys are often user-provided data: an endless stream of user-provided data will grow the interned dict indefinitely. And having the builtin intern() always return an immortal string also limits the usability of intern(). Most of the uses I could find of PyString_InternFromString() hold on to a global ref to the object, making it immortal anyway; but why should that function itself force the string to be immortal? (Especially since the exceptions are things like PyObject_GetAttrString() and PyObject_SetItemString(), which have the same concerns as PyObject_SetItem(). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 20:43 Message: Logged In: YES user_id=6380 Here's an update of the patch for current CVS (stringobject.h failed due to changes for PyAPI_DATA/PyAPI_FUNC). Could you add documentation to Doc/api/concrete.tex for PyString_Intern() and explains how PyString_InternInPlace() differs? (AFAICT it makes the interned string immortal -- I suppose this is a B/W compat feature?) The variables PYTHON_API_VERSION and PYTHON_API_STRING in modsupport.h need to be updated -- many extensions use the PyString_AS_STRING() macro which relies on the string object format. If an extension compiled with the old code is linked with the new interpreter, it will miss the first three bytes of string objects -- or even store into memory it doesn't own! (I've already added this to the patch I am uploading.) (The test_gc failures were unrelated; Tim has fixed this already in CVS.) I'm tempted to say that except for the API doc issue this is complete. Thanks! ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-10 12:01 Message: Logged In: YES user_id=562624 General cleanup. Better handling of immortal interned strings for backward compatibility. It passes regrtest but causes test_gc to leak 20 objects. 13 from test_finalizer_newclass and 7 from test_del_newclass, but only if test_saveall is used. I've tried earlier versions of this patch (which were ok at the time) and they now create this leak too. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 18:08 Message: Logged In: YES user_id=562624 Oops, forgot to actually attach the patch. Here it is. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 16:35 Message: Logged In: YES user_id=562624 This implementation supports both mortal and immortal interned strings. PyString_InternInPlace creates an immortal interned string for backward compatibility with code that relies on this behavior. PyString_Intern creates a mortal interned string that is deallocated when its refcnt reaches 0. Note that if the string value has been previously interned as immortal this will not make it mortal. Most places in the interpreter were changed to PyString_Intern except those that may be required for compatibility. This version of the patch, like the previous one, disables indirect interning. Is there any evidence that it is still an important optimization for some packages? Make sure you rebuild everything after applying this patch because it modifies the size of string object headers. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-07-02 06:21 Message: Logged In: YES user_id=80475 I like the way you consolidated all of the knowledge about interning into one place. Consider adding an example to the docs of an effective use of interning for optimization. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 From noreply@sourceforge.net Thu Aug 15 00:08:16 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 16:08:16 -0700 Subject: [Patches] [ python-Patches-587993 ] SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 13:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Michael Hudson (mwh) Summary: SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-15 01:08 Message: Logged In: YES user_id=21627 Random comments: - whatsnew23.tex: replace VERSION - ceval.c; lltrace: Why does it drop the comparison with NULL: lltrace is int, not PyObject* - it appears that RETURN_NONE does more than just returning None; it also suppresses trace calls. This should be carefully documented, or else somebody might suggest to generate RETURN_NONE for a plain return statement. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 22:16 Message: Logged In: YES user_id=6380 Looks good to me. Michael, why don't you hang onto it for another day and then check it in unless someone speaks out against it? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 16:42 Message: Logged In: YES user_id=6656 That's good to know :) No great hurry with the review. I wanted to finish the patch while everything was still fresh in my mind, and I think I've done this now. Of course, I've said that before... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 16:39 Message: Logged In: YES user_id=6380 That's pretty much what I suggested, yes. I'll have to take more time to review all this code... :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 16:29 Message: Logged In: YES user_id=6656 It was discuessed on python-dev: http://mail.python.org/pipermail/python-dev/2002-August/027261.html and followups. Adding a new opcode was Guido's idea, but I'm not totally sure this is what he meant. Also see the comments around line 2933 of ceval.c. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-05 16:23 Message: Logged In: YES user_id=35752 Why was the RETURN_NONE opcode added? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 14:20 Message: Logged In: YES user_id=6656 Here's another update. I've also deleted the old patches. Changes this time: + doesn't include pystone LOOPS boost + finish RETURN_NONE off: - add to dis.py - add to libdis.tex + removed now-unecessary co_lnotab grovelling in inspect and traceback. left the functions for compatibility. + removed the WE_WANT_THE_HACK hack + incudes Neal's test_hotshot patch + more trace.py tweaks. Lib/compiler is still broken. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 13:56 Message: Logged In: YES user_id=33168 Re-assigning to Guido. I think what happened was I was reviewing the patch before it was assigned to Guido. Sorry about that. You're right about the set_lineno, I thought the opcode was also generated in there. I agree about the code reuse--it's probably too hard. If I had a better idea, I would have shared it. :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 11:09 Message: Logged In: YES user_id=6656 Thanks for the comments, Neal! I'm not sure it's possible to separate the changes in the way you describe. For instance, com_addoparg -> com_set_lineno stops SET_LINENO being generated, so breaks tracing without the VM changes, but the VM changes make SET_LINENO into an unknown opcode... I didn't intend to upload the pystone change. About the comments: - the last little RETURN_NONE changes are easy. Thanks for the pointer. - agree there are too many bits of code grovelling co_lnotab. however, it's not clear that they can easily be refactored. maybe a generator would be useful... hmm. let me think about this one. - did I take out the initialisation of frame.f_lineno entirely? oops. probably set it to co_firstlineno. - fixed trace.py docstring. reusing code not all that easy, because different uses have subtly different requirements. hmm, inspect.getlineno is now pointless... was there any particular reason you unassigned the patch? Just curious as the main reason it was assigned to Guido was to make sure he saw it. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 21:50 Message: Logged In: YES user_id=33168 Attached is a patch to test_hotshot which is necessary because the duplicate line #s at the beginning of the function no longer exist. WIth this patch test_hotshot passes. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 22:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 17:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 15:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 17:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 17:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 17:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 17:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-31 00:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 23:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 11:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 11:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 22:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 17:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 17:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Thu Aug 15 00:30:53 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 16:30:53 -0700 Subject: [Patches] [ python-Patches-492105 ] Import from Zip archive Message-ID: Patches item #492105, was opened at 2001-12-12 12:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Out of Date Priority: 5 Submitted By: James C. Ahlstrom (ahlstromjc) Assigned to: Guido van Rossum (gvanrossum) Summary: Import from Zip archive Initial Comment: This is the "final" patch to support imports from zip archives, and directory caching using os.listdir(). It replaces patch 483466 and 476047. It is a separate patch since I can't delete file attachments. It adds support for importing from "" and from relative paths. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-14 19:30 Message: Logged In: YES user_id=33168 I'm reviewed most of the updated version. Quick summary: * strncpy() doesn't null terminate strings, this needs to be added * seems to be many leaked references Here are more details: * strncpy(dest, str, size) doesn't guarantee the result to be null terminated, need dest[size] = '\0'; after all strncpy()s (I see a bunch in getpath.c) * getpath.c:get_sys_path_0(), I think some compilers warn when you do char[i] = 0; (use '\0' instead) (there may be other places of this) * import.c:PyImport_InitZip(), have_zlib is an alias to zlib and isn't really helpful/necessary * import.c:get_path_type() can leak a reference to pyobj if it's an int * import.c:add_directory_names() pylist reference is leaked * import.c:add_zip_names(), memcmp(path+i, ".zip", 4) is clearer to me than path[i] == '.' .... * import.c:add_zip_names, there's a lot of magic #s in this function * " " : leak refs when doing PyTuple_SetItem I think there were other leaked references. I'll try to spend some more time later. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 16:50 Message: Logged In: YES user_id=6380 Sigh. We waited too long for this one, and now the patch is hopelessly out of date. I managed to fix most of the failing hunks, but the remaining hunk that fails (#11 in import.c) is a whopper: the saved import.c.rej is 270 lines long. I'm going to sleep on that, but I could use some help. In the mean time, I'm uploading an edited version of dashc.diff to help the next person. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 06:54 Message: Logged In: YES user_id=21627 Is this patch ready to be applied? ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-06-12 11:05 Message: Logged In: YES user_id=31392 Deleteing the old diffs that Jim couldn't delete. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:27 Message: Logged In: YES user_id=64929 I added a diff -c version of the patch. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:03 Message: Logged In: YES user_id=64929 I still can't delete files, but I added a new file which contains all diffs as a single file, and is made from the current CVS tree (Mar 15, 2002). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 From noreply@sourceforge.net Thu Aug 15 02:25:49 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 18:25:49 -0700 Subject: [Patches] [ python-Patches-586561 ] Better token-related error messages Message-ID: Patches item #586561, was opened at 2002-07-25 11:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=586561&group_id=5470 Category: Parser/Compiler Group: Python 2.3 >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Skip Montanaro (montanaro) Assigned to: Skip Montanaro (montanaro) Summary: Better token-related error messages Initial Comment: There were some complaints recently on c.l.py about the rather non-informative error messages emitted as a result of the tokenizer detecting a problem. In many situations it simply returns E_TOKEN which generates a fairly benign, but often unhelpful "invalid token" message. This patch adds several new E_* macrosto Includes/errorcode.h, returns them from the appropriate places in Parser/tokenizer.c and generates more specific messages in Python/pythonrun.c. I think the error messages are always better, though in some situations they may still not be strictly correct. Assigning to Jeremy since he's the compiler wiz. Skip ---------------------------------------------------------------------- >Comment By: Skip Montanaro (montanaro) Date: 2002-08-14 20:25 Message: Logged In: YES user_id=44345 implemented in Python/pythonrun.c 2.166 Parser/tokenizer.c 2.64 Include/errcode.h 2.16 ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-08-14 10:03 Message: Logged In: YES user_id=31392 Looks good to me! Sorry I took so long to review it again. ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-08-04 16:39 Message: Logged In: YES user_id=44345 here's a new patch - deletes all but the EOFC & EOFS macros and adds a test_eof.py test module ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-08-02 13:17 Message: Logged In: YES user_id=31392 The current error message for the complex case seems clear enough since it identifies exactly the offending character. >>> 3i+2 File "", line 1 3i+2 ^ SyntaxError: invalid syntax The error message for runaway triple quoted strings is much more puzzling, since the line of context doesn't have anything useful on it. I guess we should think about the others, too: E_EOLS is marginal, since you do get the line with the error in the exception. E_EOFC is a win for the same reason that E_EOFS is, although I expect it's a less common case. E_EXP and E_SLASH are borderline -- again because the current syntax error identifies exactly the line and character that are causing the problem. We should get a third opinion, but I'd probably settle for just E_EOFC and E_EOFS. ---------------------------------------------------------------------- Comment By: Skip Montanaro (montanaro) Date: 2002-08-02 13:05 Message: Logged In: YES user_id=44345 re: i vs. j Perhaps it's not needed. The patch was originally designed to address the case of runaway triple-quoted strings. Someone on c.l.py ranted about that. While I was in there, I recalled someone else (perhaps more than one person) had berated Python in the past because imaginary numbers use 'j' instead of 'i' and decided to stick it in. It's no big deal to take it out. (When you think about it, they are all corner cases, since most of the time the code is syntactically correct. ;-) S ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-08-02 12:54 Message: Logged In: YES user_id=31392 Is the warning about i vs. j for complex numbers really necessary? It seems like it adds extra, well, complexity for a tiny corner case. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=586561&group_id=5470 From noreply@sourceforge.net Thu Aug 15 02:26:50 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 18:26:50 -0700 Subject: [Patches] [ python-Patches-576101 ] Alternative implementation of interning Message-ID: Patches item #576101, was opened at 2002-07-01 15:23 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: Alternative implementation of interning Initial Comment: An interned string has a flag set indicating that it is interned instead of a pointer to the interned string. This pointer was almost always either NULL or pointing to the same object. The other cases were rare and ineffective as an optimization. This saves an average of 3 bytes per string. Interned strings are no longer immortal. They are automatically destroyed when there are no more references to them except the global dictionary of interned strings. New function (actually a macro) PyString_CheckInterned to check whether a string is interned. There are no more references to ob_sinterned anywhere outside stringobject.c. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 21:26 Message: Logged In: YES user_id=6380 > - why does this try to "fix" the problem of > dangling interned strings? AFAICT: if there is a > reference to an interned string at the time > _Py_ReleaseInternedStrings is called, that > reference is silently dropped, and a later > DECREF will result in memory corruption. IOW: it > should merely set the state of all strings to > normal, and clear the dict. Note that the *only* time when _Py_ReleaseInternedStrings() can ever be called is at program exit, just before you run a memory leak detector. There's no way Python can be resurrected after _Py_ReleaseInternedStrings() has run. > - Replacing PyString_InternInPlace with > PyString_Intern seems dangerous. AFAICT, the > fragment > > PyString_InternInPlace(&name); > Py_DECREF(name); > return PyString_AS_STRING(name); > > from getclassname would break: Intern() would > return the only reference to the interned string > (assuming this is the first usage), and > getclassname drops this reference, returning a > pointer to deallocated memory. I'm not sure > though why getclassname interns the result in > the first place. getclassname() is doing something very unsavory here! I expect that its API will have to be changed to copy the name into a buffer provided by the caller. We'll have to scrutinize all calls for tricks like this. > Selectively replacing them might be a good idea, > though. For intern(), I think an optional > argument strongref needs to be provided (the > interned dict essentially weak-references the > strings). Perhaps the default even needs to be > weakref. So do you think there's a need for immortal strings? What is that need? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 18:24 Message: Logged In: YES user_id=21627 Some mutually unrelated comments: - the GC_UnTrack call for interned is not need: GC won't be able to explain the reference that stringobject.c holds. - why does this try to "fix" the problem of dangling interned strings? AFAICT: if there is a reference to an interned string at the time _Py_ReleaseInternedStrings is called, that reference is silently dropped, and a later DECREF will result in memory corruption. IOW: it should merely set the state of all strings to normal, and clear the dict. - Replacing PyString_InternInPlace with PyString_Intern seems dangerous. AFAICT, the fragment PyString_InternInPlace(&name); Py_DECREF(name); return PyString_AS_STRING(name); from getclassname would break: Intern() would return the only reference to the interned string (assuming this is the first usage), and getclassname drops this reference, returning a pointer to deallocated memory. I'm not sure though why getclassname interns the result in the first place. Selectively replacing them might be a good idea, though. For intern(), I think an optional argument strongref needs to be provided (the interned dict essentially weak-references the strings). Perhaps the default even needs to be weakref. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 15:32 Message: Logged In: YES user_id=6380 Question for all other reviewers. Why not replace all calls to PyString_InternInplace() [which creates immortal strings] with PyString_Intern(), making all (core) uses of interning yield mortal strings? E.g. the call in PyObject_SetAttr() will immortalize all strings that are ever used as a key on a setattr operation; in a long-lived server like Zope this is a concern, since setattr keys are often user-provided data: an endless stream of user-provided data will grow the interned dict indefinitely. And having the builtin intern() always return an immortal string also limits the usability of intern(). Most of the uses I could find of PyString_InternFromString() hold on to a global ref to the object, making it immortal anyway; but why should that function itself force the string to be immortal? (Especially since the exceptions are things like PyObject_GetAttrString() and PyObject_SetItemString(), which have the same concerns as PyObject_SetItem(). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 14:43 Message: Logged In: YES user_id=6380 Here's an update of the patch for current CVS (stringobject.h failed due to changes for PyAPI_DATA/PyAPI_FUNC). Could you add documentation to Doc/api/concrete.tex for PyString_Intern() and explains how PyString_InternInPlace() differs? (AFAICT it makes the interned string immortal -- I suppose this is a B/W compat feature?) The variables PYTHON_API_VERSION and PYTHON_API_STRING in modsupport.h need to be updated -- many extensions use the PyString_AS_STRING() macro which relies on the string object format. If an extension compiled with the old code is linked with the new interpreter, it will miss the first three bytes of string objects -- or even store into memory it doesn't own! (I've already added this to the patch I am uploading.) (The test_gc failures were unrelated; Tim has fixed this already in CVS.) I'm tempted to say that except for the API doc issue this is complete. Thanks! ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-10 06:01 Message: Logged In: YES user_id=562624 General cleanup. Better handling of immortal interned strings for backward compatibility. It passes regrtest but causes test_gc to leak 20 objects. 13 from test_finalizer_newclass and 7 from test_del_newclass, but only if test_saveall is used. I've tried earlier versions of this patch (which were ok at the time) and they now create this leak too. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 12:08 Message: Logged In: YES user_id=562624 Oops, forgot to actually attach the patch. Here it is. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 10:35 Message: Logged In: YES user_id=562624 This implementation supports both mortal and immortal interned strings. PyString_InternInPlace creates an immortal interned string for backward compatibility with code that relies on this behavior. PyString_Intern creates a mortal interned string that is deallocated when its refcnt reaches 0. Note that if the string value has been previously interned as immortal this will not make it mortal. Most places in the interpreter were changed to PyString_Intern except those that may be required for compatibility. This version of the patch, like the previous one, disables indirect interning. Is there any evidence that it is still an important optimization for some packages? Make sure you rebuild everything after applying this patch because it modifies the size of string object headers. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-07-02 00:21 Message: Logged In: YES user_id=80475 I like the way you consolidated all of the knowledge about interning into one place. Consider adding an example to the docs of an effective use of interning for optimization. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 From noreply@sourceforge.net Thu Aug 15 02:32:12 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 18:32:12 -0700 Subject: [Patches] [ python-Patches-492105 ] Import from Zip archive Message-ID: Patches item #492105, was opened at 2001-12-12 12:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Out of Date Priority: 5 Submitted By: James C. Ahlstrom (ahlstromjc) Assigned to: Guido van Rossum (gvanrossum) Summary: Import from Zip archive Initial Comment: This is the "final" patch to support imports from zip archives, and directory caching using os.listdir(). It replaces patch 483466 and 476047. It is a separate patch since I can't delete file attachments. It adds support for importing from "" and from relative paths. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 21:32 Message: Logged In: YES user_id=6380 Whoa... That's a lot. Neal, do you think you could come up with a reworked version of this? That would be great! ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-14 19:30 Message: Logged In: YES user_id=33168 I'm reviewed most of the updated version. Quick summary: * strncpy() doesn't null terminate strings, this needs to be added * seems to be many leaked references Here are more details: * strncpy(dest, str, size) doesn't guarantee the result to be null terminated, need dest[size] = '\0'; after all strncpy()s (I see a bunch in getpath.c) * getpath.c:get_sys_path_0(), I think some compilers warn when you do char[i] = 0; (use '\0' instead) (there may be other places of this) * import.c:PyImport_InitZip(), have_zlib is an alias to zlib and isn't really helpful/necessary * import.c:get_path_type() can leak a reference to pyobj if it's an int * import.c:add_directory_names() pylist reference is leaked * import.c:add_zip_names(), memcmp(path+i, ".zip", 4) is clearer to me than path[i] == '.' .... * import.c:add_zip_names, there's a lot of magic #s in this function * " " : leak refs when doing PyTuple_SetItem I think there were other leaked references. I'll try to spend some more time later. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 16:50 Message: Logged In: YES user_id=6380 Sigh. We waited too long for this one, and now the patch is hopelessly out of date. I managed to fix most of the failing hunks, but the remaining hunk that fails (#11 in import.c) is a whopper: the saved import.c.rej is 270 lines long. I'm going to sleep on that, but I could use some help. In the mean time, I'm uploading an edited version of dashc.diff to help the next person. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 06:54 Message: Logged In: YES user_id=21627 Is this patch ready to be applied? ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-06-12 11:05 Message: Logged In: YES user_id=31392 Deleteing the old diffs that Jim couldn't delete. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:27 Message: Logged In: YES user_id=64929 I added a diff -c version of the patch. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:03 Message: Logged In: YES user_id=64929 I still can't delete files, but I added a new file which contains all diffs as a single file, and is made from the current CVS tree (Mar 15, 2002). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 From noreply@sourceforge.net Thu Aug 15 02:38:05 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 18:38:05 -0700 Subject: [Patches] [ python-Patches-492105 ] Import from Zip archive Message-ID: Patches item #492105, was opened at 2001-12-12 12:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Out of Date Priority: 5 Submitted By: James C. Ahlstrom (ahlstromjc) >Assigned to: Neal Norwitz (nnorwitz) Summary: Import from Zip archive Initial Comment: This is the "final" patch to support imports from zip archives, and directory caching using os.listdir(). It replaces patch 483466 and 476047. It is a separate patch since I can't delete file attachments. It adds support for importing from "" and from relative paths. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-14 21:38 Message: Logged In: YES user_id=33168 Reassigning to me. I'll give it a shot. It will take a while. Do I understand you correctly that your updated patch (dashc-2) has all the necessary pieces, except for the import hunk 11 that was rejected? And I need to get that failed hunk from the original patch? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 21:32 Message: Logged In: YES user_id=6380 Whoa... That's a lot. Neal, do you think you could come up with a reworked version of this? That would be great! ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-14 19:30 Message: Logged In: YES user_id=33168 I'm reviewed most of the updated version. Quick summary: * strncpy() doesn't null terminate strings, this needs to be added * seems to be many leaked references Here are more details: * strncpy(dest, str, size) doesn't guarantee the result to be null terminated, need dest[size] = '\0'; after all strncpy()s (I see a bunch in getpath.c) * getpath.c:get_sys_path_0(), I think some compilers warn when you do char[i] = 0; (use '\0' instead) (there may be other places of this) * import.c:PyImport_InitZip(), have_zlib is an alias to zlib and isn't really helpful/necessary * import.c:get_path_type() can leak a reference to pyobj if it's an int * import.c:add_directory_names() pylist reference is leaked * import.c:add_zip_names(), memcmp(path+i, ".zip", 4) is clearer to me than path[i] == '.' .... * import.c:add_zip_names, there's a lot of magic #s in this function * " " : leak refs when doing PyTuple_SetItem I think there were other leaked references. I'll try to spend some more time later. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 16:50 Message: Logged In: YES user_id=6380 Sigh. We waited too long for this one, and now the patch is hopelessly out of date. I managed to fix most of the failing hunks, but the remaining hunk that fails (#11 in import.c) is a whopper: the saved import.c.rej is 270 lines long. I'm going to sleep on that, but I could use some help. In the mean time, I'm uploading an edited version of dashc.diff to help the next person. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 06:54 Message: Logged In: YES user_id=21627 Is this patch ready to be applied? ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-06-12 11:05 Message: Logged In: YES user_id=31392 Deleteing the old diffs that Jim couldn't delete. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:27 Message: Logged In: YES user_id=64929 I added a diff -c version of the patch. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:03 Message: Logged In: YES user_id=64929 I still can't delete files, but I added a new file which contains all diffs as a single file, and is made from the current CVS tree (Mar 15, 2002). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 From noreply@sourceforge.net Thu Aug 15 03:40:15 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 19:40:15 -0700 Subject: [Patches] [ python-Patches-492105 ] Import from Zip archive Message-ID: Patches item #492105, was opened at 2001-12-12 12:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Out of Date Priority: 5 Submitted By: James C. Ahlstrom (ahlstromjc) Assigned to: Neal Norwitz (nnorwitz) Summary: Import from Zip archive Initial Comment: This is the "final" patch to support imports from zip archives, and directory caching using os.listdir(). It replaces patch 483466 and 476047. It is a separate patch since I can't delete file attachments. It adds support for importing from "" and from relative paths. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 22:40 Message: Logged In: YES user_id=6380 No, the failing hunk is included in dashc-2. I actually *edited* the patch file until all but that one hunk succeeded. Thanks for looking into this! If you need help don't fail to ask on python-dev, I read it daily. :-) ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-14 21:38 Message: Logged In: YES user_id=33168 Reassigning to me. I'll give it a shot. It will take a while. Do I understand you correctly that your updated patch (dashc-2) has all the necessary pieces, except for the import hunk 11 that was rejected? And I need to get that failed hunk from the original patch? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 21:32 Message: Logged In: YES user_id=6380 Whoa... That's a lot. Neal, do you think you could come up with a reworked version of this? That would be great! ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-14 19:30 Message: Logged In: YES user_id=33168 I'm reviewed most of the updated version. Quick summary: * strncpy() doesn't null terminate strings, this needs to be added * seems to be many leaked references Here are more details: * strncpy(dest, str, size) doesn't guarantee the result to be null terminated, need dest[size] = '\0'; after all strncpy()s (I see a bunch in getpath.c) * getpath.c:get_sys_path_0(), I think some compilers warn when you do char[i] = 0; (use '\0' instead) (there may be other places of this) * import.c:PyImport_InitZip(), have_zlib is an alias to zlib and isn't really helpful/necessary * import.c:get_path_type() can leak a reference to pyobj if it's an int * import.c:add_directory_names() pylist reference is leaked * import.c:add_zip_names(), memcmp(path+i, ".zip", 4) is clearer to me than path[i] == '.' .... * import.c:add_zip_names, there's a lot of magic #s in this function * " " : leak refs when doing PyTuple_SetItem I think there were other leaked references. I'll try to spend some more time later. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 16:50 Message: Logged In: YES user_id=6380 Sigh. We waited too long for this one, and now the patch is hopelessly out of date. I managed to fix most of the failing hunks, but the remaining hunk that fails (#11 in import.c) is a whopper: the saved import.c.rej is 270 lines long. I'm going to sleep on that, but I could use some help. In the mean time, I'm uploading an edited version of dashc.diff to help the next person. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 06:54 Message: Logged In: YES user_id=21627 Is this patch ready to be applied? ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-06-12 11:05 Message: Logged In: YES user_id=31392 Deleteing the old diffs that Jim couldn't delete. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:27 Message: Logged In: YES user_id=64929 I added a diff -c version of the patch. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:03 Message: Logged In: YES user_id=64929 I still can't delete files, but I added a new file which contains all diffs as a single file, and is made from the current CVS tree (Mar 15, 2002). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 From noreply@sourceforge.net Thu Aug 15 04:03:31 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 20:03:31 -0700 Subject: [Patches] [ python-Patches-576101 ] Alternative implementation of interning Message-ID: Patches item #576101, was opened at 2002-07-01 15:23 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: Alternative implementation of interning Initial Comment: An interned string has a flag set indicating that it is interned instead of a pointer to the interned string. This pointer was almost always either NULL or pointing to the same object. The other cases were rare and ineffective as an optimization. This saves an average of 3 bytes per string. Interned strings are no longer immortal. They are automatically destroyed when there are no more references to them except the global dictionary of interned strings. New function (actually a macro) PyString_CheckInterned to check whether a string is interned. There are no more references to ob_sinterned anywhere outside stringobject.c. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 23:03 Message: Logged In: YES user_id=6380 string_dealloc() is a bit optimistic in that it doesn't check the DelItem for an error; but I don't know what it should do when it gets an error at that point. Probably call Py_FatalError(); if it wanted to recover, it would have to call PyErr_Fetch() / PyErr_Restore() around the DelItem() call, because we're in a dealloc handler here and that shouldn't change the exception state. _Py_ReleaseInternedStrings() should use PyDict_ methods, not PyMapping_ methods. And it should do more careful error checking. But maybe it's best to delete this function -- it's not needed except when you want to run Insure++, and we're not using that any more. I note that the whole patch needs to be scrutinized carefully looking for missing error checking and things like that. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 21:26 Message: Logged In: YES user_id=6380 > - why does this try to "fix" the problem of > dangling interned strings? AFAICT: if there is a > reference to an interned string at the time > _Py_ReleaseInternedStrings is called, that > reference is silently dropped, and a later > DECREF will result in memory corruption. IOW: it > should merely set the state of all strings to > normal, and clear the dict. Note that the *only* time when _Py_ReleaseInternedStrings() can ever be called is at program exit, just before you run a memory leak detector. There's no way Python can be resurrected after _Py_ReleaseInternedStrings() has run. > - Replacing PyString_InternInPlace with > PyString_Intern seems dangerous. AFAICT, the > fragment > > PyString_InternInPlace(&name); > Py_DECREF(name); > return PyString_AS_STRING(name); > > from getclassname would break: Intern() would > return the only reference to the interned string > (assuming this is the first usage), and > getclassname drops this reference, returning a > pointer to deallocated memory. I'm not sure > though why getclassname interns the result in > the first place. getclassname() is doing something very unsavory here! I expect that its API will have to be changed to copy the name into a buffer provided by the caller. We'll have to scrutinize all calls for tricks like this. > Selectively replacing them might be a good idea, > though. For intern(), I think an optional > argument strongref needs to be provided (the > interned dict essentially weak-references the > strings). Perhaps the default even needs to be > weakref. So do you think there's a need for immortal strings? What is that need? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 18:24 Message: Logged In: YES user_id=21627 Some mutually unrelated comments: - the GC_UnTrack call for interned is not need: GC won't be able to explain the reference that stringobject.c holds. - why does this try to "fix" the problem of dangling interned strings? AFAICT: if there is a reference to an interned string at the time _Py_ReleaseInternedStrings is called, that reference is silently dropped, and a later DECREF will result in memory corruption. IOW: it should merely set the state of all strings to normal, and clear the dict. - Replacing PyString_InternInPlace with PyString_Intern seems dangerous. AFAICT, the fragment PyString_InternInPlace(&name); Py_DECREF(name); return PyString_AS_STRING(name); from getclassname would break: Intern() would return the only reference to the interned string (assuming this is the first usage), and getclassname drops this reference, returning a pointer to deallocated memory. I'm not sure though why getclassname interns the result in the first place. Selectively replacing them might be a good idea, though. For intern(), I think an optional argument strongref needs to be provided (the interned dict essentially weak-references the strings). Perhaps the default even needs to be weakref. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 15:32 Message: Logged In: YES user_id=6380 Question for all other reviewers. Why not replace all calls to PyString_InternInplace() [which creates immortal strings] with PyString_Intern(), making all (core) uses of interning yield mortal strings? E.g. the call in PyObject_SetAttr() will immortalize all strings that are ever used as a key on a setattr operation; in a long-lived server like Zope this is a concern, since setattr keys are often user-provided data: an endless stream of user-provided data will grow the interned dict indefinitely. And having the builtin intern() always return an immortal string also limits the usability of intern(). Most of the uses I could find of PyString_InternFromString() hold on to a global ref to the object, making it immortal anyway; but why should that function itself force the string to be immortal? (Especially since the exceptions are things like PyObject_GetAttrString() and PyObject_SetItemString(), which have the same concerns as PyObject_SetItem(). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 14:43 Message: Logged In: YES user_id=6380 Here's an update of the patch for current CVS (stringobject.h failed due to changes for PyAPI_DATA/PyAPI_FUNC). Could you add documentation to Doc/api/concrete.tex for PyString_Intern() and explains how PyString_InternInPlace() differs? (AFAICT it makes the interned string immortal -- I suppose this is a B/W compat feature?) The variables PYTHON_API_VERSION and PYTHON_API_STRING in modsupport.h need to be updated -- many extensions use the PyString_AS_STRING() macro which relies on the string object format. If an extension compiled with the old code is linked with the new interpreter, it will miss the first three bytes of string objects -- or even store into memory it doesn't own! (I've already added this to the patch I am uploading.) (The test_gc failures were unrelated; Tim has fixed this already in CVS.) I'm tempted to say that except for the API doc issue this is complete. Thanks! ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-10 06:01 Message: Logged In: YES user_id=562624 General cleanup. Better handling of immortal interned strings for backward compatibility. It passes regrtest but causes test_gc to leak 20 objects. 13 from test_finalizer_newclass and 7 from test_del_newclass, but only if test_saveall is used. I've tried earlier versions of this patch (which were ok at the time) and they now create this leak too. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 12:08 Message: Logged In: YES user_id=562624 Oops, forgot to actually attach the patch. Here it is. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 10:35 Message: Logged In: YES user_id=562624 This implementation supports both mortal and immortal interned strings. PyString_InternInPlace creates an immortal interned string for backward compatibility with code that relies on this behavior. PyString_Intern creates a mortal interned string that is deallocated when its refcnt reaches 0. Note that if the string value has been previously interned as immortal this will not make it mortal. Most places in the interpreter were changed to PyString_Intern except those that may be required for compatibility. This version of the patch, like the previous one, disables indirect interning. Is there any evidence that it is still an important optimization for some packages? Make sure you rebuild everything after applying this patch because it modifies the size of string object headers. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-07-02 00:21 Message: Logged In: YES user_id=80475 I like the way you consolidated all of the knowledge about interning into one place. Consider adding an example to the docs of an effective use of interning for optimization. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 From noreply@sourceforge.net Thu Aug 15 04:44:50 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 20:44:50 -0700 Subject: [Patches] [ python-Patches-576101 ] Alternative implementation of interning Message-ID: Patches item #576101, was opened at 2002-07-01 19:23 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: Alternative implementation of interning Initial Comment: An interned string has a flag set indicating that it is interned instead of a pointer to the interned string. This pointer was almost always either NULL or pointing to the same object. The other cases were rare and ineffective as an optimization. This saves an average of 3 bytes per string. Interned strings are no longer immortal. They are automatically destroyed when there are no more references to them except the global dictionary of interned strings. New function (actually a macro) PyString_CheckInterned to check whether a string is interned. There are no more references to ob_sinterned anywhere outside stringobject.c. ---------------------------------------------------------------------- >Comment By: Oren Tirosh (orenti) Date: 2002-08-15 03:44 Message: Logged In: YES user_id=562624 Yes, PyString_InternInPlace is for backward compatibility. How conservative do we need to be about compatibility? My work copy has an option for making strings binary compatible. Which is more important: binary compatibility or saving 3 bytes? A related patch (static names) provides a possible alternative to most PyString_InternFromString calls. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 03:03 Message: Logged In: YES user_id=6380 string_dealloc() is a bit optimistic in that it doesn't check the DelItem for an error; but I don't know what it should do when it gets an error at that point. Probably call Py_FatalError(); if it wanted to recover, it would have to call PyErr_Fetch() / PyErr_Restore() around the DelItem() call, because we're in a dealloc handler here and that shouldn't change the exception state. _Py_ReleaseInternedStrings() should use PyDict_ methods, not PyMapping_ methods. And it should do more careful error checking. But maybe it's best to delete this function -- it's not needed except when you want to run Insure++, and we're not using that any more. I note that the whole patch needs to be scrutinized carefully looking for missing error checking and things like that. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 01:26 Message: Logged In: YES user_id=6380 > - why does this try to "fix" the problem of > dangling interned strings? AFAICT: if there is a > reference to an interned string at the time > _Py_ReleaseInternedStrings is called, that > reference is silently dropped, and a later > DECREF will result in memory corruption. IOW: it > should merely set the state of all strings to > normal, and clear the dict. Note that the *only* time when _Py_ReleaseInternedStrings() can ever be called is at program exit, just before you run a memory leak detector. There's no way Python can be resurrected after _Py_ReleaseInternedStrings() has run. > - Replacing PyString_InternInPlace with > PyString_Intern seems dangerous. AFAICT, the > fragment > > PyString_InternInPlace(&name); > Py_DECREF(name); > return PyString_AS_STRING(name); > > from getclassname would break: Intern() would > return the only reference to the interned string > (assuming this is the first usage), and > getclassname drops this reference, returning a > pointer to deallocated memory. I'm not sure > though why getclassname interns the result in > the first place. getclassname() is doing something very unsavory here! I expect that its API will have to be changed to copy the name into a buffer provided by the caller. We'll have to scrutinize all calls for tricks like this. > Selectively replacing them might be a good idea, > though. For intern(), I think an optional > argument strongref needs to be provided (the > interned dict essentially weak-references the > strings). Perhaps the default even needs to be > weakref. So do you think there's a need for immortal strings? What is that need? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 22:24 Message: Logged In: YES user_id=21627 Some mutually unrelated comments: - the GC_UnTrack call for interned is not need: GC won't be able to explain the reference that stringobject.c holds. - why does this try to "fix" the problem of dangling interned strings? AFAICT: if there is a reference to an interned string at the time _Py_ReleaseInternedStrings is called, that reference is silently dropped, and a later DECREF will result in memory corruption. IOW: it should merely set the state of all strings to normal, and clear the dict. - Replacing PyString_InternInPlace with PyString_Intern seems dangerous. AFAICT, the fragment PyString_InternInPlace(&name); Py_DECREF(name); return PyString_AS_STRING(name); from getclassname would break: Intern() would return the only reference to the interned string (assuming this is the first usage), and getclassname drops this reference, returning a pointer to deallocated memory. I'm not sure though why getclassname interns the result in the first place. Selectively replacing them might be a good idea, though. For intern(), I think an optional argument strongref needs to be provided (the interned dict essentially weak-references the strings). Perhaps the default even needs to be weakref. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 19:32 Message: Logged In: YES user_id=6380 Question for all other reviewers. Why not replace all calls to PyString_InternInplace() [which creates immortal strings] with PyString_Intern(), making all (core) uses of interning yield mortal strings? E.g. the call in PyObject_SetAttr() will immortalize all strings that are ever used as a key on a setattr operation; in a long-lived server like Zope this is a concern, since setattr keys are often user-provided data: an endless stream of user-provided data will grow the interned dict indefinitely. And having the builtin intern() always return an immortal string also limits the usability of intern(). Most of the uses I could find of PyString_InternFromString() hold on to a global ref to the object, making it immortal anyway; but why should that function itself force the string to be immortal? (Especially since the exceptions are things like PyObject_GetAttrString() and PyObject_SetItemString(), which have the same concerns as PyObject_SetItem(). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 18:43 Message: Logged In: YES user_id=6380 Here's an update of the patch for current CVS (stringobject.h failed due to changes for PyAPI_DATA/PyAPI_FUNC). Could you add documentation to Doc/api/concrete.tex for PyString_Intern() and explains how PyString_InternInPlace() differs? (AFAICT it makes the interned string immortal -- I suppose this is a B/W compat feature?) The variables PYTHON_API_VERSION and PYTHON_API_STRING in modsupport.h need to be updated -- many extensions use the PyString_AS_STRING() macro which relies on the string object format. If an extension compiled with the old code is linked with the new interpreter, it will miss the first three bytes of string objects -- or even store into memory it doesn't own! (I've already added this to the patch I am uploading.) (The test_gc failures were unrelated; Tim has fixed this already in CVS.) I'm tempted to say that except for the API doc issue this is complete. Thanks! ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-10 10:01 Message: Logged In: YES user_id=562624 General cleanup. Better handling of immortal interned strings for backward compatibility. It passes regrtest but causes test_gc to leak 20 objects. 13 from test_finalizer_newclass and 7 from test_del_newclass, but only if test_saveall is used. I've tried earlier versions of this patch (which were ok at the time) and they now create this leak too. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 16:08 Message: Logged In: YES user_id=562624 Oops, forgot to actually attach the patch. Here it is. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 14:35 Message: Logged In: YES user_id=562624 This implementation supports both mortal and immortal interned strings. PyString_InternInPlace creates an immortal interned string for backward compatibility with code that relies on this behavior. PyString_Intern creates a mortal interned string that is deallocated when its refcnt reaches 0. Note that if the string value has been previously interned as immortal this will not make it mortal. Most places in the interpreter were changed to PyString_Intern except those that may be required for compatibility. This version of the patch, like the previous one, disables indirect interning. Is there any evidence that it is still an important optimization for some packages? Make sure you rebuild everything after applying this patch because it modifies the size of string object headers. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-07-02 04:21 Message: Logged In: YES user_id=80475 I like the way you consolidated all of the knowledge about interning into one place. Consider adding an example to the docs of an effective use of interning for optimization. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 From noreply@sourceforge.net Thu Aug 15 09:45:43 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 01:45:43 -0700 Subject: [Patches] [ python-Patches-576101 ] Alternative implementation of interning Message-ID: Patches item #576101, was opened at 2002-07-01 21:23 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: Alternative implementation of interning Initial Comment: An interned string has a flag set indicating that it is interned instead of a pointer to the interned string. This pointer was almost always either NULL or pointing to the same object. The other cases were rare and ineffective as an optimization. This saves an average of 3 bytes per string. Interned strings are no longer immortal. They are automatically destroyed when there are no more references to them except the global dictionary of interned strings. New function (actually a macro) PyString_CheckInterned to check whether a string is interned. There are no more references to ob_sinterned anywhere outside stringobject.c. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-15 10:45 Message: Logged In: YES user_id=21627 _Py_ReleaseInternedStrings: it might be that embedded applications use it. It would not be fair to cause heap corruption for them - it would be better to break them at link time, by removing the function entirely. I see no need to do either - it should just release immortal strings, as it always did, if there are any left. intern creates immortal strings: It might be that an application saves the id() of an interned string and releases the interned strings; then expects to get the same id back later. If you ask people whether they do that they won't tell, because they don't know that they do that. You could explicitly decide to break such applications (which would be reasonable), but then this needs to be documented. binary compatibility: I'm neutral here. If the API is bumped, people get sufficient warning. PyString_InternInPlace: I think it needs to be preserved, since applications may not hold explicit references (trusting that the interned dictionary will hold the reference). Of course, the InPlace name signals that there is no return value, so it is better than _Intern for new users. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-15 05:44 Message: Logged In: YES user_id=562624 Yes, PyString_InternInPlace is for backward compatibility. How conservative do we need to be about compatibility? My work copy has an option for making strings binary compatible. Which is more important: binary compatibility or saving 3 bytes? A related patch (static names) provides a possible alternative to most PyString_InternFromString calls. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 05:03 Message: Logged In: YES user_id=6380 string_dealloc() is a bit optimistic in that it doesn't check the DelItem for an error; but I don't know what it should do when it gets an error at that point. Probably call Py_FatalError(); if it wanted to recover, it would have to call PyErr_Fetch() / PyErr_Restore() around the DelItem() call, because we're in a dealloc handler here and that shouldn't change the exception state. _Py_ReleaseInternedStrings() should use PyDict_ methods, not PyMapping_ methods. And it should do more careful error checking. But maybe it's best to delete this function -- it's not needed except when you want to run Insure++, and we're not using that any more. I note that the whole patch needs to be scrutinized carefully looking for missing error checking and things like that. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 03:26 Message: Logged In: YES user_id=6380 > - why does this try to "fix" the problem of > dangling interned strings? AFAICT: if there is a > reference to an interned string at the time > _Py_ReleaseInternedStrings is called, that > reference is silently dropped, and a later > DECREF will result in memory corruption. IOW: it > should merely set the state of all strings to > normal, and clear the dict. Note that the *only* time when _Py_ReleaseInternedStrings() can ever be called is at program exit, just before you run a memory leak detector. There's no way Python can be resurrected after _Py_ReleaseInternedStrings() has run. > - Replacing PyString_InternInPlace with > PyString_Intern seems dangerous. AFAICT, the > fragment > > PyString_InternInPlace(&name); > Py_DECREF(name); > return PyString_AS_STRING(name); > > from getclassname would break: Intern() would > return the only reference to the interned string > (assuming this is the first usage), and > getclassname drops this reference, returning a > pointer to deallocated memory. I'm not sure > though why getclassname interns the result in > the first place. getclassname() is doing something very unsavory here! I expect that its API will have to be changed to copy the name into a buffer provided by the caller. We'll have to scrutinize all calls for tricks like this. > Selectively replacing them might be a good idea, > though. For intern(), I think an optional > argument strongref needs to be provided (the > interned dict essentially weak-references the > strings). Perhaps the default even needs to be > weakref. So do you think there's a need for immortal strings? What is that need? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-15 00:24 Message: Logged In: YES user_id=21627 Some mutually unrelated comments: - the GC_UnTrack call for interned is not need: GC won't be able to explain the reference that stringobject.c holds. - why does this try to "fix" the problem of dangling interned strings? AFAICT: if there is a reference to an interned string at the time _Py_ReleaseInternedStrings is called, that reference is silently dropped, and a later DECREF will result in memory corruption. IOW: it should merely set the state of all strings to normal, and clear the dict. - Replacing PyString_InternInPlace with PyString_Intern seems dangerous. AFAICT, the fragment PyString_InternInPlace(&name); Py_DECREF(name); return PyString_AS_STRING(name); from getclassname would break: Intern() would return the only reference to the interned string (assuming this is the first usage), and getclassname drops this reference, returning a pointer to deallocated memory. I'm not sure though why getclassname interns the result in the first place. Selectively replacing them might be a good idea, though. For intern(), I think an optional argument strongref needs to be provided (the interned dict essentially weak-references the strings). Perhaps the default even needs to be weakref. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 21:32 Message: Logged In: YES user_id=6380 Question for all other reviewers. Why not replace all calls to PyString_InternInplace() [which creates immortal strings] with PyString_Intern(), making all (core) uses of interning yield mortal strings? E.g. the call in PyObject_SetAttr() will immortalize all strings that are ever used as a key on a setattr operation; in a long-lived server like Zope this is a concern, since setattr keys are often user-provided data: an endless stream of user-provided data will grow the interned dict indefinitely. And having the builtin intern() always return an immortal string also limits the usability of intern(). Most of the uses I could find of PyString_InternFromString() hold on to a global ref to the object, making it immortal anyway; but why should that function itself force the string to be immortal? (Especially since the exceptions are things like PyObject_GetAttrString() and PyObject_SetItemString(), which have the same concerns as PyObject_SetItem(). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 20:43 Message: Logged In: YES user_id=6380 Here's an update of the patch for current CVS (stringobject.h failed due to changes for PyAPI_DATA/PyAPI_FUNC). Could you add documentation to Doc/api/concrete.tex for PyString_Intern() and explains how PyString_InternInPlace() differs? (AFAICT it makes the interned string immortal -- I suppose this is a B/W compat feature?) The variables PYTHON_API_VERSION and PYTHON_API_STRING in modsupport.h need to be updated -- many extensions use the PyString_AS_STRING() macro which relies on the string object format. If an extension compiled with the old code is linked with the new interpreter, it will miss the first three bytes of string objects -- or even store into memory it doesn't own! (I've already added this to the patch I am uploading.) (The test_gc failures were unrelated; Tim has fixed this already in CVS.) I'm tempted to say that except for the API doc issue this is complete. Thanks! ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-10 12:01 Message: Logged In: YES user_id=562624 General cleanup. Better handling of immortal interned strings for backward compatibility. It passes regrtest but causes test_gc to leak 20 objects. 13 from test_finalizer_newclass and 7 from test_del_newclass, but only if test_saveall is used. I've tried earlier versions of this patch (which were ok at the time) and they now create this leak too. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 18:08 Message: Logged In: YES user_id=562624 Oops, forgot to actually attach the patch. Here it is. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 16:35 Message: Logged In: YES user_id=562624 This implementation supports both mortal and immortal interned strings. PyString_InternInPlace creates an immortal interned string for backward compatibility with code that relies on this behavior. PyString_Intern creates a mortal interned string that is deallocated when its refcnt reaches 0. Note that if the string value has been previously interned as immortal this will not make it mortal. Most places in the interpreter were changed to PyString_Intern except those that may be required for compatibility. This version of the patch, like the previous one, disables indirect interning. Is there any evidence that it is still an important optimization for some packages? Make sure you rebuild everything after applying this patch because it modifies the size of string object headers. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-07-02 06:21 Message: Logged In: YES user_id=80475 I like the way you consolidated all of the knowledge about interning into one place. Consider adding an example to the docs of an effective use of interning for optimization. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 From noreply@sourceforge.net Thu Aug 15 10:23:24 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 02:23:24 -0700 Subject: [Patches] [ python-Patches-587993 ] SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 11:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Michael Hudson (mwh) Summary: SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Michael Hudson (mwh) Date: 2002-08-15 09:23 Message: Logged In: YES user_id=6656 Yeah, the whatsnew section needs expanding. I'll get to this, but it probably shouldn't hold up the rest of the patch. ceval.c: oversight. Good spot. There's an attempt at explaining the restrictions on RETURN_NONE in opcode.h, but it's very short. I'll expand the comments in maybe_call_line_trace & and pointers to this in the dis docs and other places. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 23:08 Message: Logged In: YES user_id=21627 Random comments: - whatsnew23.tex: replace VERSION - ceval.c; lltrace: Why does it drop the comparison with NULL: lltrace is int, not PyObject* - it appears that RETURN_NONE does more than just returning None; it also suppresses trace calls. This should be carefully documented, or else somebody might suggest to generate RETURN_NONE for a plain return statement. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 20:16 Message: Logged In: YES user_id=6380 Looks good to me. Michael, why don't you hang onto it for another day and then check it in unless someone speaks out against it? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 14:42 Message: Logged In: YES user_id=6656 That's good to know :) No great hurry with the review. I wanted to finish the patch while everything was still fresh in my mind, and I think I've done this now. Of course, I've said that before... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 14:39 Message: Logged In: YES user_id=6380 That's pretty much what I suggested, yes. I'll have to take more time to review all this code... :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 14:29 Message: Logged In: YES user_id=6656 It was discuessed on python-dev: http://mail.python.org/pipermail/python-dev/2002-August/027261.html and followups. Adding a new opcode was Guido's idea, but I'm not totally sure this is what he meant. Also see the comments around line 2933 of ceval.c. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-05 14:23 Message: Logged In: YES user_id=35752 Why was the RETURN_NONE opcode added? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 12:20 Message: Logged In: YES user_id=6656 Here's another update. I've also deleted the old patches. Changes this time: + doesn't include pystone LOOPS boost + finish RETURN_NONE off: - add to dis.py - add to libdis.tex + removed now-unecessary co_lnotab grovelling in inspect and traceback. left the functions for compatibility. + removed the WE_WANT_THE_HACK hack + incudes Neal's test_hotshot patch + more trace.py tweaks. Lib/compiler is still broken. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 11:56 Message: Logged In: YES user_id=33168 Re-assigning to Guido. I think what happened was I was reviewing the patch before it was assigned to Guido. Sorry about that. You're right about the set_lineno, I thought the opcode was also generated in there. I agree about the code reuse--it's probably too hard. If I had a better idea, I would have shared it. :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 09:09 Message: Logged In: YES user_id=6656 Thanks for the comments, Neal! I'm not sure it's possible to separate the changes in the way you describe. For instance, com_addoparg -> com_set_lineno stops SET_LINENO being generated, so breaks tracing without the VM changes, but the VM changes make SET_LINENO into an unknown opcode... I didn't intend to upload the pystone change. About the comments: - the last little RETURN_NONE changes are easy. Thanks for the pointer. - agree there are too many bits of code grovelling co_lnotab. however, it's not clear that they can easily be refactored. maybe a generator would be useful... hmm. let me think about this one. - did I take out the initialisation of frame.f_lineno entirely? oops. probably set it to co_firstlineno. - fixed trace.py docstring. reusing code not all that easy, because different uses have subtly different requirements. hmm, inspect.getlineno is now pointless... was there any particular reason you unassigned the patch? Just curious as the main reason it was assigned to Guido was to make sure he saw it. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 19:50 Message: Logged In: YES user_id=33168 Attached is a patch to test_hotshot which is necessary because the duplicate line #s at the beginning of the function no longer exist. WIth this patch test_hotshot passes. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 20:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 15:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 13:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 22:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 21:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 20:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 15:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 15:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Thu Aug 15 11:36:59 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 03:36:59 -0700 Subject: [Patches] [ python-Patches-587993 ] SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 11:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Michael Hudson (mwh) Summary: SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Michael Hudson (mwh) Date: 2002-08-15 10:36 Message: Logged In: YES user_id=6656 OK, update. ceval.c: rewrote the comments about the exceptions for POP_TOP and RETURN_NONE somewhat. Fixed up the lltrace test. compile.c: use RETURN_NONE a touch more freely. Remove a com_pop that shouldn't have been there anymore. whatsnew23.tex: expanded, clarified, moved from "Other Language Changes" to "Other Changes And Fixes". Not sure this is the right section either... opcode.h, libdis.tex: point people interested in RETURN_NONE to the comments in ceval.c. It will be nice to stop having to de-conflict Misc/NEWS every day... ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-15 09:23 Message: Logged In: YES user_id=6656 Yeah, the whatsnew section needs expanding. I'll get to this, but it probably shouldn't hold up the rest of the patch. ceval.c: oversight. Good spot. There's an attempt at explaining the restrictions on RETURN_NONE in opcode.h, but it's very short. I'll expand the comments in maybe_call_line_trace & and pointers to this in the dis docs and other places. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 23:08 Message: Logged In: YES user_id=21627 Random comments: - whatsnew23.tex: replace VERSION - ceval.c; lltrace: Why does it drop the comparison with NULL: lltrace is int, not PyObject* - it appears that RETURN_NONE does more than just returning None; it also suppresses trace calls. This should be carefully documented, or else somebody might suggest to generate RETURN_NONE for a plain return statement. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 20:16 Message: Logged In: YES user_id=6380 Looks good to me. Michael, why don't you hang onto it for another day and then check it in unless someone speaks out against it? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 14:42 Message: Logged In: YES user_id=6656 That's good to know :) No great hurry with the review. I wanted to finish the patch while everything was still fresh in my mind, and I think I've done this now. Of course, I've said that before... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 14:39 Message: Logged In: YES user_id=6380 That's pretty much what I suggested, yes. I'll have to take more time to review all this code... :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 14:29 Message: Logged In: YES user_id=6656 It was discuessed on python-dev: http://mail.python.org/pipermail/python-dev/2002-August/027261.html and followups. Adding a new opcode was Guido's idea, but I'm not totally sure this is what he meant. Also see the comments around line 2933 of ceval.c. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-05 14:23 Message: Logged In: YES user_id=35752 Why was the RETURN_NONE opcode added? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 12:20 Message: Logged In: YES user_id=6656 Here's another update. I've also deleted the old patches. Changes this time: + doesn't include pystone LOOPS boost + finish RETURN_NONE off: - add to dis.py - add to libdis.tex + removed now-unecessary co_lnotab grovelling in inspect and traceback. left the functions for compatibility. + removed the WE_WANT_THE_HACK hack + incudes Neal's test_hotshot patch + more trace.py tweaks. Lib/compiler is still broken. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 11:56 Message: Logged In: YES user_id=33168 Re-assigning to Guido. I think what happened was I was reviewing the patch before it was assigned to Guido. Sorry about that. You're right about the set_lineno, I thought the opcode was also generated in there. I agree about the code reuse--it's probably too hard. If I had a better idea, I would have shared it. :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 09:09 Message: Logged In: YES user_id=6656 Thanks for the comments, Neal! I'm not sure it's possible to separate the changes in the way you describe. For instance, com_addoparg -> com_set_lineno stops SET_LINENO being generated, so breaks tracing without the VM changes, but the VM changes make SET_LINENO into an unknown opcode... I didn't intend to upload the pystone change. About the comments: - the last little RETURN_NONE changes are easy. Thanks for the pointer. - agree there are too many bits of code grovelling co_lnotab. however, it's not clear that they can easily be refactored. maybe a generator would be useful... hmm. let me think about this one. - did I take out the initialisation of frame.f_lineno entirely? oops. probably set it to co_firstlineno. - fixed trace.py docstring. reusing code not all that easy, because different uses have subtly different requirements. hmm, inspect.getlineno is now pointless... was there any particular reason you unassigned the patch? Just curious as the main reason it was assigned to Guido was to make sure he saw it. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 19:50 Message: Logged In: YES user_id=33168 Attached is a patch to test_hotshot which is necessary because the duplicate line #s at the beginning of the function no longer exist. WIth this patch test_hotshot passes. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 20:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 15:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 13:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 22:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 21:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 20:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 15:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 15:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Thu Aug 15 13:00:51 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 05:00:51 -0700 Subject: [Patches] [ python-Patches-593627 ] Static names Message-ID: Patches item #593627, was opened at 2002-08-11 06:41 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) >Assigned to: Guido van Rossum (gvanrossum) Summary: Static names Initial Comment: This patch creates static string objects for all built-in names and interns then on initialization. The macro PyNAME is be used to access static names. PyNAME(__spam__) is equivalent to PyString_InternFromString("__spam__") but is a constant expression. It requires the name to be one of the built-in names. A linker error will be generated if it isn't. Most conversions of C strings into temporary string objects can be eliminated (PyString_FromString, PyString_InternFromString). Most string comparisons at runtime can also be eliminated. Instead of : if (strcmp(PyString_AsString(name), "__spam__")) ... This code can be used: PyString_INTERN(name) if (name == PyNAME(__spam__)) ... Where PyString_INTERN is a fast inline check if the string is already interned (and it usually is). To prevent unbounded accumulation of interned strings the mortal interned string patch should also be applied. The patch converts most of the builtin module to this new mode as an example. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 08:48 Message: Logged In: YES user_id=21627 It is difficult to see how this patch achieves the goal of readability: - many variables are not used at all, e.g. ArithmethicError; it is not clear how using them would improve readability. - the change from SETBUILTIN("None", Py_None); to SETBUILTIN(PyNAME(None), Py_None); makes it more difficult to read, not easier. Furthermore, the name "None" isn't used anywhere except this initialisation. - likewise, the changes from {"abs", builtin_abs, METH_O, abs_doc}, to {PyNAMEC(abs), builtin_abs, METH_O, abs_doc}, make the code harder to read. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-12 08:10 Message: Logged In: YES user_id=562624 If these changes are applied throughout the interpreter I expect a significant speedup but that is not my immediate goal. I am looking for redability, reduction of code size (both source and binary) and reliability (less things to check or forget to check). I am trying to get rid of code like if (docstr == NULL) { docstr= PyString_InternFromString("__doc__"); if (docstr == NULL) return NULL; And replace it with just PyNAME(__doc__) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 03:43 Message: Logged In: YES user_id=21627 What is the rationale for this patch? If it is for performance, what real speed improvements can you report? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 11:54 Message: Logged In: YES user_id=33168 I'd like to see the releasing interned string patch applied. I think it's almost ready, isn't it? It would make patching easier and seems to be a good idea. For me, the easiest way to produce patches is to use cvs. You can keep multiple cvs trees around easy enough (for having multiple overlapping/independant patches). To create patches with cvs: cvs diff -C 5 [file1] [file2] .... ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-11 11:44 Message: Logged In: YES user_id=562624 Ok, I'll fix the problems with the patch. What's the best way to produce a patch that adds new files? Static string objects cannot be released, of course. This patch will eventually depend on on the mortal interned strings patch to fix it but in the meantime I just disabled releasing interned strings because I want to keep the two patches independent. The next version will add a new PyArg_ParseTuple format for an interned string to make it easier to use. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 10:46 Message: Logged In: YES user_id=33168 Couple of initial comments: * this is a reverse patch * it seems like there are other changes in here - int ob_shash -> long - releasing interned strings? * dictobject.c is removed? * including python headers should use "" not <> Oren, could you generate a new patch with only the changes to support PyNAME? Thanks! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 From noreply@sourceforge.net Thu Aug 15 14:37:04 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 06:37:04 -0700 Subject: [Patches] [ python-Patches-492105 ] Import from Zip archive Message-ID: Patches item #492105, was opened at 2001-12-12 17:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Out of Date Priority: 5 Submitted By: James C. Ahlstrom (ahlstromjc) Assigned to: Neal Norwitz (nnorwitz) Summary: Import from Zip archive Initial Comment: This is the "final" patch to support imports from zip archives, and directory caching using os.listdir(). It replaces patch 483466 and 476047. It is a separate patch since I can't delete file attachments. It adds support for importing from "" and from relative paths. ---------------------------------------------------------------------- >Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-08-15 13:37 Message: Logged In: YES user_id=64929 This patch is old. I can provide a new patch against Python 2.2.1 if that would help. Or a new patch just for import.c against 2.2.1. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 02:40 Message: Logged In: YES user_id=6380 No, the failing hunk is included in dashc-2. I actually *edited* the patch file until all but that one hunk succeeded. Thanks for looking into this! If you need help don't fail to ask on python-dev, I read it daily. :-) ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-15 01:38 Message: Logged In: YES user_id=33168 Reassigning to me. I'll give it a shot. It will take a while. Do I understand you correctly that your updated patch (dashc-2) has all the necessary pieces, except for the import hunk 11 that was rejected? And I need to get that failed hunk from the original patch? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 01:32 Message: Logged In: YES user_id=6380 Whoa... That's a lot. Neal, do you think you could come up with a reworked version of this? That would be great! ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-14 23:30 Message: Logged In: YES user_id=33168 I'm reviewed most of the updated version. Quick summary: * strncpy() doesn't null terminate strings, this needs to be added * seems to be many leaked references Here are more details: * strncpy(dest, str, size) doesn't guarantee the result to be null terminated, need dest[size] = '\0'; after all strncpy()s (I see a bunch in getpath.c) * getpath.c:get_sys_path_0(), I think some compilers warn when you do char[i] = 0; (use '\0' instead) (there may be other places of this) * import.c:PyImport_InitZip(), have_zlib is an alias to zlib and isn't really helpful/necessary * import.c:get_path_type() can leak a reference to pyobj if it's an int * import.c:add_directory_names() pylist reference is leaked * import.c:add_zip_names(), memcmp(path+i, ".zip", 4) is clearer to me than path[i] == '.' .... * import.c:add_zip_names, there's a lot of magic #s in this function * " " : leak refs when doing PyTuple_SetItem I think there were other leaked references. I'll try to spend some more time later. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 20:50 Message: Logged In: YES user_id=6380 Sigh. We waited too long for this one, and now the patch is hopelessly out of date. I managed to fix most of the failing hunks, but the remaining hunk that fails (#11 in import.c) is a whopper: the saved import.c.rej is 270 lines long. I'm going to sleep on that, but I could use some help. In the mean time, I'm uploading an edited version of dashc.diff to help the next person. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 10:54 Message: Logged In: YES user_id=21627 Is this patch ready to be applied? ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-06-12 15:05 Message: Logged In: YES user_id=31392 Deleteing the old diffs that Jim couldn't delete. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 17:27 Message: Logged In: YES user_id=64929 I added a diff -c version of the patch. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 17:03 Message: Logged In: YES user_id=64929 I still can't delete files, but I added a new file which contains all diffs as a single file, and is made from the current CVS tree (Mar 15, 2002). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 From noreply@sourceforge.net Thu Aug 15 15:09:21 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 07:09:21 -0700 Subject: [Patches] [ python-Patches-492105 ] Import from Zip archive Message-ID: Patches item #492105, was opened at 2001-12-12 12:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Out of Date Priority: 5 Submitted By: James C. Ahlstrom (ahlstromjc) Assigned to: Neal Norwitz (nnorwitz) Summary: Import from Zip archive Initial Comment: This is the "final" patch to support imports from zip archives, and directory caching using os.listdir(). It replaces patch 483466 and 476047. It is a separate patch since I can't delete file attachments. It adds support for importing from "" and from relative paths. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-15 10:09 Message: Logged In: YES user_id=33168 James, could you look at what Guido reworked? If that is fine, I can push it forward. Otherwise, feel free to update the patch. If I do any work on it, I'll make comments here so we don't duplicate work. Thanks. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-08-15 09:37 Message: Logged In: YES user_id=64929 This patch is old. I can provide a new patch against Python 2.2.1 if that would help. Or a new patch just for import.c against 2.2.1. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 22:40 Message: Logged In: YES user_id=6380 No, the failing hunk is included in dashc-2. I actually *edited* the patch file until all but that one hunk succeeded. Thanks for looking into this! If you need help don't fail to ask on python-dev, I read it daily. :-) ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-14 21:38 Message: Logged In: YES user_id=33168 Reassigning to me. I'll give it a shot. It will take a while. Do I understand you correctly that your updated patch (dashc-2) has all the necessary pieces, except for the import hunk 11 that was rejected? And I need to get that failed hunk from the original patch? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 21:32 Message: Logged In: YES user_id=6380 Whoa... That's a lot. Neal, do you think you could come up with a reworked version of this? That would be great! ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-14 19:30 Message: Logged In: YES user_id=33168 I'm reviewed most of the updated version. Quick summary: * strncpy() doesn't null terminate strings, this needs to be added * seems to be many leaked references Here are more details: * strncpy(dest, str, size) doesn't guarantee the result to be null terminated, need dest[size] = '\0'; after all strncpy()s (I see a bunch in getpath.c) * getpath.c:get_sys_path_0(), I think some compilers warn when you do char[i] = 0; (use '\0' instead) (there may be other places of this) * import.c:PyImport_InitZip(), have_zlib is an alias to zlib and isn't really helpful/necessary * import.c:get_path_type() can leak a reference to pyobj if it's an int * import.c:add_directory_names() pylist reference is leaked * import.c:add_zip_names(), memcmp(path+i, ".zip", 4) is clearer to me than path[i] == '.' .... * import.c:add_zip_names, there's a lot of magic #s in this function * " " : leak refs when doing PyTuple_SetItem I think there were other leaked references. I'll try to spend some more time later. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 16:50 Message: Logged In: YES user_id=6380 Sigh. We waited too long for this one, and now the patch is hopelessly out of date. I managed to fix most of the failing hunks, but the remaining hunk that fails (#11 in import.c) is a whopper: the saved import.c.rej is 270 lines long. I'm going to sleep on that, but I could use some help. In the mean time, I'm uploading an edited version of dashc.diff to help the next person. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 06:54 Message: Logged In: YES user_id=21627 Is this patch ready to be applied? ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-06-12 11:05 Message: Logged In: YES user_id=31392 Deleteing the old diffs that Jim couldn't delete. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:27 Message: Logged In: YES user_id=64929 I added a diff -c version of the patch. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:03 Message: Logged In: YES user_id=64929 I still can't delete files, but I added a new file which contains all diffs as a single file, and is made from the current CVS tree (Mar 15, 2002). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 From noreply@sourceforge.net Thu Aug 15 15:20:55 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 07:20:55 -0700 Subject: [Patches] [ python-Patches-492105 ] Import from Zip archive Message-ID: Patches item #492105, was opened at 2001-12-12 12:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Out of Date Priority: 5 Submitted By: James C. Ahlstrom (ahlstromjc) Assigned to: Neal Norwitz (nnorwitz) Summary: Import from Zip archive Initial Comment: This is the "final" patch to support imports from zip archives, and directory caching using os.listdir(). It replaces patch 483466 and 476047. It is a separate patch since I can't delete file attachments. It adds support for importing from "" and from relative paths. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 10:20 Message: Logged In: YES user_id=6380 I think a new patch just for import.c would be helpful. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-15 10:09 Message: Logged In: YES user_id=33168 James, could you look at what Guido reworked? If that is fine, I can push it forward. Otherwise, feel free to update the patch. If I do any work on it, I'll make comments here so we don't duplicate work. Thanks. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-08-15 09:37 Message: Logged In: YES user_id=64929 This patch is old. I can provide a new patch against Python 2.2.1 if that would help. Or a new patch just for import.c against 2.2.1. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 22:40 Message: Logged In: YES user_id=6380 No, the failing hunk is included in dashc-2. I actually *edited* the patch file until all but that one hunk succeeded. Thanks for looking into this! If you need help don't fail to ask on python-dev, I read it daily. :-) ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-14 21:38 Message: Logged In: YES user_id=33168 Reassigning to me. I'll give it a shot. It will take a while. Do I understand you correctly that your updated patch (dashc-2) has all the necessary pieces, except for the import hunk 11 that was rejected? And I need to get that failed hunk from the original patch? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 21:32 Message: Logged In: YES user_id=6380 Whoa... That's a lot. Neal, do you think you could come up with a reworked version of this? That would be great! ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-14 19:30 Message: Logged In: YES user_id=33168 I'm reviewed most of the updated version. Quick summary: * strncpy() doesn't null terminate strings, this needs to be added * seems to be many leaked references Here are more details: * strncpy(dest, str, size) doesn't guarantee the result to be null terminated, need dest[size] = '\0'; after all strncpy()s (I see a bunch in getpath.c) * getpath.c:get_sys_path_0(), I think some compilers warn when you do char[i] = 0; (use '\0' instead) (there may be other places of this) * import.c:PyImport_InitZip(), have_zlib is an alias to zlib and isn't really helpful/necessary * import.c:get_path_type() can leak a reference to pyobj if it's an int * import.c:add_directory_names() pylist reference is leaked * import.c:add_zip_names(), memcmp(path+i, ".zip", 4) is clearer to me than path[i] == '.' .... * import.c:add_zip_names, there's a lot of magic #s in this function * " " : leak refs when doing PyTuple_SetItem I think there were other leaked references. I'll try to spend some more time later. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 16:50 Message: Logged In: YES user_id=6380 Sigh. We waited too long for this one, and now the patch is hopelessly out of date. I managed to fix most of the failing hunks, but the remaining hunk that fails (#11 in import.c) is a whopper: the saved import.c.rej is 270 lines long. I'm going to sleep on that, but I could use some help. In the mean time, I'm uploading an edited version of dashc.diff to help the next person. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 06:54 Message: Logged In: YES user_id=21627 Is this patch ready to be applied? ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-06-12 11:05 Message: Logged In: YES user_id=31392 Deleteing the old diffs that Jim couldn't delete. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:27 Message: Logged In: YES user_id=64929 I added a diff -c version of the patch. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:03 Message: Logged In: YES user_id=64929 I still can't delete files, but I added a new file which contains all diffs as a single file, and is made from the current CVS tree (Mar 15, 2002). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 From noreply@sourceforge.net Thu Aug 15 15:44:07 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 07:44:07 -0700 Subject: [Patches] [ python-Patches-492105 ] Import from Zip archive Message-ID: Patches item #492105, was opened at 2001-12-12 17:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Out of Date Priority: 5 Submitted By: James C. Ahlstrom (ahlstromjc) Assigned to: Neal Norwitz (nnorwitz) Summary: Import from Zip archive Initial Comment: This is the "final" patch to support imports from zip archives, and directory caching using os.listdir(). It replaces patch 483466 and 476047. It is a separate patch since I can't delete file attachments. It adds support for importing from "" and from relative paths. ---------------------------------------------------------------------- >Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-08-15 14:44 Message: Logged In: YES user_id=64929 Here is the import.c diff -c against Python-2.2.1. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 14:20 Message: Logged In: YES user_id=6380 I think a new patch just for import.c would be helpful. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-15 14:09 Message: Logged In: YES user_id=33168 James, could you look at what Guido reworked? If that is fine, I can push it forward. Otherwise, feel free to update the patch. If I do any work on it, I'll make comments here so we don't duplicate work. Thanks. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-08-15 13:37 Message: Logged In: YES user_id=64929 This patch is old. I can provide a new patch against Python 2.2.1 if that would help. Or a new patch just for import.c against 2.2.1. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 02:40 Message: Logged In: YES user_id=6380 No, the failing hunk is included in dashc-2. I actually *edited* the patch file until all but that one hunk succeeded. Thanks for looking into this! If you need help don't fail to ask on python-dev, I read it daily. :-) ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-15 01:38 Message: Logged In: YES user_id=33168 Reassigning to me. I'll give it a shot. It will take a while. Do I understand you correctly that your updated patch (dashc-2) has all the necessary pieces, except for the import hunk 11 that was rejected? And I need to get that failed hunk from the original patch? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 01:32 Message: Logged In: YES user_id=6380 Whoa... That's a lot. Neal, do you think you could come up with a reworked version of this? That would be great! ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-14 23:30 Message: Logged In: YES user_id=33168 I'm reviewed most of the updated version. Quick summary: * strncpy() doesn't null terminate strings, this needs to be added * seems to be many leaked references Here are more details: * strncpy(dest, str, size) doesn't guarantee the result to be null terminated, need dest[size] = '\0'; after all strncpy()s (I see a bunch in getpath.c) * getpath.c:get_sys_path_0(), I think some compilers warn when you do char[i] = 0; (use '\0' instead) (there may be other places of this) * import.c:PyImport_InitZip(), have_zlib is an alias to zlib and isn't really helpful/necessary * import.c:get_path_type() can leak a reference to pyobj if it's an int * import.c:add_directory_names() pylist reference is leaked * import.c:add_zip_names(), memcmp(path+i, ".zip", 4) is clearer to me than path[i] == '.' .... * import.c:add_zip_names, there's a lot of magic #s in this function * " " : leak refs when doing PyTuple_SetItem I think there were other leaked references. I'll try to spend some more time later. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 20:50 Message: Logged In: YES user_id=6380 Sigh. We waited too long for this one, and now the patch is hopelessly out of date. I managed to fix most of the failing hunks, but the remaining hunk that fails (#11 in import.c) is a whopper: the saved import.c.rej is 270 lines long. I'm going to sleep on that, but I could use some help. In the mean time, I'm uploading an edited version of dashc.diff to help the next person. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 10:54 Message: Logged In: YES user_id=21627 Is this patch ready to be applied? ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-06-12 15:05 Message: Logged In: YES user_id=31392 Deleteing the old diffs that Jim couldn't delete. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 17:27 Message: Logged In: YES user_id=64929 I added a diff -c version of the patch. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 17:03 Message: Logged In: YES user_id=64929 I still can't delete files, but I added a new file which contains all diffs as a single file, and is made from the current CVS tree (Mar 15, 2002). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 From noreply@sourceforge.net Thu Aug 15 15:50:20 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 07:50:20 -0700 Subject: [Patches] [ python-Patches-587993 ] SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 07:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Michael Hudson (mwh) Summary: SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 10:50 Message: Logged In: YES user_id=6380 Michael, please check this in. We can perfect is more easily when it's in CVS, and it'll get more eyeballs that way too. What's here looks very good to me. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-15 06:36 Message: Logged In: YES user_id=6656 OK, update. ceval.c: rewrote the comments about the exceptions for POP_TOP and RETURN_NONE somewhat. Fixed up the lltrace test. compile.c: use RETURN_NONE a touch more freely. Remove a com_pop that shouldn't have been there anymore. whatsnew23.tex: expanded, clarified, moved from "Other Language Changes" to "Other Changes And Fixes". Not sure this is the right section either... opcode.h, libdis.tex: point people interested in RETURN_NONE to the comments in ceval.c. It will be nice to stop having to de-conflict Misc/NEWS every day... ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-15 05:23 Message: Logged In: YES user_id=6656 Yeah, the whatsnew section needs expanding. I'll get to this, but it probably shouldn't hold up the rest of the patch. ceval.c: oversight. Good spot. There's an attempt at explaining the restrictions on RETURN_NONE in opcode.h, but it's very short. I'll expand the comments in maybe_call_line_trace & and pointers to this in the dis docs and other places. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 19:08 Message: Logged In: YES user_id=21627 Random comments: - whatsnew23.tex: replace VERSION - ceval.c; lltrace: Why does it drop the comparison with NULL: lltrace is int, not PyObject* - it appears that RETURN_NONE does more than just returning None; it also suppresses trace calls. This should be carefully documented, or else somebody might suggest to generate RETURN_NONE for a plain return statement. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 16:16 Message: Logged In: YES user_id=6380 Looks good to me. Michael, why don't you hang onto it for another day and then check it in unless someone speaks out against it? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 10:42 Message: Logged In: YES user_id=6656 That's good to know :) No great hurry with the review. I wanted to finish the patch while everything was still fresh in my mind, and I think I've done this now. Of course, I've said that before... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 10:39 Message: Logged In: YES user_id=6380 That's pretty much what I suggested, yes. I'll have to take more time to review all this code... :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 10:29 Message: Logged In: YES user_id=6656 It was discuessed on python-dev: http://mail.python.org/pipermail/python-dev/2002-August/027261.html and followups. Adding a new opcode was Guido's idea, but I'm not totally sure this is what he meant. Also see the comments around line 2933 of ceval.c. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-05 10:23 Message: Logged In: YES user_id=35752 Why was the RETURN_NONE opcode added? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 08:20 Message: Logged In: YES user_id=6656 Here's another update. I've also deleted the old patches. Changes this time: + doesn't include pystone LOOPS boost + finish RETURN_NONE off: - add to dis.py - add to libdis.tex + removed now-unecessary co_lnotab grovelling in inspect and traceback. left the functions for compatibility. + removed the WE_WANT_THE_HACK hack + incudes Neal's test_hotshot patch + more trace.py tweaks. Lib/compiler is still broken. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 07:56 Message: Logged In: YES user_id=33168 Re-assigning to Guido. I think what happened was I was reviewing the patch before it was assigned to Guido. Sorry about that. You're right about the set_lineno, I thought the opcode was also generated in there. I agree about the code reuse--it's probably too hard. If I had a better idea, I would have shared it. :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 05:09 Message: Logged In: YES user_id=6656 Thanks for the comments, Neal! I'm not sure it's possible to separate the changes in the way you describe. For instance, com_addoparg -> com_set_lineno stops SET_LINENO being generated, so breaks tracing without the VM changes, but the VM changes make SET_LINENO into an unknown opcode... I didn't intend to upload the pystone change. About the comments: - the last little RETURN_NONE changes are easy. Thanks for the pointer. - agree there are too many bits of code grovelling co_lnotab. however, it's not clear that they can easily be refactored. maybe a generator would be useful... hmm. let me think about this one. - did I take out the initialisation of frame.f_lineno entirely? oops. probably set it to co_firstlineno. - fixed trace.py docstring. reusing code not all that easy, because different uses have subtly different requirements. hmm, inspect.getlineno is now pointless... was there any particular reason you unassigned the patch? Just curious as the main reason it was assigned to Guido was to make sure he saw it. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 15:50 Message: Logged In: YES user_id=33168 Attached is a patch to test_hotshot which is necessary because the duplicate line #s at the beginning of the function no longer exist. WIth this patch test_hotshot passes. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 16:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 11:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 09:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 18:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 17:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 05:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 05:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 16:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 11:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 11:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Thu Aug 15 16:00:16 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 08:00:16 -0700 Subject: [Patches] [ python-Patches-492105 ] Import from Zip archive Message-ID: Patches item #492105, was opened at 2001-12-12 12:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Out of Date Priority: 5 Submitted By: James C. Ahlstrom (ahlstromjc) Assigned to: Neal Norwitz (nnorwitz) Summary: Import from Zip archive Initial Comment: This is the "final" patch to support imports from zip archives, and directory caching using os.listdir(). It replaces patch 483466 and 476047. It is a separate patch since I can't delete file attachments. It adds support for importing from "" and from relative paths. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 11:00 Message: Logged In: YES user_id=6380 Alas, the 2.2.1 diff doesn't help much. Current CVS is what we need. :-( ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-08-15 10:44 Message: Logged In: YES user_id=64929 Here is the import.c diff -c against Python-2.2.1. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 10:20 Message: Logged In: YES user_id=6380 I think a new patch just for import.c would be helpful. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-15 10:09 Message: Logged In: YES user_id=33168 James, could you look at what Guido reworked? If that is fine, I can push it forward. Otherwise, feel free to update the patch. If I do any work on it, I'll make comments here so we don't duplicate work. Thanks. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-08-15 09:37 Message: Logged In: YES user_id=64929 This patch is old. I can provide a new patch against Python 2.2.1 if that would help. Or a new patch just for import.c against 2.2.1. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 22:40 Message: Logged In: YES user_id=6380 No, the failing hunk is included in dashc-2. I actually *edited* the patch file until all but that one hunk succeeded. Thanks for looking into this! If you need help don't fail to ask on python-dev, I read it daily. :-) ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-14 21:38 Message: Logged In: YES user_id=33168 Reassigning to me. I'll give it a shot. It will take a while. Do I understand you correctly that your updated patch (dashc-2) has all the necessary pieces, except for the import hunk 11 that was rejected? And I need to get that failed hunk from the original patch? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 21:32 Message: Logged In: YES user_id=6380 Whoa... That's a lot. Neal, do you think you could come up with a reworked version of this? That would be great! ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-14 19:30 Message: Logged In: YES user_id=33168 I'm reviewed most of the updated version. Quick summary: * strncpy() doesn't null terminate strings, this needs to be added * seems to be many leaked references Here are more details: * strncpy(dest, str, size) doesn't guarantee the result to be null terminated, need dest[size] = '\0'; after all strncpy()s (I see a bunch in getpath.c) * getpath.c:get_sys_path_0(), I think some compilers warn when you do char[i] = 0; (use '\0' instead) (there may be other places of this) * import.c:PyImport_InitZip(), have_zlib is an alias to zlib and isn't really helpful/necessary * import.c:get_path_type() can leak a reference to pyobj if it's an int * import.c:add_directory_names() pylist reference is leaked * import.c:add_zip_names(), memcmp(path+i, ".zip", 4) is clearer to me than path[i] == '.' .... * import.c:add_zip_names, there's a lot of magic #s in this function * " " : leak refs when doing PyTuple_SetItem I think there were other leaked references. I'll try to spend some more time later. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 16:50 Message: Logged In: YES user_id=6380 Sigh. We waited too long for this one, and now the patch is hopelessly out of date. I managed to fix most of the failing hunks, but the remaining hunk that fails (#11 in import.c) is a whopper: the saved import.c.rej is 270 lines long. I'm going to sleep on that, but I could use some help. In the mean time, I'm uploading an edited version of dashc.diff to help the next person. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 06:54 Message: Logged In: YES user_id=21627 Is this patch ready to be applied? ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-06-12 11:05 Message: Logged In: YES user_id=31392 Deleteing the old diffs that Jim couldn't delete. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:27 Message: Logged In: YES user_id=64929 I added a diff -c version of the patch. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 12:03 Message: Logged In: YES user_id=64929 I still can't delete files, but I added a new file which contains all diffs as a single file, and is made from the current CVS tree (Mar 15, 2002). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 From noreply@sourceforge.net Thu Aug 15 16:03:34 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 08:03:34 -0700 Subject: [Patches] [ python-Patches-587993 ] SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 11:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Michael Hudson (mwh) Summary: SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Michael Hudson (mwh) Date: 2002-08-15 15:03 Message: Logged In: YES user_id=6656 Checked in as: Doc/lib/libdis.tex revision 1.39 Doc/lib/libtraceback.tex revision 1.16 Doc/tut/tut.tex revision 1.170 Doc/whatsnew/whatsnew23.tex revision 1.45 Include/opcode.h revision 2.40 Lib/dis.py revision 1.42 Lib/inspect.py revision 1.38 Lib/pdb.py revision 1.55 Lib/traceback.py revision 1.28 Lib/test/test_hotshot.py revision 1.12 Misc/NEWS revision 1.470 Modules/_hotshot.c revision 1.26 Objects/frameobject.c revision 2.64 Python/ceval.c revision 2.324 Python/compile.c revision 2.258 Python/frozen.c revision 1.13 Python/import.c revision 2.209 Python/traceback.c revision 2.39 Tools/scripts/trace.py revision 1.8 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 14:50 Message: Logged In: YES user_id=6380 Michael, please check this in. We can perfect is more easily when it's in CVS, and it'll get more eyeballs that way too. What's here looks very good to me. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-15 10:36 Message: Logged In: YES user_id=6656 OK, update. ceval.c: rewrote the comments about the exceptions for POP_TOP and RETURN_NONE somewhat. Fixed up the lltrace test. compile.c: use RETURN_NONE a touch more freely. Remove a com_pop that shouldn't have been there anymore. whatsnew23.tex: expanded, clarified, moved from "Other Language Changes" to "Other Changes And Fixes". Not sure this is the right section either... opcode.h, libdis.tex: point people interested in RETURN_NONE to the comments in ceval.c. It will be nice to stop having to de-conflict Misc/NEWS every day... ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-15 09:23 Message: Logged In: YES user_id=6656 Yeah, the whatsnew section needs expanding. I'll get to this, but it probably shouldn't hold up the rest of the patch. ceval.c: oversight. Good spot. There's an attempt at explaining the restrictions on RETURN_NONE in opcode.h, but it's very short. I'll expand the comments in maybe_call_line_trace & and pointers to this in the dis docs and other places. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 23:08 Message: Logged In: YES user_id=21627 Random comments: - whatsnew23.tex: replace VERSION - ceval.c; lltrace: Why does it drop the comparison with NULL: lltrace is int, not PyObject* - it appears that RETURN_NONE does more than just returning None; it also suppresses trace calls. This should be carefully documented, or else somebody might suggest to generate RETURN_NONE for a plain return statement. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 20:16 Message: Logged In: YES user_id=6380 Looks good to me. Michael, why don't you hang onto it for another day and then check it in unless someone speaks out against it? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 14:42 Message: Logged In: YES user_id=6656 That's good to know :) No great hurry with the review. I wanted to finish the patch while everything was still fresh in my mind, and I think I've done this now. Of course, I've said that before... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 14:39 Message: Logged In: YES user_id=6380 That's pretty much what I suggested, yes. I'll have to take more time to review all this code... :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 14:29 Message: Logged In: YES user_id=6656 It was discuessed on python-dev: http://mail.python.org/pipermail/python-dev/2002-August/027261.html and followups. Adding a new opcode was Guido's idea, but I'm not totally sure this is what he meant. Also see the comments around line 2933 of ceval.c. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-05 14:23 Message: Logged In: YES user_id=35752 Why was the RETURN_NONE opcode added? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 12:20 Message: Logged In: YES user_id=6656 Here's another update. I've also deleted the old patches. Changes this time: + doesn't include pystone LOOPS boost + finish RETURN_NONE off: - add to dis.py - add to libdis.tex + removed now-unecessary co_lnotab grovelling in inspect and traceback. left the functions for compatibility. + removed the WE_WANT_THE_HACK hack + incudes Neal's test_hotshot patch + more trace.py tweaks. Lib/compiler is still broken. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 11:56 Message: Logged In: YES user_id=33168 Re-assigning to Guido. I think what happened was I was reviewing the patch before it was assigned to Guido. Sorry about that. You're right about the set_lineno, I thought the opcode was also generated in there. I agree about the code reuse--it's probably too hard. If I had a better idea, I would have shared it. :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 09:09 Message: Logged In: YES user_id=6656 Thanks for the comments, Neal! I'm not sure it's possible to separate the changes in the way you describe. For instance, com_addoparg -> com_set_lineno stops SET_LINENO being generated, so breaks tracing without the VM changes, but the VM changes make SET_LINENO into an unknown opcode... I didn't intend to upload the pystone change. About the comments: - the last little RETURN_NONE changes are easy. Thanks for the pointer. - agree there are too many bits of code grovelling co_lnotab. however, it's not clear that they can easily be refactored. maybe a generator would be useful... hmm. let me think about this one. - did I take out the initialisation of frame.f_lineno entirely? oops. probably set it to co_firstlineno. - fixed trace.py docstring. reusing code not all that easy, because different uses have subtly different requirements. hmm, inspect.getlineno is now pointless... was there any particular reason you unassigned the patch? Just curious as the main reason it was assigned to Guido was to make sure he saw it. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 19:50 Message: Logged In: YES user_id=33168 Attached is a patch to test_hotshot which is necessary because the duplicate line #s at the beginning of the function no longer exist. WIth this patch test_hotshot passes. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 20:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 15:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 13:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 22:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 21:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 20:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 15:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 15:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Thu Aug 15 16:06:36 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 08:06:36 -0700 Subject: [Patches] [ python-Patches-587993 ] SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 11:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Michael Hudson (mwh) Summary: SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-15 15:03 Message: Logged In: YES user_id=6656 Checked in as: Doc/lib/libdis.tex revision 1.39 Doc/lib/libtraceback.tex revision 1.16 Doc/tut/tut.tex revision 1.170 Doc/whatsnew/whatsnew23.tex revision 1.45 Include/opcode.h revision 2.40 Lib/dis.py revision 1.42 Lib/inspect.py revision 1.38 Lib/pdb.py revision 1.55 Lib/traceback.py revision 1.28 Lib/test/test_hotshot.py revision 1.12 Misc/NEWS revision 1.470 Modules/_hotshot.c revision 1.26 Objects/frameobject.c revision 2.64 Python/ceval.c revision 2.324 Python/compile.c revision 2.258 Python/frozen.c revision 1.13 Python/import.c revision 2.209 Python/traceback.c revision 2.39 Tools/scripts/trace.py revision 1.8 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 14:50 Message: Logged In: YES user_id=6380 Michael, please check this in. We can perfect is more easily when it's in CVS, and it'll get more eyeballs that way too. What's here looks very good to me. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-15 10:36 Message: Logged In: YES user_id=6656 OK, update. ceval.c: rewrote the comments about the exceptions for POP_TOP and RETURN_NONE somewhat. Fixed up the lltrace test. compile.c: use RETURN_NONE a touch more freely. Remove a com_pop that shouldn't have been there anymore. whatsnew23.tex: expanded, clarified, moved from "Other Language Changes" to "Other Changes And Fixes". Not sure this is the right section either... opcode.h, libdis.tex: point people interested in RETURN_NONE to the comments in ceval.c. It will be nice to stop having to de-conflict Misc/NEWS every day... ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-15 09:23 Message: Logged In: YES user_id=6656 Yeah, the whatsnew section needs expanding. I'll get to this, but it probably shouldn't hold up the rest of the patch. ceval.c: oversight. Good spot. There's an attempt at explaining the restrictions on RETURN_NONE in opcode.h, but it's very short. I'll expand the comments in maybe_call_line_trace & and pointers to this in the dis docs and other places. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 23:08 Message: Logged In: YES user_id=21627 Random comments: - whatsnew23.tex: replace VERSION - ceval.c; lltrace: Why does it drop the comparison with NULL: lltrace is int, not PyObject* - it appears that RETURN_NONE does more than just returning None; it also suppresses trace calls. This should be carefully documented, or else somebody might suggest to generate RETURN_NONE for a plain return statement. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 20:16 Message: Logged In: YES user_id=6380 Looks good to me. Michael, why don't you hang onto it for another day and then check it in unless someone speaks out against it? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 14:42 Message: Logged In: YES user_id=6656 That's good to know :) No great hurry with the review. I wanted to finish the patch while everything was still fresh in my mind, and I think I've done this now. Of course, I've said that before... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 14:39 Message: Logged In: YES user_id=6380 That's pretty much what I suggested, yes. I'll have to take more time to review all this code... :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 14:29 Message: Logged In: YES user_id=6656 It was discuessed on python-dev: http://mail.python.org/pipermail/python-dev/2002-August/027261.html and followups. Adding a new opcode was Guido's idea, but I'm not totally sure this is what he meant. Also see the comments around line 2933 of ceval.c. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-05 14:23 Message: Logged In: YES user_id=35752 Why was the RETURN_NONE opcode added? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 12:20 Message: Logged In: YES user_id=6656 Here's another update. I've also deleted the old patches. Changes this time: + doesn't include pystone LOOPS boost + finish RETURN_NONE off: - add to dis.py - add to libdis.tex + removed now-unecessary co_lnotab grovelling in inspect and traceback. left the functions for compatibility. + removed the WE_WANT_THE_HACK hack + incudes Neal's test_hotshot patch + more trace.py tweaks. Lib/compiler is still broken. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 11:56 Message: Logged In: YES user_id=33168 Re-assigning to Guido. I think what happened was I was reviewing the patch before it was assigned to Guido. Sorry about that. You're right about the set_lineno, I thought the opcode was also generated in there. I agree about the code reuse--it's probably too hard. If I had a better idea, I would have shared it. :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 09:09 Message: Logged In: YES user_id=6656 Thanks for the comments, Neal! I'm not sure it's possible to separate the changes in the way you describe. For instance, com_addoparg -> com_set_lineno stops SET_LINENO being generated, so breaks tracing without the VM changes, but the VM changes make SET_LINENO into an unknown opcode... I didn't intend to upload the pystone change. About the comments: - the last little RETURN_NONE changes are easy. Thanks for the pointer. - agree there are too many bits of code grovelling co_lnotab. however, it's not clear that they can easily be refactored. maybe a generator would be useful... hmm. let me think about this one. - did I take out the initialisation of frame.f_lineno entirely? oops. probably set it to co_firstlineno. - fixed trace.py docstring. reusing code not all that easy, because different uses have subtly different requirements. hmm, inspect.getlineno is now pointless... was there any particular reason you unassigned the patch? Just curious as the main reason it was assigned to Guido was to make sure he saw it. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 19:50 Message: Logged In: YES user_id=33168 Attached is a patch to test_hotshot which is necessary because the duplicate line #s at the beginning of the function no longer exist. WIth this patch test_hotshot passes. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 20:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 15:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 13:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 15:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 22:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 21:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 09:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 20:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 15:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 15:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Thu Aug 15 16:16:36 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 08:16:36 -0700 Subject: [Patches] [ python-Patches-587993 ] SET_LINENO killer Message-ID: Patches item #587993, was opened at 2002-07-29 07:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Closed Resolution: Accepted Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Michael Hudson (mwh) Summary: SET_LINENO killer Initial Comment: This patch is a proof-of-concept of another way to remove the SET_LINENO patch (as opposed to Vladimir's ancient one). Instead of rewriting bytecode (ick!) we poke into the c_lnotab to see if we've moved onto a different line. The c_lnotab is not the most transparent of data structures, it has to be said. I'm not sure this patch is 100% correct -- but I think the idea can definitely fly. There will be some more overhead to tracing than before, but I hope not too much. I haven't tested these aspects. Comments welcome! ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 11:16 Message: Logged In: YES user_id=6380 Thanks! Woo hoo! ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-15 11:03 Message: Logged In: YES user_id=6656 Checked in as: Doc/lib/libdis.tex revision 1.39 Doc/lib/libtraceback.tex revision 1.16 Doc/tut/tut.tex revision 1.170 Doc/whatsnew/whatsnew23.tex revision 1.45 Include/opcode.h revision 2.40 Lib/dis.py revision 1.42 Lib/inspect.py revision 1.38 Lib/pdb.py revision 1.55 Lib/traceback.py revision 1.28 Lib/test/test_hotshot.py revision 1.12 Misc/NEWS revision 1.470 Modules/_hotshot.c revision 1.26 Objects/frameobject.c revision 2.64 Python/ceval.c revision 2.324 Python/compile.c revision 2.258 Python/frozen.c revision 1.13 Python/import.c revision 2.209 Python/traceback.c revision 2.39 Tools/scripts/trace.py revision 1.8 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 10:50 Message: Logged In: YES user_id=6380 Michael, please check this in. We can perfect is more easily when it's in CVS, and it'll get more eyeballs that way too. What's here looks very good to me. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-15 06:36 Message: Logged In: YES user_id=6656 OK, update. ceval.c: rewrote the comments about the exceptions for POP_TOP and RETURN_NONE somewhat. Fixed up the lltrace test. compile.c: use RETURN_NONE a touch more freely. Remove a com_pop that shouldn't have been there anymore. whatsnew23.tex: expanded, clarified, moved from "Other Language Changes" to "Other Changes And Fixes". Not sure this is the right section either... opcode.h, libdis.tex: point people interested in RETURN_NONE to the comments in ceval.c. It will be nice to stop having to de-conflict Misc/NEWS every day... ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-15 05:23 Message: Logged In: YES user_id=6656 Yeah, the whatsnew section needs expanding. I'll get to this, but it probably shouldn't hold up the rest of the patch. ceval.c: oversight. Good spot. There's an attempt at explaining the restrictions on RETURN_NONE in opcode.h, but it's very short. I'll expand the comments in maybe_call_line_trace & and pointers to this in the dis docs and other places. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 19:08 Message: Logged In: YES user_id=21627 Random comments: - whatsnew23.tex: replace VERSION - ceval.c; lltrace: Why does it drop the comparison with NULL: lltrace is int, not PyObject* - it appears that RETURN_NONE does more than just returning None; it also suppresses trace calls. This should be carefully documented, or else somebody might suggest to generate RETURN_NONE for a plain return statement. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 16:16 Message: Logged In: YES user_id=6380 Looks good to me. Michael, why don't you hang onto it for another day and then check it in unless someone speaks out against it? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 10:42 Message: Logged In: YES user_id=6656 That's good to know :) No great hurry with the review. I wanted to finish the patch while everything was still fresh in my mind, and I think I've done this now. Of course, I've said that before... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 10:39 Message: Logged In: YES user_id=6380 That's pretty much what I suggested, yes. I'll have to take more time to review all this code... :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 10:29 Message: Logged In: YES user_id=6656 It was discuessed on python-dev: http://mail.python.org/pipermail/python-dev/2002-August/027261.html and followups. Adding a new opcode was Guido's idea, but I'm not totally sure this is what he meant. Also see the comments around line 2933 of ceval.c. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-05 10:23 Message: Logged In: YES user_id=35752 Why was the RETURN_NONE opcode added? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 08:20 Message: Logged In: YES user_id=6656 Here's another update. I've also deleted the old patches. Changes this time: + doesn't include pystone LOOPS boost + finish RETURN_NONE off: - add to dis.py - add to libdis.tex + removed now-unecessary co_lnotab grovelling in inspect and traceback. left the functions for compatibility. + removed the WE_WANT_THE_HACK hack + incudes Neal's test_hotshot patch + more trace.py tweaks. Lib/compiler is still broken. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-05 07:56 Message: Logged In: YES user_id=33168 Re-assigning to Guido. I think what happened was I was reviewing the patch before it was assigned to Guido. Sorry about that. You're right about the set_lineno, I thought the opcode was also generated in there. I agree about the code reuse--it's probably too hard. If I had a better idea, I would have shared it. :-( ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-05 05:09 Message: Logged In: YES user_id=6656 Thanks for the comments, Neal! I'm not sure it's possible to separate the changes in the way you describe. For instance, com_addoparg -> com_set_lineno stops SET_LINENO being generated, so breaks tracing without the VM changes, but the VM changes make SET_LINENO into an unknown opcode... I didn't intend to upload the pystone change. About the comments: - the last little RETURN_NONE changes are easy. Thanks for the pointer. - agree there are too many bits of code grovelling co_lnotab. however, it's not clear that they can easily be refactored. maybe a generator would be useful... hmm. let me think about this one. - did I take out the initialisation of frame.f_lineno entirely? oops. probably set it to co_firstlineno. - fixed trace.py docstring. reusing code not all that easy, because different uses have subtly different requirements. hmm, inspect.getlineno is now pointless... was there any particular reason you unassigned the patch? Just curious as the main reason it was assigned to Guido was to make sure he saw it. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-04 15:50 Message: Logged In: YES user_id=33168 Attached is a patch to test_hotshot which is necessary because the duplicate line #s at the beginning of the function no longer exist. WIth this patch test_hotshot passes. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-03 16:06 Message: Logged In: YES user_id=33168 Overall, the patch looks very good. It would be nice to check in some of the changes separate from this patch: * com_addoparg(SET_LINENO) -> com_set_lineno() * the updated comments and whitespace * pystone * dis.py, disassemble(), perhaps * RETURN_NONE, perhaps Here's other minor comments: * Need to add RETURN_NONE opcode into dis.py * Need to add doc for RETURN_NONE in libdis.tex * It would be nice if the code from inspect.getlineno could be reused in disassemble() (not sure this can be done, though--i saw at least 4 variations of this code, 2 in python and 2 in C) * Should frame.f_lineno be initialized to 0, -1 or something invalid? * scripts/trace.py reimplements getting the lineno, why not reuse code? * docstring in scripts/trace._find_LINENO_from_code needs to remove ref to SET_LINENO ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-03 11:52 Message: Logged In: YES user_id=6656 Update. New this time: - adds RETURN_NONE opcode for the epilgoue. - don't call line trace events on it. - bumps MAGIC this is the output of "cvs diff" at the top level, so it's all in there. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-02 09:34 Message: Logged In: YES user_id=6656 Here's an update (just for ceval.c): - moves tracing code out of line - more expository comments - don't cause line trace events in the function epilogue ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:43 Message: Logged In: YES user_id=6656 Hang on, lets get all the doc changes: Doc/lib/libdis.tex Doc/tut/tut.tex Doc/whatsnew/whatsnew23.tex Misc/NEWS ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:41 Message: Logged In: YES user_id=6656 And finally, docs. This touches: Doc/lib/libdis.tex Doc/tut/tut.tex ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:40 Message: Logged In: YES user_id=6656 Here's the "boring" patch, more mechanical stuff. It touches: Include/opcode.h Lib/traceback.py Modules/_hotshot.c Python/frozen.c Python/traceback.c This should still be reviewed, of course, but I really shouldn't have messed any of this up. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-08-01 11:38 Message: Logged In: YES user_id=6656 Yeah, the obvious response to that was "delete your .pycs"... Right, I'm going to upload my latest, and hopefully final patch in three bits. First, I'm attaching the "interesting" patch. This touches: Objects/frameobject.c -- adds f_lineno descr Python/ceval.c -- the obvious Python/compile.c -- don't emit SET_LINENO Lib/dis.py -- use c_lnotab for line numbers Tools/scripts/trace.py -- ditto This is the most subtle patch, and the one that I'd most like review on. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 18:09 Message: Logged In: YES user_id=6380 Never mind the errors, I hadn't done a cvs update in weeks on the machine where I tested it. :-( ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-07-30 17:54 Message: Logged In: YES user_id=6380 Cool idea, but I get "unknown opcode" errors... Keep me posted though! (I wil ll now see any changes to this patch item.) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 05:59 Message: Logged In: YES user_id=6656 Guido should see this, assuming he still isn't subscribed to patches@python.org. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-30 05:58 Message: Logged In: YES user_id=6656 I worked out why some of the code in ceval.c wasn't making sense to me -- it didn't make sense, period. I've also fixed a number of silly and not so silly bugs in my patch. I'm now 99% certain this idea can fly. The patch isn't *finished* but the hard bit is done, IMHO. There are some other points to make, but I think I'll raise them on python-dev. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-07-29 16:18 Message: Logged In: YES user_id=31435 Dropping out of "vacation mode" long enough to say "mondo cool!" and encourage this. Guido may not agree, but I also encourage you to redefine c_lnotab if it can make life easier and quicker here. That subtle compression scheme has been the source of several nasty bugs, both in the core C code and in Jeremy's compiler pkg (cut 'n paste bugs abound here, because few people understand what's really needed, so flawed code gets copied with little thought). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-07-29 11:34 Message: Logged In: YES user_id=6656 Uhh, the instr_[lu]b variables need to keep their values around the loop; otherwise we might just as well call PyCode_Addr2Line each time around. I have another version of the patch that does that, but I assumed the overhead of doing so was deemed too high, or someone else would have done this by now. It's certainly easier. Glad to hear it doesn't affect python -O too much. I was doing this away from the internet and forgot to keep a clean copy of the source around for doing comparisons with... ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-07-29 11:18 Message: Logged In: YES user_id=35752 Moving the "int io, instr_ub = -1, instr_lb = 0;" declaration and the "io = INSTR_OFFSET();"| statement below the "if (tstate-c_tracefunc ...)" test gives a small speedup on my machine and is a little neater, IMHO. I was worried that this would slow down "python -O". That doesn't seem to be the case (at least I can't measure it). Well done. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=587993&group_id=5470 From noreply@sourceforge.net Thu Aug 15 16:18:42 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 08:18:42 -0700 Subject: [Patches] [ python-Patches-492105 ] Import from Zip archive Message-ID: Patches item #492105, was opened at 2001-12-12 17:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Out of Date Priority: 5 Submitted By: James C. Ahlstrom (ahlstromjc) Assigned to: Neal Norwitz (nnorwitz) Summary: Import from Zip archive Initial Comment: This is the "final" patch to support imports from zip archives, and directory caching using os.listdir(). It replaces patch 483466 and 476047. It is a separate patch since I can't delete file attachments. It adds support for importing from "" and from relative paths. ---------------------------------------------------------------------- >Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-08-15 15:18 Message: Logged In: YES user_id=64929 I just grabbed the CVS import.c (only). I will edit this to add my changes and submit it as a new import.c patch. This should help, although I can't test it unless I download the whole tree. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 15:00 Message: Logged In: YES user_id=6380 Alas, the 2.2.1 diff doesn't help much. Current CVS is what we need. :-( ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-08-15 14:44 Message: Logged In: YES user_id=64929 Here is the import.c diff -c against Python-2.2.1. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 14:20 Message: Logged In: YES user_id=6380 I think a new patch just for import.c would be helpful. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-15 14:09 Message: Logged In: YES user_id=33168 James, could you look at what Guido reworked? If that is fine, I can push it forward. Otherwise, feel free to update the patch. If I do any work on it, I'll make comments here so we don't duplicate work. Thanks. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-08-15 13:37 Message: Logged In: YES user_id=64929 This patch is old. I can provide a new patch against Python 2.2.1 if that would help. Or a new patch just for import.c against 2.2.1. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 02:40 Message: Logged In: YES user_id=6380 No, the failing hunk is included in dashc-2. I actually *edited* the patch file until all but that one hunk succeeded. Thanks for looking into this! If you need help don't fail to ask on python-dev, I read it daily. :-) ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-15 01:38 Message: Logged In: YES user_id=33168 Reassigning to me. I'll give it a shot. It will take a while. Do I understand you correctly that your updated patch (dashc-2) has all the necessary pieces, except for the import hunk 11 that was rejected? And I need to get that failed hunk from the original patch? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 01:32 Message: Logged In: YES user_id=6380 Whoa... That's a lot. Neal, do you think you could come up with a reworked version of this? That would be great! ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-14 23:30 Message: Logged In: YES user_id=33168 I'm reviewed most of the updated version. Quick summary: * strncpy() doesn't null terminate strings, this needs to be added * seems to be many leaked references Here are more details: * strncpy(dest, str, size) doesn't guarantee the result to be null terminated, need dest[size] = '\0'; after all strncpy()s (I see a bunch in getpath.c) * getpath.c:get_sys_path_0(), I think some compilers warn when you do char[i] = 0; (use '\0' instead) (there may be other places of this) * import.c:PyImport_InitZip(), have_zlib is an alias to zlib and isn't really helpful/necessary * import.c:get_path_type() can leak a reference to pyobj if it's an int * import.c:add_directory_names() pylist reference is leaked * import.c:add_zip_names(), memcmp(path+i, ".zip", 4) is clearer to me than path[i] == '.' .... * import.c:add_zip_names, there's a lot of magic #s in this function * " " : leak refs when doing PyTuple_SetItem I think there were other leaked references. I'll try to spend some more time later. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 20:50 Message: Logged In: YES user_id=6380 Sigh. We waited too long for this one, and now the patch is hopelessly out of date. I managed to fix most of the failing hunks, but the remaining hunk that fails (#11 in import.c) is a whopper: the saved import.c.rej is 270 lines long. I'm going to sleep on that, but I could use some help. In the mean time, I'm uploading an edited version of dashc.diff to help the next person. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 10:54 Message: Logged In: YES user_id=21627 Is this patch ready to be applied? ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-06-12 15:05 Message: Logged In: YES user_id=31392 Deleteing the old diffs that Jim couldn't delete. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 17:27 Message: Logged In: YES user_id=64929 I added a diff -c version of the patch. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 17:03 Message: Logged In: YES user_id=64929 I still can't delete files, but I added a new file which contains all diffs as a single file, and is made from the current CVS tree (Mar 15, 2002). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 From noreply@sourceforge.net Thu Aug 15 16:22:32 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 08:22:32 -0700 Subject: [Patches] [ python-Patches-593627 ] Static names Message-ID: Patches item #593627, was opened at 2002-08-11 06:41 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: Static names Initial Comment: This patch creates static string objects for all built-in names and interns then on initialization. The macro PyNAME is be used to access static names. PyNAME(__spam__) is equivalent to PyString_InternFromString("__spam__") but is a constant expression. It requires the name to be one of the built-in names. A linker error will be generated if it isn't. Most conversions of C strings into temporary string objects can be eliminated (PyString_FromString, PyString_InternFromString). Most string comparisons at runtime can also be eliminated. Instead of : if (strcmp(PyString_AsString(name), "__spam__")) ... This code can be used: PyString_INTERN(name) if (name == PyNAME(__spam__)) ... Where PyString_INTERN is a fast inline check if the string is already interned (and it usually is). To prevent unbounded accumulation of interned strings the mortal interned string patch should also be applied. The patch converts most of the builtin module to this new mode as an example. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 11:22 Message: Logged In: YES user_id=6380 Just for kicks I produced a forward diff, also adding the necessary changes to Makefile.pre.in that were mysteriously missing from the original, and fixing this for the very latest CVS (up to and including Michael Hudson's set_lineno patch). ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 08:48 Message: Logged In: YES user_id=21627 It is difficult to see how this patch achieves the goal of readability: - many variables are not used at all, e.g. ArithmethicError; it is not clear how using them would improve readability. - the change from SETBUILTIN("None", Py_None); to SETBUILTIN(PyNAME(None), Py_None); makes it more difficult to read, not easier. Furthermore, the name "None" isn't used anywhere except this initialisation. - likewise, the changes from {"abs", builtin_abs, METH_O, abs_doc}, to {PyNAMEC(abs), builtin_abs, METH_O, abs_doc}, make the code harder to read. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-12 08:10 Message: Logged In: YES user_id=562624 If these changes are applied throughout the interpreter I expect a significant speedup but that is not my immediate goal. I am looking for redability, reduction of code size (both source and binary) and reliability (less things to check or forget to check). I am trying to get rid of code like if (docstr == NULL) { docstr= PyString_InternFromString("__doc__"); if (docstr == NULL) return NULL; And replace it with just PyNAME(__doc__) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 03:43 Message: Logged In: YES user_id=21627 What is the rationale for this patch? If it is for performance, what real speed improvements can you report? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 11:54 Message: Logged In: YES user_id=33168 I'd like to see the releasing interned string patch applied. I think it's almost ready, isn't it? It would make patching easier and seems to be a good idea. For me, the easiest way to produce patches is to use cvs. You can keep multiple cvs trees around easy enough (for having multiple overlapping/independant patches). To create patches with cvs: cvs diff -C 5 [file1] [file2] .... ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-11 11:44 Message: Logged In: YES user_id=562624 Ok, I'll fix the problems with the patch. What's the best way to produce a patch that adds new files? Static string objects cannot be released, of course. This patch will eventually depend on on the mortal interned strings patch to fix it but in the meantime I just disabled releasing interned strings because I want to keep the two patches independent. The next version will add a new PyArg_ParseTuple format for an interned string to make it easier to use. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 10:46 Message: Logged In: YES user_id=33168 Couple of initial comments: * this is a reverse patch * it seems like there are other changes in here - int ob_shash -> long - releasing interned strings? * dictobject.c is removed? * including python headers should use "" not <> Oren, could you generate a new patch with only the changes to support PyNAME? Thanks! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 From noreply@sourceforge.net Thu Aug 15 16:58:01 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 08:58:01 -0700 Subject: [Patches] [ python-Patches-593627 ] Static names Message-ID: Patches item #593627, was opened at 2002-08-11 06:41 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None >Priority: 3 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: Static names Initial Comment: This patch creates static string objects for all built-in names and interns then on initialization. The macro PyNAME is be used to access static names. PyNAME(__spam__) is equivalent to PyString_InternFromString("__spam__") but is a constant expression. It requires the name to be one of the built-in names. A linker error will be generated if it isn't. Most conversions of C strings into temporary string objects can be eliminated (PyString_FromString, PyString_InternFromString). Most string comparisons at runtime can also be eliminated. Instead of : if (strcmp(PyString_AsString(name), "__spam__")) ... This code can be used: PyString_INTERN(name) if (name == PyNAME(__spam__)) ... Where PyString_INTERN is a fast inline check if the string is already interned (and it usually is). To prevent unbounded accumulation of interned strings the mortal interned string patch should also be applied. The patch converts most of the builtin module to this new mode as an example. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 11:58 Message: Logged In: YES user_id=6380 Strangely, I measured a code size *increase*. Strangely, because most object files didn't increase in text size, but the resulting binary did, adding about 20K text and 17K data. The only object file that changed sizes at all (according to "size */*.o" on Linux) was Python/bltinmodule.o, which grew less than 500 bytes in text size. The only new object file, Python/staticnames.o, has 48 bytes text and 17K bytes data. Maybe the added text size could be because of more cross-file references added by the linker??? (Files that referenced a local static char string constant now reference a static object in Python/staticnames.o.) I do see about a 1% speed increase for pystone. But I agree with Martin's comments on the "readability" issue. There's also a localization property that's lost: whenever a new name is added, you must update staticnames.h, staticnames.c, *and* the file where it is used. That's not nice (and not just because it forces a recompilation of the world because a header file was touched thst everybody includes). *If* this were ever accepted, the mechanism to (re)generate staticnames.h automatically should be checked in as well. In general, I've found that string literals hidden inside macros using stringification (#) are a detriment to code maintainability -- I've often had the situation where I *knew* there had to be a string literal for some name somewhere, but I couldn't find it because of this. Same for name concatenation (##); it often means that you know there's a function name somewhere but a grep through the sources won't find it. Very painful when tracking down problems. I deployed a bunch of tricks like this in early versions of typeobject.c, and ended up expanding almost all of them: a little more typing perhaps, but explicit is better than implicit, and a search for slot_nb_add will at least find the macro that defines it; ditto a search for "__add__" (with the quotes) will find where it is used. I guess that's about a -0.5 from me. Unless someone else steps up to champion this soon, it's dead. :-) PS. I used cvs add for new files and then cvs diff -c -N to create diffs that include new files; cvs produces output looking like a diff against /dev/null, and patch understands those (see the fixed patch I uploaded). But maybe if you only have anonymous CVS the cvs add won't work, and cvs diff won't give you diffs for files it knows nothing about. IOW YMMV. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 11:22 Message: Logged In: YES user_id=6380 Just for kicks I produced a forward diff, also adding the necessary changes to Makefile.pre.in that were mysteriously missing from the original, and fixing this for the very latest CVS (up to and including Michael Hudson's set_lineno patch). ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 08:48 Message: Logged In: YES user_id=21627 It is difficult to see how this patch achieves the goal of readability: - many variables are not used at all, e.g. ArithmethicError; it is not clear how using them would improve readability. - the change from SETBUILTIN("None", Py_None); to SETBUILTIN(PyNAME(None), Py_None); makes it more difficult to read, not easier. Furthermore, the name "None" isn't used anywhere except this initialisation. - likewise, the changes from {"abs", builtin_abs, METH_O, abs_doc}, to {PyNAMEC(abs), builtin_abs, METH_O, abs_doc}, make the code harder to read. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-12 08:10 Message: Logged In: YES user_id=562624 If these changes are applied throughout the interpreter I expect a significant speedup but that is not my immediate goal. I am looking for redability, reduction of code size (both source and binary) and reliability (less things to check or forget to check). I am trying to get rid of code like if (docstr == NULL) { docstr= PyString_InternFromString("__doc__"); if (docstr == NULL) return NULL; And replace it with just PyNAME(__doc__) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 03:43 Message: Logged In: YES user_id=21627 What is the rationale for this patch? If it is for performance, what real speed improvements can you report? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 11:54 Message: Logged In: YES user_id=33168 I'd like to see the releasing interned string patch applied. I think it's almost ready, isn't it? It would make patching easier and seems to be a good idea. For me, the easiest way to produce patches is to use cvs. You can keep multiple cvs trees around easy enough (for having multiple overlapping/independant patches). To create patches with cvs: cvs diff -C 5 [file1] [file2] .... ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-11 11:44 Message: Logged In: YES user_id=562624 Ok, I'll fix the problems with the patch. What's the best way to produce a patch that adds new files? Static string objects cannot be released, of course. This patch will eventually depend on on the mortal interned strings patch to fix it but in the meantime I just disabled releasing interned strings because I want to keep the two patches independent. The next version will add a new PyArg_ParseTuple format for an interned string to make it easier to use. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 10:46 Message: Logged In: YES user_id=33168 Couple of initial comments: * this is a reverse patch * it seems like there are other changes in here - int ob_shash -> long - releasing interned strings? * dictobject.c is removed? * including python headers should use "" not <> Oren, could you generate a new patch with only the changes to support PyNAME? Thanks! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 From noreply@sourceforge.net Thu Aug 15 18:40:13 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 10:40:13 -0700 Subject: [Patches] [ python-Patches-554192 ] mimetypes: all extensions for a type Message-ID: Patches item #554192, was opened at 2002-05-09 19:31 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=554192&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter D�rwald (doerwalter) Assigned to: Nobody/Anonymous (nobody) Summary: mimetypes: all extensions for a type Initial Comment: This patch adds a function guess_all_extensions to mimetypes.py. This function returns all known extensions for a given type, not just the first one found in the types_map dictionary. guess_extension is still present and returns the first from the list. ---------------------------------------------------------------------- >Comment By: Walter D�rwald (doerwalter) Date: 2002-08-15 19:40 Message: Logged In: YES user_id=89016 diff2.txt adds the global version of add_type and the documentation in Doc/lib/libmimetypes.tex. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-07-31 13:24 Message: Logged In: YES user_id=89016 OK, I'll change the patch and post the question to python-dev next week (I'm on vacation right now). ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-30 14:34 Message: Logged In: YES user_id=21627 I'm in favour of exposing it on the module level. If you are uncertain, you might want to ask on python-dev. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-07-30 13:00 Message: Logged In: YES user_id=89016 It *is* used in two spots: The constructor and the readfp method. But exposing it at the module level could make sense, because it is the atomic method of adding mime type information. So should it change the patch to expose it at the module level and change the LaTeX documentation accordingly? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-29 10:44 Message: Logged In: YES user_id=21627 I can't see the point of making it private, since it is not used inside the module. If you plan to use it, that usage certainly is outside of the module, so the method would be public. If it is public, it needs to be exposed on the module level, and it needs to be documented. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-07-29 10:23 Message: Logged In: YES user_id=89016 The patch adds an inverted mapping (i.e. mapping from type to a list of extensions). add_type simplifies adding a type<->ext mapping to both dictionaries. If this method should not be exposed we could make the name private. (_add_type) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 12:30 Message: Logged In: YES user_id=21627 What is the role of add_type in this patch? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=554192&group_id=5470 From noreply@sourceforge.net Thu Aug 15 18:42:42 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 10:42:42 -0700 Subject: [Patches] [ python-Patches-554192 ] mimetypes: all extensions for a type Message-ID: Patches item #554192, was opened at 2002-05-09 19:31 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=554192&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter D�rwald (doerwalter) >Assigned to: Martin v. L�wis (loewis) Summary: mimetypes: all extensions for a type Initial Comment: This patch adds a function guess_all_extensions to mimetypes.py. This function returns all known extensions for a given type, not just the first one found in the types_map dictionary. guess_extension is still present and returns the first from the list. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-08-15 19:40 Message: Logged In: YES user_id=89016 diff2.txt adds the global version of add_type and the documentation in Doc/lib/libmimetypes.tex. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-07-31 13:24 Message: Logged In: YES user_id=89016 OK, I'll change the patch and post the question to python-dev next week (I'm on vacation right now). ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-30 14:34 Message: Logged In: YES user_id=21627 I'm in favour of exposing it on the module level. If you are uncertain, you might want to ask on python-dev. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-07-30 13:00 Message: Logged In: YES user_id=89016 It *is* used in two spots: The constructor and the readfp method. But exposing it at the module level could make sense, because it is the atomic method of adding mime type information. So should it change the patch to expose it at the module level and change the LaTeX documentation accordingly? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-29 10:44 Message: Logged In: YES user_id=21627 I can't see the point of making it private, since it is not used inside the module. If you plan to use it, that usage certainly is outside of the module, so the method would be public. If it is public, it needs to be exposed on the module level, and it needs to be documented. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-07-29 10:23 Message: Logged In: YES user_id=89016 The patch adds an inverted mapping (i.e. mapping from type to a list of extensions). add_type simplifies adding a type<->ext mapping to both dictionaries. If this method should not be exposed we could make the name private. (_add_type) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 12:30 Message: Logged In: YES user_id=21627 What is the role of add_type in this patch? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=554192&group_id=5470 From noreply@sourceforge.net Thu Aug 15 18:45:17 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 10:45:17 -0700 Subject: [Patches] [ python-Patches-593627 ] Static names Message-ID: Patches item #593627, was opened at 2002-08-11 10:41 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 3 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: Static names Initial Comment: This patch creates static string objects for all built-in names and interns then on initialization. The macro PyNAME is be used to access static names. PyNAME(__spam__) is equivalent to PyString_InternFromString("__spam__") but is a constant expression. It requires the name to be one of the built-in names. A linker error will be generated if it isn't. Most conversions of C strings into temporary string objects can be eliminated (PyString_FromString, PyString_InternFromString). Most string comparisons at runtime can also be eliminated. Instead of : if (strcmp(PyString_AsString(name), "__spam__")) ... This code can be used: PyString_INTERN(name) if (name == PyNAME(__spam__)) ... Where PyString_INTERN is a fast inline check if the string is already interned (and it usually is). To prevent unbounded accumulation of interned strings the mortal interned string patch should also be applied. The patch converts most of the builtin module to this new mode as an example. ---------------------------------------------------------------------- >Comment By: Oren Tirosh (orenti) Date: 2002-08-15 17:45 Message: Logged In: YES user_id=562624 The code size increase is not surprising - all names appear twice in the executable: once as C strings and again as static PyStringObjects. This duplication can be eliminated. I'm surprised that there is *any* speed increase because I barely changed any code to make use this. This is very encouraging. The localization and forced recomplication issues you raise are not really relevant because this MUST NOT be used for anything but builtin names and builtins are not added so frequently. Even standard modules should not declare static names. The interning of static strings must be done before the interpreter is initialized to ensure that the static name is the interned name. If you intern a static name after the same name has already been interned elsewhere the static object will not be the one true interned version and static references to it will be incorrect. Actually, the macro PyNAME is not required any more and the actual symbol name can be used. I used the macro to do typecasting but it's no longer necessary because I found a way to make the static names real PyObjects (probably the only place where something is actually defined as a PyObject!) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 15:58 Message: Logged In: YES user_id=6380 Strangely, I measured a code size *increase*. Strangely, because most object files didn't increase in text size, but the resulting binary did, adding about 20K text and 17K data. The only object file that changed sizes at all (according to "size */*.o" on Linux) was Python/bltinmodule.o, which grew less than 500 bytes in text size. The only new object file, Python/staticnames.o, has 48 bytes text and 17K bytes data. Maybe the added text size could be because of more cross-file references added by the linker??? (Files that referenced a local static char string constant now reference a static object in Python/staticnames.o.) I do see about a 1% speed increase for pystone. But I agree with Martin's comments on the "readability" issue. There's also a localization property that's lost: whenever a new name is added, you must update staticnames.h, staticnames.c, *and* the file where it is used. That's not nice (and not just because it forces a recompilation of the world because a header file was touched thst everybody includes). *If* this were ever accepted, the mechanism to (re)generate staticnames.h automatically should be checked in as well. In general, I've found that string literals hidden inside macros using stringification (#) are a detriment to code maintainability -- I've often had the situation where I *knew* there had to be a string literal for some name somewhere, but I couldn't find it because of this. Same for name concatenation (##); it often means that you know there's a function name somewhere but a grep through the sources won't find it. Very painful when tracking down problems. I deployed a bunch of tricks like this in early versions of typeobject.c, and ended up expanding almost all of them: a little more typing perhaps, but explicit is better than implicit, and a search for slot_nb_add will at least find the macro that defines it; ditto a search for "__add__" (with the quotes) will find where it is used. I guess that's about a -0.5 from me. Unless someone else steps up to champion this soon, it's dead. :-) PS. I used cvs add for new files and then cvs diff -c -N to create diffs that include new files; cvs produces output looking like a diff against /dev/null, and patch understands those (see the fixed patch I uploaded). But maybe if you only have anonymous CVS the cvs add won't work, and cvs diff won't give you diffs for files it knows nothing about. IOW YMMV. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 15:22 Message: Logged In: YES user_id=6380 Just for kicks I produced a forward diff, also adding the necessary changes to Makefile.pre.in that were mysteriously missing from the original, and fixing this for the very latest CVS (up to and including Michael Hudson's set_lineno patch). ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 12:48 Message: Logged In: YES user_id=21627 It is difficult to see how this patch achieves the goal of readability: - many variables are not used at all, e.g. ArithmethicError; it is not clear how using them would improve readability. - the change from SETBUILTIN("None", Py_None); to SETBUILTIN(PyNAME(None), Py_None); makes it more difficult to read, not easier. Furthermore, the name "None" isn't used anywhere except this initialisation. - likewise, the changes from {"abs", builtin_abs, METH_O, abs_doc}, to {PyNAMEC(abs), builtin_abs, METH_O, abs_doc}, make the code harder to read. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-12 12:10 Message: Logged In: YES user_id=562624 If these changes are applied throughout the interpreter I expect a significant speedup but that is not my immediate goal. I am looking for redability, reduction of code size (both source and binary) and reliability (less things to check or forget to check). I am trying to get rid of code like if (docstr == NULL) { docstr= PyString_InternFromString("__doc__"); if (docstr == NULL) return NULL; And replace it with just PyNAME(__doc__) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 07:43 Message: Logged In: YES user_id=21627 What is the rationale for this patch? If it is for performance, what real speed improvements can you report? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 15:54 Message: Logged In: YES user_id=33168 I'd like to see the releasing interned string patch applied. I think it's almost ready, isn't it? It would make patching easier and seems to be a good idea. For me, the easiest way to produce patches is to use cvs. You can keep multiple cvs trees around easy enough (for having multiple overlapping/independant patches). To create patches with cvs: cvs diff -C 5 [file1] [file2] .... ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-11 15:44 Message: Logged In: YES user_id=562624 Ok, I'll fix the problems with the patch. What's the best way to produce a patch that adds new files? Static string objects cannot be released, of course. This patch will eventually depend on on the mortal interned strings patch to fix it but in the meantime I just disabled releasing interned strings because I want to keep the two patches independent. The next version will add a new PyArg_ParseTuple format for an interned string to make it easier to use. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 14:46 Message: Logged In: YES user_id=33168 Couple of initial comments: * this is a reverse patch * it seems like there are other changes in here - int ob_shash -> long - releasing interned strings? * dictobject.c is removed? * including python headers should use "" not <> Oren, could you generate a new patch with only the changes to support PyNAME? Thanks! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 From noreply@sourceforge.net Thu Aug 15 19:09:01 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 11:09:01 -0700 Subject: [Patches] [ python-Patches-593627 ] Static names Message-ID: Patches item #593627, was opened at 2002-08-11 06:41 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 3 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: Static names Initial Comment: This patch creates static string objects for all built-in names and interns then on initialization. The macro PyNAME is be used to access static names. PyNAME(__spam__) is equivalent to PyString_InternFromString("__spam__") but is a constant expression. It requires the name to be one of the built-in names. A linker error will be generated if it isn't. Most conversions of C strings into temporary string objects can be eliminated (PyString_FromString, PyString_InternFromString). Most string comparisons at runtime can also be eliminated. Instead of : if (strcmp(PyString_AsString(name), "__spam__")) ... This code can be used: PyString_INTERN(name) if (name == PyNAME(__spam__)) ... Where PyString_INTERN is a fast inline check if the string is already interned (and it usually is). To prevent unbounded accumulation of interned strings the mortal interned string patch should also be applied. The patch converts most of the builtin module to this new mode as an example. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 14:09 Message: Logged In: YES user_id=6380 > I'm surprised that there is *any* speed increase > because I barely changed any code to make use > this. This is very encouraging. Don't get too excited. Speedups and slowdowns in the order of 1% are usually random cache effects having to do with common portions of the VM main loop having a cache line conflict; I've seen a case where adding an *unreachable* printf() call predictably changed the pystone speed by 1%. > The localization and forced recomplication > issues you raise are not really relevant because > this MUST NOT be used for anything but builtin > names and builtins are not added so > frequently. Even standard modules should not > declare static names. Then why do I see all signal names in your list? And all exception names? > Actually, the macro PyNAME is not required any > more and the actual symbol name can be used. I > used the macro to do typecasting but it's no > longer necessary because I found a way to make > the static names real PyObjects (probably the > only place where something is actually defined > as a PyObject!) But the string is still more helpful in the code than the symbol name. Sorry, but none of this changes my position; you'll hve to find another champion. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-15 13:45 Message: Logged In: YES user_id=562624 The code size increase is not surprising - all names appear twice in the executable: once as C strings and again as static PyStringObjects. This duplication can be eliminated. I'm surprised that there is *any* speed increase because I barely changed any code to make use this. This is very encouraging. The localization and forced recomplication issues you raise are not really relevant because this MUST NOT be used for anything but builtin names and builtins are not added so frequently. Even standard modules should not declare static names. The interning of static strings must be done before the interpreter is initialized to ensure that the static name is the interned name. If you intern a static name after the same name has already been interned elsewhere the static object will not be the one true interned version and static references to it will be incorrect. Actually, the macro PyNAME is not required any more and the actual symbol name can be used. I used the macro to do typecasting but it's no longer necessary because I found a way to make the static names real PyObjects (probably the only place where something is actually defined as a PyObject!) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 11:58 Message: Logged In: YES user_id=6380 Strangely, I measured a code size *increase*. Strangely, because most object files didn't increase in text size, but the resulting binary did, adding about 20K text and 17K data. The only object file that changed sizes at all (according to "size */*.o" on Linux) was Python/bltinmodule.o, which grew less than 500 bytes in text size. The only new object file, Python/staticnames.o, has 48 bytes text and 17K bytes data. Maybe the added text size could be because of more cross-file references added by the linker??? (Files that referenced a local static char string constant now reference a static object in Python/staticnames.o.) I do see about a 1% speed increase for pystone. But I agree with Martin's comments on the "readability" issue. There's also a localization property that's lost: whenever a new name is added, you must update staticnames.h, staticnames.c, *and* the file where it is used. That's not nice (and not just because it forces a recompilation of the world because a header file was touched thst everybody includes). *If* this were ever accepted, the mechanism to (re)generate staticnames.h automatically should be checked in as well. In general, I've found that string literals hidden inside macros using stringification (#) are a detriment to code maintainability -- I've often had the situation where I *knew* there had to be a string literal for some name somewhere, but I couldn't find it because of this. Same for name concatenation (##); it often means that you know there's a function name somewhere but a grep through the sources won't find it. Very painful when tracking down problems. I deployed a bunch of tricks like this in early versions of typeobject.c, and ended up expanding almost all of them: a little more typing perhaps, but explicit is better than implicit, and a search for slot_nb_add will at least find the macro that defines it; ditto a search for "__add__" (with the quotes) will find where it is used. I guess that's about a -0.5 from me. Unless someone else steps up to champion this soon, it's dead. :-) PS. I used cvs add for new files and then cvs diff -c -N to create diffs that include new files; cvs produces output looking like a diff against /dev/null, and patch understands those (see the fixed patch I uploaded). But maybe if you only have anonymous CVS the cvs add won't work, and cvs diff won't give you diffs for files it knows nothing about. IOW YMMV. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 11:22 Message: Logged In: YES user_id=6380 Just for kicks I produced a forward diff, also adding the necessary changes to Makefile.pre.in that were mysteriously missing from the original, and fixing this for the very latest CVS (up to and including Michael Hudson's set_lineno patch). ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 08:48 Message: Logged In: YES user_id=21627 It is difficult to see how this patch achieves the goal of readability: - many variables are not used at all, e.g. ArithmethicError; it is not clear how using them would improve readability. - the change from SETBUILTIN("None", Py_None); to SETBUILTIN(PyNAME(None), Py_None); makes it more difficult to read, not easier. Furthermore, the name "None" isn't used anywhere except this initialisation. - likewise, the changes from {"abs", builtin_abs, METH_O, abs_doc}, to {PyNAMEC(abs), builtin_abs, METH_O, abs_doc}, make the code harder to read. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-12 08:10 Message: Logged In: YES user_id=562624 If these changes are applied throughout the interpreter I expect a significant speedup but that is not my immediate goal. I am looking for redability, reduction of code size (both source and binary) and reliability (less things to check or forget to check). I am trying to get rid of code like if (docstr == NULL) { docstr= PyString_InternFromString("__doc__"); if (docstr == NULL) return NULL; And replace it with just PyNAME(__doc__) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 03:43 Message: Logged In: YES user_id=21627 What is the rationale for this patch? If it is for performance, what real speed improvements can you report? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 11:54 Message: Logged In: YES user_id=33168 I'd like to see the releasing interned string patch applied. I think it's almost ready, isn't it? It would make patching easier and seems to be a good idea. For me, the easiest way to produce patches is to use cvs. You can keep multiple cvs trees around easy enough (for having multiple overlapping/independant patches). To create patches with cvs: cvs diff -C 5 [file1] [file2] .... ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-11 11:44 Message: Logged In: YES user_id=562624 Ok, I'll fix the problems with the patch. What's the best way to produce a patch that adds new files? Static string objects cannot be released, of course. This patch will eventually depend on on the mortal interned strings patch to fix it but in the meantime I just disabled releasing interned strings because I want to keep the two patches independent. The next version will add a new PyArg_ParseTuple format for an interned string to make it easier to use. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 10:46 Message: Logged In: YES user_id=33168 Couple of initial comments: * this is a reverse patch * it seems like there are other changes in here - int ob_shash -> long - releasing interned strings? * dictobject.c is removed? * including python headers should use "" not <> Oren, could you generate a new patch with only the changes to support PyNAME? Thanks! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 From noreply@sourceforge.net Thu Aug 15 19:18:31 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 11:18:31 -0700 Subject: [Patches] [ python-Patches-554192 ] mimetypes: all extensions for a type Message-ID: Patches item #554192, was opened at 2002-05-09 13:31 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=554192&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter D�rwald (doerwalter) Assigned to: Martin v. L�wis (loewis) Summary: mimetypes: all extensions for a type Initial Comment: This patch adds a function guess_all_extensions to mimetypes.py. This function returns all known extensions for a given type, not just the first one found in the types_map dictionary. guess_extension is still present and returns the first from the list. ---------------------------------------------------------------------- >Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-15 14:18 Message: Logged In: YES user_id=12800 If add_type() is going to be public, shouldn't it have a "strict" flag to decide whether to add it to the standard types dict or the common types dict? ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-08-15 13:40 Message: Logged In: YES user_id=89016 diff2.txt adds the global version of add_type and the documentation in Doc/lib/libmimetypes.tex. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-07-31 07:24 Message: Logged In: YES user_id=89016 OK, I'll change the patch and post the question to python-dev next week (I'm on vacation right now). ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-30 08:34 Message: Logged In: YES user_id=21627 I'm in favour of exposing it on the module level. If you are uncertain, you might want to ask on python-dev. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-07-30 07:00 Message: Logged In: YES user_id=89016 It *is* used in two spots: The constructor and the readfp method. But exposing it at the module level could make sense, because it is the atomic method of adding mime type information. So should it change the patch to expose it at the module level and change the LaTeX documentation accordingly? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-29 04:44 Message: Logged In: YES user_id=21627 I can't see the point of making it private, since it is not used inside the module. If you plan to use it, that usage certainly is outside of the module, so the method would be public. If it is public, it needs to be exposed on the module level, and it needs to be documented. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-07-29 04:23 Message: Logged In: YES user_id=89016 The patch adds an inverted mapping (i.e. mapping from type to a list of extensions). add_type simplifies adding a type<->ext mapping to both dictionaries. If this method should not be exposed we could make the name private. (_add_type) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 06:30 Message: Logged In: YES user_id=21627 What is the role of add_type in this patch? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=554192&group_id=5470 From noreply@sourceforge.net Thu Aug 15 19:47:38 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 11:47:38 -0700 Subject: [Patches] [ python-Patches-492105 ] Import from Zip archive Message-ID: Patches item #492105, was opened at 2001-12-12 17:21 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Out of Date Priority: 5 Submitted By: James C. Ahlstrom (ahlstromjc) Assigned to: Neal Norwitz (nnorwitz) Summary: Import from Zip archive Initial Comment: This is the "final" patch to support imports from zip archives, and directory caching using os.listdir(). It replaces patch 483466 and 476047. It is a separate patch since I can't delete file attachments. It adds support for importing from "" and from relative paths. ---------------------------------------------------------------------- >Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-08-15 18:47 Message: Logged In: YES user_id=64929 Here is a diff -c against today's import.c. It is untested because I didn't get the whole tree, but it is way closer than the old patch. I moved the find_module() loop over file suffixes into search_using_fopen(), a trivial change which produces a large diff. To make the change obvious, I didn't correct the indenting, so please do that after looking at the patch. Next I will download the whole tree and check the other changes once I get a little time. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-08-15 15:18 Message: Logged In: YES user_id=64929 I just grabbed the CVS import.c (only). I will edit this to add my changes and submit it as a new import.c patch. This should help, although I can't test it unless I download the whole tree. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 15:00 Message: Logged In: YES user_id=6380 Alas, the 2.2.1 diff doesn't help much. Current CVS is what we need. :-( ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-08-15 14:44 Message: Logged In: YES user_id=64929 Here is the import.c diff -c against Python-2.2.1. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 14:20 Message: Logged In: YES user_id=6380 I think a new patch just for import.c would be helpful. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-15 14:09 Message: Logged In: YES user_id=33168 James, could you look at what Guido reworked? If that is fine, I can push it forward. Otherwise, feel free to update the patch. If I do any work on it, I'll make comments here so we don't duplicate work. Thanks. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-08-15 13:37 Message: Logged In: YES user_id=64929 This patch is old. I can provide a new patch against Python 2.2.1 if that would help. Or a new patch just for import.c against 2.2.1. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 02:40 Message: Logged In: YES user_id=6380 No, the failing hunk is included in dashc-2. I actually *edited* the patch file until all but that one hunk succeeded. Thanks for looking into this! If you need help don't fail to ask on python-dev, I read it daily. :-) ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-15 01:38 Message: Logged In: YES user_id=33168 Reassigning to me. I'll give it a shot. It will take a while. Do I understand you correctly that your updated patch (dashc-2) has all the necessary pieces, except for the import hunk 11 that was rejected? And I need to get that failed hunk from the original patch? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 01:32 Message: Logged In: YES user_id=6380 Whoa... That's a lot. Neal, do you think you could come up with a reworked version of this? That would be great! ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-14 23:30 Message: Logged In: YES user_id=33168 I'm reviewed most of the updated version. Quick summary: * strncpy() doesn't null terminate strings, this needs to be added * seems to be many leaked references Here are more details: * strncpy(dest, str, size) doesn't guarantee the result to be null terminated, need dest[size] = '\0'; after all strncpy()s (I see a bunch in getpath.c) * getpath.c:get_sys_path_0(), I think some compilers warn when you do char[i] = 0; (use '\0' instead) (there may be other places of this) * import.c:PyImport_InitZip(), have_zlib is an alias to zlib and isn't really helpful/necessary * import.c:get_path_type() can leak a reference to pyobj if it's an int * import.c:add_directory_names() pylist reference is leaked * import.c:add_zip_names(), memcmp(path+i, ".zip", 4) is clearer to me than path[i] == '.' .... * import.c:add_zip_names, there's a lot of magic #s in this function * " " : leak refs when doing PyTuple_SetItem I think there were other leaked references. I'll try to spend some more time later. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 20:50 Message: Logged In: YES user_id=6380 Sigh. We waited too long for this one, and now the patch is hopelessly out of date. I managed to fix most of the failing hunks, but the remaining hunk that fails (#11 in import.c) is a whopper: the saved import.c.rej is 270 lines long. I'm going to sleep on that, but I could use some help. In the mean time, I'm uploading an edited version of dashc.diff to help the next person. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 10:54 Message: Logged In: YES user_id=21627 Is this patch ready to be applied? ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2002-06-12 15:05 Message: Logged In: YES user_id=31392 Deleteing the old diffs that Jim couldn't delete. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 17:27 Message: Logged In: YES user_id=64929 I added a diff -c version of the patch. ---------------------------------------------------------------------- Comment By: James C. Ahlstrom (ahlstromjc) Date: 2002-03-15 17:03 Message: Logged In: YES user_id=64929 I still can't delete files, but I added a new file which contains all diffs as a single file, and is made from the current CVS tree (Mar 15, 2002). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=492105&group_id=5470 From noreply@sourceforge.net Thu Aug 15 20:27:57 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 12:27:57 -0700 Subject: [Patches] [ python-Patches-595703 ] Replace (most) strncpy calls w/ strlcpy Message-ID: Patches item #595703, was opened at 2002-08-15 19:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Nobody/Anonymous (nobody) Summary: Replace (most) strncpy calls w/ strlcpy Initial Comment: I thought there as a bug or patch regarding this issue but I can't find it now. The Python interpreter has quite a few calls to strncpy. Most of the calls intend to copy a string without overflowing the destination buffer. strncpy is ill suited for this purpose. It copies too much data and does not guarantee that the destination string is null terminated. strlcpy has been designed for this purpose and should be used instead. Since strlcpy is not available on all platforms I've written a version that can be used if it is missing. The BSD version unfortunately carries the annoying advertising requirement so it can't be used. Please review the strlcpy implementation. The patches to change the interpreter to use it are coming. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 From noreply@sourceforge.net Thu Aug 15 20:43:59 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 12:43:59 -0700 Subject: [Patches] [ python-Patches-595703 ] Replace (most) strncpy calls w/ strlcpy Message-ID: Patches item #595703, was opened at 2002-08-15 19:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None >Priority: 3 Submitted By: Neil Schemenauer (nascheme) >Assigned to: Martin v. L�wis (loewis) Summary: Replace (most) strncpy calls w/ strlcpy Initial Comment: I thought there as a bug or patch regarding this issue but I can't find it now. The Python interpreter has quite a few calls to strncpy. Most of the calls intend to copy a string without overflowing the destination buffer. strncpy is ill suited for this purpose. It copies too much data and does not guarantee that the destination string is null terminated. strlcpy has been designed for this purpose and should be used instead. Since strlcpy is not available on all platforms I've written a version that can be used if it is missing. The BSD version unfortunately carries the annoying advertising requirement so it can't be used. Please review the strlcpy implementation. The patches to change the interpreter to use it are coming. ---------------------------------------------------------------------- >Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 19:43 Message: Logged In: YES user_id=35752 Patch to make strlcpy available to the interpreter. No calls to strncpy have been changed. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 From noreply@sourceforge.net Thu Aug 15 21:35:45 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 13:35:45 -0700 Subject: [Patches] [ python-Patches-595703 ] Replace (most) strncpy calls w/ strlcpy Message-ID: Patches item #595703, was opened at 2002-08-15 15:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 3 Submitted By: Neil Schemenauer (nascheme) Assigned to: Martin v. L�wis (loewis) Summary: Replace (most) strncpy calls w/ strlcpy Initial Comment: I thought there as a bug or patch regarding this issue but I can't find it now. The Python interpreter has quite a few calls to strncpy. Most of the calls intend to copy a string without overflowing the destination buffer. strncpy is ill suited for this purpose. It copies too much data and does not guarantee that the destination string is null terminated. strlcpy has been designed for this purpose and should be used instead. Since strlcpy is not available on all platforms I've written a version that can be used if it is missing. The BSD version unfortunately carries the annoying advertising requirement so it can't be used. Please review the strlcpy implementation. The patches to change the interpreter to use it are coming. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-08-15 16:35 Message: Logged In: YES user_id=31435 Neil, the patch you're thinking of is attached to this bug tracker item: [487703] Replace strcat, strcpy We should close one of these guys as a duplicate. I apologize for sitting on that bug for so long! It just hasn't seemed a priority. BTW, I expect this is so straightforward that you should just check in appropriate changes at will. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 15:43 Message: Logged In: YES user_id=35752 Patch to make strlcpy available to the interpreter. No calls to strncpy have been changed. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 From noreply@sourceforge.net Thu Aug 15 21:37:51 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 13:37:51 -0700 Subject: [Patches] [ python-Patches-580995 ] new version of Set class Message-ID: Patches item #580995, was opened at 2002-07-13 11:53 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580995&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Alex Martelli (aleax) Assigned to: Guido van Rossum (gvanrossum) Summary: new version of Set class Initial Comment: As per python-dev discussion on Sat 13 July 2002, subject "Dict constructor". A version of Greg Wilson's sandbox Set class that avoids the trickiness of implicitly freezing a set when __hash__ is called on it. Rather, uses several classes: Set itself has no __hash__ and represents a general, mutable set; BaseSet, its superclass, has all functionality common to mutable and immutable sets; ImmutableSet also subclasses BaseSet and adds __hash__; a wrapper _TemporarilyImmutableSet wraps a Set exposing only __hash__ (identical to that an ImmutableSet built from the Set would have) and __eq__ and __ne__ (delegated to the Set instance). Set.add(self, x) attempts to call x=x._asImmutable() (if AttributeError leaves x alone); Set._asImmutable(self) returns ImmutableSet(self). Membership test BaseSet.__contains__(self, x) attempt to call x = x._asTemporarilyImmutable() (if AttributeError leaves x alone); Set._asTemporarilyImmutable(self) returns TemporarilyImmutableSet(self). I've left Greg's code mostly alone otherwise except for fixing bugs/obsolescent usage (e.g. dictionary rather than dict) and making what were ValueError into TypeError (ValueError was doubtful earlier, is untenable now that mutable and immutable sets are different types). The change in exceptions forced me to change the unit tests in test_set.py, too, but I made no other changes nor additions. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 16:37 Message: Logged In: YES user_id=6380 Thanks, Alex! I've checked in a major rewrite of this in /nondist/sandbox/sets/set.py, replacing of Greg V. Wilson's version. ---------------------------------------------------------------------- Comment By: Alex Martelli (aleax) Date: 2002-07-18 16:27 Message: Logged In: YES user_id=60314 Changed as per GvR comments so now sets have-a dict rather than being-a dict. Made code more direct in some places (using list comprehensions rather than loops where appropriate). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580995&group_id=5470 From noreply@sourceforge.net Thu Aug 15 21:48:08 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 13:48:08 -0700 Subject: [Patches] [ python-Patches-595703 ] Replace (most) strncpy calls w/ strlcpy Message-ID: Patches item #595703, was opened at 2002-08-15 21:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 3 Submitted By: Neil Schemenauer (nascheme) >Assigned to: Nobody/Anonymous (nobody) Summary: Replace (most) strncpy calls w/ strlcpy Initial Comment: I thought there as a bug or patch regarding this issue but I can't find it now. The Python interpreter has quite a few calls to strncpy. Most of the calls intend to copy a string without overflowing the destination buffer. strncpy is ill suited for this purpose. It copies too much data and does not guarantee that the destination string is null terminated. strlcpy has been designed for this purpose and should be used instead. Since strlcpy is not available on all platforms I've written a version that can be used if it is missing. The BSD version unfortunately carries the annoying advertising requirement so it can't be used. Please review the strlcpy implementation. The patches to change the interpreter to use it are coming. ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-15 22:48 Message: Logged In: YES user_id=21627 I'm strongly opposed to strlcpy. It's an invention that serves no real purpose, and I hope it won't find its way into Python. Instead, it should be sufficient to review all calls to strncpy for correctness. It *is* possible to use strncpy in a safe way, and I suggest that the places where it is used unsafely are corrected. Since I'm with prejudice, I'm not really qualified to review the patch. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-15 22:35 Message: Logged In: YES user_id=31435 Neil, the patch you're thinking of is attached to this bug tracker item: [487703] Replace strcat, strcpy We should close one of these guys as a duplicate. I apologize for sitting on that bug for so long! It just hasn't seemed a priority. BTW, I expect this is so straightforward that you should just check in appropriate changes at will. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 21:43 Message: Logged In: YES user_id=35752 Patch to make strlcpy available to the interpreter. No calls to strncpy have been changed. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 From noreply@sourceforge.net Thu Aug 15 22:31:57 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 14:31:57 -0700 Subject: [Patches] [ python-Patches-595703 ] Replace (most) strncpy calls w/ strlcpy Message-ID: Patches item #595703, was opened at 2002-08-15 19:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 3 Submitted By: Neil Schemenauer (nascheme) Assigned to: Nobody/Anonymous (nobody) Summary: Replace (most) strncpy calls w/ strlcpy Initial Comment: I thought there as a bug or patch regarding this issue but I can't find it now. The Python interpreter has quite a few calls to strncpy. Most of the calls intend to copy a string without overflowing the destination buffer. strncpy is ill suited for this purpose. It copies too much data and does not guarantee that the destination string is null terminated. strlcpy has been designed for this purpose and should be used instead. Since strlcpy is not available on all platforms I've written a version that can be used if it is missing. The BSD version unfortunately carries the annoying advertising requirement so it can't be used. Please review the strlcpy implementation. The patches to change the interpreter to use it are coming. ---------------------------------------------------------------------- >Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 21:31 Message: Logged In: YES user_id=35752 See http://www.usenix.org/events/usenix99/full_papers/millert/millert_html/ for a paper on strlcpy. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-15 20:48 Message: Logged In: YES user_id=21627 I'm strongly opposed to strlcpy. It's an invention that serves no real purpose, and I hope it won't find its way into Python. Instead, it should be sufficient to review all calls to strncpy for correctness. It *is* possible to use strncpy in a safe way, and I suggest that the places where it is used unsafely are corrected. Since I'm with prejudice, I'm not really qualified to review the patch. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-15 20:35 Message: Logged In: YES user_id=31435 Neil, the patch you're thinking of is attached to this bug tracker item: [487703] Replace strcat, strcpy We should close one of these guys as a duplicate. I apologize for sitting on that bug for so long! It just hasn't seemed a priority. BTW, I expect this is so straightforward that you should just check in appropriate changes at will. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 19:43 Message: Logged In: YES user_id=35752 Patch to make strlcpy available to the interpreter. No calls to strncpy have been changed. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 From noreply@sourceforge.net Thu Aug 15 23:04:49 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 15:04:49 -0700 Subject: [Patches] [ python-Patches-595703 ] Replace (most) strncpy calls w/ strlcpy Message-ID: Patches item #595703, was opened at 2002-08-15 19:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 Category: Core (C code) Group: None >Status: Closed Resolution: None Priority: 3 Submitted By: Neil Schemenauer (nascheme) Assigned to: Nobody/Anonymous (nobody) Summary: Replace (most) strncpy calls w/ strlcpy Initial Comment: I thought there as a bug or patch regarding this issue but I can't find it now. The Python interpreter has quite a few calls to strncpy. Most of the calls intend to copy a string without overflowing the destination buffer. strncpy is ill suited for this purpose. It copies too much data and does not guarantee that the destination string is null terminated. strlcpy has been designed for this purpose and should be used instead. Since strlcpy is not available on all platforms I've written a version that can be used if it is missing. The BSD version unfortunately carries the annoying advertising requirement so it can't be used. Please review the strlcpy implementation. The patches to change the interpreter to use it are coming. ---------------------------------------------------------------------- >Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 22:04 Message: Logged In: YES user_id=35752 See bug 487703 for some more discussion. It seems there is some controversy surrounding the strlcpy and strlcat functions. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 21:31 Message: Logged In: YES user_id=35752 See http://www.usenix.org/events/usenix99/full_papers/millert/millert_html/ for a paper on strlcpy. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-15 20:48 Message: Logged In: YES user_id=21627 I'm strongly opposed to strlcpy. It's an invention that serves no real purpose, and I hope it won't find its way into Python. Instead, it should be sufficient to review all calls to strncpy for correctness. It *is* possible to use strncpy in a safe way, and I suggest that the places where it is used unsafely are corrected. Since I'm with prejudice, I'm not really qualified to review the patch. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-15 20:35 Message: Logged In: YES user_id=31435 Neil, the patch you're thinking of is attached to this bug tracker item: [487703] Replace strcat, strcpy We should close one of these guys as a duplicate. I apologize for sitting on that bug for so long! It just hasn't seemed a priority. BTW, I expect this is so straightforward that you should just check in appropriate changes at will. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 19:43 Message: Logged In: YES user_id=35752 Patch to make strlcpy available to the interpreter. No calls to strncpy have been changed. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 From noreply@sourceforge.net Thu Aug 15 23:07:39 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 15:07:39 -0700 Subject: [Patches] [ python-Patches-595703 ] Replace (most) strncpy calls w/ strlcpy Message-ID: Patches item #595703, was opened at 2002-08-15 19:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 Category: Core (C code) Group: None >Status: Open Resolution: None >Priority: 2 Submitted By: Neil Schemenauer (nascheme) Assigned to: Nobody/Anonymous (nobody) Summary: Replace (most) strncpy calls w/ strlcpy Initial Comment: I thought there as a bug or patch regarding this issue but I can't find it now. The Python interpreter has quite a few calls to strncpy. Most of the calls intend to copy a string without overflowing the destination buffer. strncpy is ill suited for this purpose. It copies too much data and does not guarantee that the destination string is null terminated. strlcpy has been designed for this purpose and should be used instead. Since strlcpy is not available on all platforms I've written a version that can be used if it is missing. The BSD version unfortunately carries the annoying advertising requirement so it can't be used. Please review the strlcpy implementation. The patches to change the interpreter to use it are coming. ---------------------------------------------------------------------- >Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 22:07 Message: Logged In: YES user_id=35752 Oops, didn't mean to close this just yet. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 22:04 Message: Logged In: YES user_id=35752 See bug 487703 for some more discussion. It seems there is some controversy surrounding the strlcpy and strlcat functions. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 21:31 Message: Logged In: YES user_id=35752 See http://www.usenix.org/events/usenix99/full_papers/millert/millert_html/ for a paper on strlcpy. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-15 20:48 Message: Logged In: YES user_id=21627 I'm strongly opposed to strlcpy. It's an invention that serves no real purpose, and I hope it won't find its way into Python. Instead, it should be sufficient to review all calls to strncpy for correctness. It *is* possible to use strncpy in a safe way, and I suggest that the places where it is used unsafely are corrected. Since I'm with prejudice, I'm not really qualified to review the patch. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-15 20:35 Message: Logged In: YES user_id=31435 Neil, the patch you're thinking of is attached to this bug tracker item: [487703] Replace strcat, strcpy We should close one of these guys as a duplicate. I apologize for sitting on that bug for so long! It just hasn't seemed a priority. BTW, I expect this is so straightforward that you should just check in appropriate changes at will. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 19:43 Message: Logged In: YES user_id=35752 Patch to make strlcpy available to the interpreter. No calls to strncpy have been changed. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 From noreply@sourceforge.net Thu Aug 15 23:23:13 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 15:23:13 -0700 Subject: [Patches] [ python-Patches-472593 ] Changing the preferences mechanism Message-ID: Patches item #472593, was opened at 2001-10-19 00:34 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=472593&group_id=5470 Category: Macintosh Group: None >Status: Closed >Resolution: Wont Fix Priority: 5 Submitted By: Alexandre Parenteau (aubonbeurre) Assigned to: Jack Jansen (jackjansen) Summary: Changing the preferences mechanism Initial Comment: Proposal to enhance MacPython preferences: ------------------------------------------ - Motivation : when embedding MacPython in MacCvs, I realized the way MacPython is storing the preferences in a solid Mac handle is a serious problem for MacCvs in order to use several versions of MacPython, and still being able to control the MacPython resources. - The patch : it is not complete, it is *only* a proposal for an under mechanism which stores individually "Persistent" values, or values which have the ability to be retained/loaded/saved accross several MacPython sessions. - The C side: example: defining a new persistent value is as simple as: static CPersistentInt version("version", POPT_VERSION_CURRENT); The value gets automatically linked to all the other persistent values so they can be loaded and stored all together. There are a set of pre-defined types of persistent values (int, bool, string) - The Python side : I have included a sample testpersistent.py which illustrates how the script can access, load, store the values. Note : the C++ implementation is just for convenience and RTTI is not used. ---------------------------------------------------------------------- >Comment By: Jack Jansen (jackjansen) Date: 2002-08-16 00:23 Message: Logged In: YES user_id=45365 Alexandre, I think that I won't fix this. OS9-based MacPython is quickly becoming a thing of the past, especially for developers (and 99% of the users of MacCVS will be developers, methinks). If you disagree reopen the report and convince me:-) ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=472593&group_id=5470 From noreply@sourceforge.net Fri Aug 16 01:26:27 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 17:26:27 -0700 Subject: [Patches] [ python-Patches-595703 ] Replace (most) strncpy calls w/ strlcpy Message-ID: Patches item #595703, was opened at 2002-08-15 15:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 Category: Core (C code) Group: None >Status: Closed Resolution: None Priority: 2 Submitted By: Neil Schemenauer (nascheme) Assigned to: Nobody/Anonymous (nobody) Summary: Replace (most) strncpy calls w/ strlcpy Initial Comment: I thought there as a bug or patch regarding this issue but I can't find it now. The Python interpreter has quite a few calls to strncpy. Most of the calls intend to copy a string without overflowing the destination buffer. strncpy is ill suited for this purpose. It copies too much data and does not guarantee that the destination string is null terminated. strlcpy has been designed for this purpose and should be used instead. Since strlcpy is not available on all platforms I've written a version that can be used if it is missing. The BSD version unfortunately carries the annoying advertising requirement so it can't be used. Please review the strlcpy implementation. The patches to change the interpreter to use it are coming. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 20:26 Message: Logged In: YES user_id=6380 Well, *I* say we ignore it. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 18:07 Message: Logged In: YES user_id=35752 Oops, didn't mean to close this just yet. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 18:04 Message: Logged In: YES user_id=35752 See bug 487703 for some more discussion. It seems there is some controversy surrounding the strlcpy and strlcat functions. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 17:31 Message: Logged In: YES user_id=35752 See http://www.usenix.org/events/usenix99/full_papers/millert/millert_html/ for a paper on strlcpy. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-15 16:48 Message: Logged In: YES user_id=21627 I'm strongly opposed to strlcpy. It's an invention that serves no real purpose, and I hope it won't find its way into Python. Instead, it should be sufficient to review all calls to strncpy for correctness. It *is* possible to use strncpy in a safe way, and I suggest that the places where it is used unsafely are corrected. Since I'm with prejudice, I'm not really qualified to review the patch. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-15 16:35 Message: Logged In: YES user_id=31435 Neil, the patch you're thinking of is attached to this bug tracker item: [487703] Replace strcat, strcpy We should close one of these guys as a duplicate. I apologize for sitting on that bug for so long! It just hasn't seemed a priority. BTW, I expect this is so straightforward that you should just check in appropriate changes at will. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 15:43 Message: Logged In: YES user_id=35752 Patch to make strlcpy available to the interpreter. No calls to strncpy have been changed. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 From noreply@sourceforge.net Fri Aug 16 03:06:24 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 19:06:24 -0700 Subject: [Patches] [ python-Patches-595821 ] --witch-cxx=c++ and correct LINKCC Message-ID: Patches item #595821, was opened at 2002-08-15 22:06 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595821&group_id=5470 Category: Build Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Jochen K�pper (kuepper) Assigned to: Nobody/Anonymous (nobody) Summary: --witch-cxx=c++ and correct LINKCC Initial Comment: Here LINKCC is not determined correctly when using configure --with-cxx=c++ This is a RedHat-7.1 based system, gcc-3.2, python from cvs. The following patch against configure.in from "release22-maint" might be a little overkill, but it makes sure that the linker uses (and finds) libstdc++: Index: configure.in =================================================================== RCS file: /cvsroot/python/python/dist/src/configure.in,v retrieving revision 1.288.6.6 diff -u -r1.288.6.6 configure.in --- configure.in 2 Jun 2002 17:34:47 -0000 1.288.6.6 +++ configure.in 16 Aug 2002 02:04:47 -0000 @@ -279,7 +279,7 @@ if test -z "$CXX"; then LINKCC="\ \" else - echo 'int main(){return 0;}' > conftest.$ac_ext + echo '#include int main(){string c('Hello'); return 0;}' > conftest.$ac_ext $CXX -c conftest.$ac_ext 2>&5 if $CC -o conftest$ac_exeext conftest.$ac_objext 2>&5 \ && test -s conftest$ac_exeext && ./conftest$ac_exeext ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595821&group_id=5470 From noreply@sourceforge.net Fri Aug 16 03:42:21 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 19:42:21 -0700 Subject: [Patches] [ python-Patches-595703 ] Replace (most) strncpy calls w/ strlcpy Message-ID: Patches item #595703, was opened at 2002-08-15 15:27 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 Category: Core (C code) Group: None Status: Closed >Resolution: Rejected Priority: 2 Submitted By: Neil Schemenauer (nascheme) Assigned to: Nobody/Anonymous (nobody) Summary: Replace (most) strncpy calls w/ strlcpy Initial Comment: I thought there as a bug or patch regarding this issue but I can't find it now. The Python interpreter has quite a few calls to strncpy. Most of the calls intend to copy a string without overflowing the destination buffer. strncpy is ill suited for this purpose. It copies too much data and does not guarantee that the destination string is null terminated. strlcpy has been designed for this purpose and should be used instead. Since strlcpy is not available on all platforms I've written a version that can be used if it is missing. The BSD version unfortunately carries the annoying advertising requirement so it can't be used. Please review the strlcpy implementation. The patches to change the interpreter to use it are coming. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-08-15 22:42 Message: Logged In: YES user_id=31435 Since Guido closed this again, based on his comment I expect he intended to reject it, so changed Resolution to Rejected. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 20:26 Message: Logged In: YES user_id=6380 Well, *I* say we ignore it. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 18:07 Message: Logged In: YES user_id=35752 Oops, didn't mean to close this just yet. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 18:04 Message: Logged In: YES user_id=35752 See bug 487703 for some more discussion. It seems there is some controversy surrounding the strlcpy and strlcat functions. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 17:31 Message: Logged In: YES user_id=35752 See http://www.usenix.org/events/usenix99/full_papers/millert/millert_html/ for a paper on strlcpy. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-15 16:48 Message: Logged In: YES user_id=21627 I'm strongly opposed to strlcpy. It's an invention that serves no real purpose, and I hope it won't find its way into Python. Instead, it should be sufficient to review all calls to strncpy for correctness. It *is* possible to use strncpy in a safe way, and I suggest that the places where it is used unsafely are corrected. Since I'm with prejudice, I'm not really qualified to review the patch. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-15 16:35 Message: Logged In: YES user_id=31435 Neil, the patch you're thinking of is attached to this bug tracker item: [487703] Replace strcat, strcpy We should close one of these guys as a duplicate. I apologize for sitting on that bug for so long! It just hasn't seemed a priority. BTW, I expect this is so straightforward that you should just check in appropriate changes at will. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-08-15 15:43 Message: Logged In: YES user_id=35752 Patch to make strlcpy available to the interpreter. No calls to strncpy have been changed. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595703&group_id=5470 From noreply@sourceforge.net Fri Aug 16 05:26:11 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 21:26:11 -0700 Subject: [Patches] [ python-Patches-595846 ] Update environ for CGIHTTPServer.py Message-ID: Patches item #595846, was opened at 2002-08-15 21:26 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595846&group_id=5470 Category: Library (Lib) Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Brett Cannon (bcannon) Assigned to: Nobody/Anonymous (nobody) Summary: Update environ for CGIHTTPServer.py Initial Comment: This patch causes CGIHTTPServer to update os.environ regardless of how it tries to handle calls (fork, popen*, etc.). I discovered this when trying to run Quixote through CGIHTTPServer and getting errors reported by Quixote saying that the SCRIPT_NAME environment variable was not being updated. I noticed that if self.has_fork was true, then os.environ was never explicitly updated; os.execve() has the env dict passed to it but I guess that is not enough or OS X's os.execve() is broken. So this patch just calls os.environ.update(env) after the last change to the env dict but before the method has decided how it is going to deal with the call. It also removes the now extraneous calls previously used. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595846&group_id=5470 From noreply@sourceforge.net Fri Aug 16 08:14:43 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 16 Aug 2002 00:14:43 -0700 Subject: [Patches] [ python-Patches-595821 ] --witch-cxx=c++ and correct LINKCC Message-ID: Patches item #595821, was opened at 2002-08-16 04:06 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595821&group_id=5470 Category: Build Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Jochen K�pper (kuepper) Assigned to: Nobody/Anonymous (nobody) Summary: --witch-cxx=c++ and correct LINKCC Initial Comment: Here LINKCC is not determined correctly when using configure --with-cxx=c++ This is a RedHat-7.1 based system, gcc-3.2, python from cvs. The following patch against configure.in from "release22-maint" might be a little overkill, but it makes sure that the linker uses (and finds) libstdc++: Index: configure.in =================================================================== RCS file: /cvsroot/python/python/dist/src/configure.in,v retrieving revision 1.288.6.6 diff -u -r1.288.6.6 configure.in --- configure.in 2 Jun 2002 17:34:47 -0000 1.288.6.6 +++ configure.in 16 Aug 2002 02:04:47 -0000 @@ -279,7 +279,7 @@ if test -z "$CXX"; then LINKCC="\ \" else - echo 'int main(){return 0;}' > conftest.$ac_ext + echo '#include int main(){string c('Hello'); return 0;}' > conftest.$ac_ext $CXX -c conftest.$ac_ext 2>&5 if $CC -o conftest$ac_exeext conftest.$ac_objext 2>&5 \ && test -s conftest$ac_exeext && ./conftest$ac_exeext ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-16 09:14 Message: Logged In: YES user_id=21627 Can you please elaborate what exactly you mean by "not determined correctly"? What did you do, what happened, what did you expect to happen, why do you think the observed behaviour is incorrect? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595821&group_id=5470 From noreply@sourceforge.net Fri Aug 16 10:00:40 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 16 Aug 2002 02:00:40 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-02 16:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Library (Lib) Group: Python 2.3 >Status: Open Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Guido van Rossum (gvanrossum) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-08-16 19:00 Message: Logged In: YES user_id=250749 I hope I'm doing the right thing by re-opening this rather than copening a bug report... I'm seeing a very odd failure in test_tempfile on FreeBSD 4.4. The failure occurs when I run the full regression test with TESTOPTS="-l -u network" but succeeds when TESTOPTS="- l". ./python -E -tt Lib/test/regrtest.py -l -u network test_tempfile succeeds too, as does: ./python -E -tt Lib/test/regrtest.py -v -u network test_tempfile At this point, I haven't tried other -u option combinations. The error log shows: test test_tempfile failed -- Traceback (most recent call last): File "/home/andymac/cvs/python/python- cvs/Lib/test/test_tempfile.py", line 345, in test_noinherit "child process exited successfully") File "/home/andymac/cvs/python/python- cvs/Lib/unittest.py", line 268, in failUnless if not expr: raise self.failureException, msg AssertionError: child process exited successfully Unfortunately Real Job has wiped out any time I might have had to try and debug this, and I won't have much if any time to look at this for about 3 weeks :-( Intuitive guesses about where to start looking would be welcome! The OS/2 EMX port has the mkstemped problem noted below for HP-UX, but I think I might not have picked up the fixes when I tested, so I'll have to check that again. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-08-15 05:44 Message: Logged In: YES user_id=45365 Nevermind. Just saw the discussion on python-dev (this is a file descriptor returned, not a file pointer, so stdio is nowhere in sight). ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-08-15 05:42 Message: Logged In: YES user_id=45365 Isn't it much more logical to give mkstemp() a mode="w+b" argument? The other routines have that as well, and it is also more in line with open() and such... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 02:52 Message: Logged In: YES user_id=6380 Closing again. Reduced the number of temp files to 100. Changed 'binary=True' to 'text=False' default on mkstemp(). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 23:16 Message: Logged In: YES user_id=6380 The "Too many open files" problem is solved. The HP system was configured to allow only 200 open file descriptors per process. But maybe the test would work just as well if it tried to create 100 instead of 1000 temp files? I expect that this would cause failures on other systems with conservative limits. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 23:02 Message: Logged In: YES user_id=6380 I'd like to change the binary=True argument to mkstemp into text=False; that seems easier to explain. News about the HP errors from Kalle Svensson: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 719, in ? test_main() File "../python/dist/src/Lib/test/test_tempfile.py", line 716, in test_main test_support.run_suite(suite) File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/test/test_support.py", line 188, in run_suite raise TestFailed(err) test.test_support.TestFailed: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 295, in test_basic_many File "../python/dist/src/Lib/test/test_tempfile.py", line 278, in do_create File "../python/dist/src/Lib/test/test_tempfile.py", line 33, in failOnException File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/unittest.py", line 260, in fail AssertionError: _mkstemp_inner raised exceptions.OSError: [Errno 24] Too many open files: '/tmp/aaU3irrA' ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 14:28 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 14:17 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 10:19 Message: Logged In: YES user_id=6380 Oops, looks like typos in the patch. Fixed (I hope). Question for Zack: I noticed that a few times you changed this: temp = tempfile.mktemp() into this: (fd, temp) = tempfile.mkstemp() os.close(fd) If the latter is secure, why can't mktemp() be defined as doing that? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-10 06:53 Message: Logged In: YES user_id=33168 Guido, there are still 3 uses of mktemp after the checkin. Should these use mkstemp()? Lib/toaiff.py:102 Lib/plat-irix5/torgb.py:95 Lib/plat-irix6/torgb.py:95 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 02:40 Message: Logged In: YES user_id=6380 I've checked this all in now. The changes to test_tempfile.py weren't as easily fixable to work without the tempfile.py changes as Zack thought. I hope the community will give it some review. It will probably break some Zope tests. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 00:50 Message: Logged In: YES user_id=6380 OK, I'll do something along those lines myself. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-09 05:01 Message: Logged In: YES user_id=580015 I'm afraid my idea of patch size comes from GCC land, where this would be considered not that big at all. Only 1500 lines affected, and more than half of that is documentation and test suite! I tried, and failed, to break up the changes to tempfile.py itself. But there's some larger divisions that could be made. We could check in the new test_tempfile.py now, disabling the tests that refer to nonexistent functions (just comment out the lines that add those tests to the test_classes array). The changes to the rest of the test suite are also largely independent of the tempfile.py rewrite (since they replace tempfile.mktemp() with TESTFN, mostly). And the search-and-replace changes in the library can wait until after tempfile.py itself gets reviewed. Unfortunately, I am about to go on vacation for five days, so I don't have time now to do this split-up. I will try to drum up interest on python-dev in reviewing the patch as is. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-06 02:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-03 16:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-03 00:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Fri Aug 16 13:39:07 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 16 Aug 2002 05:39:07 -0700 Subject: [Patches] [ python-Patches-590513 ] Add popen2 like functionality to pty.py. Message-ID: Patches item #590513, was opened at 2002-08-04 00:16 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590513&group_id=5470 Category: Library (Lib) Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Rasjid Wilcox (rasjidw) Assigned to: Nobody/Anonymous (nobody) Summary: Add popen2 like functionality to pty.py. Initial Comment: This patch adds a popen2 like function to pty.py. Due to use of os.execlp in pty.spawn, it is not quite the same, as all the arguments (including the command to be run) must be passed as a tupple. It is only a first draft, and may need some more work, which I am willing to do if some direction is indicated. Tested on Python2.2, under RedHat Linux 7.3. Rasjid. ---------------------------------------------------------------------- >Comment By: Rasjid Wilcox (rasjidw) Date: 2002-08-16 22:39 Message: Logged In: YES user_id=39640 Okay. This is essentially my final version, subject to an issue around the naming of a couple of the modules functions. My issue is that the pty.spawn funciton does not return. My new one 'th_spawn' function does. I believe quite strongly that to be consistent with the rest of python, 'th_spawn' should be renamed 'spawn', and the current spawn function should be renamed something like '_master' since it takes the place of the pty master. The issue of course is backward compatibility. However, I believe this to be relatively minor for two reasons. 1. I think very few (if any) people are using it, since it was a) largely undocumented, b) didn't always work c) wasn't particularly useful, since it only allowed control from an existing pty (so what was the point?) 2. If anyone is using it, they would _almost_ certainly be starting it on a new thread, so all that would happen if the functions were renamed would be that an extra (redundant) sub-thread is created. I have posted to comp.lang.python to see what other pty.py users think. Subject to the response here and there, I may post to python-dev in due course. Rasjid. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-06 01:20 Message: Logged In: YES user_id=6380 I mostly concur with Martin von Loewis's comments, though I'm not sure this is big enough for a PEP. I think that you're right in answering (yes, no, yes) but I have no information (portability of this module is already limited to IRIX and Linux, according to the docs). The docs use the word "baffle" -- I wonder if you could substitute something else or generally clarify that sentence; it's not very clear from the docs what spawn() does (nor what fork() does, to tell the truth -- all these could really use some examples). ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-05 21:44 Message: Logged In: YES user_id=21627 I'm not a regular pty user. Please ask those questions in comp.lang.python, and python-dev. You can also ask previous authors to pty for comments. Uncertainty in such areas might be a hint that a library PEP is need, to justify the rationale for all the details. There is no need to hurry - Python 2.3 is still months away. That said, I do think that this functionality is desirable, so I'd encourage you to complete this task. ---------------------------------------------------------------------- Comment By: Rasjid Wilcox (rasjidw) Date: 2002-08-05 21:34 Message: Logged In: YES user_id=39640 Before I do docs etc, I have a few questions: 1. I could make it more popen2 like by changing the args to def popen2(cmd, ....) and adding argv=('/bin/sh','-c',cmd) Is this a better idea? Does it reduce portability? Is it safe to assume that all posix systems have /bin/sh? (My guess is yes, no and yes.) 2. Should the threading done in the pty.popen2 function be moved to a separate function, to allow more direct access to spawn. (The current spawn function does not return until the child exits or the parent closes the pipe). 3. Should I worry about how keyboard interrupts are handled? In some cases an uncontrolled process may be left hanging around. Or is it the job of the calling process to deal with that? Lastly, I am away for a week from Wednesday, so I won't be able to do much until I get back, but I will try and finish this off then. Cheers, Rasjid. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 18:56 Message: Logged In: YES user_id=21627 Can you please write documentation and a test case? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590513&group_id=5470 From noreply@sourceforge.net Fri Aug 16 14:23:09 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 16 Aug 2002 06:23:09 -0700 Subject: [Patches] [ python-Patches-590513 ] Add popen2 like functionality to pty.py. Message-ID: Patches item #590513, was opened at 2002-08-03 10:16 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590513&group_id=5470 Category: Library (Lib) Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Rasjid Wilcox (rasjidw) >Assigned to: Guido van Rossum (gvanrossum) Summary: Add popen2 like functionality to pty.py. Initial Comment: This patch adds a popen2 like function to pty.py. Due to use of os.execlp in pty.spawn, it is not quite the same, as all the arguments (including the command to be run) must be passed as a tupple. It is only a first draft, and may need some more work, which I am willing to do if some direction is indicated. Tested on Python2.2, under RedHat Linux 7.3. Rasjid. ---------------------------------------------------------------------- Comment By: Rasjid Wilcox (rasjidw) Date: 2002-08-16 08:39 Message: Logged In: YES user_id=39640 Okay. This is essentially my final version, subject to an issue around the naming of a couple of the modules functions. My issue is that the pty.spawn funciton does not return. My new one 'th_spawn' function does. I believe quite strongly that to be consistent with the rest of python, 'th_spawn' should be renamed 'spawn', and the current spawn function should be renamed something like '_master' since it takes the place of the pty master. The issue of course is backward compatibility. However, I believe this to be relatively minor for two reasons. 1. I think very few (if any) people are using it, since it was a) largely undocumented, b) didn't always work c) wasn't particularly useful, since it only allowed control from an existing pty (so what was the point?) 2. If anyone is using it, they would _almost_ certainly be starting it on a new thread, so all that would happen if the functions were renamed would be that an extra (redundant) sub-thread is created. I have posted to comp.lang.python to see what other pty.py users think. Subject to the response here and there, I may post to python-dev in due course. Rasjid. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 11:20 Message: Logged In: YES user_id=6380 I mostly concur with Martin von Loewis's comments, though I'm not sure this is big enough for a PEP. I think that you're right in answering (yes, no, yes) but I have no information (portability of this module is already limited to IRIX and Linux, according to the docs). The docs use the word "baffle" -- I wonder if you could substitute something else or generally clarify that sentence; it's not very clear from the docs what spawn() does (nor what fork() does, to tell the truth -- all these could really use some examples). ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-05 07:44 Message: Logged In: YES user_id=21627 I'm not a regular pty user. Please ask those questions in comp.lang.python, and python-dev. You can also ask previous authors to pty for comments. Uncertainty in such areas might be a hint that a library PEP is need, to justify the rationale for all the details. There is no need to hurry - Python 2.3 is still months away. That said, I do think that this functionality is desirable, so I'd encourage you to complete this task. ---------------------------------------------------------------------- Comment By: Rasjid Wilcox (rasjidw) Date: 2002-08-05 07:34 Message: Logged In: YES user_id=39640 Before I do docs etc, I have a few questions: 1. I could make it more popen2 like by changing the args to def popen2(cmd, ....) and adding argv=('/bin/sh','-c',cmd) Is this a better idea? Does it reduce portability? Is it safe to assume that all posix systems have /bin/sh? (My guess is yes, no and yes.) 2. Should the threading done in the pty.popen2 function be moved to a separate function, to allow more direct access to spawn. (The current spawn function does not return until the child exits or the parent closes the pipe). 3. Should I worry about how keyboard interrupts are handled? In some cases an uncontrolled process may be left hanging around. Or is it the job of the calling process to deal with that? Lastly, I am away for a week from Wednesday, so I won't be able to do much until I get back, but I will try and finish this off then. Cheers, Rasjid. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 04:56 Message: Logged In: YES user_id=21627 Can you please write documentation and a test case? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590513&group_id=5470 From noreply@sourceforge.net Fri Aug 16 14:47:36 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 16 Aug 2002 06:47:36 -0700 Subject: [Patches] [ python-Patches-590513 ] Add popen2 like functionality to pty.py. Message-ID: Patches item #590513, was opened at 2002-08-03 16:16 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590513&group_id=5470 Category: Library (Lib) Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Rasjid Wilcox (rasjidw) Assigned to: Guido van Rossum (gvanrossum) Summary: Add popen2 like functionality to pty.py. Initial Comment: This patch adds a popen2 like function to pty.py. Due to use of os.execlp in pty.spawn, it is not quite the same, as all the arguments (including the command to be run) must be passed as a tupple. It is only a first draft, and may need some more work, which I am willing to do if some direction is indicated. Tested on Python2.2, under RedHat Linux 7.3. Rasjid. ---------------------------------------------------------------------- >Comment By: Thomas Wouters (twouters) Date: 2002-08-16 15:47 Message: Logged In: YES user_id=34209 The pty module actually works on all platforms that provide openpty(). I know at least BSDI and FreeBSD do, in addition to Linux, and I suspect OpenBSD and NetBSD as well, and possibly other BSD-derived systems. Still, it is very unlikely to work on non-POSIX-ish systems, so /bin/sh is a safe bet. I can take a look at the patch, and test it on a variety of machines, but not before I leave on a two-week vacation, this sunday :) I have plenty of time to look at it when I'm back though, in the first week of September. Feel free to assign the patch to me, as a reminder. ---------------------------------------------------------------------- Comment By: Rasjid Wilcox (rasjidw) Date: 2002-08-16 14:39 Message: Logged In: YES user_id=39640 Okay. This is essentially my final version, subject to an issue around the naming of a couple of the modules functions. My issue is that the pty.spawn funciton does not return. My new one 'th_spawn' function does. I believe quite strongly that to be consistent with the rest of python, 'th_spawn' should be renamed 'spawn', and the current spawn function should be renamed something like '_master' since it takes the place of the pty master. The issue of course is backward compatibility. However, I believe this to be relatively minor for two reasons. 1. I think very few (if any) people are using it, since it was a) largely undocumented, b) didn't always work c) wasn't particularly useful, since it only allowed control from an existing pty (so what was the point?) 2. If anyone is using it, they would _almost_ certainly be starting it on a new thread, so all that would happen if the functions were renamed would be that an extra (redundant) sub-thread is created. I have posted to comp.lang.python to see what other pty.py users think. Subject to the response here and there, I may post to python-dev in due course. Rasjid. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 17:20 Message: Logged In: YES user_id=6380 I mostly concur with Martin von Loewis's comments, though I'm not sure this is big enough for a PEP. I think that you're right in answering (yes, no, yes) but I have no information (portability of this module is already limited to IRIX and Linux, according to the docs). The docs use the word "baffle" -- I wonder if you could substitute something else or generally clarify that sentence; it's not very clear from the docs what spawn() does (nor what fork() does, to tell the truth -- all these could really use some examples). ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-05 13:44 Message: Logged In: YES user_id=21627 I'm not a regular pty user. Please ask those questions in comp.lang.python, and python-dev. You can also ask previous authors to pty for comments. Uncertainty in such areas might be a hint that a library PEP is need, to justify the rationale for all the details. There is no need to hurry - Python 2.3 is still months away. That said, I do think that this functionality is desirable, so I'd encourage you to complete this task. ---------------------------------------------------------------------- Comment By: Rasjid Wilcox (rasjidw) Date: 2002-08-05 13:34 Message: Logged In: YES user_id=39640 Before I do docs etc, I have a few questions: 1. I could make it more popen2 like by changing the args to def popen2(cmd, ....) and adding argv=('/bin/sh','-c',cmd) Is this a better idea? Does it reduce portability? Is it safe to assume that all posix systems have /bin/sh? (My guess is yes, no and yes.) 2. Should the threading done in the pty.popen2 function be moved to a separate function, to allow more direct access to spawn. (The current spawn function does not return until the child exits or the parent closes the pipe). 3. Should I worry about how keyboard interrupts are handled? In some cases an uncontrolled process may be left hanging around. Or is it the job of the calling process to deal with that? Lastly, I am away for a week from Wednesday, so I won't be able to do much until I get back, but I will try and finish this off then. Cheers, Rasjid. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-04 10:56 Message: Logged In: YES user_id=21627 Can you please write documentation and a test case? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590513&group_id=5470 From noreply@sourceforge.net Fri Aug 16 15:32:46 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 16 Aug 2002 07:32:46 -0700 Subject: [Patches] [ python-Patches-595821 ] --witch-cxx=c++ and correct LINKCC Message-ID: Patches item #595821, was opened at 2002-08-15 22:06 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595821&group_id=5470 Category: Build Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Jochen K�pper (kuepper) Assigned to: Nobody/Anonymous (nobody) Summary: --witch-cxx=c++ and correct LINKCC Initial Comment: Here LINKCC is not determined correctly when using configure --with-cxx=c++ This is a RedHat-7.1 based system, gcc-3.2, python from cvs. The following patch against configure.in from "release22-maint" might be a little overkill, but it makes sure that the linker uses (and finds) libstdc++: Index: configure.in =================================================================== RCS file: /cvsroot/python/python/dist/src/configure.in,v retrieving revision 1.288.6.6 diff -u -r1.288.6.6 configure.in --- configure.in 2 Jun 2002 17:34:47 -0000 1.288.6.6 +++ configure.in 16 Aug 2002 02:04:47 -0000 @@ -279,7 +279,7 @@ if test -z "$CXX"; then LINKCC="\ \" else - echo 'int main(){return 0;}' > conftest.$ac_ext + echo '#include int main(){string c('Hello'); return 0;}' > conftest.$ac_ext $CXX -c conftest.$ac_ext 2>&5 if $CC -o conftest$ac_exeext conftest.$ac_objext 2>&5 \ && test -s conftest$ac_exeext && ./conftest$ac_exeext ---------------------------------------------------------------------- >Comment By: Jochen K�pper (kuepper) Date: 2002-08-16 10:32 Message: Logged In: YES user_id=19849 I did ./configure --with-cxx=c++ && make and LINKCC was set to gcc. So later on when linking the exectutable I get unresolved references, i.e. Modules/ccpython.o(.eh_frame+0x11): undefined reference to `__gxx_personality_v0' These are defined in libstdc++. If LINKCC is set to c++, these go away, 'cause it knows where to find these symbols. In the current test for LINKCC nevertheless gcc ($CC) is good enough to link the conftest, so it is decided to use that for python. The patch merely supplies a conftest that really requires a C++ linker. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-16 03:14 Message: Logged In: YES user_id=21627 Can you please elaborate what exactly you mean by "not determined correctly"? What did you do, what happened, what did you expect to happen, why do you think the observed behaviour is incorrect? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595821&group_id=5470 From noreply@sourceforge.net Fri Aug 16 16:16:35 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 16 Aug 2002 08:16:35 -0700 Subject: [Patches] [ python-Patches-595821 ] --witch-cxx=c++ and correct LINKCC Message-ID: Patches item #595821, was opened at 2002-08-16 04:06 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595821&group_id=5470 Category: Build Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Jochen K�pper (kuepper) Assigned to: Nobody/Anonymous (nobody) Summary: --witch-cxx=c++ and correct LINKCC Initial Comment: Here LINKCC is not determined correctly when using configure --with-cxx=c++ This is a RedHat-7.1 based system, gcc-3.2, python from cvs. The following patch against configure.in from "release22-maint" might be a little overkill, but it makes sure that the linker uses (and finds) libstdc++: Index: configure.in =================================================================== RCS file: /cvsroot/python/python/dist/src/configure.in,v retrieving revision 1.288.6.6 diff -u -r1.288.6.6 configure.in --- configure.in 2 Jun 2002 17:34:47 -0000 1.288.6.6 +++ configure.in 16 Aug 2002 02:04:47 -0000 @@ -279,7 +279,7 @@ if test -z "$CXX"; then LINKCC="\ \" else - echo 'int main(){return 0;}' > conftest.$ac_ext + echo '#include int main(){string c('Hello'); return 0;}' > conftest.$ac_ext $CXX -c conftest.$ac_ext 2>&5 if $CC -o conftest$ac_exeext conftest.$ac_objext 2>&5 \ && test -s conftest$ac_exeext && ./conftest$ac_exeext ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-16 17:16 Message: Logged In: YES user_id=21627 Can you please try the mainline CVS? This appears to be a duplicate of bug #559429, which has been fixed with configure.in 1.319. ---------------------------------------------------------------------- Comment By: Jochen K�pper (kuepper) Date: 2002-08-16 16:32 Message: Logged In: YES user_id=19849 I did ./configure --with-cxx=c++ && make and LINKCC was set to gcc. So later on when linking the exectutable I get unresolved references, i.e. Modules/ccpython.o(.eh_frame+0x11): undefined reference to `__gxx_personality_v0' These are defined in libstdc++. If LINKCC is set to c++, these go away, 'cause it knows where to find these symbols. In the current test for LINKCC nevertheless gcc ($CC) is good enough to link the conftest, so it is decided to use that for python. The patch merely supplies a conftest that really requires a C++ linker. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-16 09:14 Message: Logged In: YES user_id=21627 Can you please elaborate what exactly you mean by "not determined correctly"? What did you do, what happened, what did you expect to happen, why do you think the observed behaviour is incorrect? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595821&group_id=5470 From noreply@sourceforge.net Fri Aug 16 18:17:08 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 16 Aug 2002 10:17:08 -0700 Subject: [Patches] [ python-Patches-593627 ] Static names Message-ID: Patches item #593627, was opened at 2002-08-11 06:41 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 Category: Core (C code) Group: None >Status: Closed >Resolution: Rejected Priority: 3 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: Static names Initial Comment: This patch creates static string objects for all built-in names and interns then on initialization. The macro PyNAME is be used to access static names. PyNAME(__spam__) is equivalent to PyString_InternFromString("__spam__") but is a constant expression. It requires the name to be one of the built-in names. A linker error will be generated if it isn't. Most conversions of C strings into temporary string objects can be eliminated (PyString_FromString, PyString_InternFromString). Most string comparisons at runtime can also be eliminated. Instead of : if (strcmp(PyString_AsString(name), "__spam__")) ... This code can be used: PyString_INTERN(name) if (name == PyNAME(__spam__)) ... Where PyString_INTERN is a fast inline check if the string is already interned (and it usually is). To prevent unbounded accumulation of interned strings the mortal interned string patch should also be applied. The patch converts most of the builtin module to this new mode as an example. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 14:09 Message: Logged In: YES user_id=6380 > I'm surprised that there is *any* speed increase > because I barely changed any code to make use > this. This is very encouraging. Don't get too excited. Speedups and slowdowns in the order of 1% are usually random cache effects having to do with common portions of the VM main loop having a cache line conflict; I've seen a case where adding an *unreachable* printf() call predictably changed the pystone speed by 1%. > The localization and forced recomplication > issues you raise are not really relevant because > this MUST NOT be used for anything but builtin > names and builtins are not added so > frequently. Even standard modules should not > declare static names. Then why do I see all signal names in your list? And all exception names? > Actually, the macro PyNAME is not required any > more and the actual symbol name can be used. I > used the macro to do typecasting but it's no > longer necessary because I found a way to make > the static names real PyObjects (probably the > only place where something is actually defined > as a PyObject!) But the string is still more helpful in the code than the symbol name. Sorry, but none of this changes my position; you'll hve to find another champion. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-15 13:45 Message: Logged In: YES user_id=562624 The code size increase is not surprising - all names appear twice in the executable: once as C strings and again as static PyStringObjects. This duplication can be eliminated. I'm surprised that there is *any* speed increase because I barely changed any code to make use this. This is very encouraging. The localization and forced recomplication issues you raise are not really relevant because this MUST NOT be used for anything but builtin names and builtins are not added so frequently. Even standard modules should not declare static names. The interning of static strings must be done before the interpreter is initialized to ensure that the static name is the interned name. If you intern a static name after the same name has already been interned elsewhere the static object will not be the one true interned version and static references to it will be incorrect. Actually, the macro PyNAME is not required any more and the actual symbol name can be used. I used the macro to do typecasting but it's no longer necessary because I found a way to make the static names real PyObjects (probably the only place where something is actually defined as a PyObject!) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 11:58 Message: Logged In: YES user_id=6380 Strangely, I measured a code size *increase*. Strangely, because most object files didn't increase in text size, but the resulting binary did, adding about 20K text and 17K data. The only object file that changed sizes at all (according to "size */*.o" on Linux) was Python/bltinmodule.o, which grew less than 500 bytes in text size. The only new object file, Python/staticnames.o, has 48 bytes text and 17K bytes data. Maybe the added text size could be because of more cross-file references added by the linker??? (Files that referenced a local static char string constant now reference a static object in Python/staticnames.o.) I do see about a 1% speed increase for pystone. But I agree with Martin's comments on the "readability" issue. There's also a localization property that's lost: whenever a new name is added, you must update staticnames.h, staticnames.c, *and* the file where it is used. That's not nice (and not just because it forces a recompilation of the world because a header file was touched thst everybody includes). *If* this were ever accepted, the mechanism to (re)generate staticnames.h automatically should be checked in as well. In general, I've found that string literals hidden inside macros using stringification (#) are a detriment to code maintainability -- I've often had the situation where I *knew* there had to be a string literal for some name somewhere, but I couldn't find it because of this. Same for name concatenation (##); it often means that you know there's a function name somewhere but a grep through the sources won't find it. Very painful when tracking down problems. I deployed a bunch of tricks like this in early versions of typeobject.c, and ended up expanding almost all of them: a little more typing perhaps, but explicit is better than implicit, and a search for slot_nb_add will at least find the macro that defines it; ditto a search for "__add__" (with the quotes) will find where it is used. I guess that's about a -0.5 from me. Unless someone else steps up to champion this soon, it's dead. :-) PS. I used cvs add for new files and then cvs diff -c -N to create diffs that include new files; cvs produces output looking like a diff against /dev/null, and patch understands those (see the fixed patch I uploaded). But maybe if you only have anonymous CVS the cvs add won't work, and cvs diff won't give you diffs for files it knows nothing about. IOW YMMV. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 11:22 Message: Logged In: YES user_id=6380 Just for kicks I produced a forward diff, also adding the necessary changes to Makefile.pre.in that were mysteriously missing from the original, and fixing this for the very latest CVS (up to and including Michael Hudson's set_lineno patch). ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 08:48 Message: Logged In: YES user_id=21627 It is difficult to see how this patch achieves the goal of readability: - many variables are not used at all, e.g. ArithmethicError; it is not clear how using them would improve readability. - the change from SETBUILTIN("None", Py_None); to SETBUILTIN(PyNAME(None), Py_None); makes it more difficult to read, not easier. Furthermore, the name "None" isn't used anywhere except this initialisation. - likewise, the changes from {"abs", builtin_abs, METH_O, abs_doc}, to {PyNAMEC(abs), builtin_abs, METH_O, abs_doc}, make the code harder to read. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-12 08:10 Message: Logged In: YES user_id=562624 If these changes are applied throughout the interpreter I expect a significant speedup but that is not my immediate goal. I am looking for redability, reduction of code size (both source and binary) and reliability (less things to check or forget to check). I am trying to get rid of code like if (docstr == NULL) { docstr= PyString_InternFromString("__doc__"); if (docstr == NULL) return NULL; And replace it with just PyNAME(__doc__) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-12 03:43 Message: Logged In: YES user_id=21627 What is the rationale for this patch? If it is for performance, what real speed improvements can you report? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 11:54 Message: Logged In: YES user_id=33168 I'd like to see the releasing interned string patch applied. I think it's almost ready, isn't it? It would make patching easier and seems to be a good idea. For me, the easiest way to produce patches is to use cvs. You can keep multiple cvs trees around easy enough (for having multiple overlapping/independant patches). To create patches with cvs: cvs diff -C 5 [file1] [file2] .... ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-11 11:44 Message: Logged In: YES user_id=562624 Ok, I'll fix the problems with the patch. What's the best way to produce a patch that adds new files? Static string objects cannot be released, of course. This patch will eventually depend on on the mortal interned strings patch to fix it but in the meantime I just disabled releasing interned strings because I want to keep the two patches independent. The next version will add a new PyArg_ParseTuple format for an interned string to make it easier to use. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-11 10:46 Message: Logged In: YES user_id=33168 Couple of initial comments: * this is a reverse patch * it seems like there are other changes in here - int ob_shash -> long - releasing interned strings? * dictobject.c is removed? * including python headers should use "" not <> Oren, could you generate a new patch with only the changes to support PyNAME? Thanks! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=593627&group_id=5470 From noreply@sourceforge.net Fri Aug 16 19:12:11 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 16 Aug 2002 11:12:11 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-01 23:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Guido van Rossum (gvanrossum) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Zack Weinberg (zackw) Date: 2002-08-16 11:12 Message: Logged In: YES user_id=580015 It sounds like some other test -- probably one of the ones conditioned on -u network -- is causing the child process to have a stale file descriptor open. Can you reproduce the problem with ./python -E -tt ./Lib/test/regrtest.py -l -u network test_socket_ssl test_socketserver test_tempfile ? ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-08-16 02:00 Message: Logged In: YES user_id=250749 I hope I'm doing the right thing by re-opening this rather than copening a bug report... I'm seeing a very odd failure in test_tempfile on FreeBSD 4.4. The failure occurs when I run the full regression test with TESTOPTS="-l -u network" but succeeds when TESTOPTS="- l". ./python -E -tt Lib/test/regrtest.py -l -u network test_tempfile succeeds too, as does: ./python -E -tt Lib/test/regrtest.py -v -u network test_tempfile At this point, I haven't tried other -u option combinations. The error log shows: test test_tempfile failed -- Traceback (most recent call last): File "/home/andymac/cvs/python/python- cvs/Lib/test/test_tempfile.py", line 345, in test_noinherit "child process exited successfully") File "/home/andymac/cvs/python/python- cvs/Lib/unittest.py", line 268, in failUnless if not expr: raise self.failureException, msg AssertionError: child process exited successfully Unfortunately Real Job has wiped out any time I might have had to try and debug this, and I won't have much if any time to look at this for about 3 weeks :-( Intuitive guesses about where to start looking would be welcome! The OS/2 EMX port has the mkstemped problem noted below for HP-UX, but I think I might not have picked up the fixes when I tested, so I'll have to check that again. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-08-14 12:44 Message: Logged In: YES user_id=45365 Nevermind. Just saw the discussion on python-dev (this is a file descriptor returned, not a file pointer, so stdio is nowhere in sight). ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-08-14 12:42 Message: Logged In: YES user_id=45365 Isn't it much more logical to give mkstemp() a mode="w+b" argument? The other routines have that as well, and it is also more in line with open() and such... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 09:52 Message: Logged In: YES user_id=6380 Closing again. Reduced the number of temp files to 100. Changed 'binary=True' to 'text=False' default on mkstemp(). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 06:16 Message: Logged In: YES user_id=6380 The "Too many open files" problem is solved. The HP system was configured to allow only 200 open file descriptors per process. But maybe the test would work just as well if it tried to create 100 instead of 1000 temp files? I expect that this would cause failures on other systems with conservative limits. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 06:02 Message: Logged In: YES user_id=6380 I'd like to change the binary=True argument to mkstemp into text=False; that seems easier to explain. News about the HP errors from Kalle Svensson: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 719, in ? test_main() File "../python/dist/src/Lib/test/test_tempfile.py", line 716, in test_main test_support.run_suite(suite) File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/test/test_support.py", line 188, in run_suite raise TestFailed(err) test.test_support.TestFailed: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 295, in test_basic_many File "../python/dist/src/Lib/test/test_tempfile.py", line 278, in do_create File "../python/dist/src/Lib/test/test_tempfile.py", line 33, in failOnException File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/unittest.py", line 260, in fail AssertionError: _mkstemp_inner raised exceptions.OSError: [Errno 24] Too many open files: '/tmp/aaU3irrA' ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 21:28 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 21:17 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 17:19 Message: Logged In: YES user_id=6380 Oops, looks like typos in the patch. Fixed (I hope). Question for Zack: I noticed that a few times you changed this: temp = tempfile.mktemp() into this: (fd, temp) = tempfile.mkstemp() os.close(fd) If the latter is secure, why can't mktemp() be defined as doing that? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-09 13:53 Message: Logged In: YES user_id=33168 Guido, there are still 3 uses of mktemp after the checkin. Should these use mkstemp()? Lib/toaiff.py:102 Lib/plat-irix5/torgb.py:95 Lib/plat-irix6/torgb.py:95 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 09:40 Message: Logged In: YES user_id=6380 I've checked this all in now. The changes to test_tempfile.py weren't as easily fixable to work without the tempfile.py changes as Zack thought. I hope the community will give it some review. It will probably break some Zope tests. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 07:50 Message: Logged In: YES user_id=6380 OK, I'll do something along those lines myself. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-08 12:01 Message: Logged In: YES user_id=580015 I'm afraid my idea of patch size comes from GCC land, where this would be considered not that big at all. Only 1500 lines affected, and more than half of that is documentation and test suite! I tried, and failed, to break up the changes to tempfile.py itself. But there's some larger divisions that could be made. We could check in the new test_tempfile.py now, disabling the tests that refer to nonexistent functions (just comment out the lines that add those tests to the test_classes array). The changes to the rest of the test suite are also largely independent of the tempfile.py rewrite (since they replace tempfile.mktemp() with TESTFN, mostly). And the search-and-replace changes in the library can wait until after tempfile.py itself gets reviewed. Unfortunately, I am about to go on vacation for five days, so I don't have time now to do this split-up. I will try to drum up interest on python-dev in reviewing the patch as is. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 09:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-02 23:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 07:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Fri Aug 16 19:57:07 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 16 Aug 2002 11:57:07 -0700 Subject: [Patches] [ python-Patches-576101 ] Alternative implementation of interning Message-ID: Patches item #576101, was opened at 2002-07-01 15:23 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Oren Tirosh (orenti) Assigned to: Guido van Rossum (gvanrossum) Summary: Alternative implementation of interning Initial Comment: An interned string has a flag set indicating that it is interned instead of a pointer to the interned string. This pointer was almost always either NULL or pointing to the same object. The other cases were rare and ineffective as an optimization. This saves an average of 3 bytes per string. Interned strings are no longer immortal. They are automatically destroyed when there are no more references to them except the global dictionary of interned strings. New function (actually a macro) PyString_CheckInterned to check whether a string is interned. There are no more references to ob_sinterned anywhere outside stringobject.c. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-16 14:57 Message: Logged In: YES user_id=6380 Here's a new version (#6) that makes all interned strings mortal unless explicitly requested with PyString_InternImmortal(). There are no calls to that function in the core. I'm very tempted to check this in and see how it goes. - Leave all the calls to PyString_InternInPlace(), since that is still the recommended API. - Got rid of the macro PyString_INTERN(), it was unused. - Fixed the issue with getclassname() through an API change (it's static so doesn't matter). - Rewrote _Py_ReleaseInternedStrings(); it now simply clears the immortality status, restores the stolen refcounts, and then clears and decrefs the interned dict. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-15 04:45 Message: Logged In: YES user_id=21627 _Py_ReleaseInternedStrings: it might be that embedded applications use it. It would not be fair to cause heap corruption for them - it would be better to break them at link time, by removing the function entirely. I see no need to do either - it should just release immortal strings, as it always did, if there are any left. intern creates immortal strings: It might be that an application saves the id() of an interned string and releases the interned strings; then expects to get the same id back later. If you ask people whether they do that they won't tell, because they don't know that they do that. You could explicitly decide to break such applications (which would be reasonable), but then this needs to be documented. binary compatibility: I'm neutral here. If the API is bumped, people get sufficient warning. PyString_InternInPlace: I think it needs to be preserved, since applications may not hold explicit references (trusting that the interned dictionary will hold the reference). Of course, the InPlace name signals that there is no return value, so it is better than _Intern for new users. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-14 23:44 Message: Logged In: YES user_id=562624 Yes, PyString_InternInPlace is for backward compatibility. How conservative do we need to be about compatibility? My work copy has an option for making strings binary compatible. Which is more important: binary compatibility or saving 3 bytes? A related patch (static names) provides a possible alternative to most PyString_InternFromString calls. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 23:03 Message: Logged In: YES user_id=6380 string_dealloc() is a bit optimistic in that it doesn't check the DelItem for an error; but I don't know what it should do when it gets an error at that point. Probably call Py_FatalError(); if it wanted to recover, it would have to call PyErr_Fetch() / PyErr_Restore() around the DelItem() call, because we're in a dealloc handler here and that shouldn't change the exception state. _Py_ReleaseInternedStrings() should use PyDict_ methods, not PyMapping_ methods. And it should do more careful error checking. But maybe it's best to delete this function -- it's not needed except when you want to run Insure++, and we're not using that any more. I note that the whole patch needs to be scrutinized carefully looking for missing error checking and things like that. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 21:26 Message: Logged In: YES user_id=6380 > - why does this try to "fix" the problem of > dangling interned strings? AFAICT: if there is a > reference to an interned string at the time > _Py_ReleaseInternedStrings is called, that > reference is silently dropped, and a later > DECREF will result in memory corruption. IOW: it > should merely set the state of all strings to > normal, and clear the dict. Note that the *only* time when _Py_ReleaseInternedStrings() can ever be called is at program exit, just before you run a memory leak detector. There's no way Python can be resurrected after _Py_ReleaseInternedStrings() has run. > - Replacing PyString_InternInPlace with > PyString_Intern seems dangerous. AFAICT, the > fragment > > PyString_InternInPlace(&name); > Py_DECREF(name); > return PyString_AS_STRING(name); > > from getclassname would break: Intern() would > return the only reference to the interned string > (assuming this is the first usage), and > getclassname drops this reference, returning a > pointer to deallocated memory. I'm not sure > though why getclassname interns the result in > the first place. getclassname() is doing something very unsavory here! I expect that its API will have to be changed to copy the name into a buffer provided by the caller. We'll have to scrutinize all calls for tricks like this. > Selectively replacing them might be a good idea, > though. For intern(), I think an optional > argument strongref needs to be provided (the > interned dict essentially weak-references the > strings). Perhaps the default even needs to be > weakref. So do you think there's a need for immortal strings? What is that need? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 18:24 Message: Logged In: YES user_id=21627 Some mutually unrelated comments: - the GC_UnTrack call for interned is not need: GC won't be able to explain the reference that stringobject.c holds. - why does this try to "fix" the problem of dangling interned strings? AFAICT: if there is a reference to an interned string at the time _Py_ReleaseInternedStrings is called, that reference is silently dropped, and a later DECREF will result in memory corruption. IOW: it should merely set the state of all strings to normal, and clear the dict. - Replacing PyString_InternInPlace with PyString_Intern seems dangerous. AFAICT, the fragment PyString_InternInPlace(&name); Py_DECREF(name); return PyString_AS_STRING(name); from getclassname would break: Intern() would return the only reference to the interned string (assuming this is the first usage), and getclassname drops this reference, returning a pointer to deallocated memory. I'm not sure though why getclassname interns the result in the first place. Selectively replacing them might be a good idea, though. For intern(), I think an optional argument strongref needs to be provided (the interned dict essentially weak-references the strings). Perhaps the default even needs to be weakref. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 15:32 Message: Logged In: YES user_id=6380 Question for all other reviewers. Why not replace all calls to PyString_InternInplace() [which creates immortal strings] with PyString_Intern(), making all (core) uses of interning yield mortal strings? E.g. the call in PyObject_SetAttr() will immortalize all strings that are ever used as a key on a setattr operation; in a long-lived server like Zope this is a concern, since setattr keys are often user-provided data: an endless stream of user-provided data will grow the interned dict indefinitely. And having the builtin intern() always return an immortal string also limits the usability of intern(). Most of the uses I could find of PyString_InternFromString() hold on to a global ref to the object, making it immortal anyway; but why should that function itself force the string to be immortal? (Especially since the exceptions are things like PyObject_GetAttrString() and PyObject_SetItemString(), which have the same concerns as PyObject_SetItem(). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 14:43 Message: Logged In: YES user_id=6380 Here's an update of the patch for current CVS (stringobject.h failed due to changes for PyAPI_DATA/PyAPI_FUNC). Could you add documentation to Doc/api/concrete.tex for PyString_Intern() and explains how PyString_InternInPlace() differs? (AFAICT it makes the interned string immortal -- I suppose this is a B/W compat feature?) The variables PYTHON_API_VERSION and PYTHON_API_STRING in modsupport.h need to be updated -- many extensions use the PyString_AS_STRING() macro which relies on the string object format. If an extension compiled with the old code is linked with the new interpreter, it will miss the first three bytes of string objects -- or even store into memory it doesn't own! (I've already added this to the patch I am uploading.) (The test_gc failures were unrelated; Tim has fixed this already in CVS.) I'm tempted to say that except for the API doc issue this is complete. Thanks! ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-08-10 06:01 Message: Logged In: YES user_id=562624 General cleanup. Better handling of immortal interned strings for backward compatibility. It passes regrtest but causes test_gc to leak 20 objects. 13 from test_finalizer_newclass and 7 from test_del_newclass, but only if test_saveall is used. I've tried earlier versions of this patch (which were ok at the time) and they now create this leak too. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 12:08 Message: Logged In: YES user_id=562624 Oops, forgot to actually attach the patch. Here it is. ---------------------------------------------------------------------- Comment By: Oren Tirosh (orenti) Date: 2002-07-06 10:35 Message: Logged In: YES user_id=562624 This implementation supports both mortal and immortal interned strings. PyString_InternInPlace creates an immortal interned string for backward compatibility with code that relies on this behavior. PyString_Intern creates a mortal interned string that is deallocated when its refcnt reaches 0. Note that if the string value has been previously interned as immortal this will not make it mortal. Most places in the interpreter were changed to PyString_Intern except those that may be required for compatibility. This version of the patch, like the previous one, disables indirect interning. Is there any evidence that it is still an important optimization for some packages? Make sure you rebuild everything after applying this patch because it modifies the size of string object headers. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-07-02 00:21 Message: Logged In: YES user_id=80475 I like the way you consolidated all of the knowledge about interning into one place. Consider adding an example to the docs of an effective use of interning for optimization. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576101&group_id=5470 From noreply@sourceforge.net Sat Aug 17 03:23:47 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 16 Aug 2002 19:23:47 -0700 Subject: [Patches] [ python-Patches-595821 ] --witch-cxx=c++ and correct LINKCC Message-ID: Patches item #595821, was opened at 2002-08-15 22:06 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595821&group_id=5470 Category: Build Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Jochen K�pper (kuepper) Assigned to: Nobody/Anonymous (nobody) Summary: --witch-cxx=c++ and correct LINKCC Initial Comment: Here LINKCC is not determined correctly when using configure --with-cxx=c++ This is a RedHat-7.1 based system, gcc-3.2, python from cvs. The following patch against configure.in from "release22-maint" might be a little overkill, but it makes sure that the linker uses (and finds) libstdc++: Index: configure.in =================================================================== RCS file: /cvsroot/python/python/dist/src/configure.in,v retrieving revision 1.288.6.6 diff -u -r1.288.6.6 configure.in --- configure.in 2 Jun 2002 17:34:47 -0000 1.288.6.6 +++ configure.in 16 Aug 2002 02:04:47 -0000 @@ -279,7 +279,7 @@ if test -z "$CXX"; then LINKCC="\ \" else - echo 'int main(){return 0;}' > conftest.$ac_ext + echo '#include int main(){string c('Hello'); return 0;}' > conftest.$ac_ext $CXX -c conftest.$ac_ext 2>&5 if $CC -o conftest$ac_exeext conftest.$ac_objext 2>&5 \ && test -s conftest$ac_exeext && ./conftest$ac_exeext ---------------------------------------------------------------------- >Comment By: Jochen K�pper (kuepper) Date: 2002-08-16 22:23 Message: Logged In: YES user_id=19849 Since I have a slow modem... I applied the mainline path to my maint22-branch sources, ignoring the one failed hook for the cvs Id: cvs diff -u -r 1.318 -r 1.319 configure.in | patch -p0 Works as good. Yes, it is a duplicate of the bug #559429 fix:O Besides the fact that I marked it as python-2.2 and it isn't fixed there, yet. Please back-port it to 2.2, though. (Who's in charge of that?) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-16 11:16 Message: Logged In: YES user_id=21627 Can you please try the mainline CVS? This appears to be a duplicate of bug #559429, which has been fixed with configure.in 1.319. ---------------------------------------------------------------------- Comment By: Jochen K�pper (kuepper) Date: 2002-08-16 10:32 Message: Logged In: YES user_id=19849 I did ./configure --with-cxx=c++ && make and LINKCC was set to gcc. So later on when linking the exectutable I get unresolved references, i.e. Modules/ccpython.o(.eh_frame+0x11): undefined reference to `__gxx_personality_v0' These are defined in libstdc++. If LINKCC is set to c++, these go away, 'cause it knows where to find these symbols. In the current test for LINKCC nevertheless gcc ($CC) is good enough to link the conftest, so it is decided to use that for python. The patch merely supplies a conftest that really requires a C++ linker. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-16 03:14 Message: Logged In: YES user_id=21627 Can you please elaborate what exactly you mean by "not determined correctly"? What did you do, what happened, what did you expect to happen, why do you think the observed behaviour is incorrect? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595821&group_id=5470 From noreply@sourceforge.net Sat Aug 17 08:49:00 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 17 Aug 2002 00:49:00 -0700 Subject: [Patches] [ python-Patches-595821 ] --witch-cxx=c++ and correct LINKCC Message-ID: Patches item #595821, was opened at 2002-08-16 04:06 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595821&group_id=5470 Category: Build Group: Python 2.2.x >Status: Closed >Resolution: Duplicate Priority: 5 Submitted By: Jochen K�pper (kuepper) Assigned to: Nobody/Anonymous (nobody) Summary: --witch-cxx=c++ and correct LINKCC Initial Comment: Here LINKCC is not determined correctly when using configure --with-cxx=c++ This is a RedHat-7.1 based system, gcc-3.2, python from cvs. The following patch against configure.in from "release22-maint" might be a little overkill, but it makes sure that the linker uses (and finds) libstdc++: Index: configure.in =================================================================== RCS file: /cvsroot/python/python/dist/src/configure.in,v retrieving revision 1.288.6.6 diff -u -r1.288.6.6 configure.in --- configure.in 2 Jun 2002 17:34:47 -0000 1.288.6.6 +++ configure.in 16 Aug 2002 02:04:47 -0000 @@ -279,7 +279,7 @@ if test -z "$CXX"; then LINKCC="\ \" else - echo 'int main(){return 0;}' > conftest.$ac_ext + echo '#include int main(){string c('Hello'); return 0;}' > conftest.$ac_ext $CXX -c conftest.$ac_ext 2>&5 if $CC -o conftest$ac_exeext conftest.$ac_objext 2>&5 \ && test -s conftest$ac_exeext && ./conftest$ac_exeext ---------------------------------------------------------------------- >Comment By: Martin v. L�wis (loewis) Date: 2002-08-17 09:48 Message: Logged In: YES user_id=21627 The patch is already marked as a 2.2 candidate. I understand the PBF will manage the next release of 2.2; there is no schedule for that at the moment, thus no urgency. I'll close this report. ---------------------------------------------------------------------- Comment By: Jochen K�pper (kuepper) Date: 2002-08-17 04:23 Message: Logged In: YES user_id=19849 Since I have a slow modem... I applied the mainline path to my maint22-branch sources, ignoring the one failed hook for the cvs Id: cvs diff -u -r 1.318 -r 1.319 configure.in | patch -p0 Works as good. Yes, it is a duplicate of the bug #559429 fix:O Besides the fact that I marked it as python-2.2 and it isn't fixed there, yet. Please back-port it to 2.2, though. (Who's in charge of that?) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-16 17:16 Message: Logged In: YES user_id=21627 Can you please try the mainline CVS? This appears to be a duplicate of bug #559429, which has been fixed with configure.in 1.319. ---------------------------------------------------------------------- Comment By: Jochen K�pper (kuepper) Date: 2002-08-16 16:32 Message: Logged In: YES user_id=19849 I did ./configure --with-cxx=c++ && make and LINKCC was set to gcc. So later on when linking the exectutable I get unresolved references, i.e. Modules/ccpython.o(.eh_frame+0x11): undefined reference to `__gxx_personality_v0' These are defined in libstdc++. If LINKCC is set to c++, these go away, 'cause it knows where to find these symbols. In the current test for LINKCC nevertheless gcc ($CC) is good enough to link the conftest, so it is decided to use that for python. The patch merely supplies a conftest that really requires a C++ linker. ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-16 09:14 Message: Logged In: YES user_id=21627 Can you please elaborate what exactly you mean by "not determined correctly"? What did you do, what happened, what did you expect to happen, why do you think the observed behaviour is incorrect? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=595821&group_id=5470 From noreply@sourceforge.net Sat Aug 17 12:46:11 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 17 Aug 2002 04:46:11 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-02 02:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Guido van Rossum (gvanrossum) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-17 07:46 Message: Logged In: YES user_id=6380 Andrew, can you check again with current CVS? I checked in a fix to test_tempfile.py (and an auxiliary file tf_inherit_check.py) that makes the failing test much more robust (we hope). ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-16 14:12 Message: Logged In: YES user_id=580015 It sounds like some other test -- probably one of the ones conditioned on -u network -- is causing the child process to have a stale file descriptor open. Can you reproduce the problem with ./python -E -tt ./Lib/test/regrtest.py -l -u network test_socket_ssl test_socketserver test_tempfile ? ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-08-16 05:00 Message: Logged In: YES user_id=250749 I hope I'm doing the right thing by re-opening this rather than copening a bug report... I'm seeing a very odd failure in test_tempfile on FreeBSD 4.4. The failure occurs when I run the full regression test with TESTOPTS="-l -u network" but succeeds when TESTOPTS="- l". ./python -E -tt Lib/test/regrtest.py -l -u network test_tempfile succeeds too, as does: ./python -E -tt Lib/test/regrtest.py -v -u network test_tempfile At this point, I haven't tried other -u option combinations. The error log shows: test test_tempfile failed -- Traceback (most recent call last): File "/home/andymac/cvs/python/python- cvs/Lib/test/test_tempfile.py", line 345, in test_noinherit "child process exited successfully") File "/home/andymac/cvs/python/python- cvs/Lib/unittest.py", line 268, in failUnless if not expr: raise self.failureException, msg AssertionError: child process exited successfully Unfortunately Real Job has wiped out any time I might have had to try and debug this, and I won't have much if any time to look at this for about 3 weeks :-( Intuitive guesses about where to start looking would be welcome! The OS/2 EMX port has the mkstemped problem noted below for HP-UX, but I think I might not have picked up the fixes when I tested, so I'll have to check that again. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-08-14 15:44 Message: Logged In: YES user_id=45365 Nevermind. Just saw the discussion on python-dev (this is a file descriptor returned, not a file pointer, so stdio is nowhere in sight). ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-08-14 15:42 Message: Logged In: YES user_id=45365 Isn't it much more logical to give mkstemp() a mode="w+b" argument? The other routines have that as well, and it is also more in line with open() and such... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 12:52 Message: Logged In: YES user_id=6380 Closing again. Reduced the number of temp files to 100. Changed 'binary=True' to 'text=False' default on mkstemp(). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 09:16 Message: Logged In: YES user_id=6380 The "Too many open files" problem is solved. The HP system was configured to allow only 200 open file descriptors per process. But maybe the test would work just as well if it tried to create 100 instead of 1000 temp files? I expect that this would cause failures on other systems with conservative limits. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 09:02 Message: Logged In: YES user_id=6380 I'd like to change the binary=True argument to mkstemp into text=False; that seems easier to explain. News about the HP errors from Kalle Svensson: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 719, in ? test_main() File "../python/dist/src/Lib/test/test_tempfile.py", line 716, in test_main test_support.run_suite(suite) File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/test/test_support.py", line 188, in run_suite raise TestFailed(err) test.test_support.TestFailed: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 295, in test_basic_many File "../python/dist/src/Lib/test/test_tempfile.py", line 278, in do_create File "../python/dist/src/Lib/test/test_tempfile.py", line 33, in failOnException File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/unittest.py", line 260, in fail AssertionError: _mkstemp_inner raised exceptions.OSError: [Errno 24] Too many open files: '/tmp/aaU3irrA' ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 00:28 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 00:17 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 20:19 Message: Logged In: YES user_id=6380 Oops, looks like typos in the patch. Fixed (I hope). Question for Zack: I noticed that a few times you changed this: temp = tempfile.mktemp() into this: (fd, temp) = tempfile.mkstemp() os.close(fd) If the latter is secure, why can't mktemp() be defined as doing that? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-09 16:53 Message: Logged In: YES user_id=33168 Guido, there are still 3 uses of mktemp after the checkin. Should these use mkstemp()? Lib/toaiff.py:102 Lib/plat-irix5/torgb.py:95 Lib/plat-irix6/torgb.py:95 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 12:40 Message: Logged In: YES user_id=6380 I've checked this all in now. The changes to test_tempfile.py weren't as easily fixable to work without the tempfile.py changes as Zack thought. I hope the community will give it some review. It will probably break some Zope tests. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-09 10:50 Message: Logged In: YES user_id=6380 OK, I'll do something along those lines myself. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-08 15:01 Message: Logged In: YES user_id=580015 I'm afraid my idea of patch size comes from GCC land, where this would be considered not that big at all. Only 1500 lines affected, and more than half of that is documentation and test suite! I tried, and failed, to break up the changes to tempfile.py itself. But there's some larger divisions that could be made. We could check in the new test_tempfile.py now, disabling the tests that refer to nonexistent functions (just comment out the lines that add those tests to the test_classes array). The changes to the rest of the test suite are also largely independent of the tempfile.py rewrite (since they replace tempfile.mktemp() with TESTFN, mostly). And the search-and-replace changes in the library can wait until after tempfile.py itself gets reviewed. Unfortunately, I am about to go on vacation for five days, so I don't have time now to do this split-up. I will try to drum up interest on python-dev in reviewing the patch as is. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-05 12:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-03 02:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-02 10:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Sat Aug 17 18:39:24 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 17 Aug 2002 10:39:24 -0700 Subject: [Patches] [ python-Patches-580995 ] new version of Set class Message-ID: Patches item #580995, was opened at 2002-07-13 07:53 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580995&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Alex Martelli (aleax) Assigned to: Guido van Rossum (gvanrossum) Summary: new version of Set class Initial Comment: As per python-dev discussion on Sat 13 July 2002, subject "Dict constructor". A version of Greg Wilson's sandbox Set class that avoids the trickiness of implicitly freezing a set when __hash__ is called on it. Rather, uses several classes: Set itself has no __hash__ and represents a general, mutable set; BaseSet, its superclass, has all functionality common to mutable and immutable sets; ImmutableSet also subclasses BaseSet and adds __hash__; a wrapper _TemporarilyImmutableSet wraps a Set exposing only __hash__ (identical to that an ImmutableSet built from the Set would have) and __eq__ and __ne__ (delegated to the Set instance). Set.add(self, x) attempts to call x=x._asImmutable() (if AttributeError leaves x alone); Set._asImmutable(self) returns ImmutableSet(self). Membership test BaseSet.__contains__(self, x) attempt to call x = x._asTemporarilyImmutable() (if AttributeError leaves x alone); Set._asTemporarilyImmutable(self) returns TemporarilyImmutableSet(self). I've left Greg's code mostly alone otherwise except for fixing bugs/obsolescent usage (e.g. dictionary rather than dict) and making what were ValueError into TypeError (ValueError was doubtful earlier, is untenable now that mutable and immutable sets are different types). The change in exceptions forced me to change the unit tests in test_set.py, too, but I made no other changes nor additions. ---------------------------------------------------------------------- Comment By: Greg Chapman (glchapman) Date: 2002-08-17 09:39 Message: Logged In: YES user_id=86307 I wonder if the binary operators perhaps should be changed to allow other to be a sequence (or really, anything iterable). That is, if other is not a set type, try to create an ImmutableSet using other and, if successful, use that for the operation. Of course, the conversion of iterable to set can be done explicitly by the code which uses the set operation, but it might be desirable to make this conversion implicit (under the "be generous in what you accept" principle). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 12:37 Message: Logged In: YES user_id=6380 Thanks, Alex! I've checked in a major rewrite of this in /nondist/sandbox/sets/set.py, replacing of Greg V. Wilson's version. ---------------------------------------------------------------------- Comment By: Alex Martelli (aleax) Date: 2002-07-18 12:27 Message: Logged In: YES user_id=60314 Changed as per GvR comments so now sets have-a dict rather than being-a dict. Made code more direct in some places (using list comprehensions rather than loops where appropriate). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580995&group_id=5470 From noreply@sourceforge.net Sat Aug 17 18:49:01 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 17 Aug 2002 10:49:01 -0700 Subject: [Patches] [ python-Patches-580995 ] new version of Set class Message-ID: Patches item #580995, was opened at 2002-07-13 11:53 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580995&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Alex Martelli (aleax) Assigned to: Guido van Rossum (gvanrossum) Summary: new version of Set class Initial Comment: As per python-dev discussion on Sat 13 July 2002, subject "Dict constructor". A version of Greg Wilson's sandbox Set class that avoids the trickiness of implicitly freezing a set when __hash__ is called on it. Rather, uses several classes: Set itself has no __hash__ and represents a general, mutable set; BaseSet, its superclass, has all functionality common to mutable and immutable sets; ImmutableSet also subclasses BaseSet and adds __hash__; a wrapper _TemporarilyImmutableSet wraps a Set exposing only __hash__ (identical to that an ImmutableSet built from the Set would have) and __eq__ and __ne__ (delegated to the Set instance). Set.add(self, x) attempts to call x=x._asImmutable() (if AttributeError leaves x alone); Set._asImmutable(self) returns ImmutableSet(self). Membership test BaseSet.__contains__(self, x) attempt to call x = x._asTemporarilyImmutable() (if AttributeError leaves x alone); Set._asTemporarilyImmutable(self) returns TemporarilyImmutableSet(self). I've left Greg's code mostly alone otherwise except for fixing bugs/obsolescent usage (e.g. dictionary rather than dict) and making what were ValueError into TypeError (ValueError was doubtful earlier, is untenable now that mutable and immutable sets are different types). The change in exceptions forced me to change the unit tests in test_set.py, too, but I made no other changes nor additions. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-08-17 13:49 Message: Logged In: YES user_id=31435 -1 on that, Greg. If Python sees list + set it's got no more reason to believe that the programmer intended the list elements to be inserted into the set than to believe the intent was to append the set elements to the list. "Be generous in what you accept" comes from the networking world, where the chances are good that the program on the other end was written by people who can't code confused by an ambiguous spec written by people who can't write <0.9 wink>. ---------------------------------------------------------------------- Comment By: Greg Chapman (glchapman) Date: 2002-08-17 13:39 Message: Logged In: YES user_id=86307 I wonder if the binary operators perhaps should be changed to allow other to be a sequence (or really, anything iterable). That is, if other is not a set type, try to create an ImmutableSet using other and, if successful, use that for the operation. Of course, the conversion of iterable to set can be done explicitly by the code which uses the set operation, but it might be desirable to make this conversion implicit (under the "be generous in what you accept" principle). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 16:37 Message: Logged In: YES user_id=6380 Thanks, Alex! I've checked in a major rewrite of this in /nondist/sandbox/sets/set.py, replacing of Greg V. Wilson's version. ---------------------------------------------------------------------- Comment By: Alex Martelli (aleax) Date: 2002-07-18 16:27 Message: Logged In: YES user_id=60314 Changed as per GvR comments so now sets have-a dict rather than being-a dict. Made code more direct in some places (using list comprehensions rather than loops where appropriate). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580995&group_id=5470 From noreply@sourceforge.net Sat Aug 17 20:24:03 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 17 Aug 2002 12:24:03 -0700 Subject: [Patches] [ python-Patches-463656 ] setup.py, --with-includepath, and LD_LIB Message-ID: Patches item #463656, was opened at 2001-09-21 12:12 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=463656&group_id=5470 Category: Distutils and setup.py Group: None Status: Open Resolution: None Priority: 5 Submitted By: Frederic Giacometti (giacometti) Assigned to: Nobody/Anonymous (nobody) Summary: setup.py, --with-includepath, and LD_LIB Initial Comment: This patch improves the module detection capability in setup.py. The following improvements are implemented: - directories listed in LD_LIBRARY_PATH are also searched for shared libraries. - the --with-includepath option has been added to configure, to specify additional non-standard directories where the include files are to be searched for. The corresponding changes were added to setup.py (new function detect_include(), find_library_file() augmented, detect_tkinter() improved) I retroceeded manually the changes from configure into configure.in, but I did not run autoconf; you might want to double-check this. Sample aplication: ./configure --prefix=/something --with-includepath='/mgl/apps/include:/mgl/share/include' With this patch, I get Tkinter to build correctly without editing the Setup files, with non-standard tckl/tk 8.0 to 8.3 installations. where the only tcl.h file is in /mgl/share/include/tcl8.0 (therefore, tkinter is build with tcl8.0 on this configuration). FG ---------------------------------------------------------------------- >Comment By: Frederic Giacometti (giacometti) Date: 2002-08-17 12:24 Message: Logged In: YES user_id=93657 On one hand, using -I and -L from setup.py would simplify the setup.py code, but on the other hand, in the present build system, this will require coding in configure.in some of what we can write in python in setup.py (e.g. the use of LD_SHARED_LIBRARY by default). I know we get the eternal story (I already hear it every day with Perl, awk, and various forms of platform-dependent shell scripting), that autoconf is a great and well-known programming language which everybody uses, and why should we do in Python what we can already do in autoconf, and so on... I wish I could put a smiley on this, but I stopped laughing at this things, since I meet them endlessly. Regards, FG ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-08-14 07:56 Message: Logged In: YES user_id=21627 Frederic, do you still think there is a need for this patch? ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-01-17 09:58 Message: Logged In: YES user_id=6656 You do know that you can pass -I and -L options to setup.py? That might be a less involved way of doing what you want. ---------------------------------------------------------------------- Comment By: Frederic Giacometti (giacometti) Date: 2001-09-27 17:17 Message: Logged In: YES user_id=93657 I moved the functions find_library_file() and detect_include() to distutils.sysconfig(), so that they can be reused for configuring third party modules too (e.g.: PyOpenGL...). Let me know if you wish a patch for this. Frederic Giacometti ---------------------------------------------------------------------- Comment By: Frederic Giacometti (giacometti) Date: 2001-09-26 15:56 Message: Logged In: YES user_id=93657 I'm replacing the patch with an improved version (against main line as of 09/26/01). New features: - configure is generated from configure.in, with autoconf - detect_tkinter also checks the version number inside the tcl.h and tk.h files (#define TCL_VERSION, #define TK_VERSION...). The 'tk_detect' improvement is in this same patch as the '--include-patch' feature; since the second one was written to get the first one working. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=463656&group_id=5470 From noreply@sourceforge.net Sat Aug 17 21:07:17 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 17 Aug 2002 13:07:17 -0700 Subject: [Patches] [ python-Patches-580995 ] new version of Set class Message-ID: Patches item #580995, was opened at 2002-07-13 11:53 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580995&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Alex Martelli (aleax) Assigned to: Guido van Rossum (gvanrossum) Summary: new version of Set class Initial Comment: As per python-dev discussion on Sat 13 July 2002, subject "Dict constructor". A version of Greg Wilson's sandbox Set class that avoids the trickiness of implicitly freezing a set when __hash__ is called on it. Rather, uses several classes: Set itself has no __hash__ and represents a general, mutable set; BaseSet, its superclass, has all functionality common to mutable and immutable sets; ImmutableSet also subclasses BaseSet and adds __hash__; a wrapper _TemporarilyImmutableSet wraps a Set exposing only __hash__ (identical to that an ImmutableSet built from the Set would have) and __eq__ and __ne__ (delegated to the Set instance). Set.add(self, x) attempts to call x=x._asImmutable() (if AttributeError leaves x alone); Set._asImmutable(self) returns ImmutableSet(self). Membership test BaseSet.__contains__(self, x) attempt to call x = x._asTemporarilyImmutable() (if AttributeError leaves x alone); Set._asTemporarilyImmutable(self) returns TemporarilyImmutableSet(self). I've left Greg's code mostly alone otherwise except for fixing bugs/obsolescent usage (e.g. dictionary rather than dict) and making what were ValueError into TypeError (ValueError was doubtful earlier, is untenable now that mutable and immutable sets are different types). The change in exceptions forced me to change the unit tests in test_set.py, too, but I made no other changes nor additions. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-17 16:07 Message: Logged In: YES user_id=6380 I concur with TIm on that one, even though the set union operator is |, not +. In general, Python's container types don't mix in operations that much -- you can't even concatenate a list to a tuple, so we're already being generous by allowing Set and ImmutableSet to mix. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-08-17 13:49 Message: Logged In: YES user_id=31435 -1 on that, Greg. If Python sees list + set it's got no more reason to believe that the programmer intended the list elements to be inserted into the set than to believe the intent was to append the set elements to the list. "Be generous in what you accept" comes from the networking world, where the chances are good that the program on the other end was written by people who can't code confused by an ambiguous spec written by people who can't write <0.9 wink>. ---------------------------------------------------------------------- Comment By: Greg Chapman (glchapman) Date: 2002-08-17 13:39 Message: Logged In: YES user_id=86307 I wonder if the binary operators perhaps should be changed to allow other to be a sequence (or really, anything iterable). That is, if other is not a set type, try to create an ImmutableSet using other and, if successful, use that for the operation. Of course, the conversion of iterable to set can be done explicitly by the code which uses the set operation, but it might be desirable to make this conversion implicit (under the "be generous in what you accept" principle). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 16:37 Message: Logged In: YES user_id=6380 Thanks, Alex! I've checked in a major rewrite of this in /nondist/sandbox/sets/set.py, replacing of Greg V. Wilson's version. ---------------------------------------------------------------------- Comment By: Alex Martelli (aleax) Date: 2002-07-18 16:27 Message: Logged In: YES user_id=60314 Changed as per GvR comments so now sets have-a dict rather than being-a dict. Made code more direct in some places (using list comprehensions rather than loops where appropriate). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=580995&group_id=5470 From noreply@sourceforge.net Sat Aug 17 22:55:10 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 17 Aug 2002 14:55:10 -0700 Subject: [Patches] [ python-Patches-596581 ] urllib.splituser(): '@' in usrname Message-ID: Patches item #596581, was opened at 2002-08-17 16:55 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=596581&group_id=5470 Category: Modules Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Jonathan Simms (slyphon) Assigned to: Nobody/Anonymous (nobody) Summary: urllib.splituser(): '@' in usrname Initial Comment: This was in response to "[ 581529 ] bug in splituser(host) in urllib". This was a one-line change. The issue was that if a username contained the '@' symbol, the urllib.splituser(host) method wouldn't return the correct information. if you were to try urllib.splituser(Jones@CrunchyFrog.net@Whizzo.com), you'd get ['Jones', 'CrunchyFrog.net@Whizzo.com'] after applying this patch, you get ['Jones@CrunchyFrog.net', 'Whizzo.com'] So, there you are... :-) ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=596581&group_id=5470 From noreply@sourceforge.net Sun Aug 18 01:49:51 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 17 Aug 2002 17:49:51 -0700 Subject: [Patches] [ python-Patches-596581 ] urllib.splituser(): '@' in usrname Message-ID: Patches item #596581, was opened at 2002-08-17 16:55 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=596581&group_id=5470 Category: Modules Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Jonathan Simms (slyphon) >Assigned to: Raymond Hettinger (rhettinger) Summary: urllib.splituser(): '@' in usrname Initial Comment: This was in response to "[ 581529 ] bug in splituser(host) in urllib". This was a one-line change. The issue was that if a username contained the '@' symbol, the urllib.splituser(host) method wouldn't return the correct information. if you were to try urllib.splituser(Jones@CrunchyFrog.net@Whizzo.com), you'd get ['Jones', 'CrunchyFrog.net@Whizzo.com'] after applying this patch, you get ['Jones@CrunchyFrog.net', 'Whizzo.com'] So, there you are... :-) ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=596581&group_id=5470 From noreply@sourceforge.net Sun Aug 18 07:52:25 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 17 Aug 2002 23:52:25 -0700 Subject: [Patches] [ python-Patches-589982 ] tempfile.py rewrite Message-ID: Patches item #589982, was opened at 2002-08-02 16:38 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 Category: Library (Lib) Group: Python 2.3 >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Zack Weinberg (zackw) Assigned to: Guido van Rossum (gvanrossum) Summary: tempfile.py rewrite Initial Comment: This rewrite closes a number of security-relevant races in tempfile.py; makes temporary filenames much harder to guess; provides secure interfaces that can be used to close similar races elsewhere; and makes it possible to control the prefix and directory of each temporary created, individually. ---------------------------------------------------------------------- >Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-08-18 16:52 Message: Logged In: YES user_id=250749 I didn't get to try Zack's suggestion before my FreeBSD auto build/test setup caught up with Guido's checkin. With Guido's checkin, test_tempfile passes the TESTOPT="-l -u network" test run that was previously failing. OS/2 EMX actually had 2 problems: - file/directory permissions behave like Windows, not Unix; - EMX defaults to only 40 file handles. I've checked in a small change to test_tempfile.py to deal with the first issue (making it behave like Windows), and checked in a Makefile change that ups the number of file handles to 250. I've also added notes to the port README about ways of overriding the number of file handles. I'm closing this patch as my issues are now resolved. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-17 21:46 Message: Logged In: YES user_id=6380 Andrew, can you check again with current CVS? I checked in a fix to test_tempfile.py (and an auxiliary file tf_inherit_check.py) that makes the failing test much more robust (we hope). ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-17 04:12 Message: Logged In: YES user_id=580015 It sounds like some other test -- probably one of the ones conditioned on -u network -- is causing the child process to have a stale file descriptor open. Can you reproduce the problem with ./python -E -tt ./Lib/test/regrtest.py -l -u network test_socket_ssl test_socketserver test_tempfile ? ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-08-16 19:00 Message: Logged In: YES user_id=250749 I hope I'm doing the right thing by re-opening this rather than copening a bug report... I'm seeing a very odd failure in test_tempfile on FreeBSD 4.4. The failure occurs when I run the full regression test with TESTOPTS="-l -u network" but succeeds when TESTOPTS="- l". ./python -E -tt Lib/test/regrtest.py -l -u network test_tempfile succeeds too, as does: ./python -E -tt Lib/test/regrtest.py -v -u network test_tempfile At this point, I haven't tried other -u option combinations. The error log shows: test test_tempfile failed -- Traceback (most recent call last): File "/home/andymac/cvs/python/python- cvs/Lib/test/test_tempfile.py", line 345, in test_noinherit "child process exited successfully") File "/home/andymac/cvs/python/python- cvs/Lib/unittest.py", line 268, in failUnless if not expr: raise self.failureException, msg AssertionError: child process exited successfully Unfortunately Real Job has wiped out any time I might have had to try and debug this, and I won't have much if any time to look at this for about 3 weeks :-( Intuitive guesses about where to start looking would be welcome! The OS/2 EMX port has the mkstemped problem noted below for HP-UX, but I think I might not have picked up the fixes when I tested, so I'll have to check that again. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-08-15 05:44 Message: Logged In: YES user_id=45365 Nevermind. Just saw the discussion on python-dev (this is a file descriptor returned, not a file pointer, so stdio is nowhere in sight). ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-08-15 05:42 Message: Logged In: YES user_id=45365 Isn't it much more logical to give mkstemp() a mode="w+b" argument? The other routines have that as well, and it is also more in line with open() and such... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-15 02:52 Message: Logged In: YES user_id=6380 Closing again. Reduced the number of temp files to 100. Changed 'binary=True' to 'text=False' default on mkstemp(). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 23:16 Message: Logged In: YES user_id=6380 The "Too many open files" problem is solved. The HP system was configured to allow only 200 open file descriptors per process. But maybe the test would work just as well if it tried to create 100 instead of 1000 temp files? I expect that this would cause failures on other systems with conservative limits. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-14 23:02 Message: Logged In: YES user_id=6380 I'd like to change the binary=True argument to mkstemp into text=False; that seems easier to explain. News about the HP errors from Kalle Svensson: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 719, in ? test_main() File "../python/dist/src/Lib/test/test_tempfile.py", line 716, in test_main test_support.run_suite(suite) File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/test/test_support.py", line 188, in run_suite raise TestFailed(err) test.test_support.TestFailed: Traceback (most recent call last): File "../python/dist/src/Lib/test/test_tempfile.py", line 295, in test_basic_many File "../python/dist/src/Lib/test/test_tempfile.py", line 278, in do_create File "../python/dist/src/Lib/test/test_tempfile.py", line 33, in failOnException File "/mp/slaskdisk/tmp/sfarmer/python/dist/src/Lib/unittest.py", line 260, in fail AssertionError: _mkstemp_inner raised exceptions.OSError: [Errno 24] Too many open files: '/tmp/aaU3irrA' ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-11 14:28 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 14:17 Message: Logged In: YES user_id=6380 I'm reopening this just as a precaution. The snake farm reported two messages on HP-UX 11 when the test suite was run: Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored Exception exceptions.AttributeError: "mkstemped instance has no attribute 'fd'" in > ignored The mkstemped class is defined in test_maketemp.py. That error can happen if a mkstemped instance isn't fully initialized, e.g. if the _mkstemp_inner() call in mkstemped.__init__ fails. But then I would have expected a failure reported, which I don't see... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 10:19 Message: Logged In: YES user_id=6380 Oops, looks like typos in the patch. Fixed (I hope). Question for Zack: I noticed that a few times you changed this: temp = tempfile.mktemp() into this: (fd, temp) = tempfile.mkstemp() os.close(fd) If the latter is secure, why can't mktemp() be defined as doing that? ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-08-10 06:53 Message: Logged In: YES user_id=33168 Guido, there are still 3 uses of mktemp after the checkin. Should these use mkstemp()? Lib/toaiff.py:102 Lib/plat-irix5/torgb.py:95 Lib/plat-irix6/torgb.py:95 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 02:40 Message: Logged In: YES user_id=6380 I've checked this all in now. The changes to test_tempfile.py weren't as easily fixable to work without the tempfile.py changes as Zack thought. I hope the community will give it some review. It will probably break some Zope tests. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-10 00:50 Message: Logged In: YES user_id=6380 OK, I'll do something along those lines myself. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-09 05:01 Message: Logged In: YES user_id=580015 I'm afraid my idea of patch size comes from GCC land, where this would be considered not that big at all. Only 1500 lines affected, and more than half of that is documentation and test suite! I tried, and failed, to break up the changes to tempfile.py itself. But there's some larger divisions that could be made. We could check in the new test_tempfile.py now, disabling the tests that refer to nonexistent functions (just comment out the lines that add those tests to the test_classes array). The changes to the rest of the test suite are also largely independent of the tempfile.py rewrite (since they replace tempfile.mktemp() with TESTFN, mostly). And the search-and-replace changes in the library can wait until after tempfile.py itself gets reviewed. Unfortunately, I am about to go on vacation for five days, so I don't have time now to do this split-up. I will try to drum up interest on python-dev in reviewing the patch as is. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-06 02:48 Message: Logged In: YES user_id=6380 I like the idea of fixing security holes. This patch is *humungous*. Even just the doc changes and the changes to tempfile.py itself are massive and require very careful reading to review all the consequences. Zack, can you try to interest someone with more time than me in reviewing this patch? What's the point of renaming all imports with a leading underscore? I thought __all__ took care of that. ---------------------------------------------------------------------- Comment By: Zack Weinberg (zackw) Date: 2002-08-03 16:53 Message: Logged In: YES user_id=580015 I've revised the patch; ignore the old one. This version includes a vastly expanded test_tempfile.py which hits every line that I know how to test. The omissions are marked - it's mostly non-Unix issues. Also, I went through the entire CVS repository and replaced all uses of tempfile.mktemp with mkstemp/mkdtemp/NamedTemporaryFile, as appropriate. The sole exception is Lib/os.py, which is addressed by patch #590294. The sole functional change to tempfile.py itself, from the previous, is to throw os.O_NOFOLLOW into the open flags. This closes yet another hole - on some systems, without this flag, open(file, O_CREAT|O_EXCL) will follow a symbolic link that points to a nonexistent file, and create the link target. (This has no effect on a symlink in the directory components of the pathname - if the sysadmin has symlinked /tmp to /hugedisk/scratch, that still works.) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-03 00:45 Message: Logged In: YES user_id=6380 This needs some serious review! Volunteers??? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=589982&group_id=5470 From noreply@sourceforge.net Sun Aug 18 21:11:57 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 18 Aug 2002 13:11:57 -0700 Subject: [Patches] [ python-Patches-596581 ] urllib.splituser(): '@' in usrname Message-ID: Patches item #596581, was opened at 2002-08-17 16:55 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=596581&group_id=5470 Category: Modules Group: Python 2.3 >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: Jonathan Simms (slyphon) Assigned to: Raymond Hettinger (rhettinger) Summary: urllib.splituser(): '@' in usrname Initial Comment: This was in response to "[ 581529 ] bug in splituser(host) in urllib". This was a one-line change. The issue was that if a username contained the '@' symbol, the urllib.splituser(host) method wouldn't return the correct information. if you were to try urllib.splituser(Jones@CrunchyFrog.net@Whizzo.com), you'd get ['Jones', 'CrunchyFrog.net@Whizzo.com'] after applying this patch, you get ['Jones@CrunchyFrog.net', 'Whizzo.com'] So, there you are... :-) ---------------------------------------------------------------------- >Comment By: Raymond Hettinger (rhettinger) Date: 2002-08-18 15:11 Message: Logged In: YES user_id=80475 Applied as urllib.py 1.150 and 1.135.6.4 Closing patch and related bug. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=596581&group_id=5470 From noreply@sourceforge.net Mon Aug 19 13:57:03 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 19 Aug 2002 05:57:03 -0700 Subject: [Patches] [ python-Patches-554192 ] mimetypes: all extensions for a type Message-ID: Patches item #554192, was opened at 2002-05-09 19:31 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=554192&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter D�rwald (doerwalter) Assigned to: Martin v. L�wis (loewis) Summary: mimetypes: all extensions for a type Initial Comment: This patch adds a function guess_all_extensions to mimetypes.py. This function returns all known extensions for a given type, not just the first one found in the types_map dictionary. guess_extension is still present and returns the first from the list. ---------------------------------------------------------------------- >Comment By: Walter D�rwald (doerwalter) Date: 2002-08-19 14:57 Message: Logged In: YES user_id=89016 diff3.txt adds a strict=True to add_type and to all methods that call add_type (i.e. read() and readfp()). types_map and common_types are combined into a dict tuple (in the class, on the module level they are still two dicts, to be backwards compatible.) What about adding a guess_all_types(), that returns a list of all registered mimetypes for an exception? This way we would be able to handle duplicates. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-08-15 20:18 Message: Logged In: YES user_id=12800 If add_type() is going to be public, shouldn't it have a "strict" flag to decide whether to add it to the standard types dict or the common types dict? ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-08-15 19:40 Message: Logged In: YES user_id=89016 diff2.txt adds the global version of add_type and the documentation in Doc/lib/libmimetypes.tex. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-07-31 13:24 Message: Logged In: YES user_id=89016 OK, I'll change the patch and post the question to python-dev next week (I'm on vacation right now). ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-30 14:34 Message: Logged In: YES user_id=21627 I'm in favour of exposing it on the module level. If you are uncertain, you might want to ask on python-dev. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-07-30 13:00 Message: Logged In: YES user_id=89016 It *is* used in two spots: The constructor and the readfp method. But exposing it at the module level could make sense, because it is the atomic method of adding mime type information. So should it change the patch to expose it at the module level and change the LaTeX documentation accordingly? ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-29 10:44 Message: Logged In: YES user_id=21627 I can't see the point of making it private, since it is not used inside the module. If you plan to use it, that usage certainly is outside of the module, so the method would be public. If it is public, it needs to be exposed on the module level, and it needs to be documented. ---------------------------------------------------------------------- Comment By: Walter D�rwald (doerwalter) Date: 2002-07-29 10:23 Message: Logged In: YES user_id=89016 The patch adds an inverted mapping (i.e. mapping from type to a list of extensions). add_type simplifies adding a type<->ext mapping to both dictionaries. If this method should not be exposed we could make the name private. (_add_type) ---------------------------------------------------------------------- Comment By: Martin v. L�wis (loewis) Date: 2002-07-28 12:30 Message: Logged In: YES user_id=21627 What is the role of add_type in this patch? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=554192&group_id=5470 From noreply@sourceforge.net Mon Aug 19 17:40:54 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 19 Aug 2002 09:40:54 -0700 Subject: [Patches] [ python-Patches-597220 ] frameobject.c cache friendliness patch Message-ID: Patches item #597220, was opened at 2002-08-19 16:40 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=597220&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Guido van Rossum (gvanrossum) Summary: frameobject.c cache friendliness patch Initial Comment: I was playing around with cachegrind, a cache profiler, and noticed that some error checking code in PyFrame_New generated quite a lot of cache misses. As the test is never going to pass without very dodgy C code, I've moved it inside Py_DEBUG. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=597220&group_id=5470 From noreply@sourceforge.net Mon Aug 19 17:44:13 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 19 Aug 2002 09:44:13 -0700 Subject: [Patches] [ python-Patches-597221 ] "simplification" to ceval.c Message-ID: Patches item #597221, was opened at 2002-08-19 16:44 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=597221&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Michael Hudson (mwh) Assigned to: Nobody/Anonymous (nobody) Summary: "simplification" to ceval.c Initial Comment: Here's a simple, yet subtle change to ceval.c, made possible by my setlineno work. Includes a comment block explaining why it works. The main motivation is simplification -- I'm not expecting any speed up (or slow down, or whatever). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=597221&group_id=5470 From noreply@sourceforge.net Mon Aug 19 17:49:03 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 19 Aug 2002 09:49:03 -0700 Subject: [Patches] [ python-Patches-597220 ] frameobject.c cache friendliness patch Message-ID: Patches item #597220, was opened at 2002-08-19 12:40 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=597220&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open >Resolution: Accepted Priority: 5 Submitted By: Michael Hudson (mwh) >Assigned to: Michael Hudson (mwh) Summary: frameobject.c cache friendliness patch Initial Comment: I was playing around with cachegrind, a cache profiler, and noticed that some error checking code in PyFrame_New generated quite a lot of cache misses. As the test is never going to pass without very dodgy C code, I've moved it inside Py_DEBUG. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-08-19 12:49 Message: Logged In: YES user_id=6380 Hm. The type tests on code and back are clearly bogus, since the variable types are already proof enough. So those can go altogether. For the rest I agree. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=597220&group_id=5470 From noreply@sourceforge.net Mon Aug 19 17:55:57 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 19 Aug 2002 09:55:57 -0700 Subject: [Patches] [ python-Patches-597220 ] frameobject.c cache friendliness patch Message-ID: