From noreply@sourceforge.net Mon Apr 1 02:46:16 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 31 Mar 2002 18:46:16 -0800 Subject: [Patches] [ python-Patches-528022 ] PEP 285 - Adding a bool type Message-ID: Patches item #528022, was opened at 2002-03-10 00:45 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=528022&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Guido van Rossum (gvanrossum) >Assigned to: Guido van Rossum (gvanrossum) Summary: PEP 285 - Adding a bool type Initial Comment: Here's a preliminary implementation of the PEP, including unittests checking the promises made in the PEP (test_bool.py) and (some) documentation. With this 12 tests fail for me (on Linux); I'll look into these later. They appear shallow (mostly doctests dying on True or False where 1 or 0 was expected). Note: the presence of this patch does not mean that the PEP is accepted -- it just means that a sample implementation exists in case someone wants to explore the effects of the PEP on their code. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-03-31 21:46 Message: Logged In: YES user_id=6380 Here's an updated diff (booldiff2.txt). It fixes a refcount bug in bool_repr(), and works with current CVS. With this patch set, 10 standard tests fail for shallow reasons having to do with str() or repr() returning False or True instead of 0 or 1. Here are the failed tests: test_descr test_descrtut test_difflib test_doctest test_extcall test_generators test_gettext test_richcmp test_richcompare test_unicode ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=528022&group_id=5470 From noreply@sourceforge.net Mon Apr 1 09:38:19 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 01 Apr 2002 01:38:19 -0800 Subject: [Patches] [ python-Patches-537536 ] bug 535444 super() broken w/classmethods Message-ID: Patches item #537536, was opened at 2002-03-31 23:12 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470 Category: Core (C code) Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Phillip J. Eby (pje) >Assigned to: Guido van Rossum (gvanrossum) Summary: bug 535444 super() broken w/classmethods Initial Comment: This patch fixes bug #535444. It is against the current CVS version of Python, and addresses the problem by adding a 'starttype' variable to 'super_getattro', which works the same as 'starttype' in the pure-Python version of super in the descriptor tutorial. This variable is then passed to the descriptor __get__ function, ala 'descriptor.__get__(self.__obj__,starttype)'. This patch does not correct the pure-Python version of 'super' in the descriptor tutorial; I don't know where that file is or how to submit a patch for it. This patch also does not include a regression test for the bug. I do not know what would be considered the appropriate test script to place this in. Thanks. ---------------------------------------------------------------------- >Comment By: Michael Hudson (mwh) Date: 2002-04-01 09:38 Message: Logged In: YES user_id=6656 Guido gets the fix too. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470 From noreply@sourceforge.net Mon Apr 1 11:44:31 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 01 Apr 2002 03:44:31 -0800 Subject: [Patches] [ python-Patches-528022 ] PEP 285 - Adding a bool type Message-ID: Patches item #528022, was opened at 2002-03-10 06:45 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=528022&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Guido van Rossum (gvanrossum) Assigned to: Guido van Rossum (gvanrossum) Summary: PEP 285 - Adding a bool type Initial Comment: Here's a preliminary implementation of the PEP, including unittests checking the promises made in the PEP (test_bool.py) and (some) documentation. With this 12 tests fail for me (on Linux); I'll look into these later. They appear shallow (mostly doctests dying on True or False where 1 or 0 was expected). Note: the presence of this patch does not mean that the PEP is accepted -- it just means that a sample implementation exists in case someone wants to explore the effects of the PEP on their code. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-01 13:44 Message: Logged In: YES user_id=21627 This patch does not support pickling of bools (the PEP should probably spell out how they are pickled). marshalling of bools does not round-trip (you get back an int). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-01 04:46 Message: Logged In: YES user_id=6380 Here's an updated diff (booldiff2.txt). It fixes a refcount bug in bool_repr(), and works with current CVS. With this patch set, 10 standard tests fail for shallow reasons having to do with str() or repr() returning False or True instead of 0 or 1. Here are the failed tests: test_descr test_descrtut test_difflib test_doctest test_extcall test_generators test_gettext test_richcmp test_richcompare test_unicode ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=528022&group_id=5470 From noreply@sourceforge.net Mon Apr 1 11:55:01 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 01 Apr 2002 03:55:01 -0800 Subject: [Patches] [ python-Patches-537536 ] bug 535444 super() broken w/classmethods Message-ID: Patches item #537536, was opened at 2002-04-01 01:12 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470 Category: Core (C code) Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Phillip J. Eby (pje) Assigned to: Guido van Rossum (gvanrossum) Summary: bug 535444 super() broken w/classmethods Initial Comment: This patch fixes bug #535444. It is against the current CVS version of Python, and addresses the problem by adding a 'starttype' variable to 'super_getattro', which works the same as 'starttype' in the pure-Python version of super in the descriptor tutorial. This variable is then passed to the descriptor __get__ function, ala 'descriptor.__get__(self.__obj__,starttype)'. This patch does not correct the pure-Python version of 'super' in the descriptor tutorial; I don't know where that file is or how to submit a patch for it. This patch also does not include a regression test for the bug. I do not know what would be considered the appropriate test script to place this in. Thanks. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-01 13:55 Message: Logged In: YES user_id=21627 Please put tests for this stuff into test_descr.py. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-04-01 11:38 Message: Logged In: YES user_id=6656 Guido gets the fix too. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470 From noreply@sourceforge.net Mon Apr 1 19:41:18 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 01 Apr 2002 11:41:18 -0800 Subject: [Patches] [ python-Patches-537536 ] bug 535444 super() broken w/classmethods Message-ID: Patches item #537536, was opened at 2002-03-31 23:12 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470 Category: Core (C code) Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Phillip J. Eby (pje) Assigned to: Guido van Rossum (gvanrossum) Summary: bug 535444 super() broken w/classmethods Initial Comment: This patch fixes bug #535444. It is against the current CVS version of Python, and addresses the problem by adding a 'starttype' variable to 'super_getattro', which works the same as 'starttype' in the pure-Python version of super in the descriptor tutorial. This variable is then passed to the descriptor __get__ function, ala 'descriptor.__get__(self.__obj__,starttype)'. This patch does not correct the pure-Python version of 'super' in the descriptor tutorial; I don't know where that file is or how to submit a patch for it. This patch also does not include a regression test for the bug. I do not know what would be considered the appropriate test script to place this in. Thanks. ---------------------------------------------------------------------- >Comment By: Phillip J. Eby (pje) Date: 2002-04-01 19:41 Message: Logged In: YES user_id=56214 Here's the regression test. It asserts 6 things, 5 of which will fail without the typeobject.c patch to super() in place. Thanks. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-01 11:55 Message: Logged In: YES user_id=21627 Please put tests for this stuff into test_descr.py. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-04-01 09:38 Message: Logged In: YES user_id=6656 Guido gets the fix too. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470 From noreply@sourceforge.net Tue Apr 2 04:11:24 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 01 Apr 2002 20:11:24 -0800 Subject: [Patches] [ python-Patches-537536 ] bug 535444 super() broken w/classmethods Message-ID: Patches item #537536, was opened at 2002-03-31 18:12 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470 Category: Core (C code) Group: Python 2.2.x Status: Open >Resolution: Accepted Priority: 5 Submitted By: Phillip J. Eby (pje) Assigned to: Guido van Rossum (gvanrossum) Summary: bug 535444 super() broken w/classmethods Initial Comment: This patch fixes bug #535444. It is against the current CVS version of Python, and addresses the problem by adding a 'starttype' variable to 'super_getattro', which works the same as 'starttype' in the pure-Python version of super in the descriptor tutorial. This variable is then passed to the descriptor __get__ function, ala 'descriptor.__get__(self.__obj__,starttype)'. This patch does not correct the pure-Python version of 'super' in the descriptor tutorial; I don't know where that file is or how to submit a patch for it. This patch also does not include a regression test for the bug. I do not know what would be considered the appropriate test script to place this in. Thanks. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-01 23:11 Message: Logged In: YES user_id=6380 Accepted, also as bugfix for 2.2.1 (assuming it works there, not tested). I can check this in in the morning. Thanks all! ---------------------------------------------------------------------- Comment By: Phillip J. Eby (pje) Date: 2002-04-01 14:41 Message: Logged In: YES user_id=56214 Here's the regression test. It asserts 6 things, 5 of which will fail without the typeobject.c patch to super() in place. Thanks. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-01 06:55 Message: Logged In: YES user_id=21627 Please put tests for this stuff into test_descr.py. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-04-01 04:38 Message: Logged In: YES user_id=6656 Guido gets the fix too. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470 From noreply@sourceforge.net Tue Apr 2 07:15:15 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 01 Apr 2002 23:15:15 -0800 Subject: [Patches] [ python-Patches-536661 ] splitext performances improvement Message-ID: Patches item #536661, was opened at 2002-03-29 09:06 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Sebastien Keim (s_keim) Assigned to: Nobody/Anonymous (nobody) Summary: splitext performances improvement Initial Comment: After more thought, I must admit that the behavior change in splitext, I proposed with patch 536120 is not acceptable. So I would instead propose this one which should only improve performances without modifying behavior. The following bench says that patched splitext is between 2x(for l1) and 25x(for l2) faster than the original one. The diff patch also test_posixpath.py to check the pitfall described by Tim comments in patch 536120 page. def splitext(p): root, ext = '', '' for c in p: if c == '/': root, ext = root + ext + c, '' elif c == '.': if ext: root, ext = root + ext, c else: ext = c elif ext: ext = ext + c else: root = root + c return root, ext def splitext2(p): i = p.rfind('.') if i<=p.rfind('/'): return p, '' else: return p[:i], p[i:] l1 = ('t','.t','a.b/','a.b','/a.b','a.b/.c','a.b/c.d') l2 = ( 'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.tyyttyt', 'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.', 'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/.tyyttyt', 'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut', 'reeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeyttyutyuyuttyuyut.tyyttyt', '/iuouiiuuoiiuiikhjzekezhjzekejkejkzejkhejkhzejzehjkhjezhjkehzkhjezh.tyyttyt' ) for i in l1+l2: assert splitext2(i) == splitext(i) import time def test(f,args): t = time.clock() for p in args: for i in range(1000): f(p) return time.clock() - t def f(p):pass a=test(splitext, l1) b=test(splitext2, l1) c=test(f,l1) print a,b,c,(a-c)/(b-c) a=test(splitext, l2) b=test(splitext2, l2) c=test(f,l2) print a,b,c,(a-c)/(b-c) ---------------------------------------------------------------------- >Comment By: Sebastien Keim (s_keim) Date: 2002-04-02 09:15 Message: Logged In: YES user_id=498191 I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this? ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-03-29 19:56 Message: Logged In: YES user_id=31435 I like it fine so far as it goes, but I'd like it a lot more if it also patched the splitext and test implementations for other platforms. It's not good that, e.g., posixpath.py and ntpath.py get more and more out of synch over time, and that their test suites also diverge. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-29 10:49 Message: Logged In: YES user_id=21627 The patch looks good to me. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470 From noreply@sourceforge.net Tue Apr 2 07:28:31 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 01 Apr 2002 23:28:31 -0800 Subject: [Patches] [ python-Patches-536661 ] splitext performances improvement Message-ID: Patches item #536661, was opened at 2002-03-29 09:06 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Sebastien Keim (s_keim) Assigned to: Nobody/Anonymous (nobody) Summary: splitext performances improvement Initial Comment: After more thought, I must admit that the behavior change in splitext, I proposed with patch 536120 is not acceptable. So I would instead propose this one which should only improve performances without modifying behavior. The following bench says that patched splitext is between 2x(for l1) and 25x(for l2) faster than the original one. The diff patch also test_posixpath.py to check the pitfall described by Tim comments in patch 536120 page. def splitext(p): root, ext = '', '' for c in p: if c == '/': root, ext = root + ext + c, '' elif c == '.': if ext: root, ext = root + ext, c else: ext = c elif ext: ext = ext + c else: root = root + c return root, ext def splitext2(p): i = p.rfind('.') if i<=p.rfind('/'): return p, '' else: return p[:i], p[i:] l1 = ('t','.t','a.b/','a.b','/a.b','a.b/.c','a.b/c.d') l2 = ( 'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.tyyttyt', 'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.', 'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/.tyyttyt', 'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut', 'reeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeyttyutyuyuttyuyut.tyyttyt', '/iuouiiuuoiiuiikhjzekezhjzekejkejkzejkhejkhzejzehjkhjezhjkehzkhjezh.tyyttyt' ) for i in l1+l2: assert splitext2(i) == splitext(i) import time def test(f,args): t = time.clock() for p in args: for i in range(1000): f(p) return time.clock() - t def f(p):pass a=test(splitext, l1) b=test(splitext2, l1) c=test(f,l1) print a,b,c,(a-c)/(b-c) a=test(splitext, l2) b=test(splitext2, l2) c=test(f,l2) print a,b,c,(a-c)/(b-c) ---------------------------------------------------------------------- >Comment By: Sebastien Keim (s_keim) Date: 2002-04-02 09:28 Message: Logged In: YES user_id=498191 I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this? ---------------------------------------------------------------------- Comment By: Sebastien Keim (s_keim) Date: 2002-04-02 09:15 Message: Logged In: YES user_id=498191 I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this? ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-03-29 19:56 Message: Logged In: YES user_id=31435 I like it fine so far as it goes, but I'd like it a lot more if it also patched the splitext and test implementations for other platforms. It's not good that, e.g., posixpath.py and ntpath.py get more and more out of synch over time, and that their test suites also diverge. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-29 10:49 Message: Logged In: YES user_id=21627 The patch looks good to me. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470 From noreply@sourceforge.net Tue Apr 2 09:24:27 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 02 Apr 2002 01:24:27 -0800 Subject: [Patches] [ python-Patches-536661 ] splitext performances improvement Message-ID: Patches item #536661, was opened at 2002-03-29 09:06 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Sebastien Keim (s_keim) Assigned to: Nobody/Anonymous (nobody) Summary: splitext performances improvement Initial Comment: After more thought, I must admit that the behavior change in splitext, I proposed with patch 536120 is not acceptable. So I would instead propose this one which should only improve performances without modifying behavior. The following bench says that patched splitext is between 2x(for l1) and 25x(for l2) faster than the original one. The diff patch also test_posixpath.py to check the pitfall described by Tim comments in patch 536120 page. def splitext(p): root, ext = '', '' for c in p: if c == '/': root, ext = root + ext + c, '' elif c == '.': if ext: root, ext = root + ext, c else: ext = c elif ext: ext = ext + c else: root = root + c return root, ext def splitext2(p): i = p.rfind('.') if i<=p.rfind('/'): return p, '' else: return p[:i], p[i:] l1 = ('t','.t','a.b/','a.b','/a.b','a.b/.c','a.b/c.d') l2 = ( 'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.tyyttyt', 'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.', 'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/.tyyttyt', 'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut', 'reeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeyttyutyuyuttyuyut.tyyttyt', '/iuouiiuuoiiuiikhjzekezhjzekejkejkzejkhejkhzejzehjkhjezhjkehzkhjezh.tyyttyt' ) for i in l1+l2: assert splitext2(i) == splitext(i) import time def test(f,args): t = time.clock() for p in args: for i in range(1000): f(p) return time.clock() - t def f(p):pass a=test(splitext, l1) b=test(splitext2, l1) c=test(f,l1) print a,b,c,(a-c)/(b-c) a=test(splitext, l2) b=test(splitext2, l2) c=test(f,l2) print a,b,c,(a-c)/(b-c) ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-02 11:24 Message: Logged In: YES user_id=21627 Sharing code is a good thing. However, it would be critical as to how exactly this is done, since os is such a central module. If you start now, and don't get agreement immediately, it may well be that you cannot complete until Python 2.3. ---------------------------------------------------------------------- Comment By: Sebastien Keim (s_keim) Date: 2002-04-02 09:28 Message: Logged In: YES user_id=498191 I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this? ---------------------------------------------------------------------- Comment By: Sebastien Keim (s_keim) Date: 2002-04-02 09:15 Message: Logged In: YES user_id=498191 I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this? ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-03-29 19:56 Message: Logged In: YES user_id=31435 I like it fine so far as it goes, but I'd like it a lot more if it also patched the splitext and test implementations for other platforms. It's not good that, e.g., posixpath.py and ntpath.py get more and more out of synch over time, and that their test suites also diverge. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-29 10:49 Message: Logged In: YES user_id=21627 The patch looks good to me. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470 From noreply@sourceforge.net Tue Apr 2 11:21:29 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 02 Apr 2002 03:21:29 -0800 Subject: [Patches] [ python-Patches-511219 ] suppress type restrictions on locals() Message-ID: Patches item #511219, was opened at 2002-01-31 15:55 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=511219&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Cesar Douady (douady) Assigned to: Nobody/Anonymous (nobody) Summary: suppress type restrictions on locals() Initial Comment: This patch suppresses the restriction that global and local dictionaries do not access overloaded __getitem__ and __setitem__ if passed an object derived from class dict. An exception is made for the builtin insertion and reference in the global dict to make sure this object exists and to suppress the need for the derived class to take care of this implementation dependent detail. The behavior of eval and exec has been updated for code objects which have the CO_NEWLOCALS flag set : if explicitely passed a local dict, a new local dict is not generated. This allows one to pass an explicit local dict to the code object of a function (which otherwise cannot be achieved). If this cannot be done for backward compatibility problems, then an alternative would consist in using the "new" module to create a code object from a function with CO_NEWLOCALS reset but it seems logical to me to use the information explicitely provided. Free and cell variables are not managed in this version. If the patch is accepted, I am willing to finish the job and implement free and cell variables, but this requires a serious rework of the Cell object: free variables should be accessed using the method of the dict in which they relies and today, this dict is not accessible from the Cell object. Robustness : Currently, the plain test suite passes (with a modification of test_desctut which precisely verifies that the suppressed restriction is enforced). I have introduced a new test (test_subdict.py) which verifies the new behavior. Because of performance, the plain case (when the local dict is a plain dict) is optimized so that differences in performance are not measurable (within 1%) when run on the test suite (i.e. I timed make test). ---------------------------------------------------------------------- >Comment By: Cesar Douady (douady) Date: 2002-04-02 13:21 Message: Logged In: YES user_id=428521 I successfully applied the patch as is to revision 2.2.1c2 with the following output (and then the same procedure as mentioned for patching revision 2.2) : patching file Include/dictobject.h patching file Include/frameobject.h patching file Include/object.h patching file Lib/test/test_descrtut.py patching file Lib/test/test_subdict.py patching file Modules/cPickle.c patching file Objects/classobject.c patching file Objects/frameobject.c patching file Python/ceval.c Hunk #2 succeeded at 1534 (offset 3 lines). Hunk #4 succeeded at 1613 (offset 3 lines). Hunk #6 succeeded at 1655 (offset 3 lines). Hunk #8 succeeded at 1860 (offset 3 lines). Hunk #10 succeeded at 1889 (offset 3 lines). Hunk #12 succeeded at 2635 (offset 3 lines). Hunk #14 succeeded at 2893 (offset 3 lines). Hunk #16 succeeded at 3038 (offset 3 lines). Hunk #18 succeeded at 3657 (offset 3 lines). Hunk #20 succeeded at 3722 (offset 3 lines). patching file Python/compile.c Hunk #1 succeeded at 2916 (offset 12 lines). patching file Python/import.c Hunk #1 succeeded at 1668 (offset -4 lines). Hunk #3 succeeded at 1716 (offset -4 lines). patching file Python/sysmodule.c Hunk #1 succeeded at 238 (offset -4 lines). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 12:27 Message: Logged In: YES user_id=6656 And there's precisely no way it's going into 2.2.x. ---------------------------------------------------------------------- Comment By: Cesar Douady (douady) Date: 2002-03-30 01:08 Message: Logged In: YES user_id=428521 to install this patch from python revision 2.2, follow these steps : - get the python.diff file from this page - cd Python-2.2 - run "patch -p1 Patches item #511219, was opened at 2002-01-31 14:55 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=511219&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Cesar Douady (douady) Assigned to: Nobody/Anonymous (nobody) Summary: suppress type restrictions on locals() Initial Comment: This patch suppresses the restriction that global and local dictionaries do not access overloaded __getitem__ and __setitem__ if passed an object derived from class dict. An exception is made for the builtin insertion and reference in the global dict to make sure this object exists and to suppress the need for the derived class to take care of this implementation dependent detail. The behavior of eval and exec has been updated for code objects which have the CO_NEWLOCALS flag set : if explicitely passed a local dict, a new local dict is not generated. This allows one to pass an explicit local dict to the code object of a function (which otherwise cannot be achieved). If this cannot be done for backward compatibility problems, then an alternative would consist in using the "new" module to create a code object from a function with CO_NEWLOCALS reset but it seems logical to me to use the information explicitely provided. Free and cell variables are not managed in this version. If the patch is accepted, I am willing to finish the job and implement free and cell variables, but this requires a serious rework of the Cell object: free variables should be accessed using the method of the dict in which they relies and today, this dict is not accessible from the Cell object. Robustness : Currently, the plain test suite passes (with a modification of test_desctut which precisely verifies that the suppressed restriction is enforced). I have introduced a new test (test_subdict.py) which verifies the new behavior. Because of performance, the plain case (when the local dict is a plain dict) is optimized so that differences in performance are not measurable (within 1%) when run on the test suite (i.e. I timed make test). ---------------------------------------------------------------------- >Comment By: Michael Hudson (mwh) Date: 2002-04-02 14:26 Message: Logged In: YES user_id=6656 So what? Maybe you misunderstand me. This patch was in the group "Python 2.2.x", which is the group we use for patches that are under consideration for being put into a 2.2.x release of Python (or in other words, a bugfix release of Python 2.2). This patch is not going to go into a bugfix release of Python 2.2 for at least two reasons: (1) it adds what is arguably a new feature and (2) it's big and complicated and so might cause bugs. And now I've actually looked at the patch, it has even less chance: it would break binary compaitibilty of extensions. So while I'm not against the patch in general (looks good, from an eyballing), it doesn't belong in the 2.2.x group. ---------------------------------------------------------------------- Comment By: Cesar Douady (douady) Date: 2002-04-02 11:21 Message: Logged In: YES user_id=428521 I successfully applied the patch as is to revision 2.2.1c2 with the following output (and then the same procedure as mentioned for patching revision 2.2) : patching file Include/dictobject.h patching file Include/frameobject.h patching file Include/object.h patching file Lib/test/test_descrtut.py patching file Lib/test/test_subdict.py patching file Modules/cPickle.c patching file Objects/classobject.c patching file Objects/frameobject.c patching file Python/ceval.c Hunk #2 succeeded at 1534 (offset 3 lines). Hunk #4 succeeded at 1613 (offset 3 lines). Hunk #6 succeeded at 1655 (offset 3 lines). Hunk #8 succeeded at 1860 (offset 3 lines). Hunk #10 succeeded at 1889 (offset 3 lines). Hunk #12 succeeded at 2635 (offset 3 lines). Hunk #14 succeeded at 2893 (offset 3 lines). Hunk #16 succeeded at 3038 (offset 3 lines). Hunk #18 succeeded at 3657 (offset 3 lines). Hunk #20 succeeded at 3722 (offset 3 lines). patching file Python/compile.c Hunk #1 succeeded at 2916 (offset 12 lines). patching file Python/import.c Hunk #1 succeeded at 1668 (offset -4 lines). Hunk #3 succeeded at 1716 (offset -4 lines). patching file Python/sysmodule.c Hunk #1 succeeded at 238 (offset -4 lines). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 11:27 Message: Logged In: YES user_id=6656 And there's precisely no way it's going into 2.2.x. ---------------------------------------------------------------------- Comment By: Cesar Douady (douady) Date: 2002-03-30 00:08 Message: Logged In: YES user_id=428521 to install this patch from python revision 2.2, follow these steps : - get the python.diff file from this page - cd Python-2.2 - run "patch -p1 Patches item #511219, was opened at 2002-01-31 15:55 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=511219&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Cesar Douady (douady) Assigned to: Nobody/Anonymous (nobody) Summary: suppress type restrictions on locals() Initial Comment: This patch suppresses the restriction that global and local dictionaries do not access overloaded __getitem__ and __setitem__ if passed an object derived from class dict. An exception is made for the builtin insertion and reference in the global dict to make sure this object exists and to suppress the need for the derived class to take care of this implementation dependent detail. The behavior of eval and exec has been updated for code objects which have the CO_NEWLOCALS flag set : if explicitely passed a local dict, a new local dict is not generated. This allows one to pass an explicit local dict to the code object of a function (which otherwise cannot be achieved). If this cannot be done for backward compatibility problems, then an alternative would consist in using the "new" module to create a code object from a function with CO_NEWLOCALS reset but it seems logical to me to use the information explicitely provided. Free and cell variables are not managed in this version. If the patch is accepted, I am willing to finish the job and implement free and cell variables, but this requires a serious rework of the Cell object: free variables should be accessed using the method of the dict in which they relies and today, this dict is not accessible from the Cell object. Robustness : Currently, the plain test suite passes (with a modification of test_desctut which precisely verifies that the suppressed restriction is enforced). I have introduced a new test (test_subdict.py) which verifies the new behavior. Because of performance, the plain case (when the local dict is a plain dict) is optimized so that differences in performance are not measurable (within 1%) when run on the test suite (i.e. I timed make test). ---------------------------------------------------------------------- >Comment By: Cesar Douady (douady) Date: 2002-04-02 20:08 Message: Logged In: YES user_id=428521 Well, I think I am in sync now. 1/ I did take you initial comment as meaning the patch could not be applied to 2.2.x 2/ I decided to generate a new patch to be applied to 2.2.1c2 3/ I realized that the patch could be applied as is 4/ I was lost 5/ I realized the meaning of the group was the one you just mentioned. 6/ I decided to post the result of my trial anyway so people could confidently apply the patch the lastest release (specially because patch outputs some warnings). 7/ I did not understand this place could actually be used as a forum (i.e. reply to previous post rather than general info). Let me apologize for my previous misunderstandings. about compatibility : I did not find a way to make it backward binary compatible, however my intent is to make it source compatible for extensions. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-04-02 16:26 Message: Logged In: YES user_id=6656 So what? Maybe you misunderstand me. This patch was in the group "Python 2.2.x", which is the group we use for patches that are under consideration for being put into a 2.2.x release of Python (or in other words, a bugfix release of Python 2.2). This patch is not going to go into a bugfix release of Python 2.2 for at least two reasons: (1) it adds what is arguably a new feature and (2) it's big and complicated and so might cause bugs. And now I've actually looked at the patch, it has even less chance: it would break binary compaitibilty of extensions. So while I'm not against the patch in general (looks good, from an eyballing), it doesn't belong in the 2.2.x group. ---------------------------------------------------------------------- Comment By: Cesar Douady (douady) Date: 2002-04-02 13:21 Message: Logged In: YES user_id=428521 I successfully applied the patch as is to revision 2.2.1c2 with the following output (and then the same procedure as mentioned for patching revision 2.2) : patching file Include/dictobject.h patching file Include/frameobject.h patching file Include/object.h patching file Lib/test/test_descrtut.py patching file Lib/test/test_subdict.py patching file Modules/cPickle.c patching file Objects/classobject.c patching file Objects/frameobject.c patching file Python/ceval.c Hunk #2 succeeded at 1534 (offset 3 lines). Hunk #4 succeeded at 1613 (offset 3 lines). Hunk #6 succeeded at 1655 (offset 3 lines). Hunk #8 succeeded at 1860 (offset 3 lines). Hunk #10 succeeded at 1889 (offset 3 lines). Hunk #12 succeeded at 2635 (offset 3 lines). Hunk #14 succeeded at 2893 (offset 3 lines). Hunk #16 succeeded at 3038 (offset 3 lines). Hunk #18 succeeded at 3657 (offset 3 lines). Hunk #20 succeeded at 3722 (offset 3 lines). patching file Python/compile.c Hunk #1 succeeded at 2916 (offset 12 lines). patching file Python/import.c Hunk #1 succeeded at 1668 (offset -4 lines). Hunk #3 succeeded at 1716 (offset -4 lines). patching file Python/sysmodule.c Hunk #1 succeeded at 238 (offset -4 lines). ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 12:27 Message: Logged In: YES user_id=6656 And there's precisely no way it's going into 2.2.x. ---------------------------------------------------------------------- Comment By: Cesar Douady (douady) Date: 2002-03-30 01:08 Message: Logged In: YES user_id=428521 to install this patch from python revision 2.2, follow these steps : - get the python.diff file from this page - cd Python-2.2 - run "patch -p1 Patches item #537536, was opened at 2002-03-31 18:12 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470 Category: Core (C code) Group: Python 2.2.x >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Phillip J. Eby (pje) Assigned to: Guido van Rossum (gvanrossum) Summary: bug 535444 super() broken w/classmethods Initial Comment: This patch fixes bug #535444. It is against the current CVS version of Python, and addresses the problem by adding a 'starttype' variable to 'super_getattro', which works the same as 'starttype' in the pure-Python version of super in the descriptor tutorial. This variable is then passed to the descriptor __get__ function, ala 'descriptor.__get__(self.__obj__,starttype)'. This patch does not correct the pure-Python version of 'super' in the descriptor tutorial; I don't know where that file is or how to submit a patch for it. This patch also does not include a regression test for the bug. I do not know what would be considered the appropriate test script to place this in. Thanks. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-02 14:04 Message: Logged In: YES user_id=6380 Committed to the trunk. I'll leave it to Michael to commit it to 2.2.1. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-01 23:11 Message: Logged In: YES user_id=6380 Accepted, also as bugfix for 2.2.1 (assuming it works there, not tested). I can check this in in the morning. Thanks all! ---------------------------------------------------------------------- Comment By: Phillip J. Eby (pje) Date: 2002-04-01 14:41 Message: Logged In: YES user_id=56214 Here's the regression test. It asserts 6 things, 5 of which will fail without the typeobject.c patch to super() in place. Thanks. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-01 06:55 Message: Logged In: YES user_id=21627 Please put tests for this stuff into test_descr.py. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-04-01 04:38 Message: Logged In: YES user_id=6656 Guido gets the fix too. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470 From noreply@sourceforge.net Tue Apr 2 19:24:39 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 02 Apr 2002 11:24:39 -0800 Subject: [Patches] [ python-Patches-538395 ] ae* modules: handle type inheritance Message-ID: Patches item #538395, was opened at 2002-04-02 10:24 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470 Category: Macintosh Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Donovan Preston (dsposx) Assigned to: Jack Jansen (jackjansen) Summary: ae* modules: handle type inheritance Initial Comment: The gensuitemodule script creates Python classes out of AppleScript types. It keeps track of properties in _propdict and elements in _elemdict. However, gensuitemodule does not accurately replicate the AppleScript inheritance heirarchy, and __getattr__ only looks in self._propdict and self._elemdict, therefore not finding elements and properties defined in superclasses. Attached is a patch which: 1) Correctly identifies an AppleScript type's superclasses, and defines the Python classes with these superclasses. Since not all names may be defined by the time a new class is defined, this is accomplished by setting a new class' __bases__ after all names are defined. 2) Changes __getattr__ to recurse superclasses while looking through _propdict and _elemdict. It also contains small usability enhancements which will automatically look for a .want or .which property when you are creating specifiers. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470 From noreply@sourceforge.net Tue Apr 2 21:47:13 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 02 Apr 2002 13:47:13 -0800 Subject: [Patches] [ python-Patches-538395 ] ae* modules: handle type inheritance Message-ID: Patches item #538395, was opened at 2002-04-02 21:24 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470 Category: Macintosh Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Donovan Preston (dsposx) Assigned to: Jack Jansen (jackjansen) Summary: ae* modules: handle type inheritance Initial Comment: The gensuitemodule script creates Python classes out of AppleScript types. It keeps track of properties in _propdict and elements in _elemdict. However, gensuitemodule does not accurately replicate the AppleScript inheritance heirarchy, and __getattr__ only looks in self._propdict and self._elemdict, therefore not finding elements and properties defined in superclasses. Attached is a patch which: 1) Correctly identifies an AppleScript type's superclasses, and defines the Python classes with these superclasses. Since not all names may be defined by the time a new class is defined, this is accomplished by setting a new class' __bases__ after all names are defined. 2) Changes __getattr__ to recurse superclasses while looking through _propdict and _elemdict. It also contains small usability enhancements which will automatically look for a .want or .which property when you are creating specifiers. ---------------------------------------------------------------------- >Comment By: Jack Jansen (jackjansen) Date: 2002-04-02 23:47 Message: Logged In: YES user_id=45365 Donovan, I love the functionality of your patch, but I would humbly request you make a couple of changes. Alternatively I'll make them, but that will delay the patch (as I have to find the time to do them). First: please make it a context diff (cvs diff -c), as straight diffs are too error prone for moving targets. There are also mods I can't judge this way (such as why you moved the 'utxt' support in aepack.py to a different place. Or is this a whitespace mismatch?) Second: you've diffed against the different version than against which you've patched. See gensuitemodule, for instance: it appears as if you've modified 1.22 but diffed against 1.21. Maybe you applied my 1.21->1.22 by hand without doing a cvs update? I think a cvs update (plus some manual work;-) should solve this. Third: the passing of modules by name (to the decoding routines) seems error prone and not too elegant. Can't you pass the modules themselves in stead of their names? It would also save extra imports in the decoders. Fourth: assigning to __bases__ seems like rather a big hack. Can't we generate the classes with a helper class, similarly to the event helper class in the __init__.py modules: FooApp/Foo_Suite.py would contain the class foo sketched above, and FooApp.__init__.py would contain import othersuite.superfoo import Foo_Suite.foo class foo(Foo_Suite.foo, othersuite.superfoo): pass Fifth, and least important: you're manually iterating over the base classes for lookup. Couldn't we statically combine the _propdict and _elemdict's of the base classes during class declaration, so that at lookup time we'd need only a single dictionary lookup? The "class foo" body in __init__.py would then become something like _propdict = aetools.flatten_dicts( foosuite.foo._propdict, othersuite.superfoo._propdict) and similar for elemdict. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470 From noreply@sourceforge.net Wed Apr 3 06:50:29 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 02 Apr 2002 22:50:29 -0800 Subject: [Patches] [ python-Patches-538395 ] ae* modules: handle type inheritance Message-ID: Patches item #538395, was opened at 2002-04-02 10:24 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470 Category: Macintosh Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Donovan Preston (dsposx) Assigned to: Jack Jansen (jackjansen) Summary: ae* modules: handle type inheritance Initial Comment: The gensuitemodule script creates Python classes out of AppleScript types. It keeps track of properties in _propdict and elements in _elemdict. However, gensuitemodule does not accurately replicate the AppleScript inheritance heirarchy, and __getattr__ only looks in self._propdict and self._elemdict, therefore not finding elements and properties defined in superclasses. Attached is a patch which: 1) Correctly identifies an AppleScript type's superclasses, and defines the Python classes with these superclasses. Since not all names may be defined by the time a new class is defined, this is accomplished by setting a new class' __bases__ after all names are defined. 2) Changes __getattr__ to recurse superclasses while looking through _propdict and _elemdict. It also contains small usability enhancements which will automatically look for a .want or .which property when you are creating specifiers. ---------------------------------------------------------------------- >Comment By: Donovan Preston (dsposx) Date: 2002-04-02 21:50 Message: Logged In: YES user_id=111050 Jack: Thanks a lot for your comments! You're right on target on most of them. Not sure why I didn't make a context diff this time -- last time it was screwed up, I thought it was because of that, but it was really just the tabs vs spaces issue. cvs is still very new and ugly to me. I did indeed manually apply your patches to my tree, because I was afraid of what an update would do to the production code that my boss would kill me if broke... I'll do an update on another machine and reapply the patches to that checkout. Is there any way I can get a log of what the update has done to my files, so I can check them manually? Hmm. I hadn't thought about passing the module itself; how would I get a reference to a package from inside of that package's __init__.py? From aetools, I can get away with saying __import__(modulename), but inside of __init__.py, what do I use to get a reference to the module that __init__.py is initializing? Finally, after thinking about it a bit, Fourth and fifth points may be better solved by a construct like this: (applescript type bar inherits from foo) bar._elemdict = copy(foo._elemdict) bar._elemdict.update({ dict of new keyword/class mappings }) This has the advantage of flattening out the inheritance tree again, since all elements and properties each class needs to know about are in it's own copy of _elemdict and _propdict, and therefore no Python inheritance relationship needs to be made. I had wanted to build the inheritance heirarchy properly, and dynamically look through bases, because it was "cool" but in retrospect, speed is more important :-) Donovan ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-04-02 12:47 Message: Logged In: YES user_id=45365 Donovan, I love the functionality of your patch, but I would humbly request you make a couple of changes. Alternatively I'll make them, but that will delay the patch (as I have to find the time to do them). First: please make it a context diff (cvs diff -c), as straight diffs are too error prone for moving targets. There are also mods I can't judge this way (such as why you moved the 'utxt' support in aepack.py to a different place. Or is this a whitespace mismatch?) Second: you've diffed against the different version than against which you've patched. See gensuitemodule, for instance: it appears as if you've modified 1.22 but diffed against 1.21. Maybe you applied my 1.21->1.22 by hand without doing a cvs update? I think a cvs update (plus some manual work;-) should solve this. Third: the passing of modules by name (to the decoding routines) seems error prone and not too elegant. Can't you pass the modules themselves in stead of their names? It would also save extra imports in the decoders. Fourth: assigning to __bases__ seems like rather a big hack. Can't we generate the classes with a helper class, similarly to the event helper class in the __init__.py modules: FooApp/Foo_Suite.py would contain the class foo sketched above, and FooApp.__init__.py would contain import othersuite.superfoo import Foo_Suite.foo class foo(Foo_Suite.foo, othersuite.superfoo): pass Fifth, and least important: you're manually iterating over the base classes for lookup. Couldn't we statically combine the _propdict and _elemdict's of the base classes during class declaration, so that at lookup time we'd need only a single dictionary lookup? The "class foo" body in __init__.py would then become something like _propdict = aetools.flatten_dicts( foosuite.foo._propdict, othersuite.superfoo._propdict) and similar for elemdict. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470 From noreply@sourceforge.net Wed Apr 3 07:03:21 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 02 Apr 2002 23:03:21 -0800 Subject: [Patches] [ python-Patches-536661 ] splitext performances improvement Message-ID: Patches item #536661, was opened at 2002-03-29 09:06 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Sebastien Keim (s_keim) Assigned to: Nobody/Anonymous (nobody) Summary: splitext performances improvement Initial Comment: After more thought, I must admit that the behavior change in splitext, I proposed with patch 536120 is not acceptable. So I would instead propose this one which should only improve performances without modifying behavior. The following bench says that patched splitext is between 2x(for l1) and 25x(for l2) faster than the original one. The diff patch also test_posixpath.py to check the pitfall described by Tim comments in patch 536120 page. def splitext(p): root, ext = '', '' for c in p: if c == '/': root, ext = root + ext + c, '' elif c == '.': if ext: root, ext = root + ext, c else: ext = c elif ext: ext = ext + c else: root = root + c return root, ext def splitext2(p): i = p.rfind('.') if i<=p.rfind('/'): return p, '' else: return p[:i], p[i:] l1 = ('t','.t','a.b/','a.b','/a.b','a.b/.c','a.b/c.d') l2 = ( 'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.tyyttyt', 'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.', 'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/.tyyttyt', 'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut', 'reeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeyttyutyuyuttyuyut.tyyttyt', '/iuouiiuuoiiuiikhjzekezhjzekejkejkzejkhejkhzejzehjkhjezhjkehzkhjezh.tyyttyt' ) for i in l1+l2: assert splitext2(i) == splitext(i) import time def test(f,args): t = time.clock() for p in args: for i in range(1000): f(p) return time.clock() - t def f(p):pass a=test(splitext, l1) b=test(splitext2, l1) c=test(f,l1) print a,b,c,(a-c)/(b-c) a=test(splitext, l2) b=test(splitext2, l2) c=test(f,l2) print a,b,c,(a-c)/(b-c) ---------------------------------------------------------------------- >Comment By: Sebastien Keim (s_keim) Date: 2002-04-03 09:03 Message: Logged In: YES user_id=498191 xxxpath.dif contains the splitext patch for posixpath, ntpath, dospath macpath and the corresponding test files (I have added a test file for macpath). I have found better to not attempt to modify riscospath.py since I don't know this platform. Anyway, it already use a rfind strategy. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-02 11:24 Message: Logged In: YES user_id=21627 Sharing code is a good thing. However, it would be critical as to how exactly this is done, since os is such a central module. If you start now, and don't get agreement immediately, it may well be that you cannot complete until Python 2.3. ---------------------------------------------------------------------- Comment By: Sebastien Keim (s_keim) Date: 2002-04-02 09:28 Message: Logged In: YES user_id=498191 I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this? ---------------------------------------------------------------------- Comment By: Sebastien Keim (s_keim) Date: 2002-04-02 09:15 Message: Logged In: YES user_id=498191 I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this? ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-03-29 19:56 Message: Logged In: YES user_id=31435 I like it fine so far as it goes, but I'd like it a lot more if it also patched the splitext and test implementations for other platforms. It's not good that, e.g., posixpath.py and ntpath.py get more and more out of synch over time, and that their test suites also diverge. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-29 10:49 Message: Logged In: YES user_id=21627 The patch looks good to me. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470 From noreply@sourceforge.net Wed Apr 3 09:29:44 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 03 Apr 2002 01:29:44 -0800 Subject: [Patches] [ python-Patches-538395 ] ae* modules: handle type inheritance Message-ID: Patches item #538395, was opened at 2002-04-02 21:24 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470 Category: Macintosh Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Donovan Preston (dsposx) Assigned to: Jack Jansen (jackjansen) Summary: ae* modules: handle type inheritance Initial Comment: The gensuitemodule script creates Python classes out of AppleScript types. It keeps track of properties in _propdict and elements in _elemdict. However, gensuitemodule does not accurately replicate the AppleScript inheritance heirarchy, and __getattr__ only looks in self._propdict and self._elemdict, therefore not finding elements and properties defined in superclasses. Attached is a patch which: 1) Correctly identifies an AppleScript type's superclasses, and defines the Python classes with these superclasses. Since not all names may be defined by the time a new class is defined, this is accomplished by setting a new class' __bases__ after all names are defined. 2) Changes __getattr__ to recurse superclasses while looking through _propdict and _elemdict. It also contains small usability enhancements which will automatically look for a .want or .which property when you are creating specifiers. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-03 11:29 Message: Logged In: YES user_id=21627 cvs update will keep a copy of the original file (the one you edited) if it has to merge changes; it will name it .#.. So in no case cvs will destroy your changes. Normally, merging works quite well. If it finds a conflict, it will print a 'C' on update, and put a conflict marker in the file. The stuff above the ===== is your code, the one below is the CVS code. If you want to find out what cvs would do, use 'cvs status'. If you don't want cvs to do merging, the following procedure will work cvs diff -u >patches patch -p0 -R 1.22 by hand without doing a cvs update? I think a cvs update (plus some manual work;-) should solve this. Third: the passing of modules by name (to the decoding routines) seems error prone and not too elegant. Can't you pass the modules themselves in stead of their names? It would also save extra imports in the decoders. Fourth: assigning to __bases__ seems like rather a big hack. Can't we generate the classes with a helper class, similarly to the event helper class in the __init__.py modules: FooApp/Foo_Suite.py would contain the class foo sketched above, and FooApp.__init__.py would contain import othersuite.superfoo import Foo_Suite.foo class foo(Foo_Suite.foo, othersuite.superfoo): pass Fifth, and least important: you're manually iterating over the base classes for lookup. Couldn't we statically combine the _propdict and _elemdict's of the base classes during class declaration, so that at lookup time we'd need only a single dictionary lookup? The "class foo" body in __init__.py would then become something like _propdict = aetools.flatten_dicts( foosuite.foo._propdict, othersuite.superfoo._propdict) and similar for elemdict. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470 From noreply@sourceforge.net Wed Apr 3 20:43:47 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 03 Apr 2002 12:43:47 -0800 Subject: [Patches] [ python-Patches-528022 ] PEP 285 - Adding a bool type Message-ID: Patches item #528022, was opened at 2002-03-10 00:45 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=528022&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Guido van Rossum (gvanrossum) Assigned to: Guido van Rossum (gvanrossum) Summary: PEP 285 - Adding a bool type Initial Comment: Here's a preliminary implementation of the PEP, including unittests checking the promises made in the PEP (test_bool.py) and (some) documentation. With this 12 tests fail for me (on Linux); I'll look into these later. They appear shallow (mostly doctests dying on True or False where 1 or 0 was expected). Note: the presence of this patch does not mean that the PEP is accepted -- it just means that a sample implementation exists in case someone wants to explore the effects of the PEP on their code. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-03 15:43 Message: Logged In: YES user_id=6380 I've attached a new patch, booldiff3.txt, that solves the two remaining problems: - picke, cPickle and marshal roundtrip - the test suite succeeds (a total of 12 tests had to be fixed, all because of True/False vs. 1/0 in printed output) I'm ready to check this in, but I'll first update the PEP. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-01 06:44 Message: Logged In: YES user_id=21627 This patch does not support pickling of bools (the PEP should probably spell out how they are pickled). marshalling of bools does not round-trip (you get back an int). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-03-31 21:46 Message: Logged In: YES user_id=6380 Here's an updated diff (booldiff2.txt). It fixes a refcount bug in bool_repr(), and works with current CVS. With this patch set, 10 standard tests fail for shallow reasons having to do with str() or repr() returning False or True instead of 0 or 1. Here are the failed tests: test_descr test_descrtut test_difflib test_doctest test_extcall test_generators test_gettext test_richcmp test_richcompare test_unicode ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=528022&group_id=5470 From noreply@sourceforge.net Wed Apr 3 21:16:42 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 03 Apr 2002 13:16:42 -0800 Subject: [Patches] [ python-Patches-538395 ] ae* modules: handle type inheritance Message-ID: Patches item #538395, was opened at 2002-04-02 21:24 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470 Category: Macintosh Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Donovan Preston (dsposx) Assigned to: Jack Jansen (jackjansen) Summary: ae* modules: handle type inheritance Initial Comment: The gensuitemodule script creates Python classes out of AppleScript types. It keeps track of properties in _propdict and elements in _elemdict. However, gensuitemodule does not accurately replicate the AppleScript inheritance heirarchy, and __getattr__ only looks in self._propdict and self._elemdict, therefore not finding elements and properties defined in superclasses. Attached is a patch which: 1) Correctly identifies an AppleScript type's superclasses, and defines the Python classes with these superclasses. Since not all names may be defined by the time a new class is defined, this is accomplished by setting a new class' __bases__ after all names are defined. 2) Changes __getattr__ to recurse superclasses while looking through _propdict and _elemdict. It also contains small usability enhancements which will automatically look for a .want or .which property when you are creating specifiers. ---------------------------------------------------------------------- >Comment By: Jack Jansen (jackjansen) Date: 2002-04-03 23:16 Message: Logged In: YES user_id=45365 Donovan, two comments-on-your-comments: - You're absolutely right about the module names. Pickle also uses names, and it's probably the only way to do it. - You're also absolutely right about how to update the _elemdict and _propdict. Or, as Jean-Luc Picard would say: "Make it so!" :-) Oh yes, on the production code/merging problem: aside from Martin's comments here's another tip: make a copy of the subtree that contains the conflict section (why not the whole Mac subtree in your case) and make sure you keep the CVS directories. Start hacking in this copy. Once you're satisfied do a commit from there. As long as you keep the CVS directory with the files there's little that can go wrong. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-03 11:29 Message: Logged In: YES user_id=21627 cvs update will keep a copy of the original file (the one you edited) if it has to merge changes; it will name it .#.. So in no case cvs will destroy your changes. Normally, merging works quite well. If it finds a conflict, it will print a 'C' on update, and put a conflict marker in the file. The stuff above the ===== is your code, the one below is the CVS code. If you want to find out what cvs would do, use 'cvs status'. If you don't want cvs to do merging, the following procedure will work cvs diff -u >patches patch -p0 -R 1.22 by hand without doing a cvs update? I think a cvs update (plus some manual work;-) should solve this. Third: the passing of modules by name (to the decoding routines) seems error prone and not too elegant. Can't you pass the modules themselves in stead of their names? It would also save extra imports in the decoders. Fourth: assigning to __bases__ seems like rather a big hack. Can't we generate the classes with a helper class, similarly to the event helper class in the __init__.py modules: FooApp/Foo_Suite.py would contain the class foo sketched above, and FooApp.__init__.py would contain import othersuite.superfoo import Foo_Suite.foo class foo(Foo_Suite.foo, othersuite.superfoo): pass Fifth, and least important: you're manually iterating over the base classes for lookup. Couldn't we statically combine the _propdict and _elemdict's of the base classes during class declaration, so that at lookup time we'd need only a single dictionary lookup? The "class foo" body in __init__.py would then become something like _propdict = aetools.flatten_dicts( foosuite.foo._propdict, othersuite.superfoo._propdict) and similar for elemdict. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470 From noreply@sourceforge.net Wed Apr 3 23:04:29 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 03 Apr 2002 15:04:29 -0800 Subject: [Patches] [ python-Patches-528022 ] PEP 285 - Adding a bool type Message-ID: Patches item #528022, was opened at 2002-03-10 00:45 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=528022&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Guido van Rossum (gvanrossum) Assigned to: Guido van Rossum (gvanrossum) Summary: PEP 285 - Adding a bool type Initial Comment: Here's a preliminary implementation of the PEP, including unittests checking the promises made in the PEP (test_bool.py) and (some) documentation. With this 12 tests fail for me (on Linux); I'll look into these later. They appear shallow (mostly doctests dying on True or False where 1 or 0 was expected). Note: the presence of this patch does not mean that the PEP is accepted -- it just means that a sample implementation exists in case someone wants to explore the effects of the PEP on their code. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-03 18:04 Message: Logged In: YES user_id=6380 Here's a new version of booldiff.txt that includes the new files boolobject.[ch] and test_bool.py. Sorry. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-03 15:43 Message: Logged In: YES user_id=6380 I've attached a new patch, booldiff3.txt, that solves the two remaining problems: - picke, cPickle and marshal roundtrip - the test suite succeeds (a total of 12 tests had to be fixed, all because of True/False vs. 1/0 in printed output) I'm ready to check this in, but I'll first update the PEP. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-01 06:44 Message: Logged In: YES user_id=21627 This patch does not support pickling of bools (the PEP should probably spell out how they are pickled). marshalling of bools does not round-trip (you get back an int). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-03-31 21:46 Message: Logged In: YES user_id=6380 Here's an updated diff (booldiff2.txt). It fixes a refcount bug in bool_repr(), and works with current CVS. With this patch set, 10 standard tests fail for shallow reasons having to do with str() or repr() returning False or True instead of 0 or 1. Here are the failed tests: test_descr test_descrtut test_difflib test_doctest test_extcall test_generators test_gettext test_richcmp test_richcompare test_unicode ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=528022&group_id=5470 From noreply@sourceforge.net Wed Apr 3 23:48:42 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 03 Apr 2002 15:48:42 -0800 Subject: [Patches] [ python-Patches-539005 ] error in RawPen-class (line 262) Message-ID: Patches item #539005, was opened at 2002-04-04 01:48 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539005&group_id=5470 Category: Tkinter Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Gregor Lingl (glingl) Assigned to: Nobody/Anonymous (nobody) Summary: error in RawPen-class (line 262) Initial Comment: line 262 uses the global variable _canvas instead of the instance-variable self._canvas created in the RawPen - Constructor. This certainly is a *very* old bug and it seems strange, that it could remain undetected that long. for the patch look at lines 262 - 264 ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539005&group_id=5470 From noreply@sourceforge.net Thu Apr 4 00:15:53 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 03 Apr 2002 16:15:53 -0800 Subject: [Patches] [ python-Patches-539005 ] error in RawPen-class (line 262) Message-ID: Patches item #539005, was opened at 2002-04-03 18:48 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539005&group_id=5470 Category: Tkinter Group: Python 2.2.x >Status: Closed >Resolution: Duplicate Priority: 5 Submitted By: Gregor Lingl (glingl) Assigned to: Nobody/Anonymous (nobody) Summary: error in RawPen-class (line 262) Initial Comment: line 262 uses the global variable _canvas instead of the instance-variable self._canvas created in the RawPen - Constructor. This certainly is a *very* old bug and it seems strange, that it could remain undetected that long. for the patch look at lines 262 - 264 ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-04-03 19:15 Message: Logged In: YES user_id=33168 This is a duplicate of #538991, https://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=538991 The fix has been corrected on the main branch, but not in 2.2 branch yet. I'm not sure the fix will go in 2.2.1. It will be in 2.2.2. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539005&group_id=5470 From noreply@sourceforge.net Thu Apr 4 01:28:35 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 03 Apr 2002 17:28:35 -0800 Subject: [Patches] [ python-Patches-539043 ] Support PyChecker in IDLE Message-ID: Patches item #539043, was opened at 2002-04-03 20:28 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539043&group_id=5470 Category: IDLE Group: None Status: Open Resolution: None Priority: 5 Submitted By: Neal Norwitz (nnorwitz) Assigned to: Nobody/Anonymous (nobody) Summary: Support PyChecker in IDLE Initial Comment: This patch adds SIMPLE support for pychecker in IDLE. It is not complete. It pops up a window, you can enter filenames (not even a file dialog!), and run pychecker. You cannot change examples. If someone wants to really integrate this, they should add the user interface in pychecker (pychecker/options.py), use a file dialog to enter files, and handle file modifications. Since pychecker imports the files, they need to be removed from sys.modules, so modifications will be seen. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539043&group_id=5470 From noreply@sourceforge.net Thu Apr 4 17:12:50 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 04 Apr 2002 09:12:50 -0800 Subject: [Patches] [ python-Patches-534862 ] help asyncore recover from repr() probs Message-ID: Patches item #534862, was opened at 2002-03-25 16:12 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=534862&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Skip Montanaro (montanaro) >Assigned to: Jeremy Hylton (jhylton) Summary: help asyncore recover from repr() probs Initial Comment: I've had this patch my my copy of asyncore.py for quite awhile. It works for me as a way to recover from repr() bogosities, though I'm unfamiliar enough with repr/str issues and asyncore to know if this is the right way to make it more bulletproof (or if it should even be made more bulletproof). Skip ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-04 12:12 Message: Logged In: YES user_id=6380 Jeremy, what do you think of this? Looks harmless to me... ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=534862&group_id=5470 From noreply@sourceforge.net Thu Apr 4 17:31:35 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 04 Apr 2002 09:31:35 -0800 Subject: [Patches] [ python-Patches-536883 ] SimpleXMLRPCServer auto-docing subclass Message-ID: Patches item #536883, was opened at 2002-03-29 14:52 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Brian Quinlan (bquinlan) Assigned to: Fredrik Lundh (effbot) Summary: SimpleXMLRPCServer auto-docing subclass Initial Comment: This SimpleXMLRPCServer subclass automatically serves HTML documentation, generated using pydoc, in response to an HTTP GET request (XML-RPC always uses POST). Here are some examples: http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc1.py http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc2.py ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-04 12:31 Message: Logged In: YES user_id=6380 Looks cute to me. Fredrik, any problem if I just check this in? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470 From noreply@sourceforge.net Thu Apr 4 17:51:46 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 04 Apr 2002 09:51:46 -0800 Subject: [Patches] [ python-Patches-536407 ] Comprehensibility patch (typeobject.c) Message-ID: Patches item #536407, was opened at 2002-03-28 13:56 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536407&group_id=5470 Category: Core (C code) Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: David Abrahams (david_abrahams) >Assigned to: Guido van Rossum (gvanrossum) Summary: Comprehensibility patch (typeobject.c) Initial Comment: --- typeobject.c Mon Dec 17 12:14:22 2001 +++ typeobject.c.new Thu Mar 28 13:46:03 2002 @@ -1186,8 +1186,8 @@ type_getattro(PyTypeObject *type, PyObject *name) { PyTypeObject *metatype = type->ob_type; - PyObject *descr, *res; - descrgetfunc f; + PyObject *meta_attribute, *attribute; + descrgetfunc meta_get; /* Initialize this type (we'll assume the metatype is initialized) */ if (type->tp_dict == NULL) { @@ -1195,34 +1195,50 @@ return NULL; } - /* Get a descriptor from the metatype */ - descr = _PyType_Lookup(metatype, name); - f = NULL; - if (descr != NULL) { - f = descr->ob_type->tp_descr_get; - if (f != NULL && PyDescr_IsData (descr)) - return f(descr, - (PyObject *)type, (PyObject *)metatype); - } + /* No readable descriptor found yet */ + meta_get = NULL; + + /* Look for the attribute in the metatype */ + meta_attribute = _PyType_Lookup(metatype, name); - /* Look in tp_dict of this type and its bases */ - res = _PyType_Lookup(type, name); - if (res != NULL) { - f = res->ob_type->tp_descr_get; - if (f != NULL) - return f(res, (PyObject *) NULL, (PyObject *)type); - Py_INCREF(res); - return res; + if (meta_attribute != NULL) { + meta_get = meta_attribute->ob_type- >tp_descr_get; + + if (meta_get != NULL && PyDescr_IsData (meta_attribute)) { + /* Data descriptors implement tp_descr_set to intercept + * writes. Assume the attribute is not overridden in + * type's tp_dict (and bases): call the descriptor now. + */ + return meta_get (meta_attribute, + (PyObject *)type, (PyObject *)metatype); + } } - /* Use the descriptor from the metatype */ - if (f != NULL) { - res = f(descr, (PyObject *)type, (PyObject *)metatype); - return res; + /* No data descriptor found on metatype. Look in tp_dict of this + * type and its bases */ + attribute = _PyType_Lookup(type, name); + if (attribute != NULL) { + /* Implement descriptor functionality, if any */ + descrgetfunc local_get = attribute- >ob_type->tp_descr_get; + if (local_get != NULL) { + /* NULL 2nd argument indicates the descriptor was found on + * the target object itself (or a base) */ + return local_get(attribute, (PyObject *)NULL, (PyObject *)type); + } + + Py_INCREF(attribute); + return attribute; } - if (descr != NULL) { - Py_INCREF(descr); - return descr; + + /* No attribute found in local __dict__ (or bases): use the + * descriptor from the metatype, if any */ + if (meta_get != NULL) + return meta_get(meta_attribute, (PyObject *)type, (PyObject *)metatype); + + /* If an ordinary attribute was found on the metatype, return it now. */ + if (meta_attribute != NULL) { + Py_INCREF(meta_attribute); + return meta_attribute; } /* Give up */ ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-04 12:51 Message: Logged In: YES user_id=6380 Thanks, applied (after folding some long lines). Next time, please don't call the patch "patch". Call it something like "typeobject.patch". ---------------------------------------------------------------------- Comment By: David Abrahams (david_abrahams) Date: 2002-03-29 17:30 Message: Logged In: YES user_id=52572 Thanks, Neil, I think I got the picture already (see Python-Dev). ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-03-29 17:27 Message: Logged In: YES user_id=35752 Don't paste the patch in the comment box. ---------------------------------------------------------------------- Comment By: David Abrahams (david_abrahams) Date: 2002-03-29 16:22 Message: Logged In: YES user_id=52572 I have updated the patch so that it is made against the current sources. ------- --- typeobject.c Thu Mar 28 00:33:33 2002 +++ typeobject.c.new Fri Mar 29 16:20:12 2002 @@ -1237,8 +1237,8 @@ type_getattro(PyTypeObject *type, PyObject *name) { PyTypeObject *metatype = type->ob_type; - PyObject *descr, *res; - descrgetfunc f; + PyObject *meta_attribute, *attribute; + descrgetfunc meta_get; /* Initialize this type (we'll assume the metatype is initialized) */ if (type->tp_dict == NULL) { @@ -1246,40 +1246,56 @@ return NULL; } - /* Get a descriptor from the metatype */ - descr = _PyType_Lookup(metatype, name); - f = NULL; - if (descr != NULL) { - f = descr->ob_type->tp_descr_get; - if (f != NULL && PyDescr_IsData(descr)) - return f(descr, - (PyObject *)type, (PyObject *)metatype); - } + /* No readable descriptor found yet */ + meta_get = NULL; + + /* Look for the attribute in the metatype */ + meta_attribute = _PyType_Lookup(metatype, name); - /* Look in tp_dict of this type and its bases */ - res = _PyType_Lookup(type, name); - if (res != NULL) { - f = res->ob_type->tp_descr_get; - if (f != NULL) - return f(res, (PyObject *)NULL, (PyObject *)type); - Py_INCREF(res); - return res; + if (meta_attribute != NULL) { + meta_get = meta_attribute->ob_type- >tp_descr_get; + + if (meta_get != NULL && PyDescr_IsData (meta_attribute)) { + /* Data descriptors implement tp_descr_set to intercept + * writes. Assume the attribute is not overridden in + * type's tp_dict (and bases): call the descriptor now. + */ + return meta_get(meta_attribute, + (PyObject *)type, (PyObject *)metatype); + } } - /* Use the descriptor from the metatype */ - if (f != NULL) { - res = f(descr, (PyObject *)type, (PyObject *)metatype); - return res; + /* No data descriptor found on metatype. Look in tp_dict of this + * type and its bases */ + attribute = _PyType_Lookup(type, name); + if (attribute != NULL) { + /* Implement descriptor functionality, if any */ + descrgetfunc local_get = attribute- >ob_type->tp_descr_get; + if (local_get != NULL) { + /* NULL 2nd argument indicates the descriptor was found on + * the target object itself (or a base) */ + return local_get(attribute, (PyObject *)NULL, (PyObject *)type); + } + + Py_INCREF(attribute); + return attribute; } - if (descr != NULL) { - Py_INCREF(descr); - return descr; + + /* No attribute found in local __dict__ (or bases): use the + * descriptor from the metatype, if any */ + if (meta_get != NULL) + return meta_get(meta_attribute, (PyObject *)type, (PyObject *)metatype); + + /* If an ordinary attribute was found on the metatype, return it now. */ + if (meta_attribute != NULL) { + Py_INCREF(meta_attribute); + return meta_attribute; } /* Give up */ PyErr_Format(PyExc_AttributeError, - "type object '%.50s' has no attribute '%.400s'", - type->tp_name, PyString_AS_STRING (name)); + "type object '%.50s' has no attribute '%.400s'", + type->tp_name, PyString_AS_STRING (name)); return NULL; } ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536407&group_id=5470 From noreply@sourceforge.net Thu Apr 4 17:52:47 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 04 Apr 2002 09:52:47 -0800 Subject: [Patches] [ python-Patches-539360 ] Webbrowser.py and konqueror Message-ID: Patches item #539360, was opened at 2002-04-04 09:52 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539360&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Andy McKay (zopezen) Assigned to: Nobody/Anonymous (nobody) Summary: Webbrowser.py and konqueror Initial Comment: The open function for konqueror would always fail on the assert. The assert would check the action did not contain a single quote. The url passed through in the open function would always contain a single quote. The assert should check the incoming url for a single quote. If its properly quoted then you can pass on to _remote. Secondly since the _remote url is now correctly quoted, there is no need for a second set of quotes on the kfmclient. Tested on Kde 2.2. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539360&group_id=5470 From noreply@sourceforge.net Thu Apr 4 17:55:14 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 04 Apr 2002 09:55:14 -0800 Subject: [Patches] [ python-Patches-536883 ] SimpleXMLRPCServer auto-docing subclass Message-ID: Patches item #536883, was opened at 2002-03-29 11:52 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Brian Quinlan (bquinlan) Assigned to: Fredrik Lundh (effbot) Summary: SimpleXMLRPCServer auto-docing subclass Initial Comment: This SimpleXMLRPCServer subclass automatically serves HTML documentation, generated using pydoc, in response to an HTTP GET request (XML-RPC always uses POST). Here are some examples: http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc1.py http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc2.py ---------------------------------------------------------------------- >Comment By: Brian Quinlan (bquinlan) Date: 2002-04-04 09:55 Message: Logged In: YES user_id=108973 Sorry, I was sloppy about the description: This patch is dependant on patch 473586: [473586] SimpleXMLRPCServer - fixes and CGI So please don't check this in until that patch is accepted. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-04 09:31 Message: Logged In: YES user_id=6380 Looks cute to me. Fredrik, any problem if I just check this in? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470 From noreply@sourceforge.net Thu Apr 4 18:59:23 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 04 Apr 2002 10:59:23 -0800 Subject: [Patches] [ python-Patches-539392 ] Unicode fix for test in tkFileDialog.py Message-ID: Patches item #539392, was opened at 2002-04-04 20:59 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470 Category: Tkinter Group: None Status: Open Resolution: None Priority: 5 Submitted By: Bernhard Reiter (ber) Assigned to: Nobody/Anonymous (nobody) Summary: Unicode fix for test in tkFileDialog.py Initial Comment: Patch is against current CVS form 20020404. It also gives pointers to the problem described in http://mail.python.org/pipermail/python-list/2001-June/048787.html Python's open() uses the Py_FileSystemDefaultEncoding. Py_FileSystemDefaultEncoding is NULL (bltinmodule.c) for most systems. Setlocate will set it. Thus we fixed the example and set the locale to the user defaults. Now "enc" will have a useful encoding thus the example will work with a non ascii characters in the filename, e.g. with umlauts in it. It bombed on them before. Traceback (most recent call last): File "tkFileDialog.py", line 105, in ? print "open", askopenfilename(filetypes=[("all filez", "*")]) UnicodeError: ASCII encoding error: ordinal not in range(128) open() will work with the string directly now. encode(enc) is only needed for terminal output, thus we enchanced the example to show the two uses of the returned filename string separatly. (It might be interesting to drop a note about this in the right part of the user documentation.) If you comment out the setlocale() you can see that open fails, which illustrates what seems to be a design flaw in tk. Tk should be able to give you a string in exactly the encoding in which the filesystem gave it to tk. 4.4.2002 Bernhard Bernhard ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470 From noreply@sourceforge.net Thu Apr 4 19:26:09 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 04 Apr 2002 11:26:09 -0800 Subject: [Patches] [ python-Patches-536883 ] SimpleXMLRPCServer auto-docing subclass Message-ID: Patches item #536883, was opened at 2002-03-29 11:52 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Brian Quinlan (bquinlan) Assigned to: Fredrik Lundh (effbot) Summary: SimpleXMLRPCServer auto-docing subclass Initial Comment: This SimpleXMLRPCServer subclass automatically serves HTML documentation, generated using pydoc, in response to an HTTP GET request (XML-RPC always uses POST). Here are some examples: http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc1.py http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc2.py ---------------------------------------------------------------------- >Comment By: Brian Quinlan (bquinlan) Date: 2002-04-04 11:26 Message: Logged In: YES user_id=108973 Sorry, I was sloppy about the description: This patch is dependant on patch 473586: [473586] SimpleXMLRPCServer - fixes and CGI So please don't check this in until that patch is accepted. ---------------------------------------------------------------------- Comment By: Brian Quinlan (bquinlan) Date: 2002-04-04 09:55 Message: Logged In: YES user_id=108973 Sorry, I was sloppy about the description: This patch is dependant on patch 473586: [473586] SimpleXMLRPCServer - fixes and CGI So please don't check this in until that patch is accepted. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-04 09:31 Message: Logged In: YES user_id=6380 Looks cute to me. Fredrik, any problem if I just check this in? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470 From noreply@sourceforge.net Thu Apr 4 19:57:40 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 04 Apr 2002 11:57:40 -0800 Subject: [Patches] [ python-Patches-533008 ] specifying headers for extensions Message-ID: Patches item #533008, was opened at 2002-03-21 06:09 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=533008&group_id=5470 Category: Distutils and setup.py Group: Python 2.3 Status: Open Resolution: None Priority: 7 Submitted By: Thomas Heller (theller) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: specifying headers for extensions Initial Comment: This patch allows to specify that C header files are part of source files for dependency checking. The 'sources' list in Extension instances can be simple filenames as before, but they can also be SourceFile instances created by SourceFile("myfile.c", headers=["inc1.h", "inc2.h"]). Unfortunately not only changes to command.build_ext and command.build_clib had to be made, also all the ccompiler (sub)classes have to be changed because the ccompiler does the actual dependency checking. I updated all the ccompiler subclasses except mwerkscompiler.py, but only msvccompiler has actually been tested. The argument list which dep_util.newer_pairwise() now accepts has changed, the first arg must now be a sequence of SourceFile instances. This may be problematic, better would IMO be to move this function (with a new name?) into ccompiler. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-04-04 14:57 Message: Logged In: YES user_id=3066 Wow! That's certainly more patch than I'd expected, but the approach looks about right to me. I'd like to take another look at it in a few days (mail me if I don't take action soon) before we accept, just to make sure I understand it better. Thanks! ---------------------------------------------------------------------- Comment By: Thomas Heller (theller) Date: 2002-03-25 04:03 Message: Logged In: YES user_id=11105 Fred requested it this way: http://mail.python.org/pipermail/distutils-sig/2002- March/002806.html ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-03-24 17:05 Message: Logged In: YES user_id=6380 Why is this priority 7?????? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=533008&group_id=5470 From noreply@sourceforge.net Thu Apr 4 20:16:38 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 04 Apr 2002 12:16:38 -0800 Subject: [Patches] [ python-Patches-539392 ] Unicode fix for test in tkFileDialog.py Message-ID: Patches item #539392, was opened at 2002-04-04 20:59 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470 Category: Tkinter Group: None Status: Open Resolution: None Priority: 5 Submitted By: Bernhard Reiter (ber) Assigned to: Nobody/Anonymous (nobody) Summary: Unicode fix for test in tkFileDialog.py Initial Comment: Patch is against current CVS form 20020404. It also gives pointers to the problem described in http://mail.python.org/pipermail/python-list/2001-June/048787.html Python's open() uses the Py_FileSystemDefaultEncoding. Py_FileSystemDefaultEncoding is NULL (bltinmodule.c) for most systems. Setlocate will set it. Thus we fixed the example and set the locale to the user defaults. Now "enc" will have a useful encoding thus the example will work with a non ascii characters in the filename, e.g. with umlauts in it. It bombed on them before. Traceback (most recent call last): File "tkFileDialog.py", line 105, in ? print "open", askopenfilename(filetypes=[("all filez", "*")]) UnicodeError: ASCII encoding error: ordinal not in range(128) open() will work with the string directly now. encode(enc) is only needed for terminal output, thus we enchanced the example to show the two uses of the returned filename string separatly. (It might be interesting to drop a note about this in the right part of the user documentation.) If you comment out the setlocale() you can see that open fails, which illustrates what seems to be a design flaw in tk. Tk should be able to give you a string in exactly the encoding in which the filesystem gave it to tk. 4.4.2002 Bernhard Bernhard ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-04 22:16 Message: Logged In: YES user_id=21627 I think this patch is not acceptable. If the application wants to support non-ASCII file names, it must invoke setlocale(); it is not the library's responsibility to make this decision behind the application's back. People question the validity of using CODESET in the file system, so each developer needs to make a concious decision. BTW, how does Tcl come up with the names in the first place? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470 From noreply@sourceforge.net Thu Apr 4 20:51:56 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 04 Apr 2002 12:51:56 -0800 Subject: [Patches] [ python-Patches-523415 ] Explict proxies for urllib.urlopen() Message-ID: Patches item #523415, was opened at 2002-02-27 09:53 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=523415&group_id=5470 Category: Library (Lib) Group: Python 2.2.x >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Andy Gimblett (gimbo) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Explict proxies for urllib.urlopen() Initial Comment: This patch extends urllib.urlopen() so that proxies may be specified explicitly. This is achieved by adding an optional "proxies" parameter. If this parameter is omitted, urlopen() acts exactly as before, ie gets proxy settings from the environment. This is useful if you want to tell urlopen() not to use the proxy: just pass an empty dictionary. Also included is a patch to the urllib documentation explaining the new parameter. Apologies if patch format is not exactly as required: this is my first submission. All feedback appreciated. :-) ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-04-04 15:51 Message: Logged In: YES user_id=3066 I've checked this in, with some changes to the code for urlopen(). When a proxy configuration is supplied, the version I checked in does not save the opener if there isn't one; it always discards it. If you really want to use a specific proxy configuration with the simple functions, create the opener and assign it to urllib._urlopener. ---------------------------------------------------------------------- Comment By: Andy Gimblett (gimbo) Date: 2002-03-21 06:08 Message: Logged In: YES user_id=262849 OK, have updated docs as suggested by aimacintyre, attached as urllib_proxies_docs.cdiff I also added an example for explicit proxy specification, since it illustrates how the proxies dictionary should be structured. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-03-10 00:31 Message: Logged In: YES user_id=250749 I think expanding the docs is the go here. In looking at the 2.2 docs (11.4 urllib), the bits that I think could usefully be improved include:- - the paragraph describing the proxy environment variables should note that on Windows, browser (at least for InternetExplorer - I don't know about Netscape) registry settings for proxies will be used when available; - a short para noting that proxies can be overridden using URLopener/FancyURLopener class instances, documented further down the page, placed just before the note about not supporting authenticating proxies; - adding a description of the "proxies" parameter to the URLopener class definition; - adding an example of bypassing proxies to the examples subsection (11.4.2). If/when you upload a doc patch, I suggest that you assign it to Fred Drake, who is the chief docs person. ---------------------------------------------------------------------- Comment By: Andy Gimblett (gimbo) Date: 2002-03-04 04:33 Message: Logged In: YES user_id=262849 Thanks for feedback re: diffs. Have now found out about context diffs and attached new version - hope this is better. Regarding the patch itself, this arose out of a newbie question on c.l.py and I was reminded that this was an issue I'd come across in my early days too. Personally I'd never picked up the hint that you should use FancyURLopener directly. If preferred, I could have a go at patching the docs to make that clearer? ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-03-02 22:34 Message: Logged In: YES user_id=250749 BTW, the patch guidelines indicate a strong preference for context diffs with unified diffs a poor second. ---------------------------------------------------------------------- Comment By: Andrew I MacIntyre (aimacintyre) Date: 2002-03-02 22:32 Message: Logged In: YES user_id=250749 Having just looked at this myself, I can understand where you're coming from, however my reading between the lines of the docs is that if you care about the proxies then you are supposed to use urllib.FancyURLopener (or urllib.URLopener) directly. If this is the intent, the docs could be a little clearer about this. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=523415&group_id=5470 From noreply@sourceforge.net Thu Apr 4 22:25:53 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 04 Apr 2002 14:25:53 -0800 Subject: [Patches] [ python-Patches-539486 ] build info docs from sources Message-ID: Patches item #539486, was opened at 2002-04-04 22:25 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539486&group_id=5470 Category: Documentation Group: Python 2.1.2 Status: Open Resolution: None Priority: 5 Submitted By: Matthias Klose (doko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: build info docs from sources Initial Comment: This patch adds Milan Zamazals conversion script and modifies the mkinfo script to build the info doc files from the latex sources. Currently, the mac, doc and inst tex files are not handled. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539486&group_id=5470 From noreply@sourceforge.net Thu Apr 4 22:26:45 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 04 Apr 2002 14:26:45 -0800 Subject: [Patches] [ python-Patches-539487 ] build info docs from tex sources Message-ID: Patches item #539487, was opened at 2002-04-04 22:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470 Category: Documentation Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Matthias Klose (doko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: build info docs from tex sources Initial Comment: This patch adds Milan Zamazals conversion script and modifies the mkinfo script to build the info doc files from the latex sources. Currently, the mac, doc and inst tex files are not handled. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470 From noreply@sourceforge.net Thu Apr 4 23:45:59 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 04 Apr 2002 15:45:59 -0800 Subject: [Patches] [ python-Patches-514662 ] On the update_slot() behavior Message-ID: Patches item #514662, was opened at 2002-02-07 23:49 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=514662&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed >Resolution: Accepted Priority: 6 Submitted By: Naofumi Honda (naofumi-h) Assigned to: Guido van Rossum (gvanrossum) Summary: On the update_slot() behavior Initial Comment: Inherited method __getitem__ of list type in the new subclass is unexpectedly slow. For example, x = list([1,2,3]) r = xrange(1, 1000000) for i in r: x[1] = 2 ==> excution time: real 0m2.390s class nlist(list): pass x = nlist([1,2,3]) r = xrange(1, 1000000) for i in r: x[1] = 2 ==> excution time: real 0m7.040s about 3times slower!!! The reason is: for the __getitem__ attribute, there are two slotdefs in typeobject.c (one for the mapping type, and the other for the sequence type). In the creation of new_type of list type, fixup_slot_dispatchers() and update_slot() functions in typeobject.c allocate the functions to both sq_item and mp_subscript slots (the mp_subscript slot had originally no function, because the list type is a sequence type), and it's an unexpected allocation for the mapping slot since the descriptor type of __getitem__ is now WrapperType for the sequence operations. If you will trace x[1] using gdb, you will find that in PyObject_GetItem() m->mp_subscript = slot_mp_subscript is called instead of a sequece operation because mp_subscript slot was allocated by fixup_slot_dispatchers(). In the slot_mp_subscirpt(), call_method(self, "__getitem__", ...) is invoked, and turn out to call a wrapper descriptors for the sq_item. As a result, the method of list type finally called, but it needs many unexpected function calls. I will fix the behavior of fixup_slot_dispachers() and update_slot() as follows: Only the case where *) two or more slotdefs have the same attribute name where at most one corresponding slot has a non null pointer *) the descriptor type of the attribute is WrapperType, these functions will allocate the only one function to the apropriate slot. The other case, the behavior not changed to keep compatiblity! (in particular, considering the case where user override methods exist!) The following patch also includes speed up routines to find the slotdef duplications, but it's not essential! ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-04 18:45 Message: Logged In: YES user_id=6380 Thanks! Checked in, with much refactoring. ---------------------------------------------------------------------- Comment By: Naofumi Honda (naofumi-h) Date: 2002-03-23 03:40 Message: Logged In: YES user_id=452575 Yes. slot-1.dif is a new version. At least, I purged ifdef ... as you want. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-03-22 22:47 Message: Logged In: YES user_id=6380 Is slot-1.dif the promised new patch? ---------------------------------------------------------------------- Comment By: Naofumi Honda (naofumi-h) Date: 2002-03-11 21:49 Message: Logged In: YES user_id=452575 I will post a new patch containing a essential part of previous one (i.e. without ifdef and almost all speed up routines). ---------------------------------------------------------------------- Comment By: Naofumi Honda (naofumi-h) Date: 2002-03-11 21:49 Message: Logged In: YES user_id=452575 I will post a new patch containing a essential part of previous one (i.e. without ifdef and almost all speed up routines). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-03-10 17:14 Message: Logged In: YES user_id=6380 Thanks for the analysis! Would you mind submitting a new patch without the #ifdef ORIGINAL_CODE stuff? Just delete/replace old code as needed -- cvs diff will show me the original code. The ORIGINAL_CODE stuff makes it harder for me to get the point of the diff. Also, maybe you could leave the speedup code out, to show the absolutely minimal amount of code needed. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=514662&group_id=5470 From noreply@sourceforge.net Fri Apr 5 15:18:20 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 05 Apr 2002 07:18:20 -0800 Subject: [Patches] [ python-Patches-536578 ] patch for bug 462783 mmap bus error Message-ID: Patches item #536578, was opened at 2002-03-28 22:02 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536578&group_id=5470 Category: Modules Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Greg Green (gpgreen) >Assigned to: A.M. Kuchling (akuchling) Summary: patch for bug 462783 mmap bus error Initial Comment: This patch fixes SF 462783. The problem was that an mmap'ed file caused a bus error when reading data from the file. The root cause is that the file wasn't flushed following a write. The patched module will throw an OSError exception if the mmap object was created without being flushed, fseek'ed, or closed, following a write. This patch only applies to unix systems. Windows seems to handle the condition ok. The problem with the patch is that existing code can be broken. On some systems, (FreeBSD, irix), as long as the file was flushed before attempting to read from the mmap object, it would work with no bus error. Linux gets a bus error no matter what. So existing code that did flush (or fseek) before a read will now get an OSError exception during mmap creation instead. I tried this on the cvs version of python 2.3, on linux redhat 7.2, FreeBSD 4.5, irix 6.5 n32, and windows 2000. -- Greg Green ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 10:18 Message: Logged In: YES user_id=6380 One comment from Andrew Dalke (who submitted bug 462783) about the patch: There's a small typo in the patch to test_mmap.py. Line 277 says ... not in ('win32'): when it should say ... not in ('win32', ): (Personally, I'd write ... != 'win32' or ... not in ['win32'] --GvR) Assigning to AMK since it's his module. ---------------------------------------------------------------------- Comment By: Greg Green (gpgreen) Date: 2002-03-29 13:49 Message: Logged In: YES user_id=499627 my email is gregory.p.green@boeing.com ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536578&group_id=5470 From noreply@sourceforge.net Fri Apr 5 18:59:55 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 05 Apr 2002 10:59:55 -0800 Subject: [Patches] [ python-Patches-539392 ] Unicode fix for test in tkFileDialog.py Message-ID: Patches item #539392, was opened at 2002-04-04 20:59 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470 Category: Tkinter Group: None Status: Open Resolution: None Priority: 5 Submitted By: Bernhard Reiter (ber) >Assigned to: Martin v. Löwis (loewis) Summary: Unicode fix for test in tkFileDialog.py Initial Comment: Patch is against current CVS form 20020404. It also gives pointers to the problem described in http://mail.python.org/pipermail/python-list/2001-June/048787.html Python's open() uses the Py_FileSystemDefaultEncoding. Py_FileSystemDefaultEncoding is NULL (bltinmodule.c) for most systems. Setlocate will set it. Thus we fixed the example and set the locale to the user defaults. Now "enc" will have a useful encoding thus the example will work with a non ascii characters in the filename, e.g. with umlauts in it. It bombed on them before. Traceback (most recent call last): File "tkFileDialog.py", line 105, in ? print "open", askopenfilename(filetypes=[("all filez", "*")]) UnicodeError: ASCII encoding error: ordinal not in range(128) open() will work with the string directly now. encode(enc) is only needed for terminal output, thus we enchanced the example to show the two uses of the returned filename string separatly. (It might be interesting to drop a note about this in the right part of the user documentation.) If you comment out the setlocale() you can see that open fails, which illustrates what seems to be a design flaw in tk. Tk should be able to give you a string in exactly the encoding in which the filesystem gave it to tk. 4.4.2002 Bernhard Bernhard ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-04 22:16 Message: Logged In: YES user_id=21627 I think this patch is not acceptable. If the application wants to support non-ASCII file names, it must invoke setlocale(); it is not the library's responsibility to make this decision behind the application's back. People question the validity of using CODESET in the file system, so each developer needs to make a concious decision. BTW, how does Tcl come up with the names in the first place? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470 From noreply@sourceforge.net Fri Apr 5 19:08:46 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 05 Apr 2002 11:08:46 -0800 Subject: [Patches] [ python-Patches-539043 ] Support PyChecker in IDLE Message-ID: Patches item #539043, was opened at 2002-04-04 03:28 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539043&group_id=5470 Category: IDLE Group: None Status: Open Resolution: None Priority: 5 Submitted By: Neal Norwitz (nnorwitz) Assigned to: Nobody/Anonymous (nobody) Summary: Support PyChecker in IDLE Initial Comment: This patch adds SIMPLE support for pychecker in IDLE. It is not complete. It pops up a window, you can enter filenames (not even a file dialog!), and run pychecker. You cannot change examples. If someone wants to really integrate this, they should add the user interface in pychecker (pychecker/options.py), use a file dialog to enter files, and handle file modifications. Since pychecker imports the files, they need to be removed from sys.modules, so modifications will be seen. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-05 21:08 Message: Logged In: YES user_id=21627 I'm concerned about the copyright notice. "All rights reserved" means "you cannot copy it". Could you consider licensing this under the PSF license? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539043&group_id=5470 From noreply@sourceforge.net Fri Apr 5 19:24:43 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 05 Apr 2002 11:24:43 -0800 Subject: [Patches] [ python-Patches-539043 ] Support PyChecker in IDLE Message-ID: Patches item #539043, was opened at 2002-04-03 20:28 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539043&group_id=5470 Category: IDLE Group: None Status: Open Resolution: None Priority: 5 Submitted By: Neal Norwitz (nnorwitz) Assigned to: Nobody/Anonymous (nobody) Summary: Support PyChecker in IDLE Initial Comment: This patch adds SIMPLE support for pychecker in IDLE. It is not complete. It pops up a window, you can enter filenames (not even a file dialog!), and run pychecker. You cannot change examples. If someone wants to really integrate this, they should add the user interface in pychecker (pychecker/options.py), use a file dialog to enter files, and handle file modifications. Since pychecker imports the files, they need to be removed from sys.modules, so modifications will be seen. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-04-05 14:24 Message: Logged In: YES user_id=33168 No need to worry. Really, just want to have MetaSlash mentioned. But note, this patch is still incomplete and more work needs to be done before this should be accepted. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-05 14:08 Message: Logged In: YES user_id=21627 I'm concerned about the copyright notice. "All rights reserved" means "you cannot copy it". Could you consider licensing this under the PSF license? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539043&group_id=5470 From noreply@sourceforge.net Fri Apr 5 19:38:46 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 05 Apr 2002 11:38:46 -0800 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 19:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Nobody/Anonymous (nobody) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Fri Apr 5 20:11:12 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 05 Apr 2002 12:11:12 -0800 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 14:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Nobody/Anonymous (nobody) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 15:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Fri Apr 5 21:10:35 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 05 Apr 2002 13:10:35 -0800 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 19:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Nobody/Anonymous (nobody) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-05 21:10 Message: Logged In: YES user_id=80475 Context diff uploaded at poppatch.c below. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 20:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Fri Apr 5 21:26:07 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 05 Apr 2002 13:26:07 -0800 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 14:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Nobody/Anonymous (nobody) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:26 Message: Logged In: YES user_id=6380 Now, if you could also upload a unittest and a doc patch, that would be great! ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-05 16:10 Message: Logged In: YES user_id=80475 Context diff uploaded at poppatch.c below. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 15:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Fri Apr 5 21:38:23 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 05 Apr 2002 13:38:23 -0800 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 14:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open >Resolution: Accepted Priority: 5 Submitted By: Raymond Hettinger (rhettinger) >Assigned to: Guido van Rossum (gvanrossum) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:38 Message: Logged In: YES user_id=6380 I've reviewed the patch and see only cosmetic things that need to be changed. I'll check it in as soon as you submit a unittest and doc patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:26 Message: Logged In: YES user_id=6380 Now, if you could also upload a unittest and a doc patch, that would be great! ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-05 16:10 Message: Logged In: YES user_id=80475 Context diff uploaded at poppatch.c below. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 15:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Fri Apr 5 21:47:18 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 05 Apr 2002 13:47:18 -0800 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 14:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Guido van Rossum (gvanrossum) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:47 Message: Logged In: YES user_id=6380 FYI, I'm uploading my version of the patch, with code cleanup, as popdict2.txt. I've moved the popitem-with-arg code before the allocation of res, because there were several places where this code returned NULL without DECREF'ing res. Repeating the PyTuple_New(2) call seemed the lesser evil. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:38 Message: Logged In: YES user_id=6380 I've reviewed the patch and see only cosmetic things that need to be changed. I'll check it in as soon as you submit a unittest and doc patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:26 Message: Logged In: YES user_id=6380 Now, if you could also upload a unittest and a doc patch, that would be great! ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-05 16:10 Message: Logged In: YES user_id=80475 Context diff uploaded at poppatch.c below. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 15:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Fri Apr 5 21:50:14 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 05 Apr 2002 13:50:14 -0800 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 14:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Guido van Rossum (gvanrossum) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-04-05 16:50 Message: Logged In: YES user_id=31435 Are there examples of concrete use cases? The idea that dict.popitem(k) returns (k, dict[k]) seems kinda goofy, since you necessarily already have k. So the question is whether this is the function signature that's really desired, or whether it's too much a hack. As is, it slows down popitem() without an argument because it requires using a fancier calling sequence, and because it now defers that case to a taken branch; it's also much slower than a function that just returned v could be, due to the need to allocate a 2-tuple to hold a redundant copy of the key. Perhaps there are use cases of the form k, v = dict.popitem(f(x, y, z)) where the key is known only implicitly? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:47 Message: Logged In: YES user_id=6380 FYI, I'm uploading my version of the patch, with code cleanup, as popdict2.txt. I've moved the popitem-with-arg code before the allocation of res, because there were several places where this code returned NULL without DECREF'ing res. Repeating the PyTuple_New(2) call seemed the lesser evil. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:38 Message: Logged In: YES user_id=6380 I've reviewed the patch and see only cosmetic things that need to be changed. I'll check it in as soon as you submit a unittest and doc patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:26 Message: Logged In: YES user_id=6380 Now, if you could also upload a unittest and a doc patch, that would be great! ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-05 16:10 Message: Logged In: YES user_id=80475 Context diff uploaded at poppatch.c below. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 15:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Sat Apr 6 17:23:23 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 06 Apr 2002 09:23:23 -0800 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 19:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Guido van Rossum (gvanrossum) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 17:23 Message: Logged In: YES user_id=80475 Q: Does the new function signature slow the existing no argument case? A: Yes. The function is already so fast, that the small overhead of PyArg_ParseTuple is measurable. My timing shows a 8% drop in speed. Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)? A: Yes. Though popvalue is a non-existing strawman, it would be quicker: it would cost two calls to Py_DECREF while saving a call to PyTuple_New and two calls to PyTuple_SET_ITEM. Still, the running time for popvalue would be dominated by the rest of the function and not the single malloc. Also, I think it unlikely that the dictionary interface would ever be expanded for popvalue, so the comparison is moot. Q: Are there cases where (k,v) is needed? A: Yes. One common case is where the tuple still needs to be formed to help build another dictionary: dict([d.popitem(k) for k in xferlist]) or [n.__setitem__(d.popitem(k)) for k in xferlist]. Also, it is useful when the key is computed by a function and then needs to be used in an expression. I often do something like that with setdefault: uniqInOrder= [u.setdefault(k,k) for k in alist if k not in u]. Also, when the key is computed by a function, it may need to be saved only when .popitem succeeds but not when the key is missing: "get and remove key if present; trigger exception if absent" This pattern is used in validating user input keys for deletion. Q: Where is the unittest and doc patch? A: Coming this weekend. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-05 21:50 Message: Logged In: YES user_id=31435 Are there examples of concrete use cases? The idea that dict.popitem(k) returns (k, dict[k]) seems kinda goofy, since you necessarily already have k. So the question is whether this is the function signature that's really desired, or whether it's too much a hack. As is, it slows down popitem() without an argument because it requires using a fancier calling sequence, and because it now defers that case to a taken branch; it's also much slower than a function that just returned v could be, due to the need to allocate a 2-tuple to hold a redundant copy of the key. Perhaps there are use cases of the form k, v = dict.popitem(f(x, y, z)) where the key is known only implicitly? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:47 Message: Logged In: YES user_id=6380 FYI, I'm uploading my version of the patch, with code cleanup, as popdict2.txt. I've moved the popitem-with-arg code before the allocation of res, because there were several places where this code returned NULL without DECREF'ing res. Repeating the PyTuple_New(2) call seemed the lesser evil. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:38 Message: Logged In: YES user_id=6380 I've reviewed the patch and see only cosmetic things that need to be changed. I'll check it in as soon as you submit a unittest and doc patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:26 Message: Logged In: YES user_id=6380 Now, if you could also upload a unittest and a doc patch, that would be great! ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-05 21:10 Message: Logged In: YES user_id=80475 Context diff uploaded at poppatch.c below. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 20:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Sun Apr 7 01:07:00 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 06 Apr 2002 17:07:00 -0800 Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols Message-ID: Patches item #540394, was opened at 2002-04-07 01:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Tim Peters (tim_one) Summary: Remove PyMalloc_* symbols Initial Comment: This patch removes all PyMalloc_* symbols from the source. obmalloc now implements PyObject_{Malloc, Realloc, Free}. PyObject_{New,NewVar} allocate using pymalloc. I also changed PyObject_Del and PyObject_GC_Del so that they be used as function designators. Is changing the signature of PyObject_Del going to cause any problems? I had to add some extra typecasts when assigning to tp_free. Please review and assign back to me. The next phase would be to cleanup the memory API usage. Do we want to replace all PyObject_Del calls with PyObject_Free? PyObject_Del seems to match better with PyObject_GC_Del. Oh yes, we also need to change PyMem_{Free, Del, ...} to use pymalloc's free. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 From noreply@sourceforge.net Sun Apr 7 01:07:38 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 06 Apr 2002 17:07:38 -0800 Subject: [Patches] [ python-Patches-536909 ] pymalloc for types and other cleanups Message-ID: Patches item #536909, was opened at 2002-03-29 21:11 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536909&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Deleted >Resolution: Out of Date Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Neil Schemenauer (nascheme) Summary: pymalloc for types and other cleanups Initial Comment: This patch changes typeobject to use pymalloc for managing the memory of subclassable types. It also fixes a bug that caused an interpreter built without GC to crash. Testing this patch was a bitch. There are three knobs related to MM now (with-cycle-gc, with-pymalloc, and PYMALLOC_DEBUG). I think I found different bugs when testing with each possible combination. There's one bit of ugliness in this patch. Extension module writers have to use _PyMalloc_Del to initialize the tp_free pointer. There should be a "public" function for that. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-03-31 07:11 Message: Logged In: YES user_id=31435 Neil, I appreciate the work! I'm afraid I screwed you at the same time. How do you want to proceed? I think "the plan" now is that we go back to the PyObject_XXX interface, and when pymalloc is enabled map most flavors of "free memory" ({Py{Mem, Object}_{Del, DEL, Free, FREE}) to the pymalloc free. You're not required to work on this, but if you've got some spare energy I could sure use the help. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-03-29 23:09 Message: Logged In: YES user_id=35752 I'm counting on Tim to finish PyMem_NukeIt. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-29 22:47 Message: Logged In: YES user_id=21627 I see another memory allocation family here: What function should objects allocated through PyType_GenericAlloc be released with? If you change the behaviour of PyType_GenericAlloc, all types in extensions written for 2.2 that use PyType_GenericAlloc will break, since they will still have PyObject_Del in their tp_free slot. I believe "families" should always be complete, so along with PyType_GenericAlloc goes PyType_GenericFree. If you want it fully backwards compatible, you need to introduce PyType_PyMallocAlloc... ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536909&group_id=5470 From noreply@sourceforge.net Sun Apr 7 01:08:38 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 06 Apr 2002 17:08:38 -0800 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 19:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Guido van Rossum (gvanrossum) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-07 01:08 Message: Logged In: YES user_id=80475 The tests and documentation patches have been added. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 17:23 Message: Logged In: YES user_id=80475 Q: Does the new function signature slow the existing no argument case? A: Yes. The function is already so fast, that the small overhead of PyArg_ParseTuple is measurable. My timing shows a 8% drop in speed. Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)? A: Yes. Though popvalue is a non-existing strawman, it would be quicker: it would cost two calls to Py_DECREF while saving a call to PyTuple_New and two calls to PyTuple_SET_ITEM. Still, the running time for popvalue would be dominated by the rest of the function and not the single malloc. Also, I think it unlikely that the dictionary interface would ever be expanded for popvalue, so the comparison is moot. Q: Are there cases where (k,v) is needed? A: Yes. One common case is where the tuple still needs to be formed to help build another dictionary: dict([d.popitem(k) for k in xferlist]) or [n.__setitem__(d.popitem(k)) for k in xferlist]. Also, it is useful when the key is computed by a function and then needs to be used in an expression. I often do something like that with setdefault: uniqInOrder= [u.setdefault(k,k) for k in alist if k not in u]. Also, when the key is computed by a function, it may need to be saved only when .popitem succeeds but not when the key is missing: "get and remove key if present; trigger exception if absent" This pattern is used in validating user input keys for deletion. Q: Where is the unittest and doc patch? A: Coming this weekend. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-05 21:50 Message: Logged In: YES user_id=31435 Are there examples of concrete use cases? The idea that dict.popitem(k) returns (k, dict[k]) seems kinda goofy, since you necessarily already have k. So the question is whether this is the function signature that's really desired, or whether it's too much a hack. As is, it slows down popitem() without an argument because it requires using a fancier calling sequence, and because it now defers that case to a taken branch; it's also much slower than a function that just returned v could be, due to the need to allocate a 2-tuple to hold a redundant copy of the key. Perhaps there are use cases of the form k, v = dict.popitem(f(x, y, z)) where the key is known only implicitly? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:47 Message: Logged In: YES user_id=6380 FYI, I'm uploading my version of the patch, with code cleanup, as popdict2.txt. I've moved the popitem-with-arg code before the allocation of res, because there were several places where this code returned NULL without DECREF'ing res. Repeating the PyTuple_New(2) call seemed the lesser evil. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:38 Message: Logged In: YES user_id=6380 I've reviewed the patch and see only cosmetic things that need to be changed. I'll check it in as soon as you submit a unittest and doc patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:26 Message: Logged In: YES user_id=6380 Now, if you could also upload a unittest and a doc patch, that would be great! ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-05 21:10 Message: Logged In: YES user_id=80475 Context diff uploaded at poppatch.c below. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 20:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Sun Apr 7 01:14:55 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 06 Apr 2002 17:14:55 -0800 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 19:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Guido van Rossum (gvanrossum) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Neil Schemenauer (nascheme) Date: 2002-04-07 01:14 Message: Logged In: YES user_id=35752 I think this should be implemented as pop() instead: D.pop([key]) -> value -- remove and return value by key (default a random value) It makes no sense to return the key when you already have it. pop() also matches well with list pop(): L.pop([index]) -> item -- remove and return item at index (default last) ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-07 01:08 Message: Logged In: YES user_id=80475 The tests and documentation patches have been added. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 17:23 Message: Logged In: YES user_id=80475 Q: Does the new function signature slow the existing no argument case? A: Yes. The function is already so fast, that the small overhead of PyArg_ParseTuple is measurable. My timing shows a 8% drop in speed. Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)? A: Yes. Though popvalue is a non-existing strawman, it would be quicker: it would cost two calls to Py_DECREF while saving a call to PyTuple_New and two calls to PyTuple_SET_ITEM. Still, the running time for popvalue would be dominated by the rest of the function and not the single malloc. Also, I think it unlikely that the dictionary interface would ever be expanded for popvalue, so the comparison is moot. Q: Are there cases where (k,v) is needed? A: Yes. One common case is where the tuple still needs to be formed to help build another dictionary: dict([d.popitem(k) for k in xferlist]) or [n.__setitem__(d.popitem(k)) for k in xferlist]. Also, it is useful when the key is computed by a function and then needs to be used in an expression. I often do something like that with setdefault: uniqInOrder= [u.setdefault(k,k) for k in alist if k not in u]. Also, when the key is computed by a function, it may need to be saved only when .popitem succeeds but not when the key is missing: "get and remove key if present; trigger exception if absent" This pattern is used in validating user input keys for deletion. Q: Where is the unittest and doc patch? A: Coming this weekend. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-05 21:50 Message: Logged In: YES user_id=31435 Are there examples of concrete use cases? The idea that dict.popitem(k) returns (k, dict[k]) seems kinda goofy, since you necessarily already have k. So the question is whether this is the function signature that's really desired, or whether it's too much a hack. As is, it slows down popitem() without an argument because it requires using a fancier calling sequence, and because it now defers that case to a taken branch; it's also much slower than a function that just returned v could be, due to the need to allocate a 2-tuple to hold a redundant copy of the key. Perhaps there are use cases of the form k, v = dict.popitem(f(x, y, z)) where the key is known only implicitly? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:47 Message: Logged In: YES user_id=6380 FYI, I'm uploading my version of the patch, with code cleanup, as popdict2.txt. I've moved the popitem-with-arg code before the allocation of res, because there were several places where this code returned NULL without DECREF'ing res. Repeating the PyTuple_New(2) call seemed the lesser evil. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:38 Message: Logged In: YES user_id=6380 I've reviewed the patch and see only cosmetic things that need to be changed. I'll check it in as soon as you submit a unittest and doc patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:26 Message: Logged In: YES user_id=6380 Now, if you could also upload a unittest and a doc patch, that would be great! ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-05 21:10 Message: Logged In: YES user_id=80475 Context diff uploaded at poppatch.c below. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 20:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Sun Apr 7 01:16:55 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 06 Apr 2002 17:16:55 -0800 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 14:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open >Resolution: Remind Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Guido van Rossum (gvanrossum) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-06 20:16 Message: Logged In: YES user_id=6380 Not a bad idea, Neil! Care to work the code around to implement that? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-06 20:14 Message: Logged In: YES user_id=35752 I think this should be implemented as pop() instead: D.pop([key]) -> value -- remove and return value by key (default a random value) It makes no sense to return the key when you already have it. pop() also matches well with list pop(): L.pop([index]) -> item -- remove and return item at index (default last) ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 20:08 Message: Logged In: YES user_id=80475 The tests and documentation patches have been added. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 12:23 Message: Logged In: YES user_id=80475 Q: Does the new function signature slow the existing no argument case? A: Yes. The function is already so fast, that the small overhead of PyArg_ParseTuple is measurable. My timing shows a 8% drop in speed. Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)? A: Yes. Though popvalue is a non-existing strawman, it would be quicker: it would cost two calls to Py_DECREF while saving a call to PyTuple_New and two calls to PyTuple_SET_ITEM. Still, the running time for popvalue would be dominated by the rest of the function and not the single malloc. Also, I think it unlikely that the dictionary interface would ever be expanded for popvalue, so the comparison is moot. Q: Are there cases where (k,v) is needed? A: Yes. One common case is where the tuple still needs to be formed to help build another dictionary: dict([d.popitem(k) for k in xferlist]) or [n.__setitem__(d.popitem(k)) for k in xferlist]. Also, it is useful when the key is computed by a function and then needs to be used in an expression. I often do something like that with setdefault: uniqInOrder= [u.setdefault(k,k) for k in alist if k not in u]. Also, when the key is computed by a function, it may need to be saved only when .popitem succeeds but not when the key is missing: "get and remove key if present; trigger exception if absent" This pattern is used in validating user input keys for deletion. Q: Where is the unittest and doc patch? A: Coming this weekend. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-05 16:50 Message: Logged In: YES user_id=31435 Are there examples of concrete use cases? The idea that dict.popitem(k) returns (k, dict[k]) seems kinda goofy, since you necessarily already have k. So the question is whether this is the function signature that's really desired, or whether it's too much a hack. As is, it slows down popitem() without an argument because it requires using a fancier calling sequence, and because it now defers that case to a taken branch; it's also much slower than a function that just returned v could be, due to the need to allocate a 2-tuple to hold a redundant copy of the key. Perhaps there are use cases of the form k, v = dict.popitem(f(x, y, z)) where the key is known only implicitly? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:47 Message: Logged In: YES user_id=6380 FYI, I'm uploading my version of the patch, with code cleanup, as popdict2.txt. I've moved the popitem-with-arg code before the allocation of res, because there were several places where this code returned NULL without DECREF'ing res. Repeating the PyTuple_New(2) call seemed the lesser evil. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:38 Message: Logged In: YES user_id=6380 I've reviewed the patch and see only cosmetic things that need to be changed. I'll check it in as soon as you submit a unittest and doc patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:26 Message: Logged In: YES user_id=6380 Now, if you could also upload a unittest and a doc patch, that would be great! ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-05 16:10 Message: Logged In: YES user_id=80475 Context diff uploaded at poppatch.c below. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 15:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Sun Apr 7 01:51:45 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 06 Apr 2002 17:51:45 -0800 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 19:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Remind Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Guido van Rossum (gvanrossum) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Neil Schemenauer (nascheme) Date: 2002-04-07 01:51 Message: Logged In: YES user_id=35752 Here's a quick implementation. D.pop() is not as efficient as it could be (it uses popitem and then promply deallocates the item tuple). I'm not sure it matters though. Someone should probably check the refcounts. I always screw them up. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-07 01:16 Message: Logged In: YES user_id=6380 Not a bad idea, Neil! Care to work the code around to implement that? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-07 01:14 Message: Logged In: YES user_id=35752 I think this should be implemented as pop() instead: D.pop([key]) -> value -- remove and return value by key (default a random value) It makes no sense to return the key when you already have it. pop() also matches well with list pop(): L.pop([index]) -> item -- remove and return item at index (default last) ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-07 01:08 Message: Logged In: YES user_id=80475 The tests and documentation patches have been added. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 17:23 Message: Logged In: YES user_id=80475 Q: Does the new function signature slow the existing no argument case? A: Yes. The function is already so fast, that the small overhead of PyArg_ParseTuple is measurable. My timing shows a 8% drop in speed. Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)? A: Yes. Though popvalue is a non-existing strawman, it would be quicker: it would cost two calls to Py_DECREF while saving a call to PyTuple_New and two calls to PyTuple_SET_ITEM. Still, the running time for popvalue would be dominated by the rest of the function and not the single malloc. Also, I think it unlikely that the dictionary interface would ever be expanded for popvalue, so the comparison is moot. Q: Are there cases where (k,v) is needed? A: Yes. One common case is where the tuple still needs to be formed to help build another dictionary: dict([d.popitem(k) for k in xferlist]) or [n.__setitem__(d.popitem(k)) for k in xferlist]. Also, it is useful when the key is computed by a function and then needs to be used in an expression. I often do something like that with setdefault: uniqInOrder= [u.setdefault(k,k) for k in alist if k not in u]. Also, when the key is computed by a function, it may need to be saved only when .popitem succeeds but not when the key is missing: "get and remove key if present; trigger exception if absent" This pattern is used in validating user input keys for deletion. Q: Where is the unittest and doc patch? A: Coming this weekend. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-05 21:50 Message: Logged In: YES user_id=31435 Are there examples of concrete use cases? The idea that dict.popitem(k) returns (k, dict[k]) seems kinda goofy, since you necessarily already have k. So the question is whether this is the function signature that's really desired, or whether it's too much a hack. As is, it slows down popitem() without an argument because it requires using a fancier calling sequence, and because it now defers that case to a taken branch; it's also much slower than a function that just returned v could be, due to the need to allocate a 2-tuple to hold a redundant copy of the key. Perhaps there are use cases of the form k, v = dict.popitem(f(x, y, z)) where the key is known only implicitly? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:47 Message: Logged In: YES user_id=6380 FYI, I'm uploading my version of the patch, with code cleanup, as popdict2.txt. I've moved the popitem-with-arg code before the allocation of res, because there were several places where this code returned NULL without DECREF'ing res. Repeating the PyTuple_New(2) call seemed the lesser evil. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:38 Message: Logged In: YES user_id=6380 I've reviewed the patch and see only cosmetic things that need to be changed. I'll check it in as soon as you submit a unittest and doc patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:26 Message: Logged In: YES user_id=6380 Now, if you could also upload a unittest and a doc patch, that would be great! ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-05 21:10 Message: Logged In: YES user_id=80475 Context diff uploaded at poppatch.c below. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 20:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Sun Apr 7 03:40:56 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 06 Apr 2002 18:40:56 -0800 Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols Message-ID: Patches item #540394, was opened at 2002-04-06 20:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 Category: Core (C code) Group: None Status: Open >Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) >Assigned to: Neil Schemenauer (nascheme) Summary: Remove PyMalloc_* symbols Initial Comment: This patch removes all PyMalloc_* symbols from the source. obmalloc now implements PyObject_{Malloc, Realloc, Free}. PyObject_{New,NewVar} allocate using pymalloc. I also changed PyObject_Del and PyObject_GC_Del so that they be used as function designators. Is changing the signature of PyObject_Del going to cause any problems? I had to add some extra typecasts when assigning to tp_free. Please review and assign back to me. The next phase would be to cleanup the memory API usage. Do we want to replace all PyObject_Del calls with PyObject_Free? PyObject_Del seems to match better with PyObject_GC_Del. Oh yes, we also need to change PyMem_{Free, Del, ...} to use pymalloc's free. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:40 Message: Logged In: YES user_id=31435 Looks good to me -- thanks! ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 From noreply@sourceforge.net Sun Apr 7 03:41:27 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 06 Apr 2002 18:41:27 -0800 Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols Message-ID: Patches item #540394, was opened at 2002-04-06 20:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Neil Schemenauer (nascheme) Summary: Remove PyMalloc_* symbols Initial Comment: This patch removes all PyMalloc_* symbols from the source. obmalloc now implements PyObject_{Malloc, Realloc, Free}. PyObject_{New,NewVar} allocate using pymalloc. I also changed PyObject_Del and PyObject_GC_Del so that they be used as function designators. Is changing the signature of PyObject_Del going to cause any problems? I had to add some extra typecasts when assigning to tp_free. Please review and assign back to me. The next phase would be to cleanup the memory API usage. Do we want to replace all PyObject_Del calls with PyObject_Free? PyObject_Del seems to match better with PyObject_GC_Del. Oh yes, we also need to change PyMem_{Free, Del, ...} to use pymalloc's free. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:41 Message: Logged In: YES user_id=31435 Oops -- I hit "Submit" prematurely. More to come. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:40 Message: Logged In: YES user_id=31435 Looks good to me -- thanks! ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 From noreply@sourceforge.net Sun Apr 7 03:59:13 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 06 Apr 2002 18:59:13 -0800 Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols Message-ID: Patches item #540394, was opened at 2002-04-06 20:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) >Assigned to: Guido van Rossum (gvanrossum) Summary: Remove PyMalloc_* symbols Initial Comment: This patch removes all PyMalloc_* symbols from the source. obmalloc now implements PyObject_{Malloc, Realloc, Free}. PyObject_{New,NewVar} allocate using pymalloc. I also changed PyObject_Del and PyObject_GC_Del so that they be used as function designators. Is changing the signature of PyObject_Del going to cause any problems? I had to add some extra typecasts when assigning to tp_free. Please review and assign back to me. The next phase would be to cleanup the memory API usage. Do we want to replace all PyObject_Del calls with PyObject_Free? PyObject_Del seems to match better with PyObject_GC_Del. Oh yes, we also need to change PyMem_{Free, Del, ...} to use pymalloc's free. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:59 Message: Logged In: YES user_id=31435 Extensions that *currently* call PyObject_Del have its old macro expansion ("_PyObject_Del((PyObject *)(op))") buried in them, so getting rid of _PyObject_Del is a binary-API incompatibility (existing extensions will no longer link without recompilation). I personally don't mind that, but I run on Windows and "binary compatability" never works there across minor releases for other reasons, so I don't have any real feel for how much people on other platforms value it. As you pointed out recently too, binary compatability has, in reality, not been the case since 1.5.2 anyway. So that's one for Python-Dev. If we do break binary compatibility, I'd be sorely tempted to change the "destructor" typedef to say destructors take void*. IMO saying they take PyObject* was a poor idea, as you almost never have a PyObject* when calling one of these guys. That's why PyObject_Del "had to" be a macro, to hide the cast to PyObject* almost everyone needs because of destructor's "correct" but impractical signature. If "destructor" had a practical signature, there would have been no temptation to use a macro. Note that if the typedef of destructor were so changed, you wouldn't have needed new casts in tp_free slots. And I'd rather break binary compatability than make extension authors add new casts. Hmm. I'm assigning this to Guido for comment: Guido, what are your feelings about binary compatibility here? C didn't define free() as taking a void* by mistake . Back to Neil: I wouldn't bother changing PyObject_Del to PyObject_Free. The former isn't in the "recommended" minimal API, but neither is it discouraged. I expect TMTOWTDI here forever. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:41 Message: Logged In: YES user_id=31435 Oops -- I hit "Submit" prematurely. More to come. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:40 Message: Logged In: YES user_id=31435 Looks good to me -- thanks! ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 From noreply@sourceforge.net Sun Apr 7 04:00:08 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 06 Apr 2002 19:00:08 -0800 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 19:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Remind Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Guido van Rossum (gvanrossum) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-07 03:00 Message: Logged In: YES user_id=80475 Here's a more fleshed-out implementation of D.pop(). It doesn't rely on popitem(), doesn't malloc a tuple, and the refcounts should be correct. One change from Neil's version, since k isn't being returned, then an arbitrary pair doesn't make sense, so the key argument to pop is required rather than optional. The diff is off of 2.123. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-07 01:51 Message: Logged In: YES user_id=35752 Here's a quick implementation. D.pop() is not as efficient as it could be (it uses popitem and then promply deallocates the item tuple). I'm not sure it matters though. Someone should probably check the refcounts. I always screw them up. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-07 01:16 Message: Logged In: YES user_id=6380 Not a bad idea, Neil! Care to work the code around to implement that? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-07 01:14 Message: Logged In: YES user_id=35752 I think this should be implemented as pop() instead: D.pop([key]) -> value -- remove and return value by key (default a random value) It makes no sense to return the key when you already have it. pop() also matches well with list pop(): L.pop([index]) -> item -- remove and return item at index (default last) ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-07 01:08 Message: Logged In: YES user_id=80475 The tests and documentation patches have been added. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 17:23 Message: Logged In: YES user_id=80475 Q: Does the new function signature slow the existing no argument case? A: Yes. The function is already so fast, that the small overhead of PyArg_ParseTuple is measurable. My timing shows a 8% drop in speed. Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)? A: Yes. Though popvalue is a non-existing strawman, it would be quicker: it would cost two calls to Py_DECREF while saving a call to PyTuple_New and two calls to PyTuple_SET_ITEM. Still, the running time for popvalue would be dominated by the rest of the function and not the single malloc. Also, I think it unlikely that the dictionary interface would ever be expanded for popvalue, so the comparison is moot. Q: Are there cases where (k,v) is needed? A: Yes. One common case is where the tuple still needs to be formed to help build another dictionary: dict([d.popitem(k) for k in xferlist]) or [n.__setitem__(d.popitem(k)) for k in xferlist]. Also, it is useful when the key is computed by a function and then needs to be used in an expression. I often do something like that with setdefault: uniqInOrder= [u.setdefault(k,k) for k in alist if k not in u]. Also, when the key is computed by a function, it may need to be saved only when .popitem succeeds but not when the key is missing: "get and remove key if present; trigger exception if absent" This pattern is used in validating user input keys for deletion. Q: Where is the unittest and doc patch? A: Coming this weekend. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-05 21:50 Message: Logged In: YES user_id=31435 Are there examples of concrete use cases? The idea that dict.popitem(k) returns (k, dict[k]) seems kinda goofy, since you necessarily already have k. So the question is whether this is the function signature that's really desired, or whether it's too much a hack. As is, it slows down popitem() without an argument because it requires using a fancier calling sequence, and because it now defers that case to a taken branch; it's also much slower than a function that just returned v could be, due to the need to allocate a 2-tuple to hold a redundant copy of the key. Perhaps there are use cases of the form k, v = dict.popitem(f(x, y, z)) where the key is known only implicitly? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:47 Message: Logged In: YES user_id=6380 FYI, I'm uploading my version of the patch, with code cleanup, as popdict2.txt. I've moved the popitem-with-arg code before the allocation of res, because there were several places where this code returned NULL without DECREF'ing res. Repeating the PyTuple_New(2) call seemed the lesser evil. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:38 Message: Logged In: YES user_id=6380 I've reviewed the patch and see only cosmetic things that need to be changed. I'll check it in as soon as you submit a unittest and doc patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:26 Message: Logged In: YES user_id=6380 Now, if you could also upload a unittest and a doc patch, that would be great! ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-05 21:10 Message: Logged In: YES user_id=80475 Context diff uploaded at poppatch.c below. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 20:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Sun Apr 7 15:36:40 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 07 Apr 2002 07:36:40 -0700 Subject: [Patches] [ python-Patches-540583 ] IDLE calls MS HTML Help Python Docs Message-ID: Patches item #540583, was opened at 2002-04-07 16:36 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470 Category: IDLE Group: None Status: Open Resolution: None Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Nobody/Anonymous (nobody) Summary: IDLE calls MS HTML Help Python Docs Initial Comment: A little patch to enable IDLE to call a Python Docs in HTML Help format if it becomes part of the standard Windows distribution. A few things: - The patch uses os.startfile() instead of webbrowser.open() because the default browser may not be IExplorer. - The name of .chm file is hardwire - I assume that the .chm file resides in the same directory of the python exec. - I'll try to upload a similar patch on idlefork. Regards, -Hernan ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470 From noreply@sourceforge.net Sun Apr 7 17:15:02 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 07 Apr 2002 09:15:02 -0700 Subject: [Patches] [ python-Patches-540583 ] IDLE calls MS HTML Help Python Docs Message-ID: Patches item #540583, was opened at 2002-04-07 10:36 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470 Category: IDLE Group: None Status: Open Resolution: None Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) >Assigned to: Guido van Rossum (gvanrossum) Summary: IDLE calls MS HTML Help Python Docs Initial Comment: A little patch to enable IDLE to call a Python Docs in HTML Help format if it becomes part of the standard Windows distribution. A few things: - The patch uses os.startfile() instead of webbrowser.open() because the default browser may not be IExplorer. - The name of .chm file is hardwire - I assume that the .chm file resides in the same directory of the python exec. - I'll try to upload a similar patch on idlefork. Regards, -Hernan ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470 From noreply@sourceforge.net Sun Apr 7 18:06:54 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 07 Apr 2002 10:06:54 -0700 Subject: [Patches] [ python-Patches-540583 ] IDLE calls MS HTML Help Python Docs Message-ID: Patches item #540583, was opened at 2002-04-07 16:36 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470 Category: IDLE Group: None Status: Open Resolution: None Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Guido van Rossum (gvanrossum) Summary: IDLE calls MS HTML Help Python Docs Initial Comment: A little patch to enable IDLE to call a Python Docs in HTML Help format if it becomes part of the standard Windows distribution. A few things: - The patch uses os.startfile() instead of webbrowser.open() because the default browser may not be IExplorer. - The name of .chm file is hardwire - I assume that the .chm file resides in the same directory of the python exec. - I'll try to upload a similar patch on idlefork. Regards, -Hernan ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-07 19:06 Message: Logged In: YES user_id=21627 IMO, it would be good if it would fall back to HTML help if the chm file is not found. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470 From noreply@sourceforge.net Sun Apr 7 18:10:56 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 07 Apr 2002 10:10:56 -0700 Subject: [Patches] [ python-Patches-540583 ] IDLE calls MS HTML Help Python Docs Message-ID: Patches item #540583, was opened at 2002-04-07 16:36 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470 Category: IDLE Group: None Status: Open Resolution: None Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Guido van Rossum (gvanrossum) Summary: IDLE calls MS HTML Help Python Docs Initial Comment: A little patch to enable IDLE to call a Python Docs in HTML Help format if it becomes part of the standard Windows distribution. A few things: - The patch uses os.startfile() instead of webbrowser.open() because the default browser may not be IExplorer. - The name of .chm file is hardwire - I assume that the .chm file resides in the same directory of the python exec. - I'll try to upload a similar patch on idlefork. Regards, -Hernan ---------------------------------------------------------------------- >Comment By: Hernan Martinez Foffani (hfoffani) Date: 2002-04-07 19:10 Message: Logged In: YES user_id=112690 Ok. I can add the fallback. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-07 19:06 Message: Logged In: YES user_id=21627 IMO, it would be good if it would fall back to HTML help if the chm file is not found. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470 From noreply@sourceforge.net Sun Apr 7 19:20:20 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 07 Apr 2002 11:20:20 -0700 Subject: [Patches] [ python-Patches-539392 ] Unicode fix for test in tkFileDialog.py Message-ID: Patches item #539392, was opened at 2002-04-04 20:59 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470 Category: Tkinter Group: None Status: Open Resolution: None Priority: 5 Submitted By: Bernhard Reiter (ber) Assigned to: Martin v. Löwis (loewis) Summary: Unicode fix for test in tkFileDialog.py Initial Comment: Patch is against current CVS form 20020404. It also gives pointers to the problem described in http://mail.python.org/pipermail/python-list/2001-June/048787.html Python's open() uses the Py_FileSystemDefaultEncoding. Py_FileSystemDefaultEncoding is NULL (bltinmodule.c) for most systems. Setlocate will set it. Thus we fixed the example and set the locale to the user defaults. Now "enc" will have a useful encoding thus the example will work with a non ascii characters in the filename, e.g. with umlauts in it. It bombed on them before. Traceback (most recent call last): File "tkFileDialog.py", line 105, in ? print "open", askopenfilename(filetypes=[("all filez", "*")]) UnicodeError: ASCII encoding error: ordinal not in range(128) open() will work with the string directly now. encode(enc) is only needed for terminal output, thus we enchanced the example to show the two uses of the returned filename string separatly. (It might be interesting to drop a note about this in the right part of the user documentation.) If you comment out the setlocale() you can see that open fails, which illustrates what seems to be a design flaw in tk. Tk should be able to give you a string in exactly the encoding in which the filesystem gave it to tk. 4.4.2002 Bernhard Bernhard ---------------------------------------------------------------------- >Comment By: Bernhard Reiter (ber) Date: 2002-04-07 20:20 Message: Logged In: YES user_id=113859 I agree with your analysis that the appplication has to set the locale, if it wants to support non-ASCII filenames. This is why we fixed the _test_ code to demonstrate exactly this. The code of the modules itself is untouched. If you do not fix the _test_ code it will bomb on non-ascii file names. Our code also demonstrates that there might be a difference in the file system encoding (suitable for open) and the terminal encoding (suitable for printing). ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-04 22:16 Message: Logged In: YES user_id=21627 I think this patch is not acceptable. If the application wants to support non-ASCII file names, it must invoke setlocale(); it is not the library's responsibility to make this decision behind the application's back. People question the validity of using CODESET in the file system, so each developer needs to make a concious decision. BTW, how does Tcl come up with the names in the first place? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470 From noreply@sourceforge.net Sun Apr 7 22:54:17 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 07 Apr 2002 14:54:17 -0700 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 14:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Remind Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Guido van Rossum (gvanrossum) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-04-07 17:54 Message: Logged In: YES user_id=31435 I like Raymond's new pop(). Problems: + "speficied" is misspelled in the docstring. + Should be declared METH_O, not METH_VARARGS (mimic how, e.g., dict_update is set up). + The decrefs have to be reworked: a decref can trigger calls back into arbitrary Python code, due to __del__ methods getting invoked. This means you can never leave any live object in an insane or inconsistent state *during* a decref. What you need to do instead is first capture the key and value into local vrbls, plug dummy and NULL in to the dict slot, and decrement the used count. This leaves the dict in a consistent state again. Only then is it safe to decref the key and value. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 22:00 Message: Logged In: YES user_id=80475 Here's a more fleshed-out implementation of D.pop(). It doesn't rely on popitem(), doesn't malloc a tuple, and the refcounts should be correct. One change from Neil's version, since k isn't being returned, then an arbitrary pair doesn't make sense, so the key argument to pop is required rather than optional. The diff is off of 2.123. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-06 20:51 Message: Logged In: YES user_id=35752 Here's a quick implementation. D.pop() is not as efficient as it could be (it uses popitem and then promply deallocates the item tuple). I'm not sure it matters though. Someone should probably check the refcounts. I always screw them up. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-06 20:16 Message: Logged In: YES user_id=6380 Not a bad idea, Neil! Care to work the code around to implement that? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-06 20:14 Message: Logged In: YES user_id=35752 I think this should be implemented as pop() instead: D.pop([key]) -> value -- remove and return value by key (default a random value) It makes no sense to return the key when you already have it. pop() also matches well with list pop(): L.pop([index]) -> item -- remove and return item at index (default last) ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 20:08 Message: Logged In: YES user_id=80475 The tests and documentation patches have been added. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 12:23 Message: Logged In: YES user_id=80475 Q: Does the new function signature slow the existing no argument case? A: Yes. The function is already so fast, that the small overhead of PyArg_ParseTuple is measurable. My timing shows a 8% drop in speed. Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)? A: Yes. Though popvalue is a non-existing strawman, it would be quicker: it would cost two calls to Py_DECREF while saving a call to PyTuple_New and two calls to PyTuple_SET_ITEM. Still, the running time for popvalue would be dominated by the rest of the function and not the single malloc. Also, I think it unlikely that the dictionary interface would ever be expanded for popvalue, so the comparison is moot. Q: Are there cases where (k,v) is needed? A: Yes. One common case is where the tuple still needs to be formed to help build another dictionary: dict([d.popitem(k) for k in xferlist]) or [n.__setitem__(d.popitem(k)) for k in xferlist]. Also, it is useful when the key is computed by a function and then needs to be used in an expression. I often do something like that with setdefault: uniqInOrder= [u.setdefault(k,k) for k in alist if k not in u]. Also, when the key is computed by a function, it may need to be saved only when .popitem succeeds but not when the key is missing: "get and remove key if present; trigger exception if absent" This pattern is used in validating user input keys for deletion. Q: Where is the unittest and doc patch? A: Coming this weekend. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-05 16:50 Message: Logged In: YES user_id=31435 Are there examples of concrete use cases? The idea that dict.popitem(k) returns (k, dict[k]) seems kinda goofy, since you necessarily already have k. So the question is whether this is the function signature that's really desired, or whether it's too much a hack. As is, it slows down popitem() without an argument because it requires using a fancier calling sequence, and because it now defers that case to a taken branch; it's also much slower than a function that just returned v could be, due to the need to allocate a 2-tuple to hold a redundant copy of the key. Perhaps there are use cases of the form k, v = dict.popitem(f(x, y, z)) where the key is known only implicitly? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:47 Message: Logged In: YES user_id=6380 FYI, I'm uploading my version of the patch, with code cleanup, as popdict2.txt. I've moved the popitem-with-arg code before the allocation of res, because there were several places where this code returned NULL without DECREF'ing res. Repeating the PyTuple_New(2) call seemed the lesser evil. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:38 Message: Logged In: YES user_id=6380 I've reviewed the patch and see only cosmetic things that need to be changed. I'll check it in as soon as you submit a unittest and doc patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:26 Message: Logged In: YES user_id=6380 Now, if you could also upload a unittest and a doc patch, that would be great! ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-05 16:10 Message: Logged In: YES user_id=80475 Context diff uploaded at poppatch.c below. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 15:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Mon Apr 8 15:14:46 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 08 Apr 2002 07:14:46 -0700 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 19:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Remind Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Guido van Rossum (gvanrossum) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-08 14:14 Message: Logged In: YES user_id=80475 Here is a revised patch for D.pop() incorporating Tim's ideas: + Docstring spelling fixed + Switched to METH_O instead of METH_VARARGS + Delayed decref until dict entry in consistent state + Removed unused int i=0 variable + Tabs replaced with spaces ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-07 21:54 Message: Logged In: YES user_id=31435 I like Raymond's new pop(). Problems: + "speficied" is misspelled in the docstring. + Should be declared METH_O, not METH_VARARGS (mimic how, e.g., dict_update is set up). + The decrefs have to be reworked: a decref can trigger calls back into arbitrary Python code, due to __del__ methods getting invoked. This means you can never leave any live object in an insane or inconsistent state *during* a decref. What you need to do instead is first capture the key and value into local vrbls, plug dummy and NULL in to the dict slot, and decrement the used count. This leaves the dict in a consistent state again. Only then is it safe to decref the key and value. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-07 03:00 Message: Logged In: YES user_id=80475 Here's a more fleshed-out implementation of D.pop(). It doesn't rely on popitem(), doesn't malloc a tuple, and the refcounts should be correct. One change from Neil's version, since k isn't being returned, then an arbitrary pair doesn't make sense, so the key argument to pop is required rather than optional. The diff is off of 2.123. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-07 01:51 Message: Logged In: YES user_id=35752 Here's a quick implementation. D.pop() is not as efficient as it could be (it uses popitem and then promply deallocates the item tuple). I'm not sure it matters though. Someone should probably check the refcounts. I always screw them up. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-07 01:16 Message: Logged In: YES user_id=6380 Not a bad idea, Neil! Care to work the code around to implement that? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-07 01:14 Message: Logged In: YES user_id=35752 I think this should be implemented as pop() instead: D.pop([key]) -> value -- remove and return value by key (default a random value) It makes no sense to return the key when you already have it. pop() also matches well with list pop(): L.pop([index]) -> item -- remove and return item at index (default last) ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-07 01:08 Message: Logged In: YES user_id=80475 The tests and documentation patches have been added. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 17:23 Message: Logged In: YES user_id=80475 Q: Does the new function signature slow the existing no argument case? A: Yes. The function is already so fast, that the small overhead of PyArg_ParseTuple is measurable. My timing shows a 8% drop in speed. Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)? A: Yes. Though popvalue is a non-existing strawman, it would be quicker: it would cost two calls to Py_DECREF while saving a call to PyTuple_New and two calls to PyTuple_SET_ITEM. Still, the running time for popvalue would be dominated by the rest of the function and not the single malloc. Also, I think it unlikely that the dictionary interface would ever be expanded for popvalue, so the comparison is moot. Q: Are there cases where (k,v) is needed? A: Yes. One common case is where the tuple still needs to be formed to help build another dictionary: dict([d.popitem(k) for k in xferlist]) or [n.__setitem__(d.popitem(k)) for k in xferlist]. Also, it is useful when the key is computed by a function and then needs to be used in an expression. I often do something like that with setdefault: uniqInOrder= [u.setdefault(k,k) for k in alist if k not in u]. Also, when the key is computed by a function, it may need to be saved only when .popitem succeeds but not when the key is missing: "get and remove key if present; trigger exception if absent" This pattern is used in validating user input keys for deletion. Q: Where is the unittest and doc patch? A: Coming this weekend. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-05 21:50 Message: Logged In: YES user_id=31435 Are there examples of concrete use cases? The idea that dict.popitem(k) returns (k, dict[k]) seems kinda goofy, since you necessarily already have k. So the question is whether this is the function signature that's really desired, or whether it's too much a hack. As is, it slows down popitem() without an argument because it requires using a fancier calling sequence, and because it now defers that case to a taken branch; it's also much slower than a function that just returned v could be, due to the need to allocate a 2-tuple to hold a redundant copy of the key. Perhaps there are use cases of the form k, v = dict.popitem(f(x, y, z)) where the key is known only implicitly? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:47 Message: Logged In: YES user_id=6380 FYI, I'm uploading my version of the patch, with code cleanup, as popdict2.txt. I've moved the popitem-with-arg code before the allocation of res, because there were several places where this code returned NULL without DECREF'ing res. Repeating the PyTuple_New(2) call seemed the lesser evil. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:38 Message: Logged In: YES user_id=6380 I've reviewed the patch and see only cosmetic things that need to be changed. I'll check it in as soon as you submit a unittest and doc patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:26 Message: Logged In: YES user_id=6380 Now, if you could also upload a unittest and a doc patch, that would be great! ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-05 21:10 Message: Logged In: YES user_id=80475 Context diff uploaded at poppatch.c below. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 20:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Mon Apr 8 15:25:42 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 08 Apr 2002 07:25:42 -0700 Subject: [Patches] [ python-Patches-541031 ] context sensitive help/keyword search Message-ID: Patches item #541031, was opened at 2002-04-08 16:25 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541031&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Thomas Heller (theller) Assigned to: Nobody/Anonymous (nobody) Summary: context sensitive help/keyword search Initial Comment: This script/module looks up keywords in the Python manuals. It is usable as CGI script - a version is online at http://starship.python.net/crew/theller/cgi- bin/pyhelp.cgi It can also by used from the command line: python pyhelp.py keyword It can also be used to implement context sensitive help in IDLE or Xemacs (for example) by simply selecting a word and pressing F1. It can use the online version of the manuals at www.python.org/doc/, or it can use local installed html pages. The script/module scans the index pages of the docs for hyperlinks, and pickles the results to disk. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541031&group_id=5470 From noreply@sourceforge.net Mon Apr 8 16:00:56 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 08 Apr 2002 08:00:56 -0700 Subject: [Patches] [ python-Patches-539392 ] Unicode fix for test in tkFileDialog.py Message-ID: Patches item #539392, was opened at 2002-04-04 20:59 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470 Category: Tkinter Group: None Status: Open Resolution: None Priority: 5 Submitted By: Bernhard Reiter (ber) Assigned to: Martin v. Löwis (loewis) Summary: Unicode fix for test in tkFileDialog.py Initial Comment: Patch is against current CVS form 20020404. It also gives pointers to the problem described in http://mail.python.org/pipermail/python-list/2001-June/048787.html Python's open() uses the Py_FileSystemDefaultEncoding. Py_FileSystemDefaultEncoding is NULL (bltinmodule.c) for most systems. Setlocate will set it. Thus we fixed the example and set the locale to the user defaults. Now "enc" will have a useful encoding thus the example will work with a non ascii characters in the filename, e.g. with umlauts in it. It bombed on them before. Traceback (most recent call last): File "tkFileDialog.py", line 105, in ? print "open", askopenfilename(filetypes=[("all filez", "*")]) UnicodeError: ASCII encoding error: ordinal not in range(128) open() will work with the string directly now. encode(enc) is only needed for terminal output, thus we enchanced the example to show the two uses of the returned filename string separatly. (It might be interesting to drop a note about this in the right part of the user documentation.) If you comment out the setlocale() you can see that open fails, which illustrates what seems to be a design flaw in tk. Tk should be able to give you a string in exactly the encoding in which the filesystem gave it to tk. 4.4.2002 Bernhard Bernhard ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-08 17:00 Message: Logged In: YES user_id=21627 Sorry, I misinterpreted your patch first. I agree with your distinction of a file system encoding, and a terminal encoding; I still hope to enhance Python to expose an estimate of both - then leaving it to the application to make use of either as appropriate (the file system encoding would be used implicitly as is done today). As for the flaw in Tk: it turns out that Tcl has a different notion of the default encoding than Python - Tcl always uses a locale-aware default encoding, whereas Python has a system-wide fixed default encoding (usually ASCII). It is a good thing that Tkinter manages to represent file names correctly (i.e. as Unicode strings) in most cases - if you want to get the file name in the encoding in which the file system gave it to you, you need to establish the value of Tcl's "encoding system" command. Committed as tkFileDialog.py 1.7. ---------------------------------------------------------------------- Comment By: Bernhard Reiter (ber) Date: 2002-04-07 20:20 Message: Logged In: YES user_id=113859 I agree with your analysis that the appplication has to set the locale, if it wants to support non-ASCII filenames. This is why we fixed the _test_ code to demonstrate exactly this. The code of the modules itself is untouched. If you do not fix the _test_ code it will bomb on non-ascii file names. Our code also demonstrates that there might be a difference in the file system encoding (suitable for open) and the terminal encoding (suitable for printing). ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-04 22:16 Message: Logged In: YES user_id=21627 I think this patch is not acceptable. If the application wants to support non-ASCII file names, it must invoke setlocale(); it is not the library's responsibility to make this decision behind the application's back. People question the validity of using CODESET in the file system, so each developer needs to make a concious decision. BTW, how does Tcl come up with the names in the first place? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470 From noreply@sourceforge.net Mon Apr 8 16:01:40 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 08 Apr 2002 08:01:40 -0700 Subject: [Patches] [ python-Patches-539392 ] Unicode fix for test in tkFileDialog.py Message-ID: Patches item #539392, was opened at 2002-04-04 20:59 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470 Category: Tkinter Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Bernhard Reiter (ber) Assigned to: Martin v. Löwis (loewis) Summary: Unicode fix for test in tkFileDialog.py Initial Comment: Patch is against current CVS form 20020404. It also gives pointers to the problem described in http://mail.python.org/pipermail/python-list/2001-June/048787.html Python's open() uses the Py_FileSystemDefaultEncoding. Py_FileSystemDefaultEncoding is NULL (bltinmodule.c) for most systems. Setlocate will set it. Thus we fixed the example and set the locale to the user defaults. Now "enc" will have a useful encoding thus the example will work with a non ascii characters in the filename, e.g. with umlauts in it. It bombed on them before. Traceback (most recent call last): File "tkFileDialog.py", line 105, in ? print "open", askopenfilename(filetypes=[("all filez", "*")]) UnicodeError: ASCII encoding error: ordinal not in range(128) open() will work with the string directly now. encode(enc) is only needed for terminal output, thus we enchanced the example to show the two uses of the returned filename string separatly. (It might be interesting to drop a note about this in the right part of the user documentation.) If you comment out the setlocale() you can see that open fails, which illustrates what seems to be a design flaw in tk. Tk should be able to give you a string in exactly the encoding in which the filesystem gave it to tk. 4.4.2002 Bernhard Bernhard ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-08 17:00 Message: Logged In: YES user_id=21627 Sorry, I misinterpreted your patch first. I agree with your distinction of a file system encoding, and a terminal encoding; I still hope to enhance Python to expose an estimate of both - then leaving it to the application to make use of either as appropriate (the file system encoding would be used implicitly as is done today). As for the flaw in Tk: it turns out that Tcl has a different notion of the default encoding than Python - Tcl always uses a locale-aware default encoding, whereas Python has a system-wide fixed default encoding (usually ASCII). It is a good thing that Tkinter manages to represent file names correctly (i.e. as Unicode strings) in most cases - if you want to get the file name in the encoding in which the file system gave it to you, you need to establish the value of Tcl's "encoding system" command. Committed as tkFileDialog.py 1.7. ---------------------------------------------------------------------- Comment By: Bernhard Reiter (ber) Date: 2002-04-07 20:20 Message: Logged In: YES user_id=113859 I agree with your analysis that the appplication has to set the locale, if it wants to support non-ASCII filenames. This is why we fixed the _test_ code to demonstrate exactly this. The code of the modules itself is untouched. If you do not fix the _test_ code it will bomb on non-ascii file names. Our code also demonstrates that there might be a difference in the file system encoding (suitable for open) and the terminal encoding (suitable for printing). ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-04 22:16 Message: Logged In: YES user_id=21627 I think this patch is not acceptable. If the application wants to support non-ASCII file names, it must invoke setlocale(); it is not the library's responsibility to make this decision behind the application's back. People question the validity of using CODESET in the file system, so each developer needs to make a concious decision. BTW, how does Tcl come up with the names in the first place? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470 From noreply@sourceforge.net Mon Apr 8 17:46:45 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 08 Apr 2002 09:46:45 -0700 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 14:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Remind Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Guido van Rossum (gvanrossum) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-04-08 12:46 Message: Logged In: YES user_id=31435 Getting closer! Two more questions: + Why switch from tabs to spaces? The rest of this file uses hard tabs, and that's what Guido prefers in C source. + Think hard about whether we really want to decref the value -- I doubt we do, as we're *transferring* ownership of the value from the dict to the caller. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-08 10:14 Message: Logged In: YES user_id=80475 Here is a revised patch for D.pop() incorporating Tim's ideas: + Docstring spelling fixed + Switched to METH_O instead of METH_VARARGS + Delayed decref until dict entry in consistent state + Removed unused int i=0 variable + Tabs replaced with spaces ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-07 17:54 Message: Logged In: YES user_id=31435 I like Raymond's new pop(). Problems: + "speficied" is misspelled in the docstring. + Should be declared METH_O, not METH_VARARGS (mimic how, e.g., dict_update is set up). + The decrefs have to be reworked: a decref can trigger calls back into arbitrary Python code, due to __del__ methods getting invoked. This means you can never leave any live object in an insane or inconsistent state *during* a decref. What you need to do instead is first capture the key and value into local vrbls, plug dummy and NULL in to the dict slot, and decrement the used count. This leaves the dict in a consistent state again. Only then is it safe to decref the key and value. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 22:00 Message: Logged In: YES user_id=80475 Here's a more fleshed-out implementation of D.pop(). It doesn't rely on popitem(), doesn't malloc a tuple, and the refcounts should be correct. One change from Neil's version, since k isn't being returned, then an arbitrary pair doesn't make sense, so the key argument to pop is required rather than optional. The diff is off of 2.123. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-06 20:51 Message: Logged In: YES user_id=35752 Here's a quick implementation. D.pop() is not as efficient as it could be (it uses popitem and then promply deallocates the item tuple). I'm not sure it matters though. Someone should probably check the refcounts. I always screw them up. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-06 20:16 Message: Logged In: YES user_id=6380 Not a bad idea, Neil! Care to work the code around to implement that? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-06 20:14 Message: Logged In: YES user_id=35752 I think this should be implemented as pop() instead: D.pop([key]) -> value -- remove and return value by key (default a random value) It makes no sense to return the key when you already have it. pop() also matches well with list pop(): L.pop([index]) -> item -- remove and return item at index (default last) ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 20:08 Message: Logged In: YES user_id=80475 The tests and documentation patches have been added. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 12:23 Message: Logged In: YES user_id=80475 Q: Does the new function signature slow the existing no argument case? A: Yes. The function is already so fast, that the small overhead of PyArg_ParseTuple is measurable. My timing shows a 8% drop in speed. Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)? A: Yes. Though popvalue is a non-existing strawman, it would be quicker: it would cost two calls to Py_DECREF while saving a call to PyTuple_New and two calls to PyTuple_SET_ITEM. Still, the running time for popvalue would be dominated by the rest of the function and not the single malloc. Also, I think it unlikely that the dictionary interface would ever be expanded for popvalue, so the comparison is moot. Q: Are there cases where (k,v) is needed? A: Yes. One common case is where the tuple still needs to be formed to help build another dictionary: dict([d.popitem(k) for k in xferlist]) or [n.__setitem__(d.popitem(k)) for k in xferlist]. Also, it is useful when the key is computed by a function and then needs to be used in an expression. I often do something like that with setdefault: uniqInOrder= [u.setdefault(k,k) for k in alist if k not in u]. Also, when the key is computed by a function, it may need to be saved only when .popitem succeeds but not when the key is missing: "get and remove key if present; trigger exception if absent" This pattern is used in validating user input keys for deletion. Q: Where is the unittest and doc patch? A: Coming this weekend. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-05 16:50 Message: Logged In: YES user_id=31435 Are there examples of concrete use cases? The idea that dict.popitem(k) returns (k, dict[k]) seems kinda goofy, since you necessarily already have k. So the question is whether this is the function signature that's really desired, or whether it's too much a hack. As is, it slows down popitem() without an argument because it requires using a fancier calling sequence, and because it now defers that case to a taken branch; it's also much slower than a function that just returned v could be, due to the need to allocate a 2-tuple to hold a redundant copy of the key. Perhaps there are use cases of the form k, v = dict.popitem(f(x, y, z)) where the key is known only implicitly? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:47 Message: Logged In: YES user_id=6380 FYI, I'm uploading my version of the patch, with code cleanup, as popdict2.txt. I've moved the popitem-with-arg code before the allocation of res, because there were several places where this code returned NULL without DECREF'ing res. Repeating the PyTuple_New(2) call seemed the lesser evil. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:38 Message: Logged In: YES user_id=6380 I've reviewed the patch and see only cosmetic things that need to be changed. I'll check it in as soon as you submit a unittest and doc patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:26 Message: Logged In: YES user_id=6380 Now, if you could also upload a unittest and a doc patch, that would be great! ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-05 16:10 Message: Logged In: YES user_id=80475 Context diff uploaded at poppatch.c below. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 15:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Mon Apr 8 19:47:11 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 08 Apr 2002 11:47:11 -0700 Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols Message-ID: Patches item #540394, was opened at 2002-04-06 20:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Guido van Rossum (gvanrossum) Summary: Remove PyMalloc_* symbols Initial Comment: This patch removes all PyMalloc_* symbols from the source. obmalloc now implements PyObject_{Malloc, Realloc, Free}. PyObject_{New,NewVar} allocate using pymalloc. I also changed PyObject_Del and PyObject_GC_Del so that they be used as function designators. Is changing the signature of PyObject_Del going to cause any problems? I had to add some extra typecasts when assigning to tp_free. Please review and assign back to me. The next phase would be to cleanup the memory API usage. Do we want to replace all PyObject_Del calls with PyObject_Free? PyObject_Del seems to match better with PyObject_GC_Del. Oh yes, we also need to change PyMem_{Free, Del, ...} to use pymalloc's free. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-08 14:47 Message: Logged In: YES user_id=6380 I'm looking at this now... ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:59 Message: Logged In: YES user_id=31435 Extensions that *currently* call PyObject_Del have its old macro expansion ("_PyObject_Del((PyObject *)(op))") buried in them, so getting rid of _PyObject_Del is a binary-API incompatibility (existing extensions will no longer link without recompilation). I personally don't mind that, but I run on Windows and "binary compatability" never works there across minor releases for other reasons, so I don't have any real feel for how much people on other platforms value it. As you pointed out recently too, binary compatability has, in reality, not been the case since 1.5.2 anyway. So that's one for Python-Dev. If we do break binary compatibility, I'd be sorely tempted to change the "destructor" typedef to say destructors take void*. IMO saying they take PyObject* was a poor idea, as you almost never have a PyObject* when calling one of these guys. That's why PyObject_Del "had to" be a macro, to hide the cast to PyObject* almost everyone needs because of destructor's "correct" but impractical signature. If "destructor" had a practical signature, there would have been no temptation to use a macro. Note that if the typedef of destructor were so changed, you wouldn't have needed new casts in tp_free slots. And I'd rather break binary compatability than make extension authors add new casts. Hmm. I'm assigning this to Guido for comment: Guido, what are your feelings about binary compatibility here? C didn't define free() as taking a void* by mistake . Back to Neil: I wouldn't bother changing PyObject_Del to PyObject_Free. The former isn't in the "recommended" minimal API, but neither is it discouraged. I expect TMTOWTDI here forever. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:41 Message: Logged In: YES user_id=31435 Oops -- I hit "Submit" prematurely. More to come. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:40 Message: Logged In: YES user_id=31435 Looks good to me -- thanks! ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 From noreply@sourceforge.net Mon Apr 8 20:18:07 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 08 Apr 2002 12:18:07 -0700 Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols Message-ID: Patches item #540394, was opened at 2002-04-06 20:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Guido van Rossum (gvanrossum) Summary: Remove PyMalloc_* symbols Initial Comment: This patch removes all PyMalloc_* symbols from the source. obmalloc now implements PyObject_{Malloc, Realloc, Free}. PyObject_{New,NewVar} allocate using pymalloc. I also changed PyObject_Del and PyObject_GC_Del so that they be used as function designators. Is changing the signature of PyObject_Del going to cause any problems? I had to add some extra typecasts when assigning to tp_free. Please review and assign back to me. The next phase would be to cleanup the memory API usage. Do we want to replace all PyObject_Del calls with PyObject_Free? PyObject_Del seems to match better with PyObject_GC_Del. Oh yes, we also need to change PyMem_{Free, Del, ...} to use pymalloc's free. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-08 15:18 Message: Logged In: YES user_id=6380 (Wouldn't it be more efficient to take this to email between the three of us?) > Extensions that *currently* call PyObject_Del have > its old macro expansion ("_PyObject_Del((PyObject > *)(op))") buried in them, so getting rid of > _PyObject_Del is a binary-API incompatibility > (existing extensions will no longer link without > recompilation). I personally don't mind that, but > I run on Windows and "binary compatability" never > works there across minor releases for other > reasons, so I don't have any real feel for how > much people on other platforms value it. As you > pointed out recently too, binary compatability > has, in reality, not been the case since 1.5.2 > anyway. Still, tradition has it that we keep such entry points around for a long time. I propose that we do so now, too. > So that's one for Python-Dev. If we do break > binary compatibility, I'd be sorely tempted to > change the "destructor" typedef to say destructors > take void*. IMO saying they take PyObject* was a > poor idea, as you almost never have a PyObject* > when calling one of these guys. Huh? "destructor" is used to declare tp_dealloc, which definitely needs a PyObject * (or some "subclass" of it, like PyIntObject *). It's also used to declare tp_free, which arguably shouldn't take a PyObject * (since by the time tp_free is called, most of the object's contents have been destroyed by tp_dealloc). So maybe tp_free (a newcomer in 2.2) should be declared to take something else, but then the risk is breaking code that defines a tp_free with the correct signature. > That's why PyObject_Del "had to" be a macro, to > hide the cast to PyObject* almost everyone needs > because of destructor's "correct" but impractical > signature. If "destructor" had a practical > signature, there would have been no temptation to > use a macro. I don't understand this at all. > Note that if the typedef of destructor were so > changed, you wouldn't have needed new casts in > tp_free slots. And I'd rather break binary > compatability than make extension authors add new > casts. Nor this. > Hmm. I'm assigning this to Guido for comment: > Guido, what are your feelings about binary > compatibility here? C didn't define free() as > taking a void* by mistake . I want binary compatibility, but I don't understand your comments very well. > Back to Neil: I wouldn't bother changing PyObject_Del > to PyObject_Free. The former isn't in the > "recommended" minimal API, but neither is it > discouraged. I expect TMTOWTDI here forever. I prefer PyObject_Del -- like PyObject_GC_Del, and like we did in the past. Plus, I like New to match Del and Malloc to match Free. Since it's PyObject_New, it should be _Del. I'm not sure what to say of Neil's patch, except that I'm glad to be rid of the PyMalloc_XXX family. I wish we didn't have to change all the places that used to say _PyObject_Del. Maybe it's best to keep that name around? The patch would (psychologically) become a lot smaller. I almost wish that this would work: #define PyObject_Del ((destructor)PyObject_Free) Or maybe it *does* work??? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-08 14:47 Message: Logged In: YES user_id=6380 I'm looking at this now... ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:59 Message: Logged In: YES user_id=31435 Extensions that *currently* call PyObject_Del have its old macro expansion ("_PyObject_Del((PyObject *)(op))") buried in them, so getting rid of _PyObject_Del is a binary-API incompatibility (existing extensions will no longer link without recompilation). I personally don't mind that, but I run on Windows and "binary compatability" never works there across minor releases for other reasons, so I don't have any real feel for how much people on other platforms value it. As you pointed out recently too, binary compatability has, in reality, not been the case since 1.5.2 anyway. So that's one for Python-Dev. If we do break binary compatibility, I'd be sorely tempted to change the "destructor" typedef to say destructors take void*. IMO saying they take PyObject* was a poor idea, as you almost never have a PyObject* when calling one of these guys. That's why PyObject_Del "had to" be a macro, to hide the cast to PyObject* almost everyone needs because of destructor's "correct" but impractical signature. If "destructor" had a practical signature, there would have been no temptation to use a macro. Note that if the typedef of destructor were so changed, you wouldn't have needed new casts in tp_free slots. And I'd rather break binary compatability than make extension authors add new casts. Hmm. I'm assigning this to Guido for comment: Guido, what are your feelings about binary compatibility here? C didn't define free() as taking a void* by mistake . Back to Neil: I wouldn't bother changing PyObject_Del to PyObject_Free. The former isn't in the "recommended" minimal API, but neither is it discouraged. I expect TMTOWTDI here forever. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:41 Message: Logged In: YES user_id=31435 Oops -- I hit "Submit" prematurely. More to come. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:40 Message: Logged In: YES user_id=31435 Looks good to me -- thanks! ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 From noreply@sourceforge.net Mon Apr 8 21:35:24 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 08 Apr 2002 13:35:24 -0700 Subject: [Patches] [ python-Patches-541210 ] build info docs from tex sources Message-ID: Patches item #541210, was opened at 2002-04-08 20:35 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541210&group_id=5470 Category: Documentation Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Matthias Klose (doko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: build info docs from tex sources Initial Comment: This patch (same as for 2.2) adds Milan Zamazals conversion script and modifies the mkinfo script to build the info doc files from the latex sources. Currently, the mac, doc and inst tex files are not handled. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541210&group_id=5470 From noreply@sourceforge.net Mon Apr 8 22:08:38 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 08 Apr 2002 14:08:38 -0700 Subject: [Patches] [ python-Patches-523424 ] Finding "home" in "user.py" for Windows Message-ID: Patches item #523424, was opened at 2002-02-27 16:03 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=523424&group_id=5470 Category: Modules Group: None >Status: Closed >Resolution: Rejected Priority: 5 Submitted By: Gilles Lenfant (glenfant) Assigned to: Nobody/Anonymous (nobody) >Summary: Finding "home" in "user.py" for Windows Initial Comment: On my win2k French box + python 2.1.2: >>> import user >>> user.home 'C:\' This isn't a great issue but this means that all users of this win2k box will share the same ".pythonrc.py". The code provided by Jeff Bauer can be changed easily because the standard Python distro now has a "_winreg" module. This patch gives the real user $HOME like folder for any user on whatever's Windows localization: >>> import user >>> user.home u'C:\Documents and Settings\MyWindowsUsername\Mes documents' This has been successfully tested with Win98 and Win2000. This should be tested on XP, NT4, and 95 but I can't. Sorry for the "context or unified diffs" (dunno what it means) but the module is short and my patch is clearly emphasized. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-18 09:42 Message: Logged In: YES user_id=21627 If there are no further comments in favour of accepting this patch, it will be rejected. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-02-27 23:13 Message: Logged In: YES user_id=21627 If it returns "My Documents", it is definitely *not* the home directory of the user; \Documents and Settings\username would be the home directory. Furthermore, on many installations, HOME *is* set, and it is the Administrator's choice where that points to; the typical installation (in a domain) indeed is to assign HOMEDRIVE. So I'm not in favour of that change. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=523424&group_id=5470 From noreply@sourceforge.net Mon Apr 8 22:16:57 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 08 Apr 2002 14:16:57 -0700 Subject: [Patches] [ python-Patches-541210 ] build info docs from tex sources Message-ID: Patches item #541210, was opened at 2002-04-08 16:35 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541210&group_id=5470 Category: Documentation Group: Python 2.3 >Status: Closed >Resolution: Duplicate Priority: 5 Submitted By: Matthias Klose (doko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: build info docs from tex sources Initial Comment: This patch (same as for 2.2) adds Milan Zamazals conversion script and modifies the mkinfo script to build the info doc files from the latex sources. Currently, the mac, doc and inst tex files are not handled. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-04-08 17:16 Message: Logged In: YES user_id=3066 This is a duplicate patch; a duplicate report is not needed. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541210&group_id=5470 From noreply@sourceforge.net Mon Apr 8 22:19:31 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 08 Apr 2002 14:19:31 -0700 Subject: [Patches] [ python-Patches-539487 ] build info docs from tex sources Message-ID: Patches item #539487, was opened at 2002-04-04 17:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470 Category: Documentation Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Matthias Klose (doko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: build info docs from tex sources Initial Comment: This patch adds Milan Zamazals conversion script and modifies the mkinfo script to build the info doc files from the latex sources. Currently, the mac, doc and inst tex files are not handled. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-04-08 17:19 Message: Logged In: YES user_id=3066 I'll add a note here just in case: This patch applies to the 2.3 development as well as 2.2 maintenance tree. This still seems to suffer the problems that all versions of this conversion have suffered; it isn't portable between FSF Emacs and XEmacs. I'll see about installing FSF Emacs to see if it'll work for me there. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470 From noreply@sourceforge.net Mon Apr 8 22:24:11 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 08 Apr 2002 14:24:11 -0700 Subject: [Patches] [ python-Patches-539487 ] build info docs from tex sources Message-ID: Patches item #539487, was opened at 2002-04-04 17:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470 Category: Documentation Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Matthias Klose (doko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: build info docs from tex sources Initial Comment: This patch adds Milan Zamazals conversion script and modifies the mkinfo script to build the info doc files from the latex sources. Currently, the mac, doc and inst tex files are not handled. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-04-08 17:24 Message: Logged In: YES user_id=3066 For the record, here's the specific errors I get when using XEmacs with this patch on the current release22-maint branch (hopefully SF won't munge them too badly): grendel(.../r22-maint/Doc); make EMACS=xemacs info cd info && make make[1]: Entering directory `/home/fdrake/projects/python/r22-maint/Doc/info' ../tools/mkinfo ../api/api.tex python-api.info xemacs -batch -q --no-site-file -l /home/fdrake/projects/python/r22-maint/Doc/tools/py2texi.el --eval (setq py2texi-dirs '("./" "../texinputs/" "/home/fdrake/projects/python/r22-maint/Doc/api")) --eval (py2texi "/home/fdrake/projects/python/r22-maint/Doc/api/api.tex") -f kill-emacs Loading /usr/lib/xemacs/xemacs-packages/lisp/site-start.d/aspell-init.el... Loading /usr/lib/xemacs/xemacs-packages/lisp/site-start.d/mew-init.el... Loading /usr/lib/xemacs/xemacs-packages/lisp/site-start.d/psgml-init.el... Loading /usr/lib/xemacs/xemacs-packages/lisp/site-start.d/xemacs-po-mode-init.el... Mark set Args out of range: 72, 132 xemacs exiting. make[1]: *** [python-api.info] Error 255 make[1]: Leaving directory `/home/fdrake/projects/python/r22-maint/Doc/info' make: *** [info] Error 2 ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-04-08 17:19 Message: Logged In: YES user_id=3066 I'll add a note here just in case: This patch applies to the 2.3 development as well as 2.2 maintenance tree. This still seems to suffer the problems that all versions of this conversion have suffered; it isn't portable between FSF Emacs and XEmacs. I'll see about installing FSF Emacs to see if it'll work for me there. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470 From noreply@sourceforge.net Mon Apr 8 22:30:06 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 08 Apr 2002 14:30:06 -0700 Subject: [Patches] [ python-Patches-512005 ] getrusage() returns struct-like object. Message-ID: Patches item #512005, was opened at 2002-02-02 03:41 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=512005&group_id=5470 Category: Modules Group: Python 2.3 >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Kirill Simonov (kirill_simonov) Assigned to: Nobody/Anonymous (nobody) Summary: getrusage() returns struct-like object. Initial Comment: The function resource.getrusage() now returns struct-like object (cf. os.stat() and time.gmtime()). This is my first patch for Python so please don't scorch me if something is wrong ;). ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-08 23:30 Message: Logged In: YES user_id=21627 Thanks for the patch, applied as libresource.tex 1.17 ACKS 1.165 NEWS 1.382 resource.c 2.24 ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=512005&group_id=5470 From noreply@sourceforge.net Tue Apr 9 09:05:28 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 09 Apr 2002 01:05:28 -0700 Subject: [Patches] [ python-Patches-539487 ] build info docs from tex sources Message-ID: Patches item #539487, was opened at 2002-04-04 22:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470 Category: Documentation Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Matthias Klose (doko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: build info docs from tex sources Initial Comment: This patch adds Milan Zamazals conversion script and modifies the mkinfo script to build the info doc files from the latex sources. Currently, the mac, doc and inst tex files are not handled. ---------------------------------------------------------------------- >Comment By: Matthias Klose (doko) Date: 2002-04-09 08:05 Message: Logged In: YES user_id=60903 Yes, forget to mention that Milan said, it only works for emacs. I built the info docs using emacs-21.2 ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-04-08 21:24 Message: Logged In: YES user_id=3066 For the record, here's the specific errors I get when using XEmacs with this patch on the current release22-maint branch (hopefully SF won't munge them too badly): grendel(.../r22-maint/Doc); make EMACS=xemacs info cd info && make make[1]: Entering directory `/home/fdrake/projects/python/r22-maint/Doc/info' ../tools/mkinfo ../api/api.tex python-api.info xemacs -batch -q --no-site-file -l /home/fdrake/projects/python/r22-maint/Doc/tools/py2texi.el --eval (setq py2texi-dirs '("./" "../texinputs/" "/home/fdrake/projects/python/r22-maint/Doc/api")) --eval (py2texi "/home/fdrake/projects/python/r22-maint/Doc/api/api.tex") -f kill-emacs Loading /usr/lib/xemacs/xemacs-packages/lisp/site-start.d/aspell-init.el... Loading /usr/lib/xemacs/xemacs-packages/lisp/site-start.d/mew-init.el... Loading /usr/lib/xemacs/xemacs-packages/lisp/site-start.d/psgml-init.el... Loading /usr/lib/xemacs/xemacs-packages/lisp/site-start.d/xemacs-po-mode-init.el... Mark set Args out of range: 72, 132 xemacs exiting. make[1]: *** [python-api.info] Error 255 make[1]: Leaving directory `/home/fdrake/projects/python/r22-maint/Doc/info' make: *** [info] Error 2 ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-04-08 21:19 Message: Logged In: YES user_id=3066 I'll add a note here just in case: This patch applies to the 2.3 development as well as 2.2 maintenance tree. This still seems to suffer the problems that all versions of this conversion have suffered; it isn't portable between FSF Emacs and XEmacs. I'll see about installing FSF Emacs to see if it'll work for me there. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470 From noreply@sourceforge.net Tue Apr 9 14:39:08 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 09 Apr 2002 06:39:08 -0700 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 19:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: Remind Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Guido van Rossum (gvanrossum) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-09 13:39 Message: Logged In: YES user_id=80475 Here is a revised patch for D.pop() with hard tabs and corrected reference counts. In a DEBUG build, I validated the ref counts against equivalent steps: vv=d[k]; del d[k]. And, after Tim's suggestions, the code is fast and light. In addition to d.pop(k), GvR's patch for d.popitem(k) should also go in. The (k,v) return value feeds directly into d.__setitem__ or a dict(itemlist) constructor (see the code fragments in the 4/6/02 post). The only downside is the time to process METH_VARARGS. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-08 16:46 Message: Logged In: YES user_id=31435 Getting closer! Two more questions: + Why switch from tabs to spaces? The rest of this file uses hard tabs, and that's what Guido prefers in C source. + Think hard about whether we really want to decref the value -- I doubt we do, as we're *transferring* ownership of the value from the dict to the caller. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-08 14:14 Message: Logged In: YES user_id=80475 Here is a revised patch for D.pop() incorporating Tim's ideas: + Docstring spelling fixed + Switched to METH_O instead of METH_VARARGS + Delayed decref until dict entry in consistent state + Removed unused int i=0 variable + Tabs replaced with spaces ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-07 21:54 Message: Logged In: YES user_id=31435 I like Raymond's new pop(). Problems: + "speficied" is misspelled in the docstring. + Should be declared METH_O, not METH_VARARGS (mimic how, e.g., dict_update is set up). + The decrefs have to be reworked: a decref can trigger calls back into arbitrary Python code, due to __del__ methods getting invoked. This means you can never leave any live object in an insane or inconsistent state *during* a decref. What you need to do instead is first capture the key and value into local vrbls, plug dummy and NULL in to the dict slot, and decrement the used count. This leaves the dict in a consistent state again. Only then is it safe to decref the key and value. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-07 03:00 Message: Logged In: YES user_id=80475 Here's a more fleshed-out implementation of D.pop(). It doesn't rely on popitem(), doesn't malloc a tuple, and the refcounts should be correct. One change from Neil's version, since k isn't being returned, then an arbitrary pair doesn't make sense, so the key argument to pop is required rather than optional. The diff is off of 2.123. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-07 01:51 Message: Logged In: YES user_id=35752 Here's a quick implementation. D.pop() is not as efficient as it could be (it uses popitem and then promply deallocates the item tuple). I'm not sure it matters though. Someone should probably check the refcounts. I always screw them up. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-07 01:16 Message: Logged In: YES user_id=6380 Not a bad idea, Neil! Care to work the code around to implement that? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-07 01:14 Message: Logged In: YES user_id=35752 I think this should be implemented as pop() instead: D.pop([key]) -> value -- remove and return value by key (default a random value) It makes no sense to return the key when you already have it. pop() also matches well with list pop(): L.pop([index]) -> item -- remove and return item at index (default last) ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-07 01:08 Message: Logged In: YES user_id=80475 The tests and documentation patches have been added. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 17:23 Message: Logged In: YES user_id=80475 Q: Does the new function signature slow the existing no argument case? A: Yes. The function is already so fast, that the small overhead of PyArg_ParseTuple is measurable. My timing shows a 8% drop in speed. Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)? A: Yes. Though popvalue is a non-existing strawman, it would be quicker: it would cost two calls to Py_DECREF while saving a call to PyTuple_New and two calls to PyTuple_SET_ITEM. Still, the running time for popvalue would be dominated by the rest of the function and not the single malloc. Also, I think it unlikely that the dictionary interface would ever be expanded for popvalue, so the comparison is moot. Q: Are there cases where (k,v) is needed? A: Yes. One common case is where the tuple still needs to be formed to help build another dictionary: dict([d.popitem(k) for k in xferlist]) or [n.__setitem__(d.popitem(k)) for k in xferlist]. Also, it is useful when the key is computed by a function and then needs to be used in an expression. I often do something like that with setdefault: uniqInOrder= [u.setdefault(k,k) for k in alist if k not in u]. Also, when the key is computed by a function, it may need to be saved only when .popitem succeeds but not when the key is missing: "get and remove key if present; trigger exception if absent" This pattern is used in validating user input keys for deletion. Q: Where is the unittest and doc patch? A: Coming this weekend. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-05 21:50 Message: Logged In: YES user_id=31435 Are there examples of concrete use cases? The idea that dict.popitem(k) returns (k, dict[k]) seems kinda goofy, since you necessarily already have k. So the question is whether this is the function signature that's really desired, or whether it's too much a hack. As is, it slows down popitem() without an argument because it requires using a fancier calling sequence, and because it now defers that case to a taken branch; it's also much slower than a function that just returned v could be, due to the need to allocate a 2-tuple to hold a redundant copy of the key. Perhaps there are use cases of the form k, v = dict.popitem(f(x, y, z)) where the key is known only implicitly? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:47 Message: Logged In: YES user_id=6380 FYI, I'm uploading my version of the patch, with code cleanup, as popdict2.txt. I've moved the popitem-with-arg code before the allocation of res, because there were several places where this code returned NULL without DECREF'ing res. Repeating the PyTuple_New(2) call seemed the lesser evil. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:38 Message: Logged In: YES user_id=6380 I've reviewed the patch and see only cosmetic things that need to be changed. I'll check it in as soon as you submit a unittest and doc patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 21:26 Message: Logged In: YES user_id=6380 Now, if you could also upload a unittest and a doc patch, that would be great! ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-05 21:10 Message: Logged In: YES user_id=80475 Context diff uploaded at poppatch.c below. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 20:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Tue Apr 9 15:17:47 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 09 Apr 2002 07:17:47 -0700 Subject: [Patches] [ python-Patches-539487 ] build info docs from tex sources Message-ID: Patches item #539487, was opened at 2002-04-04 17:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470 Category: Documentation Group: Python 2.2.x >Status: Closed >Resolution: Rejected Priority: 5 Submitted By: Matthias Klose (doko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: build info docs from tex sources Initial Comment: This patch adds Milan Zamazals conversion script and modifies the mkinfo script to build the info doc files from the latex sources. Currently, the mac, doc and inst tex files are not handled. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-04-09 10:17 Message: Logged In: YES user_id=3066 I just installed emacs 20.7 'cause those are the RPMs that came with the distro I have on this box (RedHat 7.2), and that produced a similar error. I'll have to ask that a more robust patch be available before I can spend more time on it; this one will be marked as rejected. Until then, I'm glad to publish contributed GNU info versions provided by community members. For the record, here's the specific error output I got and the FSF Emacs version info: grendel(.../r22-maint/Doc); make info cd info && make make[1]: Entering directory `/home/fdrake/projects/python/r22-maint/Doc/info' ../tools/mkinfo ../api/api.tex python-api.info emacs -batch -q --no-site-file -l /home/fdrake/projects/python/r22-maint/Doc/tools/py2texi.el --eval (setq py2texi-dirs '("./" "../texinputs/" "/home/fdrake/projects/python/r22-maint/Doc/api")) --eval (py2texi "/home/fdrake/projects/python/r22-maint/Doc/api/api.tex") -f kill-emacs Mark set Args out of range: 27914, 27916 make[1]: *** [python-api.info] Error 255 make[1]: Leaving directory `/home/fdrake/projects/python/r22-maint/Doc/info' make: *** [info] Error 2 [2] grendel(.../r22-maint/Doc); emacs --version GNU Emacs 20.7.1 Copyright (C) 1999 Free Software Foundation, Inc. GNU Emacs comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of Emacs under the terms of the GNU General Public License. For more information about these matters, see the file named COPYING. ---------------------------------------------------------------------- Comment By: Matthias Klose (doko) Date: 2002-04-09 04:05 Message: Logged In: YES user_id=60903 Yes, forget to mention that Milan said, it only works for emacs. I built the info docs using emacs-21.2 ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-04-08 17:24 Message: Logged In: YES user_id=3066 For the record, here's the specific errors I get when using XEmacs with this patch on the current release22-maint branch (hopefully SF won't munge them too badly): grendel(.../r22-maint/Doc); make EMACS=xemacs info cd info && make make[1]: Entering directory `/home/fdrake/projects/python/r22-maint/Doc/info' ../tools/mkinfo ../api/api.tex python-api.info xemacs -batch -q --no-site-file -l /home/fdrake/projects/python/r22-maint/Doc/tools/py2texi.el --eval (setq py2texi-dirs '("./" "../texinputs/" "/home/fdrake/projects/python/r22-maint/Doc/api")) --eval (py2texi "/home/fdrake/projects/python/r22-maint/Doc/api/api.tex") -f kill-emacs Loading /usr/lib/xemacs/xemacs-packages/lisp/site-start.d/aspell-init.el... Loading /usr/lib/xemacs/xemacs-packages/lisp/site-start.d/mew-init.el... Loading /usr/lib/xemacs/xemacs-packages/lisp/site-start.d/psgml-init.el... Loading /usr/lib/xemacs/xemacs-packages/lisp/site-start.d/xemacs-po-mode-init.el... Mark set Args out of range: 72, 132 xemacs exiting. make[1]: *** [python-api.info] Error 255 make[1]: Leaving directory `/home/fdrake/projects/python/r22-maint/Doc/info' make: *** [info] Error 2 ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-04-08 17:19 Message: Logged In: YES user_id=3066 I'll add a note here just in case: This patch applies to the 2.3 development as well as 2.2 maintenance tree. This still seems to suffer the problems that all versions of this conversion have suffered; it isn't portable between FSF Emacs and XEmacs. I'll see about installing FSF Emacs to see if it'll work for me there. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470 From noreply@sourceforge.net Tue Apr 9 15:20:24 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 09 Apr 2002 07:20:24 -0700 Subject: [Patches] [ python-Patches-539486 ] build info docs from sources Message-ID: Patches item #539486, was opened at 2002-04-04 17:25 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539486&group_id=5470 Category: Documentation Group: Python 2.1.2 >Status: Pending Resolution: None Priority: 5 Submitted By: Matthias Klose (doko) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: build info docs from sources Initial Comment: This patch adds Milan Zamazals conversion script and modifies the mkinfo script to build the info doc files from the latex sources. Currently, the mac, doc and inst tex files are not handled. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-04-09 10:20 Message: Logged In: YES user_id=3066 Is this essentially the same patch as the 2.2.x version, with the differences being in the comprehension of the generated HTML? If so, I'll reject this on the same grounds (insufficiently robust to spend time on). If this is less fragile, I'll consider it for the 2.1.x tree independently. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539486&group_id=5470 From noreply@sourceforge.net Tue Apr 9 17:27:21 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 09 Apr 2002 09:27:21 -0700 Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols Message-ID: Patches item #540394, was opened at 2002-04-06 20:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) >Assigned to: Neil Schemenauer (nascheme) Summary: Remove PyMalloc_* symbols Initial Comment: This patch removes all PyMalloc_* symbols from the source. obmalloc now implements PyObject_{Malloc, Realloc, Free}. PyObject_{New,NewVar} allocate using pymalloc. I also changed PyObject_Del and PyObject_GC_Del so that they be used as function designators. Is changing the signature of PyObject_Del going to cause any problems? I had to add some extra typecasts when assigning to tp_free. Please review and assign back to me. The next phase would be to cleanup the memory API usage. Do we want to replace all PyObject_Del calls with PyObject_Free? PyObject_Del seems to match better with PyObject_GC_Del. Oh yes, we also need to change PyMem_{Free, Del, ...} to use pymalloc's free. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-09 12:27 Message: Logged In: YES user_id=6380 I've not fully read Tim's response in email, but instead I've reviewed and discussed the patch with Tim. I think the only thing to which I object at this point is the removal of the entry point _PyObject_Del. I believe that for source and binary compatibility with 2.2, that entry point should remain, with the same meaning, but it should not be used at all by the core. (Motivation to keep it: it's the only thing you can reasonably stick in tp_free that works for 2.2 as well as for 2.3.) One minor question: there are a bunch of #undefs in gcmodule.c (e.g. PyObject_GC_Track) that don't seem to make sense -- at least I cannot find where these would be #defined any more. Ditto for #indef PyObject_Malloc in obmalloc.c. I suggest that you check this thing in, but keeping _PyObject_Del alive, and we'll take it from there. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-08 15:18 Message: Logged In: YES user_id=6380 (Wouldn't it be more efficient to take this to email between the three of us?) > Extensions that *currently* call PyObject_Del have > its old macro expansion ("_PyObject_Del((PyObject > *)(op))") buried in them, so getting rid of > _PyObject_Del is a binary-API incompatibility > (existing extensions will no longer link without > recompilation). I personally don't mind that, but > I run on Windows and "binary compatability" never > works there across minor releases for other > reasons, so I don't have any real feel for how > much people on other platforms value it. As you > pointed out recently too, binary compatability > has, in reality, not been the case since 1.5.2 > anyway. Still, tradition has it that we keep such entry points around for a long time. I propose that we do so now, too. > So that's one for Python-Dev. If we do break > binary compatibility, I'd be sorely tempted to > change the "destructor" typedef to say destructors > take void*. IMO saying they take PyObject* was a > poor idea, as you almost never have a PyObject* > when calling one of these guys. Huh? "destructor" is used to declare tp_dealloc, which definitely needs a PyObject * (or some "subclass" of it, like PyIntObject *). It's also used to declare tp_free, which arguably shouldn't take a PyObject * (since by the time tp_free is called, most of the object's contents have been destroyed by tp_dealloc). So maybe tp_free (a newcomer in 2.2) should be declared to take something else, but then the risk is breaking code that defines a tp_free with the correct signature. > That's why PyObject_Del "had to" be a macro, to > hide the cast to PyObject* almost everyone needs > because of destructor's "correct" but impractical > signature. If "destructor" had a practical > signature, there would have been no temptation to > use a macro. I don't understand this at all. > Note that if the typedef of destructor were so > changed, you wouldn't have needed new casts in > tp_free slots. And I'd rather break binary > compatability than make extension authors add new > casts. Nor this. > Hmm. I'm assigning this to Guido for comment: > Guido, what are your feelings about binary > compatibility here? C didn't define free() as > taking a void* by mistake . I want binary compatibility, but I don't understand your comments very well. > Back to Neil: I wouldn't bother changing PyObject_Del > to PyObject_Free. The former isn't in the > "recommended" minimal API, but neither is it > discouraged. I expect TMTOWTDI here forever. I prefer PyObject_Del -- like PyObject_GC_Del, and like we did in the past. Plus, I like New to match Del and Malloc to match Free. Since it's PyObject_New, it should be _Del. I'm not sure what to say of Neil's patch, except that I'm glad to be rid of the PyMalloc_XXX family. I wish we didn't have to change all the places that used to say _PyObject_Del. Maybe it's best to keep that name around? The patch would (psychologically) become a lot smaller. I almost wish that this would work: #define PyObject_Del ((destructor)PyObject_Free) Or maybe it *does* work??? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-08 14:47 Message: Logged In: YES user_id=6380 I'm looking at this now... ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:59 Message: Logged In: YES user_id=31435 Extensions that *currently* call PyObject_Del have its old macro expansion ("_PyObject_Del((PyObject *)(op))") buried in them, so getting rid of _PyObject_Del is a binary-API incompatibility (existing extensions will no longer link without recompilation). I personally don't mind that, but I run on Windows and "binary compatability" never works there across minor releases for other reasons, so I don't have any real feel for how much people on other platforms value it. As you pointed out recently too, binary compatability has, in reality, not been the case since 1.5.2 anyway. So that's one for Python-Dev. If we do break binary compatibility, I'd be sorely tempted to change the "destructor" typedef to say destructors take void*. IMO saying they take PyObject* was a poor idea, as you almost never have a PyObject* when calling one of these guys. That's why PyObject_Del "had to" be a macro, to hide the cast to PyObject* almost everyone needs because of destructor's "correct" but impractical signature. If "destructor" had a practical signature, there would have been no temptation to use a macro. Note that if the typedef of destructor were so changed, you wouldn't have needed new casts in tp_free slots. And I'd rather break binary compatability than make extension authors add new casts. Hmm. I'm assigning this to Guido for comment: Guido, what are your feelings about binary compatibility here? C didn't define free() as taking a void* by mistake . Back to Neil: I wouldn't bother changing PyObject_Del to PyObject_Free. The former isn't in the "recommended" minimal API, but neither is it discouraged. I expect TMTOWTDI here forever. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:41 Message: Logged In: YES user_id=31435 Oops -- I hit "Submit" prematurely. More to come. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:40 Message: Logged In: YES user_id=31435 Looks good to me -- thanks! ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 From noreply@sourceforge.net Tue Apr 9 17:30:06 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 09 Apr 2002 09:30:06 -0700 Subject: [Patches] [ python-Patches-536278 ] force gzip to open files with 'b' Message-ID: Patches item #536278, was opened at 2002-03-28 08:53 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536278&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Skip Montanaro (montanaro) Assigned to: Nobody/Anonymous (nobody) Summary: force gzip to open files with 'b' Initial Comment: It doesn't make sense that the gzip module should try to open a file in text mode. The attached patch forces a 'b' into the file open mode if it wasn't given. I also modified the test slightly to try and tickle this code, but I can't test it very effectively, because I don't do Windows... :-) ---------------------------------------------------------------------- >Comment By: Skip Montanaro (montanaro) Date: 2002-04-09 11:30 Message: Logged In: YES user_id=44345 good point. updated patch. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-03-30 22:12 Message: Logged In: YES user_id=31435 I suggest fixing this via changing the test to if mode and 'b' not in mode: Then mode=None and mode='' will be left alone (as Neal says, the code already does the right thing for those). ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-03-28 09:04 Message: Logged In: YES user_id=33168 There is a problem (sorry, I have an evil mind). :-) If '' is passed as the mode, before the patch, this would have been converted to 'rb'. After the patch, mode will become 'b' and that will raise an exception: >>> open('/dev/null', 'b') IOError: [Errno 22] Invalid argument: b If you add an (and mode) condition and that should be fine. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536278&group_id=5470 From noreply@sourceforge.net Tue Apr 9 19:47:39 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 09 Apr 2002 11:47:39 -0700 Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols Message-ID: Patches item #540394, was opened at 2002-04-06 20:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Neil Schemenauer (nascheme) Summary: Remove PyMalloc_* symbols Initial Comment: This patch removes all PyMalloc_* symbols from the source. obmalloc now implements PyObject_{Malloc, Realloc, Free}. PyObject_{New,NewVar} allocate using pymalloc. I also changed PyObject_Del and PyObject_GC_Del so that they be used as function designators. Is changing the signature of PyObject_Del going to cause any problems? I had to add some extra typecasts when assigning to tp_free. Please review and assign back to me. The next phase would be to cleanup the memory API usage. Do we want to replace all PyObject_Del calls with PyObject_Free? PyObject_Del seems to match better with PyObject_GC_Del. Oh yes, we also need to change PyMem_{Free, Del, ...} to use pymalloc's free. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-04-09 14:47 Message: Logged In: YES user_id=31435 Clarifying or just repeating Guido here: + Binary compatibility is important. It's better on Unix than it appears -- while you'll get a warning if you run an old 1.5.2 extension with 2.2 today and without recompiling, it will almost certainly work anyway. So in the case of macros that expanded to a private API function before, that private API function must still exist, but the macro needn't expand to that anymore (nor even *be* a macro anymore). _PyObject_Del is a particular problem cuz it's even documented in the C API manual -- there simply wasn't a public API function before that did the same thing and could be used as a function designator. You're making life better for future generations. + Casts on tp_free slots are par for the course, because "destructor" has an impractical signature. I'm afraid that can't change either, so the casts stay. + Fred and I agreed to add PyObject_Del to the "minimal recommended API", so, for the next round of this, feel wholly righteous in leaving existing PyObject_Del calls alone. If anything's unclear, hit me. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-09 12:27 Message: Logged In: YES user_id=6380 I've not fully read Tim's response in email, but instead I've reviewed and discussed the patch with Tim. I think the only thing to which I object at this point is the removal of the entry point _PyObject_Del. I believe that for source and binary compatibility with 2.2, that entry point should remain, with the same meaning, but it should not be used at all by the core. (Motivation to keep it: it's the only thing you can reasonably stick in tp_free that works for 2.2 as well as for 2.3.) One minor question: there are a bunch of #undefs in gcmodule.c (e.g. PyObject_GC_Track) that don't seem to make sense -- at least I cannot find where these would be #defined any more. Ditto for #indef PyObject_Malloc in obmalloc.c. I suggest that you check this thing in, but keeping _PyObject_Del alive, and we'll take it from there. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-08 15:18 Message: Logged In: YES user_id=6380 (Wouldn't it be more efficient to take this to email between the three of us?) > Extensions that *currently* call PyObject_Del have > its old macro expansion ("_PyObject_Del((PyObject > *)(op))") buried in them, so getting rid of > _PyObject_Del is a binary-API incompatibility > (existing extensions will no longer link without > recompilation). I personally don't mind that, but > I run on Windows and "binary compatability" never > works there across minor releases for other > reasons, so I don't have any real feel for how > much people on other platforms value it. As you > pointed out recently too, binary compatability > has, in reality, not been the case since 1.5.2 > anyway. Still, tradition has it that we keep such entry points around for a long time. I propose that we do so now, too. > So that's one for Python-Dev. If we do break > binary compatibility, I'd be sorely tempted to > change the "destructor" typedef to say destructors > take void*. IMO saying they take PyObject* was a > poor idea, as you almost never have a PyObject* > when calling one of these guys. Huh? "destructor" is used to declare tp_dealloc, which definitely needs a PyObject * (or some "subclass" of it, like PyIntObject *). It's also used to declare tp_free, which arguably shouldn't take a PyObject * (since by the time tp_free is called, most of the object's contents have been destroyed by tp_dealloc). So maybe tp_free (a newcomer in 2.2) should be declared to take something else, but then the risk is breaking code that defines a tp_free with the correct signature. > That's why PyObject_Del "had to" be a macro, to > hide the cast to PyObject* almost everyone needs > because of destructor's "correct" but impractical > signature. If "destructor" had a practical > signature, there would have been no temptation to > use a macro. I don't understand this at all. > Note that if the typedef of destructor were so > changed, you wouldn't have needed new casts in > tp_free slots. And I'd rather break binary > compatability than make extension authors add new > casts. Nor this. > Hmm. I'm assigning this to Guido for comment: > Guido, what are your feelings about binary > compatibility here? C didn't define free() as > taking a void* by mistake . I want binary compatibility, but I don't understand your comments very well. > Back to Neil: I wouldn't bother changing PyObject_Del > to PyObject_Free. The former isn't in the > "recommended" minimal API, but neither is it > discouraged. I expect TMTOWTDI here forever. I prefer PyObject_Del -- like PyObject_GC_Del, and like we did in the past. Plus, I like New to match Del and Malloc to match Free. Since it's PyObject_New, it should be _Del. I'm not sure what to say of Neil's patch, except that I'm glad to be rid of the PyMalloc_XXX family. I wish we didn't have to change all the places that used to say _PyObject_Del. Maybe it's best to keep that name around? The patch would (psychologically) become a lot smaller. I almost wish that this would work: #define PyObject_Del ((destructor)PyObject_Free) Or maybe it *does* work??? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-08 14:47 Message: Logged In: YES user_id=6380 I'm looking at this now... ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:59 Message: Logged In: YES user_id=31435 Extensions that *currently* call PyObject_Del have its old macro expansion ("_PyObject_Del((PyObject *)(op))") buried in them, so getting rid of _PyObject_Del is a binary-API incompatibility (existing extensions will no longer link without recompilation). I personally don't mind that, but I run on Windows and "binary compatability" never works there across minor releases for other reasons, so I don't have any real feel for how much people on other platforms value it. As you pointed out recently too, binary compatability has, in reality, not been the case since 1.5.2 anyway. So that's one for Python-Dev. If we do break binary compatibility, I'd be sorely tempted to change the "destructor" typedef to say destructors take void*. IMO saying they take PyObject* was a poor idea, as you almost never have a PyObject* when calling one of these guys. That's why PyObject_Del "had to" be a macro, to hide the cast to PyObject* almost everyone needs because of destructor's "correct" but impractical signature. If "destructor" had a practical signature, there would have been no temptation to use a macro. Note that if the typedef of destructor were so changed, you wouldn't have needed new casts in tp_free slots. And I'd rather break binary compatability than make extension authors add new casts. Hmm. I'm assigning this to Guido for comment: Guido, what are your feelings about binary compatibility here? C didn't define free() as taking a void* by mistake . Back to Neil: I wouldn't bother changing PyObject_Del to PyObject_Free. The former isn't in the "recommended" minimal API, but neither is it discouraged. I expect TMTOWTDI here forever. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:41 Message: Logged In: YES user_id=31435 Oops -- I hit "Submit" prematurely. More to come. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:40 Message: Logged In: YES user_id=31435 Looks good to me -- thanks! ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 From noreply@sourceforge.net Tue Apr 9 20:15:40 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 09 Apr 2002 12:15:40 -0700 Subject: [Patches] [ python-Patches-541694 ] whichdb unittest Message-ID: Patches item #541694, was opened at 2002-04-09 15:15 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 Category: Tests Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Gregory H. Ball (greg_ball) Assigned to: Nobody/Anonymous (nobody) Summary: whichdb unittest Initial Comment: Attached patch is a first crack at a unit test for whichdb. I think that all functionality required for use by the anydbm module is tested, but only for the database modules found in a given installation. The test case is built up at runtime to cover all the available modules, so it is a bit introspective, but I think it is obvious that it should run correctly. Unfortunately it crashes on my box (Redhat 6.2) and this seems to be a real problem with whichdb: it assumes things about the dbm format which turn out to be wrong sometimes. I only discovered this because test_anydbm was crashing, when whichdb failed to work on dbm files. It would not have crashed if dbhash was available... and dbhash was not available because bsddb was not built correctly. So I think there is a build bug there, but I have little idea how to solve that one at this point. Would I be correct in thinking that if this test really uncovers bugs in whichdb, it can't be checked in until they are fixed? Unfortunately I don't know much about the various databases, but I'll try to work with someone on it. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 From noreply@sourceforge.net Tue Apr 9 20:16:46 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 09 Apr 2002 12:16:46 -0700 Subject: [Patches] [ python-Patches-541694 ] whichdb unittest Message-ID: Patches item #541694, was opened at 2002-04-09 15:15 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 Category: Tests Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Gregory H. Ball (greg_ball) >Assigned to: Skip Montanaro (montanaro) Summary: whichdb unittest Initial Comment: Attached patch is a first crack at a unit test for whichdb. I think that all functionality required for use by the anydbm module is tested, but only for the database modules found in a given installation. The test case is built up at runtime to cover all the available modules, so it is a bit introspective, but I think it is obvious that it should run correctly. Unfortunately it crashes on my box (Redhat 6.2) and this seems to be a real problem with whichdb: it assumes things about the dbm format which turn out to be wrong sometimes. I only discovered this because test_anydbm was crashing, when whichdb failed to work on dbm files. It would not have crashed if dbhash was available... and dbhash was not available because bsddb was not built correctly. So I think there is a build bug there, but I have little idea how to solve that one at this point. Would I be correct in thinking that if this test really uncovers bugs in whichdb, it can't be checked in until they are fixed? Unfortunately I don't know much about the various databases, but I'll try to work with someone on it. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 From noreply@sourceforge.net Tue Apr 9 21:29:15 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 09 Apr 2002 13:29:15 -0700 Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols Message-ID: Patches item #540394, was opened at 2002-04-07 01:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Neil Schemenauer (nascheme) Summary: Remove PyMalloc_* symbols Initial Comment: This patch removes all PyMalloc_* symbols from the source. obmalloc now implements PyObject_{Malloc, Realloc, Free}. PyObject_{New,NewVar} allocate using pymalloc. I also changed PyObject_Del and PyObject_GC_Del so that they be used as function designators. Is changing the signature of PyObject_Del going to cause any problems? I had to add some extra typecasts when assigning to tp_free. Please review and assign back to me. The next phase would be to cleanup the memory API usage. Do we want to replace all PyObject_Del calls with PyObject_Free? PyObject_Del seems to match better with PyObject_GC_Del. Oh yes, we also need to change PyMem_{Free, Del, ...} to use pymalloc's free. ---------------------------------------------------------------------- >Comment By: Neil Schemenauer (nascheme) Date: 2002-04-09 20:29 Message: Logged In: YES user_id=35752 It might be a day or two before I get to this. Regarding the type of tp_free, could we change it to be something like: typedef void (*freefunc)(void *); ... freefunc tp_free; and leave the type of tp_dealloc alone. Maybe it's too late now that 2.2 is out and uses 'destructor'. I don't see how this relates to binary compatibility though. Why does it matter if the function takes a PyObject pointer or a void pointer? The worse I see happening is that people could get warnings when they compile their extension modules. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-09 18:47 Message: Logged In: YES user_id=31435 Clarifying or just repeating Guido here: + Binary compatibility is important. It's better on Unix than it appears -- while you'll get a warning if you run an old 1.5.2 extension with 2.2 today and without recompiling, it will almost certainly work anyway. So in the case of macros that expanded to a private API function before, that private API function must still exist, but the macro needn't expand to that anymore (nor even *be* a macro anymore). _PyObject_Del is a particular problem cuz it's even documented in the C API manual -- there simply wasn't a public API function before that did the same thing and could be used as a function designator. You're making life better for future generations. + Casts on tp_free slots are par for the course, because "destructor" has an impractical signature. I'm afraid that can't change either, so the casts stay. + Fred and I agreed to add PyObject_Del to the "minimal recommended API", so, for the next round of this, feel wholly righteous in leaving existing PyObject_Del calls alone. If anything's unclear, hit me. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-09 16:27 Message: Logged In: YES user_id=6380 I've not fully read Tim's response in email, but instead I've reviewed and discussed the patch with Tim. I think the only thing to which I object at this point is the removal of the entry point _PyObject_Del. I believe that for source and binary compatibility with 2.2, that entry point should remain, with the same meaning, but it should not be used at all by the core. (Motivation to keep it: it's the only thing you can reasonably stick in tp_free that works for 2.2 as well as for 2.3.) One minor question: there are a bunch of #undefs in gcmodule.c (e.g. PyObject_GC_Track) that don't seem to make sense -- at least I cannot find where these would be #defined any more. Ditto for #indef PyObject_Malloc in obmalloc.c. I suggest that you check this thing in, but keeping _PyObject_Del alive, and we'll take it from there. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-08 19:18 Message: Logged In: YES user_id=6380 (Wouldn't it be more efficient to take this to email between the three of us?) > Extensions that *currently* call PyObject_Del have > its old macro expansion ("_PyObject_Del((PyObject > *)(op))") buried in them, so getting rid of > _PyObject_Del is a binary-API incompatibility > (existing extensions will no longer link without > recompilation). I personally don't mind that, but > I run on Windows and "binary compatability" never > works there across minor releases for other > reasons, so I don't have any real feel for how > much people on other platforms value it. As you > pointed out recently too, binary compatability > has, in reality, not been the case since 1.5.2 > anyway. Still, tradition has it that we keep such entry points around for a long time. I propose that we do so now, too. > So that's one for Python-Dev. If we do break > binary compatibility, I'd be sorely tempted to > change the "destructor" typedef to say destructors > take void*. IMO saying they take PyObject* was a > poor idea, as you almost never have a PyObject* > when calling one of these guys. Huh? "destructor" is used to declare tp_dealloc, which definitely needs a PyObject * (or some "subclass" of it, like PyIntObject *). It's also used to declare tp_free, which arguably shouldn't take a PyObject * (since by the time tp_free is called, most of the object's contents have been destroyed by tp_dealloc). So maybe tp_free (a newcomer in 2.2) should be declared to take something else, but then the risk is breaking code that defines a tp_free with the correct signature. > That's why PyObject_Del "had to" be a macro, to > hide the cast to PyObject* almost everyone needs > because of destructor's "correct" but impractical > signature. If "destructor" had a practical > signature, there would have been no temptation to > use a macro. I don't understand this at all. > Note that if the typedef of destructor were so > changed, you wouldn't have needed new casts in > tp_free slots. And I'd rather break binary > compatability than make extension authors add new > casts. Nor this. > Hmm. I'm assigning this to Guido for comment: > Guido, what are your feelings about binary > compatibility here? C didn't define free() as > taking a void* by mistake . I want binary compatibility, but I don't understand your comments very well. > Back to Neil: I wouldn't bother changing PyObject_Del > to PyObject_Free. The former isn't in the > "recommended" minimal API, but neither is it > discouraged. I expect TMTOWTDI here forever. I prefer PyObject_Del -- like PyObject_GC_Del, and like we did in the past. Plus, I like New to match Del and Malloc to match Free. Since it's PyObject_New, it should be _Del. I'm not sure what to say of Neil's patch, except that I'm glad to be rid of the PyMalloc_XXX family. I wish we didn't have to change all the places that used to say _PyObject_Del. Maybe it's best to keep that name around? The patch would (psychologically) become a lot smaller. I almost wish that this would work: #define PyObject_Del ((destructor)PyObject_Free) Or maybe it *does* work??? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-08 18:47 Message: Logged In: YES user_id=6380 I'm looking at this now... ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-07 02:59 Message: Logged In: YES user_id=31435 Extensions that *currently* call PyObject_Del have its old macro expansion ("_PyObject_Del((PyObject *)(op))") buried in them, so getting rid of _PyObject_Del is a binary-API incompatibility (existing extensions will no longer link without recompilation). I personally don't mind that, but I run on Windows and "binary compatability" never works there across minor releases for other reasons, so I don't have any real feel for how much people on other platforms value it. As you pointed out recently too, binary compatability has, in reality, not been the case since 1.5.2 anyway. So that's one for Python-Dev. If we do break binary compatibility, I'd be sorely tempted to change the "destructor" typedef to say destructors take void*. IMO saying they take PyObject* was a poor idea, as you almost never have a PyObject* when calling one of these guys. That's why PyObject_Del "had to" be a macro, to hide the cast to PyObject* almost everyone needs because of destructor's "correct" but impractical signature. If "destructor" had a practical signature, there would have been no temptation to use a macro. Note that if the typedef of destructor were so changed, you wouldn't have needed new casts in tp_free slots. And I'd rather break binary compatability than make extension authors add new casts. Hmm. I'm assigning this to Guido for comment: Guido, what are your feelings about binary compatibility here? C didn't define free() as taking a void* by mistake . Back to Neil: I wouldn't bother changing PyObject_Del to PyObject_Free. The former isn't in the "recommended" minimal API, but neither is it discouraged. I expect TMTOWTDI here forever. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-07 02:41 Message: Logged In: YES user_id=31435 Oops -- I hit "Submit" prematurely. More to come. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-07 02:40 Message: Logged In: YES user_id=31435 Looks good to me -- thanks! ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 From noreply@sourceforge.net Tue Apr 9 21:43:44 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 09 Apr 2002 13:43:44 -0700 Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols Message-ID: Patches item #540394, was opened at 2002-04-06 20:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) >Assigned to: Guido van Rossum (gvanrossum) Summary: Remove PyMalloc_* symbols Initial Comment: This patch removes all PyMalloc_* symbols from the source. obmalloc now implements PyObject_{Malloc, Realloc, Free}. PyObject_{New,NewVar} allocate using pymalloc. I also changed PyObject_Del and PyObject_GC_Del so that they be used as function designators. Is changing the signature of PyObject_Del going to cause any problems? I had to add some extra typecasts when assigning to tp_free. Please review and assign back to me. The next phase would be to cleanup the memory API usage. Do we want to replace all PyObject_Del calls with PyObject_Free? PyObject_Del seems to match better with PyObject_GC_Del. Oh yes, we also need to change PyMem_{Free, Del, ...} to use pymalloc's free. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-04-09 16:43 Message: Logged In: YES user_id=31435 It'll be a day or two before PLabs can get back to Python work anyway. Reassigning to Guido -- I'm not even going to try to channel him on backwards compatibility, or the feasibility of introducing possible warnings. If I were you I'd check in the patch with the casts in; they can be taken out again later if Guido is agreeable. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-09 16:29 Message: Logged In: YES user_id=35752 It might be a day or two before I get to this. Regarding the type of tp_free, could we change it to be something like: typedef void (*freefunc)(void *); ... freefunc tp_free; and leave the type of tp_dealloc alone. Maybe it's too late now that 2.2 is out and uses 'destructor'. I don't see how this relates to binary compatibility though. Why does it matter if the function takes a PyObject pointer or a void pointer? The worse I see happening is that people could get warnings when they compile their extension modules. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-09 14:47 Message: Logged In: YES user_id=31435 Clarifying or just repeating Guido here: + Binary compatibility is important. It's better on Unix than it appears -- while you'll get a warning if you run an old 1.5.2 extension with 2.2 today and without recompiling, it will almost certainly work anyway. So in the case of macros that expanded to a private API function before, that private API function must still exist, but the macro needn't expand to that anymore (nor even *be* a macro anymore). _PyObject_Del is a particular problem cuz it's even documented in the C API manual -- there simply wasn't a public API function before that did the same thing and could be used as a function designator. You're making life better for future generations. + Casts on tp_free slots are par for the course, because "destructor" has an impractical signature. I'm afraid that can't change either, so the casts stay. + Fred and I agreed to add PyObject_Del to the "minimal recommended API", so, for the next round of this, feel wholly righteous in leaving existing PyObject_Del calls alone. If anything's unclear, hit me. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-09 12:27 Message: Logged In: YES user_id=6380 I've not fully read Tim's response in email, but instead I've reviewed and discussed the patch with Tim. I think the only thing to which I object at this point is the removal of the entry point _PyObject_Del. I believe that for source and binary compatibility with 2.2, that entry point should remain, with the same meaning, but it should not be used at all by the core. (Motivation to keep it: it's the only thing you can reasonably stick in tp_free that works for 2.2 as well as for 2.3.) One minor question: there are a bunch of #undefs in gcmodule.c (e.g. PyObject_GC_Track) that don't seem to make sense -- at least I cannot find where these would be #defined any more. Ditto for #indef PyObject_Malloc in obmalloc.c. I suggest that you check this thing in, but keeping _PyObject_Del alive, and we'll take it from there. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-08 15:18 Message: Logged In: YES user_id=6380 (Wouldn't it be more efficient to take this to email between the three of us?) > Extensions that *currently* call PyObject_Del have > its old macro expansion ("_PyObject_Del((PyObject > *)(op))") buried in them, so getting rid of > _PyObject_Del is a binary-API incompatibility > (existing extensions will no longer link without > recompilation). I personally don't mind that, but > I run on Windows and "binary compatability" never > works there across minor releases for other > reasons, so I don't have any real feel for how > much people on other platforms value it. As you > pointed out recently too, binary compatability > has, in reality, not been the case since 1.5.2 > anyway. Still, tradition has it that we keep such entry points around for a long time. I propose that we do so now, too. > So that's one for Python-Dev. If we do break > binary compatibility, I'd be sorely tempted to > change the "destructor" typedef to say destructors > take void*. IMO saying they take PyObject* was a > poor idea, as you almost never have a PyObject* > when calling one of these guys. Huh? "destructor" is used to declare tp_dealloc, which definitely needs a PyObject * (or some "subclass" of it, like PyIntObject *). It's also used to declare tp_free, which arguably shouldn't take a PyObject * (since by the time tp_free is called, most of the object's contents have been destroyed by tp_dealloc). So maybe tp_free (a newcomer in 2.2) should be declared to take something else, but then the risk is breaking code that defines a tp_free with the correct signature. > That's why PyObject_Del "had to" be a macro, to > hide the cast to PyObject* almost everyone needs > because of destructor's "correct" but impractical > signature. If "destructor" had a practical > signature, there would have been no temptation to > use a macro. I don't understand this at all. > Note that if the typedef of destructor were so > changed, you wouldn't have needed new casts in > tp_free slots. And I'd rather break binary > compatability than make extension authors add new > casts. Nor this. > Hmm. I'm assigning this to Guido for comment: > Guido, what are your feelings about binary > compatibility here? C didn't define free() as > taking a void* by mistake . I want binary compatibility, but I don't understand your comments very well. > Back to Neil: I wouldn't bother changing PyObject_Del > to PyObject_Free. The former isn't in the > "recommended" minimal API, but neither is it > discouraged. I expect TMTOWTDI here forever. I prefer PyObject_Del -- like PyObject_GC_Del, and like we did in the past. Plus, I like New to match Del and Malloc to match Free. Since it's PyObject_New, it should be _Del. I'm not sure what to say of Neil's patch, except that I'm glad to be rid of the PyMalloc_XXX family. I wish we didn't have to change all the places that used to say _PyObject_Del. Maybe it's best to keep that name around? The patch would (psychologically) become a lot smaller. I almost wish that this would work: #define PyObject_Del ((destructor)PyObject_Free) Or maybe it *does* work??? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-08 14:47 Message: Logged In: YES user_id=6380 I'm looking at this now... ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:59 Message: Logged In: YES user_id=31435 Extensions that *currently* call PyObject_Del have its old macro expansion ("_PyObject_Del((PyObject *)(op))") buried in them, so getting rid of _PyObject_Del is a binary-API incompatibility (existing extensions will no longer link without recompilation). I personally don't mind that, but I run on Windows and "binary compatability" never works there across minor releases for other reasons, so I don't have any real feel for how much people on other platforms value it. As you pointed out recently too, binary compatability has, in reality, not been the case since 1.5.2 anyway. So that's one for Python-Dev. If we do break binary compatibility, I'd be sorely tempted to change the "destructor" typedef to say destructors take void*. IMO saying they take PyObject* was a poor idea, as you almost never have a PyObject* when calling one of these guys. That's why PyObject_Del "had to" be a macro, to hide the cast to PyObject* almost everyone needs because of destructor's "correct" but impractical signature. If "destructor" had a practical signature, there would have been no temptation to use a macro. Note that if the typedef of destructor were so changed, you wouldn't have needed new casts in tp_free slots. And I'd rather break binary compatability than make extension authors add new casts. Hmm. I'm assigning this to Guido for comment: Guido, what are your feelings about binary compatibility here? C didn't define free() as taking a void* by mistake . Back to Neil: I wouldn't bother changing PyObject_Del to PyObject_Free. The former isn't in the "recommended" minimal API, but neither is it discouraged. I expect TMTOWTDI here forever. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:41 Message: Logged In: YES user_id=31435 Oops -- I hit "Submit" prematurely. More to come. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:40 Message: Logged In: YES user_id=31435 Looks good to me -- thanks! ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 From noreply@sourceforge.net Wed Apr 10 01:53:05 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 09 Apr 2002 17:53:05 -0700 Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols Message-ID: Patches item #540394, was opened at 2002-04-06 20:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) >Assigned to: Neil Schemenauer (nascheme) Summary: Remove PyMalloc_* symbols Initial Comment: This patch removes all PyMalloc_* symbols from the source. obmalloc now implements PyObject_{Malloc, Realloc, Free}. PyObject_{New,NewVar} allocate using pymalloc. I also changed PyObject_Del and PyObject_GC_Del so that they be used as function designators. Is changing the signature of PyObject_Del going to cause any problems? I had to add some extra typecasts when assigning to tp_free. Please review and assign back to me. The next phase would be to cleanup the memory API usage. Do we want to replace all PyObject_Del calls with PyObject_Free? PyObject_Del seems to match better with PyObject_GC_Del. Oh yes, we also need to change PyMem_{Free, Del, ...} to use pymalloc's free. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-09 20:53 Message: Logged In: YES user_id=6380 The binary compatibility issue is extensions compiled for 2.2 that have references to _PyObject_Del compiled into them and aren't recompiled for 2.3. I think that should work (even if they get a warning). To make it work, the _PyObject_Del entry point must continue to exist. Back to Neil, I think my instructions are clear enough. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-09 16:43 Message: Logged In: YES user_id=31435 It'll be a day or two before PLabs can get back to Python work anyway. Reassigning to Guido -- I'm not even going to try to channel him on backwards compatibility, or the feasibility of introducing possible warnings. If I were you I'd check in the patch with the casts in; they can be taken out again later if Guido is agreeable. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-09 16:29 Message: Logged In: YES user_id=35752 It might be a day or two before I get to this. Regarding the type of tp_free, could we change it to be something like: typedef void (*freefunc)(void *); ... freefunc tp_free; and leave the type of tp_dealloc alone. Maybe it's too late now that 2.2 is out and uses 'destructor'. I don't see how this relates to binary compatibility though. Why does it matter if the function takes a PyObject pointer or a void pointer? The worse I see happening is that people could get warnings when they compile their extension modules. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-09 14:47 Message: Logged In: YES user_id=31435 Clarifying or just repeating Guido here: + Binary compatibility is important. It's better on Unix than it appears -- while you'll get a warning if you run an old 1.5.2 extension with 2.2 today and without recompiling, it will almost certainly work anyway. So in the case of macros that expanded to a private API function before, that private API function must still exist, but the macro needn't expand to that anymore (nor even *be* a macro anymore). _PyObject_Del is a particular problem cuz it's even documented in the C API manual -- there simply wasn't a public API function before that did the same thing and could be used as a function designator. You're making life better for future generations. + Casts on tp_free slots are par for the course, because "destructor" has an impractical signature. I'm afraid that can't change either, so the casts stay. + Fred and I agreed to add PyObject_Del to the "minimal recommended API", so, for the next round of this, feel wholly righteous in leaving existing PyObject_Del calls alone. If anything's unclear, hit me. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-09 12:27 Message: Logged In: YES user_id=6380 I've not fully read Tim's response in email, but instead I've reviewed and discussed the patch with Tim. I think the only thing to which I object at this point is the removal of the entry point _PyObject_Del. I believe that for source and binary compatibility with 2.2, that entry point should remain, with the same meaning, but it should not be used at all by the core. (Motivation to keep it: it's the only thing you can reasonably stick in tp_free that works for 2.2 as well as for 2.3.) One minor question: there are a bunch of #undefs in gcmodule.c (e.g. PyObject_GC_Track) that don't seem to make sense -- at least I cannot find where these would be #defined any more. Ditto for #indef PyObject_Malloc in obmalloc.c. I suggest that you check this thing in, but keeping _PyObject_Del alive, and we'll take it from there. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-08 15:18 Message: Logged In: YES user_id=6380 (Wouldn't it be more efficient to take this to email between the three of us?) > Extensions that *currently* call PyObject_Del have > its old macro expansion ("_PyObject_Del((PyObject > *)(op))") buried in them, so getting rid of > _PyObject_Del is a binary-API incompatibility > (existing extensions will no longer link without > recompilation). I personally don't mind that, but > I run on Windows and "binary compatability" never > works there across minor releases for other > reasons, so I don't have any real feel for how > much people on other platforms value it. As you > pointed out recently too, binary compatability > has, in reality, not been the case since 1.5.2 > anyway. Still, tradition has it that we keep such entry points around for a long time. I propose that we do so now, too. > So that's one for Python-Dev. If we do break > binary compatibility, I'd be sorely tempted to > change the "destructor" typedef to say destructors > take void*. IMO saying they take PyObject* was a > poor idea, as you almost never have a PyObject* > when calling one of these guys. Huh? "destructor" is used to declare tp_dealloc, which definitely needs a PyObject * (or some "subclass" of it, like PyIntObject *). It's also used to declare tp_free, which arguably shouldn't take a PyObject * (since by the time tp_free is called, most of the object's contents have been destroyed by tp_dealloc). So maybe tp_free (a newcomer in 2.2) should be declared to take something else, but then the risk is breaking code that defines a tp_free with the correct signature. > That's why PyObject_Del "had to" be a macro, to > hide the cast to PyObject* almost everyone needs > because of destructor's "correct" but impractical > signature. If "destructor" had a practical > signature, there would have been no temptation to > use a macro. I don't understand this at all. > Note that if the typedef of destructor were so > changed, you wouldn't have needed new casts in > tp_free slots. And I'd rather break binary > compatability than make extension authors add new > casts. Nor this. > Hmm. I'm assigning this to Guido for comment: > Guido, what are your feelings about binary > compatibility here? C didn't define free() as > taking a void* by mistake . I want binary compatibility, but I don't understand your comments very well. > Back to Neil: I wouldn't bother changing PyObject_Del > to PyObject_Free. The former isn't in the > "recommended" minimal API, but neither is it > discouraged. I expect TMTOWTDI here forever. I prefer PyObject_Del -- like PyObject_GC_Del, and like we did in the past. Plus, I like New to match Del and Malloc to match Free. Since it's PyObject_New, it should be _Del. I'm not sure what to say of Neil's patch, except that I'm glad to be rid of the PyMalloc_XXX family. I wish we didn't have to change all the places that used to say _PyObject_Del. Maybe it's best to keep that name around? The patch would (psychologically) become a lot smaller. I almost wish that this would work: #define PyObject_Del ((destructor)PyObject_Free) Or maybe it *does* work??? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-08 14:47 Message: Logged In: YES user_id=6380 I'm looking at this now... ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:59 Message: Logged In: YES user_id=31435 Extensions that *currently* call PyObject_Del have its old macro expansion ("_PyObject_Del((PyObject *)(op))") buried in them, so getting rid of _PyObject_Del is a binary-API incompatibility (existing extensions will no longer link without recompilation). I personally don't mind that, but I run on Windows and "binary compatability" never works there across minor releases for other reasons, so I don't have any real feel for how much people on other platforms value it. As you pointed out recently too, binary compatability has, in reality, not been the case since 1.5.2 anyway. So that's one for Python-Dev. If we do break binary compatibility, I'd be sorely tempted to change the "destructor" typedef to say destructors take void*. IMO saying they take PyObject* was a poor idea, as you almost never have a PyObject* when calling one of these guys. That's why PyObject_Del "had to" be a macro, to hide the cast to PyObject* almost everyone needs because of destructor's "correct" but impractical signature. If "destructor" had a practical signature, there would have been no temptation to use a macro. Note that if the typedef of destructor were so changed, you wouldn't have needed new casts in tp_free slots. And I'd rather break binary compatability than make extension authors add new casts. Hmm. I'm assigning this to Guido for comment: Guido, what are your feelings about binary compatibility here? C didn't define free() as taking a void* by mistake . Back to Neil: I wouldn't bother changing PyObject_Del to PyObject_Free. The former isn't in the "recommended" minimal API, but neither is it discouraged. I expect TMTOWTDI here forever. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:41 Message: Logged In: YES user_id=31435 Oops -- I hit "Submit" prematurely. More to come. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-06 21:40 Message: Logged In: YES user_id=31435 Looks good to me -- thanks! ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 From noreply@sourceforge.net Wed Apr 10 10:09:33 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 10 Apr 2002 02:09:33 -0700 Subject: [Patches] [ python-Patches-541924 ] this.py too verbose Message-ID: Patches item #541924, was opened at 2002-04-10 09:09 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541924&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Duncan Booth (duncanb) Assigned to: Nobody/Anonymous (nobody) Summary: this.py too verbose Initial Comment: The 'Easter Egg' file this.py might be regarded as something of an advert for Python, but its implementation is excessively verbose as it rolls its own rot13 decoding code when Python already has perfectly usable rot13 coding built in. The attached context diff replaces the 5 lines currently used to decode the Zen string with a single line: print s.decode('rot13') ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541924&group_id=5470 From noreply@sourceforge.net Wed Apr 10 19:11:15 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 10 Apr 2002 11:11:15 -0700 Subject: [Patches] [ python-Patches-541924 ] this.py too verbose Message-ID: Patches item #541924, was opened at 2002-04-10 11:09 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541924&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Duncan Booth (duncanb) Assigned to: Nobody/Anonymous (nobody) Summary: this.py too verbose Initial Comment: The 'Easter Egg' file this.py might be regarded as something of an advert for Python, but its implementation is excessively verbose as it rolls its own rot13 decoding code when Python already has perfectly usable rot13 coding built in. The attached context diff replaces the 5 lines currently used to decode the Zen string with a single line: print s.decode('rot13') ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-10 20:11 Message: Logged In: YES user_id=21627 There's no uploaded file! You have to check the checkbox labeled "Check to Upload & Attach File" when you upload a file. Please try again. (This is a SourceForge annoyance that we can do nothing about. :-( ) ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541924&group_id=5470 From noreply@sourceforge.net Wed Apr 10 22:31:29 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 10 Apr 2002 14:31:29 -0700 Subject: [Patches] [ python-Patches-541924 ] this.py too verbose Message-ID: Patches item #541924, was opened at 2002-04-10 05:09 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541924&group_id=5470 Category: None Group: None >Status: Closed >Resolution: Rejected Priority: 5 Submitted By: Duncan Booth (duncanb) >Assigned to: Tim Peters (tim_one) Summary: this.py too verbose Initial Comment: The 'Easter Egg' file this.py might be regarded as something of an advert for Python, but its implementation is excessively verbose as it rolls its own rot13 decoding code when Python already has perfectly usable rot13 coding built in. The attached context diff replaces the 5 lines currently used to decode the Zen string with a single line: print s.decode('rot13') ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-04-10 17:31 Message: Logged In: YES user_id=31435 Sorry, Guido deliberately refused to use rot13. He wants it to be obscure. I told him that I instantly recognized the by-hand implementation of rot13, but had no idea what .decode('rot13') might do. He wasn't swayed . ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-10 14:11 Message: Logged In: YES user_id=21627 There's no uploaded file! You have to check the checkbox labeled "Check to Upload & Attach File" when you upload a file. Please try again. (This is a SourceForge annoyance that we can do nothing about. :-( ) ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541924&group_id=5470 From noreply@sourceforge.net Thu Apr 11 17:23:29 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 11 Apr 2002 09:23:29 -0700 Subject: [Patches] [ python-Patches-526840 ] PEP 263 Implementation Message-ID: Patches item #526840, was opened at 2002-03-07 08:55 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=526840&group_id=5470 Category: Parser/Compiler Group: Python 2.3 Status: Open Resolution: None Priority: 7 Submitted By: Martin v. Löwis (loewis) Assigned to: M.-A. Lemburg (lemburg) Summary: PEP 263 Implementation Initial Comment: The attached patch implements PEP 263. The following differences to the PEP (rev. 1.8) are known: - The implementation interprets "ASCII compatible" as meaning "bytes below 128 always denote ASCII characters", although this property is only used for ",', and \. There have been other readings of "ASCII compatible", so this should probably be elaborated in the PEP. - The check whether all bytes follow the declared or system encoding (including comments and string literals) is only performed if the encoding is "ascii". ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2002-04-11 16:23 Message: Logged In: YES user_id=38388 Apart from the codec changes, the patch looks ok. I would still like two APIs for the two different codec tasks, though. I don't expect anything much to change in the codecs, so maintenance is not an issue. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-21 10:25 Message: Logged In: YES user_id=21627 Version 2 of this patch implements revision 1.11 of the PEP (phase 1). The check of the complete source file for compliance with the declared encoding is implemented by decoding the input line-by-line; I believe that for all supported encodings, this is not different compared to decoding the entire source file at once. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-07 18:24 Message: Logged In: YES user_id=21627 Changing the decoding functions will not result in one additional function, but in two of them: you'll also get PyUnicode_DecodeRawUnicodeEscapeFromUnicode. That seems quite unmaintainable to me: any change now needs to propagate into four functions. OTOH, I don't think that the code that allows parsing a variable-sized strings is overly complicated. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-03-07 18:01 Message: Logged In: YES user_id=38388 Ok, I've had a look at the patch. It looks good except for the overly complicated implementation of the unicode-escape codec. Even though there's a bit of code duplication, I'd prefer to have two separate functions here: one for the standard char* pointer type and another one for Py_UNICODE*, ie. PyUnicode_DecodeUnicodeEscape(char*...) and PyUnicode_DecodeUnicodeEscapeFromUnicode(Py_UNICODE*...) This is easier to support and gives better performance since the compiler can optimize the two functions making different assumptions. You'll also need to include a name mangling at the top of the header for the new API. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-03-07 14:06 Message: Logged In: YES user_id=6380 I've set the group to Python 2.3 so the priority has some context (I'd rather you move the priority down to 5 but I understand this is your personal priority). I haven't accepted the PEP yet (although I expect I will), so please don't check this in yet (if you feel it needs to be saved in CVS, use a branch). ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-03-07 11:06 Message: Logged In: YES user_id=38388 Thank you ! I'll add a note to the PEP about the way the first two lines are processed (removing the ASCII mention...). ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-07 09:11 Message: Logged In: YES user_id=21627 A note on the implementation strategy: it turned out that communicating the encoding into the abstract syntax was the biggest challenge. To solve this, I introduced encoding_decl pseudo node: it is an unused non-terminal whose STR() is the encoding, and whose only child is the true root of the syntax tree. As such, it is the only non-terminal which has a STR value. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=526840&group_id=5470 From noreply@sourceforge.net Thu Apr 11 17:34:41 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 11 Apr 2002 09:34:41 -0700 Subject: [Patches] [ python-Patches-542562 ] clean up trace.py Message-ID: Patches item #542562, was opened at 2002-04-11 16:34 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542562&group_id=5470 Category: Demos and tools Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Zooko O'Whielacronx (zooko) Assigned to: Nobody/Anonymous (nobody) Summary: clean up trace.py Initial Comment: moderately interesting changes: * bugfix: remove "feature" of ignoring files in the tmpdir, as I was trying to run it on file in the tmpdir and couldn't figure out why it gave no answer! I think the original motivation for that feature (spurious "/tmp/" filenames for builtin functions??) has gone away, but I'm not sure. * add more usage docs and warning about common mistake pretty mundane changes: * remove unnecessary checks for backwards compatibility with a version that never escaped from my (Zooko's) laptop * add a future-compatible check: if the interpreter offers an attribute called `sys.optimized', and it is "true", and the user is trying to do something that can't be done with an optimizing interpreter, then error out ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542562&group_id=5470 From noreply@sourceforge.net Thu Apr 11 17:59:31 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 11 Apr 2002 09:59:31 -0700 Subject: [Patches] [ python-Patches-531901 ] binary packagers Message-ID: Patches item #531901, was opened at 2002-03-19 15:53 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=531901&group_id=5470 Category: Distutils and setup.py Group: None Status: Open Resolution: None Priority: 5 Submitted By: Mark Alexander (mwa) Assigned to: M.-A. Lemburg (lemburg) Summary: binary packagers Initial Comment: zip file with updated Solaris and HP-UX packagers. Replaces 415226, 415227, 415228. Changes made to take advantage of new PEP241 changes in the Distribution class. ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2002-04-11 16:59 Message: Logged In: YES user_id=38388 Mark, could you reupload the ZIP file ? I cannot download it from the SF page (the file is mostly empty). Also, is the documentation already included in the ZIP file ? If not, it would be nice if you could add them as well. I don't require a special PEP for these changes, BTW, but I do require you to maintain them. Thanks. ---------------------------------------------------------------------- Comment By: Mark Alexander (mwa) Date: 2002-03-20 19:55 Message: Logged In: YES user_id=12810 OK, the PEP seems to me to mean most of this is done. These additions are not library modules, they are Distutils "commands". So the way i read it, the Distutils-SIG (where I've been hanging around for some time) are the Maintainers. The documentation will be 2 new chapters for the Distutils manual "Creating Solaris packages" and "Creating HP-UX packages" each looking a whole lot like "Creating RPM packages". Does that clarify anything, or am I still missing a clue? p.s. Thanks for cleaning up the extra uploads! ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-20 15:35 Message: Logged In: YES user_id=21627 You volunteering as the maintainer is part of the prerequisites of accepting new modules, when following PEP 2, see http://python.sourceforge.net/peps/pep-0002.html It says: "developers ... will first form a group of maintainers. Then, this group shall produce a PEP called a library PEP." So existance of a PEP describing these library extensions would be a prerequisite for accepting them. If MAL wants to waive this requirement, it would be fine with me. However, such a PEP could also share text with the documentation, so it might not be wasted effort. ---------------------------------------------------------------------- Comment By: Mark Alexander (mwa) Date: 2002-03-20 14:49 Message: Logged In: YES user_id=12810 Any of the three (they're all the same). SourceForge hiccuped during the upload, and I don't have permission to delete the duplicates. I don't exactly understand what you mean by applying PEP 2. I uploaded this per Marc Lemburg's request for the latest versions of patches 41522[6-8]. He's acting as as the integrator in this case (see http://mail.python.org/pipermail/distutils-sig/2001-December/002659.html). I let him know about the duplicate uploads, so hopefully he'll correct it. If you can and want, feel free to delete the 2 of your choice. I agree they need to be documented. As soon as I can, I'll submit changes to the Distutils documentation. Finally, yes, I'll act as maintainer. I'm on the Distutils-sig and as soon as some other poor soul who has to deal with Solaris or HP-UX tries them, I'm there to work out issues. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-20 07:35 Message: Logged In: YES user_id=21627 Which of the three attached files is the right one (19633, 19634, or 19635)? Unless they are all needed, we should delete the extra copies. I recommend to apply PEP 2 to this patch: A library PEP is needed (which could be quite short), documentation, perhaps test cases. Most importantly, there must be an identified maintainer of these modules. Are you willing to act as the maintainer? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=531901&group_id=5470 From noreply@sourceforge.net Thu Apr 11 18:01:50 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 11 Apr 2002 10:01:50 -0700 Subject: [Patches] [ python-Patches-542569 ] tp_print tp_repr tp_str in test_bool.py Message-ID: Patches item #542569, was opened at 2002-04-11 19:01 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542569&group_id=5470 Category: Tests Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Nobody/Anonymous (nobody) Summary: tp_print tp_repr tp_str in test_bool.py Initial Comment: Those slots are not being tested by test_bool.py if it was run standalone. bool_print() does not run during the complete regression test suite. I was using Neal's tools and choose boolobject.c (because it's an easy module :-) to get in touch with the internals. I don't know if this patch would be useful to you because I didn't see similar checks done for other types. Ie: the eval(repr(x))==x property, or the tp_print slot (I found only one for dicts.) Hope it helps, -Hernan ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542569&group_id=5470 From noreply@sourceforge.net Thu Apr 11 21:31:11 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 11 Apr 2002 13:31:11 -0700 Subject: [Patches] [ python-Patches-476814 ] foreign-platform newline support Message-ID: Patches item #476814, was opened at 2001-10-31 17:41 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=476814&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jack Jansen (jackjansen) Assigned to: Jack Jansen (jackjansen) Summary: foreign-platform newline support Initial Comment: This patch enables Python to interpret all known newline conventions, CR, LF or CRLF, on all platforms. This support is enabled by configuring with --with-universal-newlines (so by default it is off, and everything should behave as usual). With universal newline support enabled two things happen: - When importing or otherwise parsing .py files any newline convention is accepted. - Python code can pass a new "t" mode parameter to open() which reads files with any newline convention. "t" cannot be combined with any other mode flags like "w" or "+", for obvious reasons. File objects have a new attribute "newlines" which contains the type of newlines encountered in the file (or None when no newline has been seen, or "mixed" if there were various types of newlines). Also included is a test script which tests both file I/O and parsing. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-11 22:31 Message: Logged In: YES user_id=21627 What is the rationale for making this a compile-time option? It seems to complicate things, with no apparent advantage. If this is for backwards compatibility, don't make it an option: nobody will rebuild Python just to work around a compatibility problem. Apart from that, the patch looks goo.d ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-03-29 00:17 Message: Logged In: YES user_id=45365 New doc patch, and new version of the patch that mainly allows the U to be specified (no-op) in non-univ-newline-builds. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-03-25 22:07 Message: Logged In: YES user_id=6380 Thanks! But there's no documentation. Could I twist your arm for a separate doc patch? I'm tempted to give this a +1, but I'd like to hear from MvL and MAL to see if they foresee any interaction with their PEP 262 implemetation. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-03-13 23:44 Message: Logged In: YES user_id=45365 A new version of the patch. Main differences are that U is now the mode character to trigger universal newline input and --with-universal-newlines is default on. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-01-16 23:47 Message: Logged In: YES user_id=45365 This version of the patch addresses the bug in Py_UniversalNewlineFread and fixes up some minor details. Tim's other issues are addressed (at least: I think they are:-) in a forthcoming PEP. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-12-14 00:57 Message: Logged In: YES user_id=31435 Back to Jack -- and sorry for sitting on it so long. Clearly this isn't making it into 2.2 in the core. As I said on Python-Dev, I believe this needs a PEP: the design decisions are debatable, so *should* be debated outside the Mac community too. Note, though, that I can't stop you from adding it to the 2.2 Mac distribution (if you want it badly enough there). If a PEP won't be written, I suggest finding someone else to review it again; maybe Guido. Note that the patch needs doc changes too. The patch to regrtest.py doesn't belong here (I assume it just slipped in). There seems a lot of code in support of the f_newlinetypes member, and the value of that member isn't clear -- I can't imagine a good use for it (maybe it's a Mac thing?). The implementation of Py_UniversalNewlineFread appears incorrect to me: it reads n bytes *every* time around the outer loop, no matter how few characters are still required, and n doesn't change inside the loop. The business about the GIL may be due to the lack of docs: are, or are not, people supposed to release the GIL themselves around calls to these guys? It's not documented, and it appears your intent differed from my guess. Finally, it would be better to call ferror () after calling fread() instead of before it . ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2001-11-14 16:13 Message: Logged In: YES user_id=45365 Here's a new version of the patch. To address your issues one by one: - get_line and Py_UniversalNewlineFgets are too difficult to integrate, at leat, I don't see how I could do it. The storage management of get_line gets in the way. - The global lock comment I don't understand. The Universal... routines are replacements for fgets() and fread(), so have nothing to do with the interpreter lock. - The logic of all three routines (get_line too) has changed and I've put comments in. I hope this addresses some of the points. - If universal_newline is false for a certain PyFileObject we now immedeately take a quick exit via fgets() or fread(). There's also a new test script, that tests some more border cases (like lines longer than 100 characters, and a lone CR just before end of file). ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-11-05 09:16 Message: Logged In: YES user_id=31435 It would be better if get_line just called Py_UniversalNewlineFgets (when appropriate) instead of duplicating its logic inline. Py_UniversalNewlineFgets and Py_UniversalNewlineFread should deal with releasing the global lock themselves -- the correct granularity for lock release/reacquire is around the C-level input routines (esp. for fread). The new routines never check for I/O errors! Why not? It seems essential. The new Fgets checks for EOF at the end of the loop instead of the top. This is surprising, and I stared a long time in vain trying to guess why. Setting newlinetypes |= NEWLINE_CR; immediately after seeing an '\r' would be as fast (instead of waiting to see EOF and then inferring the prior existence of '\r' indirectly from the state of the skipnextlf flag). Speaking of which , the fobj tests in the inner loop waste cycles. Set the local flag vrbls whether or not fobj is NULL. When you're *out* of the inner loop you can simply decline to store the new masks when fobj is NULL (and you're already doing the latter anyway). A test and branch inside the loop is much more expensive than or'ing in a flag bit inside the loop, ditto harder to understand. Floating the univ_newline test out of the loop (and duplicating the loop body, one way for univ_newline true and the other for it false) would also save a test and branch on every character. Doing fread one character at a time is very inefficient. Since you know you need to obtain n characters in the end, and that these transformations require reading at least n characters, you could very profitably read n characters in one gulp at the start, then switch to k at a time where k is the number of \r\n pairs seen since the last fread call. This is easier to code than it sounds . It would be fine by me if you included (and initialized) the new file-object fields all the time, whether or not universal newlines are configured. I'd rather waste a few bytes in a file object than see #ifdefs spread thru the code. I'll be damned if I can think of a quick way to do this stuff on Windows -- native Windows fgets() is still the only Windows handle we have on avoiding crushing thread overhead inside MS's C library. I'll think some more about it (the thrust still being to eliminate the 't' mode flag, as whined about on Python-Dev). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-10-31 18:38 Message: Logged In: YES user_id=6380 Tim, can you review this or pass it on to someone else who has time? Jack developed this patch after a discussion in which I was involved in some of the design, but I won't have time to look at it until December. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=476814&group_id=5470 From noreply@sourceforge.net Thu Apr 11 21:38:20 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 11 Apr 2002 13:38:20 -0700 Subject: [Patches] [ python-Patches-542659 ] PyCode_New NULL parameters cleanup Message-ID: Patches item #542659, was opened at 2002-04-11 22:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542659&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Olivier Dormond (odormond) Assigned to: Nobody/Anonymous (nobody) Summary: PyCode_New NULL parameters cleanup Initial Comment: This patch remove the creation of an empty tuple for freevars or cellvars if they are equal to NULL because this case is handle earlier (at the same time all the other parameters are checked) by raising a PyErr_BadInternalCall. It's almost a oneliner. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542659&group_id=5470 From noreply@sourceforge.net Thu Apr 11 21:46:26 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 11 Apr 2002 13:46:26 -0700 Subject: [Patches] [ python-Patches-542659 ] PyCode_New NULL parameters cleanup Message-ID: Patches item #542659, was opened at 2002-04-11 22:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542659&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None >Priority: 1 Submitted By: Olivier Dormond (odormond) Assigned to: Nobody/Anonymous (nobody) Summary: PyCode_New NULL parameters cleanup Initial Comment: This patch remove the creation of an empty tuple for freevars or cellvars if they are equal to NULL because this case is handle earlier (at the same time all the other parameters are checked) by raising a PyErr_BadInternalCall. It's almost a oneliner. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542659&group_id=5470 From noreply@sourceforge.net Fri Apr 12 15:51:06 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 12 Apr 2002 07:51:06 -0700 Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode Message-ID: Patches item #536241, was opened at 2002-03-28 14:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: Fixed Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: A.M. Kuchling (akuchling) Summary: string.zfill and unicode Initial Comment: This patch makes the function string.zfill work with unicode instances (and instances of str and unicode subclasses). Currently string.zfill(u"123", 10) results in "0000u'123'". With this patch the result is u'0000000123'. Should zfill be made a real str und unicode method? I noticed that a zfill implementation is available in unicodeobject.c, but commented out. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-12 16:51 Message: Logged In: YES user_id=21627 Re: optional Unicode: Walter is correct; configuring with --disable-unicode currently breaks the string module. One might consider using types.StringTypes; OTOH, pulling in types might not be desirable. As for str vs. repr: Python was always using repr in zfill, so changing it may break things. So I recommend that Walter reverts Andrew's check-in and applies his change. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 12:25 Message: Logged In: YES user_id=6656 Hah, I was going to say that but was distracted by IE wiping out the machine I'm sitting at. Re-opening. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-30 12:16 Message: Logged In: YES user_id=89016 But Python could be compiled without unicode support (by undefining PY_USING_UNICODE), and string.zfill should work even in this case. What about making zfill a real str and unicode method? ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2002-03-29 17:24 Message: Logged In: YES user_id=11375 Thanks for your patch! I've checked it into CVS, with two modifications. First, I removed the code to handle the case where Python doesn't have a unicode() built-in; there's no expection that you can take the standard library for Python version N and use it with version N-1, so this code isn't needed. Second, I changed string.zfill() to take the str() and not the repr() when it gets a non-string object because that seems to make more sense. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 From noreply@sourceforge.net Fri Apr 12 16:12:15 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 12 Apr 2002 08:12:15 -0700 Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None) Message-ID: Patches item #539949, was opened at 2002-04-05 14:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Guido van Rossum (gvanrossum) Summary: dict.popitem(key=None) Initial Comment: This patch implements the feature request at http://sourceforge.net/tracker/index.php? func=detail&aid=495086&group_id=5470&atid=355470 which asks for an optional argument to popitem so that it returns a key/value pair for a specified key or, if not specified, an arbitrary key. The benefit is in providing a fast, explicit way to retrieve and remove and particular key/value pair from a dictionary. By using only a single lookup, it is faster than the usual Python code: value = d[key] del d[key] return (key, value) which now becomes: return d.popitem(key) There is no magic or new code in the implementation -- it uses a few lines each from getitem, delitem, and popitem. If an argument is specified, the new code is run; otherwise, the existing code is run. This assures that the patch does not cause a performance penalty. The diff is about -3 lines and +25 lines. There are four sections: 1. Replacement code for dict_popitem in dictobject.c 2. Replacement docstring for popitem in dictobject.c 3. Replacement registration line for popitem in dictobject.c 4. Sample Python test code. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-12 11:12 Message: Logged In: YES user_id=6380 Thanks! Accepted and checked in. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-09 09:39 Message: Logged In: YES user_id=80475 Here is a revised patch for D.pop() with hard tabs and corrected reference counts. In a DEBUG build, I validated the ref counts against equivalent steps: vv=d[k]; del d[k]. And, after Tim's suggestions, the code is fast and light. In addition to d.pop(k), GvR's patch for d.popitem(k) should also go in. The (k,v) return value feeds directly into d.__setitem__ or a dict(itemlist) constructor (see the code fragments in the 4/6/02 post). The only downside is the time to process METH_VARARGS. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-08 12:46 Message: Logged In: YES user_id=31435 Getting closer! Two more questions: + Why switch from tabs to spaces? The rest of this file uses hard tabs, and that's what Guido prefers in C source. + Think hard about whether we really want to decref the value -- I doubt we do, as we're *transferring* ownership of the value from the dict to the caller. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-08 10:14 Message: Logged In: YES user_id=80475 Here is a revised patch for D.pop() incorporating Tim's ideas: + Docstring spelling fixed + Switched to METH_O instead of METH_VARARGS + Delayed decref until dict entry in consistent state + Removed unused int i=0 variable + Tabs replaced with spaces ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-07 17:54 Message: Logged In: YES user_id=31435 I like Raymond's new pop(). Problems: + "speficied" is misspelled in the docstring. + Should be declared METH_O, not METH_VARARGS (mimic how, e.g., dict_update is set up). + The decrefs have to be reworked: a decref can trigger calls back into arbitrary Python code, due to __del__ methods getting invoked. This means you can never leave any live object in an insane or inconsistent state *during* a decref. What you need to do instead is first capture the key and value into local vrbls, plug dummy and NULL in to the dict slot, and decrement the used count. This leaves the dict in a consistent state again. Only then is it safe to decref the key and value. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 22:00 Message: Logged In: YES user_id=80475 Here's a more fleshed-out implementation of D.pop(). It doesn't rely on popitem(), doesn't malloc a tuple, and the refcounts should be correct. One change from Neil's version, since k isn't being returned, then an arbitrary pair doesn't make sense, so the key argument to pop is required rather than optional. The diff is off of 2.123. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-06 20:51 Message: Logged In: YES user_id=35752 Here's a quick implementation. D.pop() is not as efficient as it could be (it uses popitem and then promply deallocates the item tuple). I'm not sure it matters though. Someone should probably check the refcounts. I always screw them up. :-) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-06 20:16 Message: Logged In: YES user_id=6380 Not a bad idea, Neil! Care to work the code around to implement that? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-06 20:14 Message: Logged In: YES user_id=35752 I think this should be implemented as pop() instead: D.pop([key]) -> value -- remove and return value by key (default a random value) It makes no sense to return the key when you already have it. pop() also matches well with list pop(): L.pop([index]) -> item -- remove and return item at index (default last) ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 20:08 Message: Logged In: YES user_id=80475 The tests and documentation patches have been added. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-06 12:23 Message: Logged In: YES user_id=80475 Q: Does the new function signature slow the existing no argument case? A: Yes. The function is already so fast, that the small overhead of PyArg_ParseTuple is measurable. My timing shows a 8% drop in speed. Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)? A: Yes. Though popvalue is a non-existing strawman, it would be quicker: it would cost two calls to Py_DECREF while saving a call to PyTuple_New and two calls to PyTuple_SET_ITEM. Still, the running time for popvalue would be dominated by the rest of the function and not the single malloc. Also, I think it unlikely that the dictionary interface would ever be expanded for popvalue, so the comparison is moot. Q: Are there cases where (k,v) is needed? A: Yes. One common case is where the tuple still needs to be formed to help build another dictionary: dict([d.popitem(k) for k in xferlist]) or [n.__setitem__(d.popitem(k)) for k in xferlist]. Also, it is useful when the key is computed by a function and then needs to be used in an expression. I often do something like that with setdefault: uniqInOrder= [u.setdefault(k,k) for k in alist if k not in u]. Also, when the key is computed by a function, it may need to be saved only when .popitem succeeds but not when the key is missing: "get and remove key if present; trigger exception if absent" This pattern is used in validating user input keys for deletion. Q: Where is the unittest and doc patch? A: Coming this weekend. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-05 16:50 Message: Logged In: YES user_id=31435 Are there examples of concrete use cases? The idea that dict.popitem(k) returns (k, dict[k]) seems kinda goofy, since you necessarily already have k. So the question is whether this is the function signature that's really desired, or whether it's too much a hack. As is, it slows down popitem() without an argument because it requires using a fancier calling sequence, and because it now defers that case to a taken branch; it's also much slower than a function that just returned v could be, due to the need to allocate a 2-tuple to hold a redundant copy of the key. Perhaps there are use cases of the form k, v = dict.popitem(f(x, y, z)) where the key is known only implicitly? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:47 Message: Logged In: YES user_id=6380 FYI, I'm uploading my version of the patch, with code cleanup, as popdict2.txt. I've moved the popitem-with-arg code before the allocation of res, because there were several places where this code returned NULL without DECREF'ing res. Repeating the PyTuple_New(2) call seemed the lesser evil. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:38 Message: Logged In: YES user_id=6380 I've reviewed the patch and see only cosmetic things that need to be changed. I'll check it in as soon as you submit a unittest and doc patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 16:26 Message: Logged In: YES user_id=6380 Now, if you could also upload a unittest and a doc patch, that would be great! ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2002-04-05 16:10 Message: Logged In: YES user_id=80475 Context diff uploaded at poppatch.c below. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-05 15:11 Message: Logged In: YES user_id=6380 Please upload a context or unified diff. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470 From noreply@sourceforge.net Fri Apr 12 18:08:19 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 12 Apr 2002 10:08:19 -0700 Subject: [Patches] [ python-Patches-543098 ] start docs for PyEval_ function Message-ID: Patches item #543098, was opened at 2002-04-12 19:08 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543098&group_id=5470 Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: Thomas Heller (theller) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: start docs for PyEval_ function Initial Comment: The start of a new (sub)section for the api manual. Should this go into api/utilities? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543098&group_id=5470 From noreply@sourceforge.net Fri Apr 12 18:13:49 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 12 Apr 2002 10:13:49 -0700 Subject: [Patches] [ python-Patches-462936 ] Improved modulefinder Message-ID: Patches item #462936, was opened at 2001-09-19 19:43 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=462936&group_id=5470 Category: Modules Group: None >Status: Closed >Resolution: Rejected Priority: 5 Submitted By: Thomas Heller (theller) Assigned to: Nobody/Anonymous (nobody) Summary: Improved modulefinder Initial Comment: This patch adds two improvements to freeze/modulefinder. 1. ModuleFinder now keeps track of which module is imported by whom. 2. ModuleFinder, when instantiated with the new scan_extdeps=1 argument, tries to track dependencies of builtin and extension modules. ---------------------------------------------------------------------- >Comment By: Thomas Heller (theller) Date: 2002-04-12 19:13 Message: Logged In: YES user_id=11105 Closed as rejected. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-09-23 23:50 Message: Logged In: YES user_id=21627 I would still prefer the inelegant approach. Your approach appears to be quite dangerous: the packager would essentially run arbitrary C code... If you absolutely have to use such a feature, I think you can do better than analysing the python -v output: watch sys.modules before and after the import. As for extending the hard-coded knowledge: I was suggesting that the packaging tool that uses modulefinder has a mechanism to extend the hard-coded knowledge by other hard-coded knowledge (which lives in the packaging tool, instead of living in modulefinder). If the packaging tool absolutely wants to, it also could run the Python interpreter through a pipe and put the gathered output into modulefinder :-) ---------------------------------------------------------------------- Comment By: Thomas Heller (theller) Date: 2001-09-20 11:42 Message: Logged In: YES user_id=11105 The use case is to find as much dependencies as possible. Sure, you cannot assume that importing an extension module finds all dependencies - only those which are executed inside the initmodule function. OTOH, this covers a *lot* of problematic cases, pygame and numpy for example. The situation is (somewhat) similar to finding dependencies of python modules - only those donewith normal import statements are found, __import__, eval, or exec is not handled. A possible solution would be to run the script in 'profiling mode', where the script is actually run, and all imports are monitored. This is however far beyond ModuleFinder's scope. Hardcoding the knowledge about dependencies into ModuleFinder for the core modules would be possible although inelegant IMO. An API for non-standard modules would be possible, but how should this be used without executing any code? ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-09-19 22:28 Message: Logged In: YES user_id=21627 I dislike the chunk on finding external dependencies. What is the typical use case (i.e. what module has what external dependencies)? It seems easier to hard-code knowledge about external dependencies into ModuleFinder; this hard-coded knowledge should cover all core modules. In addition, there should be an API to extend this knowledge for non-standard modules. Furthermore, by executing an import, you cannot be sure that you really find all dependencies - some may only show up when a certain function is used. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=462936&group_id=5470 From noreply@sourceforge.net Fri Apr 12 18:14:49 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 12 Apr 2002 10:14:49 -0700 Subject: [Patches] [ python-Patches-542569 ] tp_print tp_repr tp_str in test_bool.py Message-ID: Patches item #542569, was opened at 2002-04-11 19:01 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542569&group_id=5470 Category: Tests Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Nobody/Anonymous (nobody) Summary: tp_print tp_repr tp_str in test_bool.py Initial Comment: Those slots are not being tested by test_bool.py if it was run standalone. bool_print() does not run during the complete regression test suite. I was using Neal's tools and choose boolobject.c (because it's an easy module :-) to get in touch with the internals. I don't know if this patch would be useful to you because I didn't see similar checks done for other types. Ie: the eval(repr(x))==x property, or the tp_print slot (I found only one for dicts.) Hope it helps, -Hernan ---------------------------------------------------------------------- >Comment By: Hernan Martinez Foffani (hfoffani) Date: 2002-04-12 19:14 Message: Logged In: YES user_id=112690 The patch file "21067: test_bool.diff" is the good one. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542569&group_id=5470 From noreply@sourceforge.net Fri Apr 12 18:21:44 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 12 Apr 2002 10:21:44 -0700 Subject: [Patches] [ python-Patches-543098 ] start docs for PyEval_* functions Message-ID: Patches item #543098, was opened at 2002-04-12 19:08 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543098&group_id=5470 Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: Thomas Heller (theller) Assigned to: Fred L. Drake, Jr. (fdrake) >Summary: start docs for PyEval_* functions Initial Comment: The start of a new (sub)section for the api manual. Should this go into api/utilities? ---------------------------------------------------------------------- >Comment By: Thomas Heller (theller) Date: 2002-04-12 19:21 Message: Logged In: YES user_id=11105 Typo in the summary. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543098&group_id=5470 From noreply@sourceforge.net Fri Apr 12 19:37:50 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 12 Apr 2002 11:37:50 -0700 Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode Message-ID: Patches item #536241, was opened at 2002-03-28 14:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: Fixed Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: A.M. Kuchling (akuchling) Summary: string.zfill and unicode Initial Comment: This patch makes the function string.zfill work with unicode instances (and instances of str and unicode subclasses). Currently string.zfill(u"123", 10) results in "0000u'123'". With this patch the result is u'0000000123'. Should zfill be made a real str und unicode method? I noticed that a zfill implementation is available in unicodeobject.c, but commented out. ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2002-04-12 20:37 Message: Logged In: YES user_id=89016 Now that test_userstring.py works and fails (rev 1.6) should we add zfill as str and unicode methods or change UserString.zfill to use string.zfill? I've made a patch (attached) that implements zfill as methods (i.e. activates the version in unicodeobject.c that was commented out and implements the same in stringobject.c) (And it adds the test for unicode support back in.) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-12 16:51 Message: Logged In: YES user_id=21627 Re: optional Unicode: Walter is correct; configuring with --disable-unicode currently breaks the string module. One might consider using types.StringTypes; OTOH, pulling in types might not be desirable. As for str vs. repr: Python was always using repr in zfill, so changing it may break things. So I recommend that Walter reverts Andrew's check-in and applies his change. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 12:25 Message: Logged In: YES user_id=6656 Hah, I was going to say that but was distracted by IE wiping out the machine I'm sitting at. Re-opening. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-30 12:16 Message: Logged In: YES user_id=89016 But Python could be compiled without unicode support (by undefining PY_USING_UNICODE), and string.zfill should work even in this case. What about making zfill a real str and unicode method? ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2002-03-29 17:24 Message: Logged In: YES user_id=11375 Thanks for your patch! I've checked it into CVS, with two modifications. First, I removed the code to handle the case where Python doesn't have a unicode() built-in; there's no expection that you can take the standard library for Python version N and use it with version N-1, so this code isn't needed. Second, I changed string.zfill() to take the str() and not the repr() when it gets a non-string object because that seems to make more sense. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 From noreply@sourceforge.net Fri Apr 12 20:26:37 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 12 Apr 2002 12:26:37 -0700 Subject: [Patches] [ python-Patches-543098 ] start docs for PyEval_* functions Message-ID: Patches item #543098, was opened at 2002-04-12 13:08 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543098&group_id=5470 Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: Thomas Heller (theller) >Assigned to: Thomas Heller (theller) Summary: start docs for PyEval_* functions Initial Comment: The start of a new (sub)section for the api manual. Should this go into api/utilities? ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-04-12 15:26 Message: Logged In: YES user_id=3066 The section needs a better heading. ;) (The Utilities chapter is fine; it can go at the end.) I'd also like to see more content in the section before it gets added (though that's just as easily fixed once the boilerplate is checked in). It would be good to review the material in "Documenting Python"; this is part of the standard documentation. PyEval_SetProfile() and PyEval_SetTrace() are already documented. Please continue with this! ---------------------------------------------------------------------- Comment By: Thomas Heller (theller) Date: 2002-04-12 13:21 Message: Logged In: YES user_id=11105 Typo in the summary. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543098&group_id=5470 From noreply@sourceforge.net Fri Apr 12 20:32:47 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 12 Apr 2002 12:32:47 -0700 Subject: [Patches] [ python-Patches-543098 ] start docs for PyEval_* functions Message-ID: Patches item #543098, was opened at 2002-04-12 19:08 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543098&group_id=5470 Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: Thomas Heller (theller) Assigned to: Thomas Heller (theller) Summary: start docs for PyEval_* functions Initial Comment: The start of a new (sub)section for the api manual. Should this go into api/utilities? ---------------------------------------------------------------------- >Comment By: Thomas Heller (theller) Date: 2002-04-12 21:32 Message: Logged In: YES user_id=11105 > The section needs a better heading. ;) This is where I need your help. These functions are in ceval.c, and I even don't know why. The reason would probably make a good header. Suggestions? Since I currently have no internet access except email and http, I will work on a local version. I'm also relying on you checking and fixing the markup ;-) ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-04-12 21:26 Message: Logged In: YES user_id=3066 The section needs a better heading. ;) (The Utilities chapter is fine; it can go at the end.) I'd also like to see more content in the section before it gets added (though that's just as easily fixed once the boilerplate is checked in). It would be good to review the material in "Documenting Python"; this is part of the standard documentation. PyEval_SetProfile() and PyEval_SetTrace() are already documented. Please continue with this! ---------------------------------------------------------------------- Comment By: Thomas Heller (theller) Date: 2002-04-12 19:21 Message: Logged In: YES user_id=11105 Typo in the summary. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543098&group_id=5470 From noreply@sourceforge.net Fri Apr 12 21:43:36 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 12 Apr 2002 13:43:36 -0700 Subject: [Patches] [ python-Patches-541694 ] whichdb unittest Message-ID: Patches item #541694, was opened at 2002-04-09 15:15 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 Category: Tests Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Gregory H. Ball (greg_ball) >Assigned to: Neal Norwitz (nnorwitz) Summary: whichdb unittest Initial Comment: Attached patch is a first crack at a unit test for whichdb. I think that all functionality required for use by the anydbm module is tested, but only for the database modules found in a given installation. The test case is built up at runtime to cover all the available modules, so it is a bit introspective, but I think it is obvious that it should run correctly. Unfortunately it crashes on my box (Redhat 6.2) and this seems to be a real problem with whichdb: it assumes things about the dbm format which turn out to be wrong sometimes. I only discovered this because test_anydbm was crashing, when whichdb failed to work on dbm files. It would not have crashed if dbhash was available... and dbhash was not available because bsddb was not built correctly. So I think there is a build bug there, but I have little idea how to solve that one at this point. Would I be correct in thinking that if this test really uncovers bugs in whichdb, it can't be checked in until they are fixed? Unfortunately I don't know much about the various databases, but I'll try to work with someone on it. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 From noreply@sourceforge.net Sat Apr 13 02:00:17 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 12 Apr 2002 18:00:17 -0700 Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode Message-ID: Patches item #536241, was opened at 2002-03-28 08:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 Category: Library (Lib) Group: None Status: Open >Resolution: Accepted Priority: 5 Submitted By: Walter Dörwald (doerwalter) >Assigned to: Walter Dörwald (doerwalter) Summary: string.zfill and unicode Initial Comment: This patch makes the function string.zfill work with unicode instances (and instances of str and unicode subclasses). Currently string.zfill(u"123", 10) results in "0000u'123'". With this patch the result is u'0000000123'. Should zfill be made a real str und unicode method? I noticed that a zfill implementation is available in unicodeobject.c, but commented out. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-12 21:00 Message: Logged In: YES user_id=6380 I'm for making them methods. Walter, just check it in! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-12 14:37 Message: Logged In: YES user_id=89016 Now that test_userstring.py works and fails (rev 1.6) should we add zfill as str and unicode methods or change UserString.zfill to use string.zfill? I've made a patch (attached) that implements zfill as methods (i.e. activates the version in unicodeobject.c that was commented out and implements the same in stringobject.c) (And it adds the test for unicode support back in.) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-12 10:51 Message: Logged In: YES user_id=21627 Re: optional Unicode: Walter is correct; configuring with --disable-unicode currently breaks the string module. One might consider using types.StringTypes; OTOH, pulling in types might not be desirable. As for str vs. repr: Python was always using repr in zfill, so changing it may break things. So I recommend that Walter reverts Andrew's check-in and applies his change. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 06:25 Message: Logged In: YES user_id=6656 Hah, I was going to say that but was distracted by IE wiping out the machine I'm sitting at. Re-opening. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-30 06:16 Message: Logged In: YES user_id=89016 But Python could be compiled without unicode support (by undefining PY_USING_UNICODE), and string.zfill should work even in this case. What about making zfill a real str and unicode method? ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2002-03-29 11:24 Message: Logged In: YES user_id=11375 Thanks for your patch! I've checked it into CVS, with two modifications. First, I removed the code to handle the case where Python doesn't have a unicode() built-in; there's no expection that you can take the standard library for Python version N and use it with version N-1, so this code isn't needed. Second, I changed string.zfill() to take the str() and not the repr() when it gets a non-string object because that seems to make more sense. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 From Thanks4asking@alliedmarketing.net Sat Apr 13 03:06:20 2002 From: Thanks4asking@alliedmarketing.net (Thanks4asking@alliedmarketing.net) Date: Fri, 12 Apr 2002 22:06:20 -0400 Subject: [Patches] Webmaster Information Message-ID: <4D4508BD-4E5A-11D6-8F4E-00500471CA25@5MGU9XE9> ------=_NextPart_000_00Z1_11B11E2G.G1333L54 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: base64 PEhUTUw+DQo8SEVBRD4NCiAgPFRJVExFPlRoYW5rcy00LUFza2luZzwvVElUTEU+DQogIDxMSU5L IHJlbD0ic3R5bGVzaGVldCIgaHJlZj0iaHR0cDovL2NvZ25pZ2VuLm5ldC9jb3Jwb3JhdGUvY29n bmlnZW4uY3NzIj4NCjwvSEVBRD4NCjxCT0RZIEJHQ09MT1I9IiNmZmZmZmYiPg0KPENFTlRFUj4N CiAgPFRBQkxFIENFTExTUEFDSU5HPSIwIiBDRUxMUEFERElORz0iMCIgQUxJR049IkNlbnRlciJ3 aWR0aD01MDA+DQogICAgPFRSPg0KICAgICAgPFREPjxUQUJMRSBDRUxMU1BBQ0lORz0iMCIgQ0VM TFBBRERJTkc9IjAiPg0KCSAgPFRSPg0KCSAgICA8VEQgYmdjb2xvcj0jMDAwMDAwPjxQIEFMSUdO PUNlbnRlcj4NCgkgICAgICA8SU1HIFNSQz0icGljdDQ1LmpwZyIgQUxJR049IkJvdHRvbSIgQUxU PSJbSW1hZ2VdIiBXSURUSD0iMzY1IiBIRUlHSFQ9IjcyIj48L1REPg0KCSAgPC9UUj4NCgkgIDxU Uj4NCgkgICAgPFREPjxUQUJMRSBDRUxMU1BBQ0lORz0iMCIgQ0VMTFBBRERJTkc9IjAiPg0KCQk8 VFI+DQoJCSAgPFREPjxUQUJMRSBDRUxMU1BBQ0lORz0iMCIgQ0VMTFBBRERJTkc9IjAiPg0KCQkg ICAgICA8VFI+DQoJCQk8VEQ+PFRBQkxFIENFTExTUEFDSU5HPSIwIiBDRUxMUEFERElORz0iMCI+ DQoJCQkgICAgPFRSPg0KCQkJICAgICAgPFREPg0KCQkJCSAgPEhSPg0KCQkJICAgICAgPC9URD4N CgkJCSAgICA8L1RSPg0KCQkJICAgIDxUUj4NCgkJCSAgICAgIDxURD48VEFCTEUgQ0VMTFNQQUNJ Tkc9IjAiIENFTExQQURESU5HPSIwIj4NCgkJCQkgIDxUUj4NCgkJCQkgICAgPFREPjxUQUJMRSBD RUxMU1BBQ0lORz0iMCIgQ0VMTFBBRERJTkc9IjAiPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+PEI+ PFNQQU4gY2xhc3M9ImdlbmVyYWwiPjxCSUc+PEJJRz48QklHPkludGVybmV0IFdlYiBUcmFmZmlj IEZvcg0KCQkJCQkgIFNhbGU8L0JJRz48L0JJRz48L0JJRz48L1NQQU4+PC9CPjwvVEQ+DQoJCQkJ CTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJIDxURD48U1BBTiBjbGFzcz0iZ2VuZXJhbCI+PEk+U3Bl Y2lhbCBQcmljaW5nIEZvciBUcmFmZmljIFRvIFlvdXIgV2ViDQoJCQkJCSAgU2l0ZTo8L0k+PC9T UEFOPjwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJIDxURD4mbmJzcDs8L1REPg0K CQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+PFNQQU4gY2xhc3M9ImdlbmVyYWwiPjxC Pk1haW5zdHJlYW0gU2l0ZXMgRXhpdCBvciBQb3B1bmRlcg0KCQkJCQkgIFRyYWZmaWM6PC9CPjwv U1BBTj48L1REPg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+PFNQQU4gY2xhc3M9 ImdlbmVyYWwiPjxJPkdlbmVyaWMgVHJhZmZpYyAtICQzLjI1IENQTTwvST48L1NQQU4+PC9URD4N CgkJCQkJPC9UUj4NCgkJCQkJPFRSPg0KCQkJCQkgPFREPjxTUEFOIGNsYXNzPSJnZW5lcmFsIj48 ST5DYXRlZ29yeSBTcGVjaWZpYyBUcmFmZmljIC0gJDUuMDAgQ1BNPC9JPjwvU1BBTj48L1REPg0K CQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+Jm5ic3A7PC9URD4NCgkJCQkJPC9UUj4N CgkJCQkJPFRSPg0KCQkJCQkgPFREPjxTUEFOIGNsYXNzPSJnZW5lcmFsIj48Qj5BZHVsdCBTaXRl cyBFeGl0IG9yIFBvcHVuZGVyIFRyYWZmaWM6PC9CPjwvU1BBTj48L1REPg0KCQkJCQk8L1RSPg0K CQkJCQk8VFI+DQoJCQkJCSA8VEQ+PFNQQU4gY2xhc3M9ImdlbmVyYWwiPjxJPkZyb20gUGF5c2l0 ZXMgLSAkNS4wMCBDUE08L0k+PC9TUEFOPjwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJ CQkJIDxURD48U1BBTiBjbGFzcz0iZ2VuZXJhbCI+PEk+RnJvbSBGcmVlc2l0ZXMgLSAkNC4wMCBD UE08L0k+PC9TUEFOPjwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJIDxURD4mbmJz cDs8L1REPg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+PFNQQU4gY2xhc3M9Imdl bmVyYWwiPjxCPlNlYXJjaCBFbmdpbmUgVHJhZmZpYyBBdmFpYWxibGUgVXBvbiBSZXF1ZXN0Lg0K CQkJCQkgIEZvciBBIFF1b3RlIE9uIFNlYXJjaCBFbmdpbmUgVHJhZmZpYyBQbGVhc2UgY2FsbCB1 cyBhdCA5NzMuOTkyLjM5ODUgb3IgZW1haWwNCgkJCQkJICA8QSBIUkVGPSJtYWlsdG86dHJhZmZp Y0BhbGxpZWRtYXJrZXRpbmcubmV0Ij50cmFmZmljQGFsbGllZG1hcmtldGluZy5uZXQ8L0E+PC9C PjwvU1BBTj48L1REPg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+Jm5ic3A7PC9U RD4NCgkJCQkJPC9UUj4NCgkJCQkJPFRSPg0KCQkJCQkgPFREPjxTUEFOIGNsYXNzPSJnZW5lcmFs Ij48Qj5Gb3IgQWRkaXRpb25hbCBJbmZvcm1hdGlvbiBBYm91dCBQdXJjaGFzaW5nDQoJCQkJCSAg VHJhZmZpYyBUaHJvdWdoIEFsbGllZCBJbnRlcm5ldCBNYXJrZXRpbmcgUGxlYXNlIFZpc2l0DQoJ CQkJCSAgPEEgSFJFRj0iaHR0cDovL3RyYWZmaWMuYWxsaWVkbWFya2V0aW5nLm5ldCIgVEFSR0VU PSJfYmxhbmsiPmh0dHA6Ly90cmFmZmljLmFsbGllZG1hcmtldGluZy5uZXQ8L0E+PC9CPjwvU1BB Tj48L1REPg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+DQoJCQkJCSAgIDxIUj4N CgkJCQkJIDwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJIDxURD4mbmJzcDs8L1RE Pg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+PEI+PFNQQU4gY2xhc3M9ImdlbmVy YWwiPjxCSUc+PEJJRz48QklHPk9wdC1JbiBFbWFpbCBUcmFmZmljIEZvcg0KCQkJCQkgIFNhbGU8 L0JJRz48L0JJRz48L0JJRz48L1NQQU4+PC9CPjwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4N CgkJCQkJIDxURD48U1BBTiBjbGFzcz0iZ2VuZXJhbCI+PEk+U3BlY2lhbCBQcmljaW5nIEZvciBF bWFpbCBUcmFmZmljIFRvIFlvdXIgV2ViDQoJCQkJCSAgU2l0ZTo8L0k+PC9TUEFOPjwvVEQ+DQoJ CQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJIDxURD4mbmJzcDsgJm5ic3A7PC9URD4NCgkJCQkJ PC9UUj4NCgkJCQkJPFRSPg0KCQkJCQkgPFREPjxTUEFOIGNsYXNzPSJnZW5lcmFsIj48Qj5XZSBo YXZlIDcgZGlmZmVyZW50IE5ld3NsZXR0ZXJzIHRvIHNlbGVjdA0KCQkJCQkgIGZyb206PC9CPjwv U1BBTj48L1REPg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+PFNQQU4gY2xhc3M9 ImdlbmVyYWwiPjxJPlV0aWxpemUgb3VyIG1haW5zdHJlYW0gYW5kIGFkdWx0IG5ld3NsZXR0ZXJz DQoJCQkJCSAgdG8gYnJvYWRjYXN0IHlvdXIgbWVzc2FnZSBvdXQgdG8gb3ZlciAxMiBtaWxsaW9u IE9wdC1pbiBNZW1iZXJzLiBQcmljZXMgYXJlDQoJCQkJCSAgYXMgbG93IGFzICQxLjAwIHBlciAx LDAwMCBkZWxpdmVyZWQgZW1haWxzLjwvST48L1NQQU4+PC9URD4NCgkJCQkJPC9UUj4NCgkJCQkJ PFRSPg0KCQkJCQkgPFREPiZuYnNwOzwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJ IDxURD48U1BBTiBjbGFzcz0iZ2VuZXJhbCI+PEI+Rm9yIEFkZGl0aW9uYWwgSW5mb3JtYXRpb24g QWJvdXQgUHVyY2hhc2luZw0KCQkJCQkgIEVtYWlsIFRyYWZmaWMgVGhyb3VnaCBBbGxpZWQgSW50 ZXJuZXQgTWFya2V0aW5nIFBsZWFzZSBWaXNpdA0KCQkJCQkgIDxBIEhSRUY9Imh0dHA6Ly9idWxr ZW1haWwuYWxsaWVkbWFya2V0aW5nLm5ldCIgVEFSR0VUPSJfYmxhbmsiPmh0dHA6Ly9idWxrZW1h aWwuYWxsaWVkbWFya2V0aW5nLm5ldDwvQT48L0I+PC9TUEFOPjwvVEQ+DQoJCQkJCTwvVFI+DQoJ CQkJCTxUUj4NCgkJCQkJIDxURD4NCgkJCQkJICAgPEhSPg0KCQkJCQkgPC9URD4NCgkJCQkJPC9U Uj4NCgkJCQkJPFRSPg0KCQkJCQkgPFREPiZuYnNwOzwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxU Uj4NCgkJCQkJIDxURD48Qj48U1BBTiBjbGFzcz0iZ2VuZXJhbCI+PEJJRz48QklHPjxCSUc+Tm9u LU9wdC1JbiBFbWFpbA0KCQkJCQkgIExpc3RzPC9CSUc+PC9CSUc+PC9CSUc+PC9TUEFOPjwvQj48 L1REPg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+PFNQQU4gY2xhc3M9ImdlbmVy YWwiPjxJPlNwZWNpYWwgUHJpY2luZyBGb3IgRW1haWwgVHJhZmZpYyBUbyBZb3VyIFdlYg0KCQkJ CQkgIFNpdGU6PC9JPjwvU1BBTj48L1REPg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8 VEQ+Jm5ic3A7PC9URD4NCgkJCQkJPC9UUj4NCgkJCQkJPFRSPg0KCQkJCQkgPFREPjxTUEFOIGNs YXNzPSJnZW5lcmFsIj48Qj5XZSBoYXZlIHZhcmlvdXMgZGlmZmVyZW50IG5vbi1PcHQtSW4gTGlz dHMgZm9yDQoJCQkJCSAgc2FsZS4gVmlzaXQgb3VyIGVtYWlsIHNpdGUgYXQNCgkJCQkJICA8QSBI UkVGPSJodHRwOi8vYnVsa2VtYWlsLmFsbGllZG1hcmtldGluZy5uZXQiIFRBUkdFVD0iX2JsYW5r Ij5odHRwOi8vYnVsa2VtYWlsLmFsbGllZG1hcmtldGluZy5uZXQ8L0E+DQoJCQkJCSAgZm9yIGFk ZGl0aW9uYWwgZGV0YWlscy48L0I+PC9TUEFOPjwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4N CgkJCQkJIDxURD4NCgkJCQkJICAgPEhSPg0KCQkJCQkgPC9URD4NCgkJCQkJPC9UUj4NCgkJCQkJ PFRSPg0KCQkJCQkgPFREPiZuYnNwOzwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJ IDxURD48Qj48U1BBTiBjbGFzcz0iZ2VuZXJhbCI+PEJJRz48QklHPjxCSUc+U0VDVVJFDQoJCQkJ CSAgRU1BSUxJTkc8L0JJRz48L0JJRz48L0JJRz48L1NQQU4+PC9CPjwvVEQ+DQoJCQkJCTwvVFI+ DQoJCQkJCTxUUj4NCgkJCQkJIDxURD48U1BBTiBjbGFzcz0iZ2VuZXJhbCI+RG8geW91IGhhdmUg YSBsaXN0IGFuZCBuZWVkIHVzIHRvIHNlbmQgb3V0IHlvdXINCgkJCQkJICBlbWFpbHMgdGhyb3Vn aCBvdXIgc2VydmVyPyBObyBwcm9ibGVtISBXZSBjYW4gZG8gaXQgZm9yIG9ubHkgNTAgY2VudHMg cGVyDQoJCQkJCSAgMSwwMDAuPC9TUEFOPjwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJ CQkJIDxURD4mbmJzcDs8L1REPg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+PFNQ QU4gY2xhc3M9ImdlbmVyYWwiPjxCPmNhbGwgdXMgYXQgOTczLjk5Mi4zOTg1IG9yIGVtYWlsDQoJ CQkJCSAgPEEgSFJFRj0ibWFpbHRvOnRyYWZmaWNAYWxsaWVkbWFya2V0aW5nLm5ldCI+dHJhZmZp Y0BhbGxpZWRtYXJrZXRpbmcubmV0PC9BPjwvQj48L1NQQU4+PC9URD4NCgkJCQkJPC9UUj4NCgkJ CQkJPFRSPg0KCQkJCQkgPFREPg0KCQkJCQkgICA8SFI+DQoJCQkJCSA8L1REPg0KCQkJCQk8L1RS Pg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+Jm5ic3A7PC9URD4NCgkJCQkJPC9UUj4NCgkJCQkJPFRS Pg0KCQkJCQkgPFREPjxCPjxTUEFOIGNsYXNzPSJnZW5lcmFsIj48QklHPjxCSUc+PEJJRz5GdXR1 cmVIaXRzIFRyYWZmaWMgLSA0DQoJCQkJCSAgRlJFRSE8L0JJRz48L0JJRz48L0JJRz48L1NQQU4+ PC9CPjwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJIDxURD48U1BBTiBjbGFzcz0i Z2VuZXJhbCI+RnV0dXJlSGl0cyBpcyBhIHNvZnR3YXJlIHByb2dyYW0gaG9zdGVkIG9uIG91cg0K CQkJCQkgIHNlcnZlciB0aGF0Jm5ic3A7cGxhY2VzIGEgJm5ic3A7SUNPTiBvbiB5b3VyIHVzZXJz IGRlc2t0b3AsIGZhdm9yaXRlcywNCgkJCQkJICBsaW5rcywmbmJzcDtzdGFydCBidXR0b24gYW5k IGNoYW5nZXMgdGhlaXIgaG9tZXBhZ2UuIFRoaXMgaXMgZG9uZSBlaXRoZXINCgkJCQkJICBhdXRv bWF0aWNhbGx5IG9yIHdpdGggYWxlcnQuIFRoaXMgcHJvZHVjdCBpcyBmcmVlIGFuZCBpcyBsb2Nh dGVkIG9uIG91cg0KCQkJCQkgIDxBIEhSRUY9Imh0dHA6Ly93d3cuYWxsaWVkbWFya2V0aW5nLm5l dCI+Y29ycG9yYXRlIHdlYiBzaXRlPC9BPiBvciBvbg0KCQkJCQkgIDxBIEhSRUY9Imh0dHA6Ly9m dXR1cmVoaXRzLmFsbGllZG1hcmtldGluZy5uZXQiPmh0dHA6Ly9mdXR1cmVoaXRzLmFsbGllZG1h cmtldGluZy5uZXQ8L0E+PC9TUEFOPjwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJ IDxURD4mbmJzcDs8L1REPg0KCQkJCQk8L1RSPg0KCQkJCSAgICAgIDwvVEFCTEU+DQoJCQkJICAg IDwvVEQ+DQoJCQkJICA8L1RSPg0KCQkJCTwvVEFCTEU+DQoJCQkgICAgICA8L1REPg0KCQkJICAg IDwvVFI+DQoJCQkgICAgPFRSPg0KCQkJICAgICAgPFREPg0KCQkJCSAgPEhSPg0KCQkJICAgICAg PC9URD4NCgkJCSAgICA8L1RSPg0KCQkJICAgIDxUUj4NCgkJCSAgICAgIDxURD48U1BBTiBjbGFz cz0iZ2VuZXJhbCI+VG8gcmVtb3ZlIHlvdXIgZW1haWwgYWRkcmVzcyBmcm9tIHRoaXMgbGlzdCBh bmQNCgkJCQlhbnkgb3RoZXIgbGlzdHMgYXNzb2NpYXRlZCB0byBUaGUtRW1haWwtSW5mb3JtYXRv cnkNCgkJCQk8QSBIUkVGPSJodHRwOi8vYWRzZXJ2ZXIuY3liZXJzdWJzY3JpYmVyLmNvbS9yZW1v dmUuaHRtbCI+Q0xJQ0sNCgkJCQlIRVJFPC9BPjwvU1BBTj48L1REPg0KCQkJICAgIDwvVFI+DQoJ CQkgIDwvVEFCTEU+DQoJCQk8L1REPg0KCQkgICAgICA8L1RSPg0KCQkgICAgPC9UQUJMRT4NCgkJ ICAgIDxQPg0KCQkgIDwvVEQ+DQoJCTwvVFI+DQoJICAgICAgPC9UQUJMRT4NCgkgICAgPC9URD4N CgkgIDwvVFI+DQoJPC9UQUJMRT4NCiAgICAgIDwvVEQ+DQogICAgPC9UUj4NCiAgPC9UQUJMRT4N CjwvQ0VOVEVSPg0KPFA+DQo8L0JPRFk+PC9IVE1MPg0K From noreply@sourceforge.net Sat Apr 13 05:31:24 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 12 Apr 2002 21:31:24 -0700 Subject: [Patches] [ python-Patches-543316 ] UserDict.pop(key) Message-ID: Patches item #543316, was opened at 2002-04-13 04:31 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543316&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Nobody/Anonymous (nobody) Summary: UserDict.pop(key) Initial Comment: This two line patch modifies UserDict.py to match the new dictionary behavior, d.pop(k). I originally omitted this patch on the theory that UserDict is headed toward deprecation, but I checked the docs and they promise that UserDict implements all of the methods for dictionaries. This patch makes that statement true once again. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543316&group_id=5470 From noreply@sourceforge.net Sat Apr 13 05:35:56 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 12 Apr 2002 21:35:56 -0700 Subject: [Patches] [ python-Patches-543316 ] UserDict.pop(key) Message-ID: Patches item #543316, was opened at 2002-04-13 04:31 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543316&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Raymond Hettinger (rhettinger) >Assigned to: Guido van Rossum (gvanrossum) Summary: UserDict.pop(key) Initial Comment: This two line patch modifies UserDict.py to match the new dictionary behavior, d.pop(k). I originally omitted this patch on the theory that UserDict is headed toward deprecation, but I checked the docs and they promise that UserDict implements all of the methods for dictionaries. This patch makes that statement true once again. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543316&group_id=5470 From noreply@sourceforge.net Sat Apr 13 15:03:54 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 13 Apr 2002 07:03:54 -0700 Subject: [Patches] [ python-Patches-543316 ] UserDict.pop(key) Message-ID: Patches item #543316, was opened at 2002-04-13 00:31 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543316&group_id=5470 Category: Library (Lib) Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Raymond Hettinger (rhettinger) Assigned to: Guido van Rossum (gvanrossum) Summary: UserDict.pop(key) Initial Comment: This two line patch modifies UserDict.py to match the new dictionary behavior, d.pop(k). I originally omitted this patch on the theory that UserDict is headed toward deprecation, but I checked the docs and they promise that UserDict implements all of the methods for dictionaries. This patch makes that statement true once again. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-13 10:03 Message: Logged In: YES user_id=6380 Thanks! Added. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543316&group_id=5470 From noreply@sourceforge.net Sat Apr 13 18:41:48 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 13 Apr 2002 10:41:48 -0700 Subject: [Patches] [ python-Patches-543447 ] Inclusion of mknod() in posixmodule Message-ID: Patches item #543447, was opened at 2002-04-13 17:41 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543447&group_id=5470 Category: Modules Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Gustavo Niemeyer (niemeyer) Assigned to: Nobody/Anonymous (nobody) Summary: Inclusion of mknod() in posixmodule Initial Comment: As discussed, here is a patch implementing mknod() in posixmodule.c. As a side note, this patch also renames the "file" parameter of mkfifo() to "filename", to better reflect its meaning. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543447&group_id=5470 From noreply@sourceforge.net Sat Apr 13 23:07:10 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 13 Apr 2002 15:07:10 -0700 Subject: [Patches] [ python-Patches-543498 ] s/Copyright/License/ in bdist_rpm.py Message-ID: Patches item #543498, was opened at 2002-04-13 22:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543498&group_id=5470 Category: Distutils and setup.py Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Gustavo Niemeyer (niemeyer) Assigned to: Nobody/Anonymous (nobody) Summary: s/Copyright/License/ in bdist_rpm.py Initial Comment: The "Copyright" field in RPM spec files is obsolete. "License" should be used instead. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543498&group_id=5470 From noreply@sourceforge.net Sun Apr 14 00:31:44 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 13 Apr 2002 16:31:44 -0700 Subject: [Patches] [ python-Patches-476814 ] foreign-platform newline support Message-ID: Patches item #476814, was opened at 2001-10-31 17:41 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=476814&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jack Jansen (jackjansen) Assigned to: Jack Jansen (jackjansen) Summary: foreign-platform newline support Initial Comment: This patch enables Python to interpret all known newline conventions, CR, LF or CRLF, on all platforms. This support is enabled by configuring with --with-universal-newlines (so by default it is off, and everything should behave as usual). With universal newline support enabled two things happen: - When importing or otherwise parsing .py files any newline convention is accepted. - Python code can pass a new "t" mode parameter to open() which reads files with any newline convention. "t" cannot be combined with any other mode flags like "w" or "+", for obvious reasons. File objects have a new attribute "newlines" which contains the type of newlines encountered in the file (or None when no newline has been seen, or "mixed" if there were various types of newlines). Also included is a test script which tests both file I/O and parsing. ---------------------------------------------------------------------- >Comment By: Jack Jansen (jackjansen) Date: 2002-04-14 01:31 Message: Logged In: YES user_id=45365 A final tweaks: return a tuple of newline values in stead of 'mixed'. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-11 22:31 Message: Logged In: YES user_id=21627 What is the rationale for making this a compile-time option? It seems to complicate things, with no apparent advantage. If this is for backwards compatibility, don't make it an option: nobody will rebuild Python just to work around a compatibility problem. Apart from that, the patch looks goo.d ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-03-29 00:17 Message: Logged In: YES user_id=45365 New doc patch, and new version of the patch that mainly allows the U to be specified (no-op) in non-univ-newline-builds. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-03-25 22:07 Message: Logged In: YES user_id=6380 Thanks! But there's no documentation. Could I twist your arm for a separate doc patch? I'm tempted to give this a +1, but I'd like to hear from MvL and MAL to see if they foresee any interaction with their PEP 262 implemetation. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-03-13 23:44 Message: Logged In: YES user_id=45365 A new version of the patch. Main differences are that U is now the mode character to trigger universal newline input and --with-universal-newlines is default on. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-01-16 23:47 Message: Logged In: YES user_id=45365 This version of the patch addresses the bug in Py_UniversalNewlineFread and fixes up some minor details. Tim's other issues are addressed (at least: I think they are:-) in a forthcoming PEP. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-12-14 00:57 Message: Logged In: YES user_id=31435 Back to Jack -- and sorry for sitting on it so long. Clearly this isn't making it into 2.2 in the core. As I said on Python-Dev, I believe this needs a PEP: the design decisions are debatable, so *should* be debated outside the Mac community too. Note, though, that I can't stop you from adding it to the 2.2 Mac distribution (if you want it badly enough there). If a PEP won't be written, I suggest finding someone else to review it again; maybe Guido. Note that the patch needs doc changes too. The patch to regrtest.py doesn't belong here (I assume it just slipped in). There seems a lot of code in support of the f_newlinetypes member, and the value of that member isn't clear -- I can't imagine a good use for it (maybe it's a Mac thing?). The implementation of Py_UniversalNewlineFread appears incorrect to me: it reads n bytes *every* time around the outer loop, no matter how few characters are still required, and n doesn't change inside the loop. The business about the GIL may be due to the lack of docs: are, or are not, people supposed to release the GIL themselves around calls to these guys? It's not documented, and it appears your intent differed from my guess. Finally, it would be better to call ferror () after calling fread() instead of before it . ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2001-11-14 16:13 Message: Logged In: YES user_id=45365 Here's a new version of the patch. To address your issues one by one: - get_line and Py_UniversalNewlineFgets are too difficult to integrate, at leat, I don't see how I could do it. The storage management of get_line gets in the way. - The global lock comment I don't understand. The Universal... routines are replacements for fgets() and fread(), so have nothing to do with the interpreter lock. - The logic of all three routines (get_line too) has changed and I've put comments in. I hope this addresses some of the points. - If universal_newline is false for a certain PyFileObject we now immedeately take a quick exit via fgets() or fread(). There's also a new test script, that tests some more border cases (like lines longer than 100 characters, and a lone CR just before end of file). ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-11-05 09:16 Message: Logged In: YES user_id=31435 It would be better if get_line just called Py_UniversalNewlineFgets (when appropriate) instead of duplicating its logic inline. Py_UniversalNewlineFgets and Py_UniversalNewlineFread should deal with releasing the global lock themselves -- the correct granularity for lock release/reacquire is around the C-level input routines (esp. for fread). The new routines never check for I/O errors! Why not? It seems essential. The new Fgets checks for EOF at the end of the loop instead of the top. This is surprising, and I stared a long time in vain trying to guess why. Setting newlinetypes |= NEWLINE_CR; immediately after seeing an '\r' would be as fast (instead of waiting to see EOF and then inferring the prior existence of '\r' indirectly from the state of the skipnextlf flag). Speaking of which , the fobj tests in the inner loop waste cycles. Set the local flag vrbls whether or not fobj is NULL. When you're *out* of the inner loop you can simply decline to store the new masks when fobj is NULL (and you're already doing the latter anyway). A test and branch inside the loop is much more expensive than or'ing in a flag bit inside the loop, ditto harder to understand. Floating the univ_newline test out of the loop (and duplicating the loop body, one way for univ_newline true and the other for it false) would also save a test and branch on every character. Doing fread one character at a time is very inefficient. Since you know you need to obtain n characters in the end, and that these transformations require reading at least n characters, you could very profitably read n characters in one gulp at the start, then switch to k at a time where k is the number of \r\n pairs seen since the last fread call. This is easier to code than it sounds . It would be fine by me if you included (and initialized) the new file-object fields all the time, whether or not universal newlines are configured. I'd rather waste a few bytes in a file object than see #ifdefs spread thru the code. I'll be damned if I can think of a quick way to do this stuff on Windows -- native Windows fgets() is still the only Windows handle we have on avoiding crushing thread overhead inside MS's C library. I'll think some more about it (the thrust still being to eliminate the 't' mode flag, as whined about on Python-Dev). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-10-31 18:38 Message: Logged In: YES user_id=6380 Tim, can you review this or pass it on to someone else who has time? Jack developed this patch after a discussion in which I was involved in some of the design, but I won't have time to look at it until December. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=476814&group_id=5470 From noreply@sourceforge.net Sun Apr 14 10:54:16 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 02:54:16 -0700 Subject: [Patches] [ python-Patches-542659 ] PyCode_New NULL parameters cleanup Message-ID: Patches item #542659, was opened at 2002-04-11 22:38 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542659&group_id=5470 Category: Core (C code) Group: Python 2.3 >Status: Closed >Resolution: Accepted Priority: 1 Submitted By: Olivier Dormond (odormond) Assigned to: Nobody/Anonymous (nobody) Summary: PyCode_New NULL parameters cleanup Initial Comment: This patch remove the creation of an empty tuple for freevars or cellvars if they are equal to NULL because this case is handle earlier (at the same time all the other parameters are checked) by raising a PyErr_BadInternalCall. It's almost a oneliner. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-14 11:54 Message: Logged In: YES user_id=21627 Thanks for the patch, appplied as compile.c 2.240 ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542659&group_id=5470 From noreply@sourceforge.net Sun Apr 14 10:58:17 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 02:58:17 -0700 Subject: [Patches] [ python-Patches-543498 ] s/Copyright/License/ in bdist_rpm.py Message-ID: Patches item #543498, was opened at 2002-04-14 00:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543498&group_id=5470 Category: Distutils and setup.py Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Gustavo Niemeyer (niemeyer) Assigned to: Nobody/Anonymous (nobody) Summary: s/Copyright/License/ in bdist_rpm.py Initial Comment: The "Copyright" field in RPM spec files is obsolete. "License" should be used instead. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-14 11:58 Message: Logged In: YES user_id=21627 Can you provide a pointer that shows this obsoletion? http://www.rpm.org/RPM-HOWTO/build.html#SPEC-FILE still says Copyright. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543498&group_id=5470 From noreply@sourceforge.net Sun Apr 14 11:21:10 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 03:21:10 -0700 Subject: [Patches] [ python-Patches-543447 ] Inclusion of mknod() in posixmodule Message-ID: Patches item #543447, was opened at 2002-04-13 19:41 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543447&group_id=5470 Category: Modules Group: Python 2.3 >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Gustavo Niemeyer (niemeyer) Assigned to: Nobody/Anonymous (nobody) Summary: Inclusion of mknod() in posixmodule Initial Comment: As discussed, here is a patch implementing mknod() in posixmodule.c. As a side note, this patch also renames the "file" parameter of mkfifo() to "filename", to better reflect its meaning. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-14 12:21 Message: Logged In: YES user_id=21627 Thanks for the patch. Committed as configure 1.298 configure.in 1.308 pyconfig.h.in 1.29 libos.tex 1.79 NEWS 1.386 posixmodule.c 2.228 ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543447&group_id=5470 From noreply@sourceforge.net Sun Apr 14 11:24:34 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 03:24:34 -0700 Subject: [Patches] [ python-Patches-542569 ] tp_print tp_repr tp_str in test_bool.py Message-ID: Patches item #542569, was opened at 2002-04-11 19:01 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542569&group_id=5470 Category: Tests Group: Python 2.3 >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Nobody/Anonymous (nobody) Summary: tp_print tp_repr tp_str in test_bool.py Initial Comment: Those slots are not being tested by test_bool.py if it was run standalone. bool_print() does not run during the complete regression test suite. I was using Neal's tools and choose boolobject.c (because it's an easy module :-) to get in touch with the internals. I don't know if this patch would be useful to you because I didn't see similar checks done for other types. Ie: the eval(repr(x))==x property, or the tp_print slot (I found only one for dicts.) Hope it helps, -Hernan ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-14 12:24 Message: Logged In: YES user_id=21627 As a principle, there should be a test for each line of code, so yes, this patch is useful; I've applied it as test_bool.py 1.4. Feel free to contribute more of those. I'm not so sure tp_print is useful in the first place: the fall-back would have worked just as fine for boo. ---------------------------------------------------------------------- Comment By: Hernan Martinez Foffani (hfoffani) Date: 2002-04-12 19:14 Message: Logged In: YES user_id=112690 The patch file "21067: test_bool.diff" is the good one. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542569&group_id=5470 From noreply@sourceforge.net Sun Apr 14 11:27:16 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 03:27:16 -0700 Subject: [Patches] [ python-Patches-542562 ] clean up trace.py Message-ID: Patches item #542562, was opened at 2002-04-11 18:34 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542562&group_id=5470 Category: Demos and tools Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Zooko O'Whielacronx (zooko) Assigned to: Nobody/Anonymous (nobody) Summary: clean up trace.py Initial Comment: moderately interesting changes: * bugfix: remove "feature" of ignoring files in the tmpdir, as I was trying to run it on file in the tmpdir and couldn't figure out why it gave no answer! I think the original motivation for that feature (spurious "/tmp/" filenames for builtin functions??) has gone away, but I'm not sure. * add more usage docs and warning about common mistake pretty mundane changes: * remove unnecessary checks for backwards compatibility with a version that never escaped from my (Zooko's) laptop * add a future-compatible check: if the interpreter offers an attribute called `sys.optimized', and it is "true", and the user is trying to do something that can't be done with an optimizing interpreter, then error out ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-14 12:27 Message: Logged In: YES user_id=21627 Can you also provide the other cleanup that Guido requested (change of license, removal of change logs, etc)? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542562&group_id=5470 From noreply@sourceforge.net Sun Apr 14 11:31:22 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 03:31:22 -0700 Subject: [Patches] [ python-Patches-540583 ] IDLE calls MS HTML Help Python Docs Message-ID: Patches item #540583, was opened at 2002-04-07 16:36 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470 Category: IDLE Group: None Status: Open Resolution: None Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Guido van Rossum (gvanrossum) Summary: IDLE calls MS HTML Help Python Docs Initial Comment: A little patch to enable IDLE to call a Python Docs in HTML Help format if it becomes part of the standard Windows distribution. A few things: - The patch uses os.startfile() instead of webbrowser.open() because the default browser may not be IExplorer. - The name of .chm file is hardwire - I assume that the .chm file resides in the same directory of the python exec. - I'll try to upload a similar patch on idlefork. Regards, -Hernan ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-14 12:31 Message: Logged In: YES user_id=21627 Thanks for the patch. Applied as EditorWindow.py 1.41. ---------------------------------------------------------------------- Comment By: Hernan Martinez Foffani (hfoffani) Date: 2002-04-07 19:10 Message: Logged In: YES user_id=112690 Ok. I can add the fallback. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-07 19:06 Message: Logged In: YES user_id=21627 IMO, it would be good if it would fall back to HTML help if the chm file is not found. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470 From noreply@sourceforge.net Sun Apr 14 11:31:35 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 03:31:35 -0700 Subject: [Patches] [ python-Patches-540583 ] IDLE calls MS HTML Help Python Docs Message-ID: Patches item #540583, was opened at 2002-04-07 16:36 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470 Category: IDLE Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Guido van Rossum (gvanrossum) Summary: IDLE calls MS HTML Help Python Docs Initial Comment: A little patch to enable IDLE to call a Python Docs in HTML Help format if it becomes part of the standard Windows distribution. A few things: - The patch uses os.startfile() instead of webbrowser.open() because the default browser may not be IExplorer. - The name of .chm file is hardwire - I assume that the .chm file resides in the same directory of the python exec. - I'll try to upload a similar patch on idlefork. Regards, -Hernan ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-14 12:31 Message: Logged In: YES user_id=21627 Thanks for the patch. Applied as EditorWindow.py 1.41. ---------------------------------------------------------------------- Comment By: Hernan Martinez Foffani (hfoffani) Date: 2002-04-07 19:10 Message: Logged In: YES user_id=112690 Ok. I can add the fallback. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-07 19:06 Message: Logged In: YES user_id=21627 IMO, it would be good if it would fall back to HTML help if the chm file is not found. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470 From noreply@sourceforge.net Sun Apr 14 11:35:02 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 03:35:02 -0700 Subject: [Patches] [ python-Patches-403972 ] threaded profiler. Message-ID: Patches item #403972, was opened at 2001-02-23 16:21 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403972&group_id=5470 Category: Demos and tools Group: None >Status: Closed >Resolution: Rejected Priority: 5 Submitted By: Amila Fernando (amila) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: threaded profiler. Initial Comment: Basically a profiler that can handle threaded programs and generate profiling snapshots. It does however have some situations it cannot handle well . (see included README for details). ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-14 12:35 Message: Logged In: YES user_id=21627 Since there has been no further comments on this issue, I reject this patch. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-09 11:47 Message: Logged In: YES user_id=21627 I recommend to reject this patch. Since it is pure-Python, it is probably more suited as a stand-alone package. For inclusion into Python, trying to hook into thread creation is a hack, IMO, there are certainly ways to cheat that technique. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-07-04 06:27 Message: Logged In: YES user_id=3066 Assigned to me since I've been digging into the profiling support lately. ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2001-05-09 18:11 Message: Logged In: YES user_id=31392 Perhaps you could share this on comp.lang.python and see if people can help you fix the situations it doesn't handle well. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403972&group_id=5470 From noreply@sourceforge.net Sun Apr 14 11:38:26 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 03:38:26 -0700 Subject: [Patches] [ python-Patches-418465 ] patches for python-mode.el V4.1 Message-ID: Patches item #418465, was opened at 2001-04-24 07:11 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=418465&group_id=5470 Category: Demos and tools Group: None >Status: Closed >Resolution: Rejected Priority: 5 Submitted By: Bob Weiner (bwcto) Assigned to: Barry Warsaw (bwarsaw) Summary: patches for python-mode.el V4.1 Initial Comment: This patch fixes a number of issues with python- mode.el (the fixes are documented within the patch) and also extends python-mode.el so that it works with the emacs interface to pydoc, pydoc.el which was just released . Please let me know if you decide to apply all of these changes and put this into a production release of Python, at which point, I will stop distributing the modified python-mode.el with the pydoc.el package. Thanks, Bob ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-14 12:38 Message: Logged In: YES user_id=21627 Since there is apparently no interest in this patch anymore, I reject it. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-09 11:55 Message: Logged In: YES user_id=21627 The patch fails completely when applied to python-mode.el. This is certainly not the fault of the submitter, but due to the fact that it has been sitting around for such a long time. Bob, are you still interested in this patch, and willing to provide an updated version? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=418465&group_id=5470 From noreply@sourceforge.net Sun Apr 14 17:46:27 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 09:46:27 -0700 Subject: [Patches] [ python-Patches-543498 ] s/Copyright/License/ in bdist_rpm.py Message-ID: Patches item #543498, was opened at 2002-04-13 22:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543498&group_id=5470 Category: Distutils and setup.py Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Gustavo Niemeyer (niemeyer) Assigned to: Nobody/Anonymous (nobody) Summary: s/Copyright/License/ in bdist_rpm.py Initial Comment: The "Copyright" field in RPM spec files is obsolete. "License" should be used instead. ---------------------------------------------------------------------- >Comment By: Gustavo Niemeyer (niemeyer) Date: 2002-04-14 16:46 Message: Logged In: YES user_id=7887 The rpm.org site is much more obsolete than this tag . Here is an excerpt from a message of Jeff Johnson in rpm-list (subject is "Re: three questions about building rpms"): ---- [...] This is historical legacy. Originally rpm had Copyright: GPL but everyone said GPL is not a copyright. So, rpm changed the tag name to License:, and, for backward compatibility, used the same numeric value as RPMTAG_COPYRIGHT. Now, everyone gets to ask the next question Which is it Copyright: or License:? and the answer is :-) ---- Every distribution working with rpms, including redhat, has changed (or is changing) the tag to License. Copyright, as Jeff said by himself, is a misgiven name for that field. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-14 09:58 Message: Logged In: YES user_id=21627 Can you provide a pointer that shows this obsoletion? http://www.rpm.org/RPM-HOWTO/build.html#SPEC-FILE still says Copyright. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543498&group_id=5470 From noreply@sourceforge.net Sun Apr 14 21:15:59 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 13:15:59 -0700 Subject: [Patches] [ python-Patches-476814 ] foreign-platform newline support Message-ID: Patches item #476814, was opened at 2001-10-31 17:41 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=476814&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jack Jansen (jackjansen) >Assigned to: Barry Warsaw (bwarsaw) Summary: foreign-platform newline support Initial Comment: This patch enables Python to interpret all known newline conventions, CR, LF or CRLF, on all platforms. This support is enabled by configuring with --with-universal-newlines (so by default it is off, and everything should behave as usual). With universal newline support enabled two things happen: - When importing or otherwise parsing .py files any newline convention is accepted. - Python code can pass a new "t" mode parameter to open() which reads files with any newline convention. "t" cannot be combined with any other mode flags like "w" or "+", for obvious reasons. File objects have a new attribute "newlines" which contains the type of newlines encountered in the file (or None when no newline has been seen, or "mixed" if there were various types of newlines). Also included is a test script which tests both file I/O and parsing. ---------------------------------------------------------------------- >Comment By: Jack Jansen (jackjansen) Date: 2002-04-14 22:15 Message: Logged In: YES user_id=45365 Barry, I've checked in the code fixes, but I don't dare check in the documentation fixes myself, as I have no TeX and hence no way to test them. Could you do this, please, and also check that I'm following all the relevant guidelines? Thanks! ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-04-14 01:31 Message: Logged In: YES user_id=45365 A final tweaks: return a tuple of newline values in stead of 'mixed'. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-11 22:31 Message: Logged In: YES user_id=21627 What is the rationale for making this a compile-time option? It seems to complicate things, with no apparent advantage. If this is for backwards compatibility, don't make it an option: nobody will rebuild Python just to work around a compatibility problem. Apart from that, the patch looks goo.d ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-03-29 00:17 Message: Logged In: YES user_id=45365 New doc patch, and new version of the patch that mainly allows the U to be specified (no-op) in non-univ-newline-builds. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-03-25 22:07 Message: Logged In: YES user_id=6380 Thanks! But there's no documentation. Could I twist your arm for a separate doc patch? I'm tempted to give this a +1, but I'd like to hear from MvL and MAL to see if they foresee any interaction with their PEP 262 implemetation. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-03-13 23:44 Message: Logged In: YES user_id=45365 A new version of the patch. Main differences are that U is now the mode character to trigger universal newline input and --with-universal-newlines is default on. ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2002-01-16 23:47 Message: Logged In: YES user_id=45365 This version of the patch addresses the bug in Py_UniversalNewlineFread and fixes up some minor details. Tim's other issues are addressed (at least: I think they are:-) in a forthcoming PEP. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-12-14 00:57 Message: Logged In: YES user_id=31435 Back to Jack -- and sorry for sitting on it so long. Clearly this isn't making it into 2.2 in the core. As I said on Python-Dev, I believe this needs a PEP: the design decisions are debatable, so *should* be debated outside the Mac community too. Note, though, that I can't stop you from adding it to the 2.2 Mac distribution (if you want it badly enough there). If a PEP won't be written, I suggest finding someone else to review it again; maybe Guido. Note that the patch needs doc changes too. The patch to regrtest.py doesn't belong here (I assume it just slipped in). There seems a lot of code in support of the f_newlinetypes member, and the value of that member isn't clear -- I can't imagine a good use for it (maybe it's a Mac thing?). The implementation of Py_UniversalNewlineFread appears incorrect to me: it reads n bytes *every* time around the outer loop, no matter how few characters are still required, and n doesn't change inside the loop. The business about the GIL may be due to the lack of docs: are, or are not, people supposed to release the GIL themselves around calls to these guys? It's not documented, and it appears your intent differed from my guess. Finally, it would be better to call ferror () after calling fread() instead of before it . ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2001-11-14 16:13 Message: Logged In: YES user_id=45365 Here's a new version of the patch. To address your issues one by one: - get_line and Py_UniversalNewlineFgets are too difficult to integrate, at leat, I don't see how I could do it. The storage management of get_line gets in the way. - The global lock comment I don't understand. The Universal... routines are replacements for fgets() and fread(), so have nothing to do with the interpreter lock. - The logic of all three routines (get_line too) has changed and I've put comments in. I hope this addresses some of the points. - If universal_newline is false for a certain PyFileObject we now immedeately take a quick exit via fgets() or fread(). There's also a new test script, that tests some more border cases (like lines longer than 100 characters, and a lone CR just before end of file). ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-11-05 09:16 Message: Logged In: YES user_id=31435 It would be better if get_line just called Py_UniversalNewlineFgets (when appropriate) instead of duplicating its logic inline. Py_UniversalNewlineFgets and Py_UniversalNewlineFread should deal with releasing the global lock themselves -- the correct granularity for lock release/reacquire is around the C-level input routines (esp. for fread). The new routines never check for I/O errors! Why not? It seems essential. The new Fgets checks for EOF at the end of the loop instead of the top. This is surprising, and I stared a long time in vain trying to guess why. Setting newlinetypes |= NEWLINE_CR; immediately after seeing an '\r' would be as fast (instead of waiting to see EOF and then inferring the prior existence of '\r' indirectly from the state of the skipnextlf flag). Speaking of which , the fobj tests in the inner loop waste cycles. Set the local flag vrbls whether or not fobj is NULL. When you're *out* of the inner loop you can simply decline to store the new masks when fobj is NULL (and you're already doing the latter anyway). A test and branch inside the loop is much more expensive than or'ing in a flag bit inside the loop, ditto harder to understand. Floating the univ_newline test out of the loop (and duplicating the loop body, one way for univ_newline true and the other for it false) would also save a test and branch on every character. Doing fread one character at a time is very inefficient. Since you know you need to obtain n characters in the end, and that these transformations require reading at least n characters, you could very profitably read n characters in one gulp at the start, then switch to k at a time where k is the number of \r\n pairs seen since the last fread call. This is easier to code than it sounds . It would be fine by me if you included (and initialized) the new file-object fields all the time, whether or not universal newlines are configured. I'd rather waste a few bytes in a file object than see #ifdefs spread thru the code. I'll be damned if I can think of a quick way to do this stuff on Windows -- native Windows fgets() is still the only Windows handle we have on avoiding crushing thread overhead inside MS's C library. I'll think some more about it (the thrust still being to eliminate the 't' mode flag, as whined about on Python-Dev). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-10-31 18:38 Message: Logged In: YES user_id=6380 Tim, can you review this or pass it on to someone else who has time? Jack developed this patch after a discussion in which I was involved in some of the design, but I won't have time to look at it until December. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=476814&group_id=5470 From noreply@sourceforge.net Sun Apr 14 23:03:04 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 15:03:04 -0700 Subject: [Patches] [ python-Patches-543865 ] bugfixes on complexobject.c Message-ID: Patches item #543865, was opened at 2002-04-15 00:03 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470 Category: Core (C code) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Nobody/Anonymous (nobody) Summary: bugfixes on complexobject.c Initial Comment: A patch that fixes bugs #543840 (complex() constructor doesn't fail in certain cases) and #543387 (floor division doen't raise exception as indicates PEP 238) is included here. For the first bug, I moved a block of C code that checks the presence of '\0' outside the loop. For the other one, I just cleared the nb_floor_divide entry in the table. Also deleted the complex_int_div() function. I'm uploading also tests for this patch, but it goes on a different submit. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470 From noreply@sourceforge.net Sun Apr 14 23:16:38 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 15:16:38 -0700 Subject: [Patches] [ python-Patches-543865 ] bugfixes on complexobject.c Message-ID: Patches item #543865, was opened at 2002-04-14 18:03 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470 Category: Core (C code) >Group: None Status: Open Resolution: None Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Nobody/Anonymous (nobody) Summary: bugfixes on complexobject.c Initial Comment: A patch that fixes bugs #543840 (complex() constructor doesn't fail in certain cases) and #543387 (floor division doen't raise exception as indicates PEP 238) is included here. For the first bug, I moved a block of C code that checks the presence of '\0' outside the loop. For the other one, I just cleared the nb_floor_divide entry in the table. Also deleted the complex_int_div() function. I'm uploading also tests for this patch, but it goes on a different submit. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2002-04-14 18:16 Message: Logged In: YES user_id=31435 Note that 543840 got fixed before you uploaded this patch, so please take that out of this patch (one bug == one patch is an excellent idea, and you can attach a patch to the bug report instead). Please combine the remaining patch with the test-suite change too -- opening lots of distinct tracker items makes more work for everyone (including you!). ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470 From noreply@sourceforge.net Sun Apr 14 23:18:13 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 15:18:13 -0700 Subject: [Patches] [ python-Patches-543867 ] test for patch #543865 & others Message-ID: Patches item #543867, was opened at 2002-04-15 00:18 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470 Category: Tests Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Nobody/Anonymous (nobody) Summary: test for patch #543865 & others Initial Comment: Here are 3 patches for: - test_complex.py: . add several checks to force execution of unvisited parts of complexobject.c code. . add a test for complex floor division corresponding bug #543387 and fix #543865 - test_complex_future.py . add test for "future" true division. (actually this is not a patch but the hole file) - test_b1.py . add test for bug #543840 and it's fix at patch #543865 Regards, -Hernan ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470 From noreply@sourceforge.net Mon Apr 15 00:47:08 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 16:47:08 -0700 Subject: [Patches] [ python-Patches-543867 ] test for patch #543865 & others Message-ID: Patches item #543867, was opened at 2002-04-15 00:18 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470 Category: Tests Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Nobody/Anonymous (nobody) Summary: test for patch #543865 & others Initial Comment: Here are 3 patches for: - test_complex.py: . add several checks to force execution of unvisited parts of complexobject.c code. . add a test for complex floor division corresponding bug #543387 and fix #543865 - test_complex_future.py . add test for "future" true division. (actually this is not a patch but the hole file) - test_b1.py . add test for bug #543840 and it's fix at patch #543865 Regards, -Hernan ---------------------------------------------------------------------- >Comment By: Hernan Martinez Foffani (hfoffani) Date: 2002-04-15 01:47 Message: Logged In: YES user_id=112690 ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470 From noreply@sourceforge.net Mon Apr 15 00:48:33 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 16:48:33 -0700 Subject: [Patches] [ python-Patches-543867 ] test for patch #543865 & others Message-ID: Patches item #543867, was opened at 2002-04-15 00:18 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470 Category: Tests Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Nobody/Anonymous (nobody) Summary: test for patch #543865 & others Initial Comment: Here are 3 patches for: - test_complex.py: . add several checks to force execution of unvisited parts of complexobject.c code. . add a test for complex floor division corresponding bug #543387 and fix #543865 - test_complex_future.py . add test for "future" true division. (actually this is not a patch but the hole file) - test_b1.py . add test for bug #543840 and it's fix at patch #543865 Regards, -Hernan ---------------------------------------------------------------------- >Comment By: Hernan Martinez Foffani (hfoffani) Date: 2002-04-15 01:48 Message: Logged In: YES user_id=112690 Following Tim's advise to group together bug/fix/test, I'll leave this patch entry for improvements in the tests of complex numbers. Then the valid files are: 21173: test_complex_future.py and 21180: test_complex.diff3 ---------------------------------------------------------------------- Comment By: Hernan Martinez Foffani (hfoffani) Date: 2002-04-15 01:47 Message: Logged In: YES user_id=112690 ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470 From noreply@sourceforge.net Mon Apr 15 01:03:17 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 17:03:17 -0700 Subject: [Patches] [ python-Patches-543865 ] bugfixes on complexobject.c Message-ID: Patches item #543865, was opened at 2002-04-15 00:03 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Nobody/Anonymous (nobody) Summary: bugfixes on complexobject.c Initial Comment: A patch that fixes bugs #543840 (complex() constructor doesn't fail in certain cases) and #543387 (floor division doen't raise exception as indicates PEP 238) is included here. For the first bug, I moved a block of C code that checks the presence of '\0' outside the loop. For the other one, I just cleared the nb_floor_divide entry in the table. Also deleted the complex_int_div() function. I'm uploading also tests for this patch, but it goes on a different submit. ---------------------------------------------------------------------- >Comment By: Hernan Martinez Foffani (hfoffani) Date: 2002-04-15 02:03 Message: Logged In: YES user_id=112690 Ok, done! Bug report #543387 has patch and test suite. Pure enhancements to complex numbers tests stay at patch submission #543847 but now they don't include code related to the the reported bugs. Please, sorry for the mess. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-15 00:16 Message: Logged In: YES user_id=31435 Note that 543840 got fixed before you uploaded this patch, so please take that out of this patch (one bug == one patch is an excellent idea, and you can attach a patch to the bug report instead). Please combine the remaining patch with the test-suite change too -- opening lots of distinct tracker items makes more work for everyone (including you!). ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470 From noreply@sourceforge.net Mon Apr 15 01:34:19 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 17:34:19 -0700 Subject: [Patches] [ python-Patches-543865 ] bugfixes on complexobject.c Message-ID: Patches item #543865, was opened at 2002-04-14 18:03 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Nobody/Anonymous (nobody) Summary: bugfixes on complexobject.c Initial Comment: A patch that fixes bugs #543840 (complex() constructor doesn't fail in certain cases) and #543387 (floor division doen't raise exception as indicates PEP 238) is included here. For the first bug, I moved a block of C code that checks the presence of '\0' outside the loop. For the other one, I just cleared the nb_floor_divide entry in the table. Also deleted the complex_int_div() function. I'm uploading also tests for this patch, but it goes on a different submit. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-04-14 20:34 Message: Logged In: YES user_id=33168 Should this patch be closed since #543867 was entered? ---------------------------------------------------------------------- Comment By: Hernan Martinez Foffani (hfoffani) Date: 2002-04-14 20:03 Message: Logged In: YES user_id=112690 Ok, done! Bug report #543387 has patch and test suite. Pure enhancements to complex numbers tests stay at patch submission #543847 but now they don't include code related to the the reported bugs. Please, sorry for the mess. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-14 18:16 Message: Logged In: YES user_id=31435 Note that 543840 got fixed before you uploaded this patch, so please take that out of this patch (one bug == one patch is an excellent idea, and you can attach a patch to the bug report instead). Please combine the remaining patch with the test-suite change too -- opening lots of distinct tracker items makes more work for everyone (including you!). ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470 From noreply@sourceforge.net Mon Apr 15 02:06:53 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 18:06:53 -0700 Subject: [Patches] [ python-Patches-541031 ] context sensitive help/keyword search Message-ID: Patches item #541031, was opened at 2002-04-08 10:25 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541031&group_id=5470 >Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: Thomas Heller (theller) >Assigned to: Fred L. Drake, Jr. (fdrake) Summary: context sensitive help/keyword search Initial Comment: This script/module looks up keywords in the Python manuals. It is usable as CGI script - a version is online at http://starship.python.net/crew/theller/cgi- bin/pyhelp.cgi It can also by used from the command line: python pyhelp.py keyword It can also be used to implement context sensitive help in IDLE or Xemacs (for example) by simply selecting a word and pressing F1. It can use the online version of the manuals at www.python.org/doc/, or it can use local installed html pages. The script/module scans the index pages of the docs for hyperlinks, and pickles the results to disk. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-14 21:06 Message: Logged In: YES user_id=6380 Maybe Fred finds this interesting? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541031&group_id=5470 From noreply@sourceforge.net Mon Apr 15 02:10:12 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 14 Apr 2002 18:10:12 -0700 Subject: [Patches] [ python-Patches-541694 ] whichdb unittest Message-ID: Patches item #541694, was opened at 2002-04-09 15:15 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 Category: Tests Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Gregory H. Ball (greg_ball) Assigned to: Neal Norwitz (nnorwitz) Summary: whichdb unittest Initial Comment: Attached patch is a first crack at a unit test for whichdb. I think that all functionality required for use by the anydbm module is tested, but only for the database modules found in a given installation. The test case is built up at runtime to cover all the available modules, so it is a bit introspective, but I think it is obvious that it should run correctly. Unfortunately it crashes on my box (Redhat 6.2) and this seems to be a real problem with whichdb: it assumes things about the dbm format which turn out to be wrong sometimes. I only discovered this because test_anydbm was crashing, when whichdb failed to work on dbm files. It would not have crashed if dbhash was available... and dbhash was not available because bsddb was not built correctly. So I think there is a build bug there, but I have little idea how to solve that one at this point. Would I be correct in thinking that if this test really uncovers bugs in whichdb, it can't be checked in until they are fixed? Unfortunately I don't know much about the various databases, but I'll try to work with someone on it. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-14 21:10 Message: Logged In: YES user_id=6380 What kind of crash do you experience? Do you have a patch that fixes whichdb? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 From noreply@sourceforge.net Mon Apr 15 10:43:27 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 02:43:27 -0700 Subject: [Patches] [ python-Patches-543865 ] bugfixes on complexobject.c Message-ID: Patches item #543865, was opened at 2002-04-15 00:03 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Nobody/Anonymous (nobody) Summary: bugfixes on complexobject.c Initial Comment: A patch that fixes bugs #543840 (complex() constructor doesn't fail in certain cases) and #543387 (floor division doen't raise exception as indicates PEP 238) is included here. For the first bug, I moved a block of C code that checks the presence of '\0' outside the loop. For the other one, I just cleared the nb_floor_divide entry in the table. Also deleted the complex_int_div() function. I'm uploading also tests for this patch, but it goes on a different submit. ---------------------------------------------------------------------- Comment By: Hernan Martinez Foffani (hfoffani) Date: 2002-04-15 11:43 Message: Logged In: YES user_id=112690 Yes. I think this entry should be closed as its targets are/were taken care in bug entries #543387 / #543840. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-04-15 02:34 Message: Logged In: YES user_id=33168 Should this patch be closed since #543867 was entered? ---------------------------------------------------------------------- Comment By: Hernan Martinez Foffani (hfoffani) Date: 2002-04-15 02:03 Message: Logged In: YES user_id=112690 Ok, done! Bug report #543387 has patch and test suite. Pure enhancements to complex numbers tests stay at patch submission #543847 but now they don't include code related to the the reported bugs. Please, sorry for the mess. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-15 00:16 Message: Logged In: YES user_id=31435 Note that 543840 got fixed before you uploaded this patch, so please take that out of this patch (one bug == one patch is an excellent idea, and you can attach a patch to the bug report instead). Please combine the remaining patch with the test-suite change too -- opening lots of distinct tracker items makes more work for everyone (including you!). ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470 From noreply@sourceforge.net Mon Apr 15 12:42:34 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 04:42:34 -0700 Subject: [Patches] [ python-Patches-544113 ] merging sorted sequences Message-ID: Patches item #544113, was opened at 2002-04-15 13:42 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544113&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Sebastien Keim (s_keim) Assigned to: Nobody/Anonymous (nobody) Summary: merging sorted sequences Initial Comment: This patch is intended to add to the bisect module a function witch permit to merge several sorted sequences into an ordered list. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544113&group_id=5470 From noreply@sourceforge.net Mon Apr 15 12:49:44 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 04:49:44 -0700 Subject: [Patches] [ python-Patches-543865 ] bugfixes on complexobject.c Message-ID: Patches item #543865, was opened at 2002-04-14 18:03 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470 Category: Core (C code) Group: None >Status: Closed Resolution: None Priority: 5 Submitted By: Hernan Martinez Foffani (hfoffani) Assigned to: Nobody/Anonymous (nobody) Summary: bugfixes on complexobject.c Initial Comment: A patch that fixes bugs #543840 (complex() constructor doesn't fail in certain cases) and #543387 (floor division doen't raise exception as indicates PEP 238) is included here. For the first bug, I moved a block of C code that checks the presence of '\0' outside the loop. For the other one, I just cleared the nb_floor_divide entry in the table. Also deleted the complex_int_div() function. I'm uploading also tests for this patch, but it goes on a different submit. ---------------------------------------------------------------------- Comment By: Hernan Martinez Foffani (hfoffani) Date: 2002-04-15 05:43 Message: Logged In: YES user_id=112690 Yes. I think this entry should be closed as its targets are/were taken care in bug entries #543387 / #543840. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-04-14 20:34 Message: Logged In: YES user_id=33168 Should this patch be closed since #543867 was entered? ---------------------------------------------------------------------- Comment By: Hernan Martinez Foffani (hfoffani) Date: 2002-04-14 20:03 Message: Logged In: YES user_id=112690 Ok, done! Bug report #543387 has patch and test suite. Pure enhancements to complex numbers tests stay at patch submission #543847 but now they don't include code related to the the reported bugs. Please, sorry for the mess. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-14 18:16 Message: Logged In: YES user_id=31435 Note that 543840 got fixed before you uploaded this patch, so please take that out of this patch (one bug == one patch is an excellent idea, and you can attach a patch to the bug report instead). Please combine the remaining patch with the test-suite change too -- opening lots of distinct tracker items makes more work for everyone (including you!). ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470 From noreply@sourceforge.net Mon Apr 15 14:41:47 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 06:41:47 -0700 Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode Message-ID: Patches item #536241, was opened at 2002-03-28 14:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 Category: Library (Lib) Group: None >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: Walter Dörwald (doerwalter) Summary: string.zfill and unicode Initial Comment: This patch makes the function string.zfill work with unicode instances (and instances of str and unicode subclasses). Currently string.zfill(u"123", 10) results in "0000u'123'". With this patch the result is u'0000000123'. Should zfill be made a real str und unicode method? I noticed that a zfill implementation is available in unicodeobject.c, but commented out. ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 15:41 Message: Logged In: YES user_id=89016 Checked in as: Doc/lib/libstdtypes.tex 1.88 Lib/UserString.py 1.12 Lib/string.py 1.63 test/string_tests.py 1.13 test/test_unicode.py 1.54 Misc/NEWS 1.388 Objects/stringobject.c 2.157 Objects/unicodeobject.c 2.138 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-13 03:00 Message: Logged In: YES user_id=6380 I'm for making them methods. Walter, just check it in! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-12 20:37 Message: Logged In: YES user_id=89016 Now that test_userstring.py works and fails (rev 1.6) should we add zfill as str and unicode methods or change UserString.zfill to use string.zfill? I've made a patch (attached) that implements zfill as methods (i.e. activates the version in unicodeobject.c that was commented out and implements the same in stringobject.c) (And it adds the test for unicode support back in.) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-12 16:51 Message: Logged In: YES user_id=21627 Re: optional Unicode: Walter is correct; configuring with --disable-unicode currently breaks the string module. One might consider using types.StringTypes; OTOH, pulling in types might not be desirable. As for str vs. repr: Python was always using repr in zfill, so changing it may break things. So I recommend that Walter reverts Andrew's check-in and applies his change. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 12:25 Message: Logged In: YES user_id=6656 Hah, I was going to say that but was distracted by IE wiping out the machine I'm sitting at. Re-opening. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-30 12:16 Message: Logged In: YES user_id=89016 But Python could be compiled without unicode support (by undefining PY_USING_UNICODE), and string.zfill should work even in this case. What about making zfill a real str and unicode method? ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2002-03-29 17:24 Message: Logged In: YES user_id=11375 Thanks for your patch! I've checked it into CVS, with two modifications. First, I removed the code to handle the case where Python doesn't have a unicode() built-in; there's no expection that you can take the standard library for Python version N and use it with version N-1, so this code isn't needed. Second, I changed string.zfill() to take the str() and not the repr() when it gets a non-string object because that seems to make more sense. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 From noreply@sourceforge.net Mon Apr 15 14:53:21 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 06:53:21 -0700 Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode Message-ID: Patches item #536241, was opened at 2002-03-28 08:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 Category: Library (Lib) Group: None Status: Closed Resolution: Accepted Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: Walter Dörwald (doerwalter) Summary: string.zfill and unicode Initial Comment: This patch makes the function string.zfill work with unicode instances (and instances of str and unicode subclasses). Currently string.zfill(u"123", 10) results in "0000u'123'". With this patch the result is u'0000000123'. Should zfill be made a real str und unicode method? I noticed that a zfill implementation is available in unicodeobject.c, but commented out. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 09:53 Message: Logged In: YES user_id=6380 Thanks, Walter! Some nits: The string_zfill() code you checked in caused two warnings about modifying data pointed to by a const pointer. I've removed the const, but I'd like to understand how come you didn't catch this. Does your compiler not warn you? Or did you ignore warnings? (The latter's a sin in Python-land :-). I've also folded some long lines that weren't your fault -- but I noticed that elsewhere you checked in some long lines; please try to limit line length to 78. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 09:41 Message: Logged In: YES user_id=89016 Checked in as: Doc/lib/libstdtypes.tex 1.88 Lib/UserString.py 1.12 Lib/string.py 1.63 test/string_tests.py 1.13 test/test_unicode.py 1.54 Misc/NEWS 1.388 Objects/stringobject.c 2.157 Objects/unicodeobject.c 2.138 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-12 21:00 Message: Logged In: YES user_id=6380 I'm for making them methods. Walter, just check it in! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-12 14:37 Message: Logged In: YES user_id=89016 Now that test_userstring.py works and fails (rev 1.6) should we add zfill as str and unicode methods or change UserString.zfill to use string.zfill? I've made a patch (attached) that implements zfill as methods (i.e. activates the version in unicodeobject.c that was commented out and implements the same in stringobject.c) (And it adds the test for unicode support back in.) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-12 10:51 Message: Logged In: YES user_id=21627 Re: optional Unicode: Walter is correct; configuring with --disable-unicode currently breaks the string module. One might consider using types.StringTypes; OTOH, pulling in types might not be desirable. As for str vs. repr: Python was always using repr in zfill, so changing it may break things. So I recommend that Walter reverts Andrew's check-in and applies his change. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 06:25 Message: Logged In: YES user_id=6656 Hah, I was going to say that but was distracted by IE wiping out the machine I'm sitting at. Re-opening. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-30 06:16 Message: Logged In: YES user_id=89016 But Python could be compiled without unicode support (by undefining PY_USING_UNICODE), and string.zfill should work even in this case. What about making zfill a real str and unicode method? ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2002-03-29 11:24 Message: Logged In: YES user_id=11375 Thanks for your patch! I've checked it into CVS, with two modifications. First, I removed the code to handle the case where Python doesn't have a unicode() built-in; there's no expection that you can take the standard library for Python version N and use it with version N-1, so this code isn't needed. Second, I changed string.zfill() to take the str() and not the repr() when it gets a non-string object because that seems to make more sense. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 From noreply@sourceforge.net Mon Apr 15 15:43:07 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 07:43:07 -0700 Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode Message-ID: Patches item #536241, was opened at 2002-03-28 14:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 Category: Library (Lib) Group: None Status: Closed Resolution: Accepted Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: Walter Dörwald (doerwalter) Summary: string.zfill and unicode Initial Comment: This patch makes the function string.zfill work with unicode instances (and instances of str and unicode subclasses). Currently string.zfill(u"123", 10) results in "0000u'123'". With this patch the result is u'0000000123'. Should zfill be made a real str und unicode method? I noticed that a zfill implementation is available in unicodeobject.c, but commented out. ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 16:43 Message: Logged In: YES user_id=89016 > Does your compiler not warn you? Or did > you ignore warnings? > (The latter's a sin in Python-land :-). The warning was just lost in the long list of outputs. Now that you mention it, there are still a few warnings (gcc 2.96 on Linux): Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format': Objects/unicodeobject.c:5574: warning: int format, long int arg (arg 3) Objects/unicodeobject.c:5574: warning: unsigned int format, long unsigned int arg (arg 4) libpython2.3.a(posixmodule.o): In function `posix_tmpnam': Modules/posixmodule.c:5150: the use of `tmpnam_r' is dangerous, better use `mkstemp' libpython2.3.a(posixmodule.o): In function `posix_tempnam': Modules/posixmodule.c:5100: the use of `tempnam' is dangerous, better use `mkstemp' Modules/pwdmodule.c: In function `initpwd': Modules/pwdmodule.c:161: warning: unused variable `d' Modules/readline.c: In function `set_completer_delims': Modules/readline.c:273: warning: passing arg 1 of `free' discards qualifiers from pointer target type Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not used Should I open a separate bug report for that? > I've also folded some long lines that weren't > your fault -- but I noticed that elsewhere you > checked in some long lines; > please try to limit line length to 78. I noticed your descrobject.c checkin message. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 15:53 Message: Logged In: YES user_id=6380 Thanks, Walter! Some nits: The string_zfill() code you checked in caused two warnings about modifying data pointed to by a const pointer. I've removed the const, but I'd like to understand how come you didn't catch this. Does your compiler not warn you? Or did you ignore warnings? (The latter's a sin in Python-land :-). I've also folded some long lines that weren't your fault -- but I noticed that elsewhere you checked in some long lines; please try to limit line length to 78. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 15:41 Message: Logged In: YES user_id=89016 Checked in as: Doc/lib/libstdtypes.tex 1.88 Lib/UserString.py 1.12 Lib/string.py 1.63 test/string_tests.py 1.13 test/test_unicode.py 1.54 Misc/NEWS 1.388 Objects/stringobject.c 2.157 Objects/unicodeobject.c 2.138 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-13 03:00 Message: Logged In: YES user_id=6380 I'm for making them methods. Walter, just check it in! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-12 20:37 Message: Logged In: YES user_id=89016 Now that test_userstring.py works and fails (rev 1.6) should we add zfill as str and unicode methods or change UserString.zfill to use string.zfill? I've made a patch (attached) that implements zfill as methods (i.e. activates the version in unicodeobject.c that was commented out and implements the same in stringobject.c) (And it adds the test for unicode support back in.) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-12 16:51 Message: Logged In: YES user_id=21627 Re: optional Unicode: Walter is correct; configuring with --disable-unicode currently breaks the string module. One might consider using types.StringTypes; OTOH, pulling in types might not be desirable. As for str vs. repr: Python was always using repr in zfill, so changing it may break things. So I recommend that Walter reverts Andrew's check-in and applies his change. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 12:25 Message: Logged In: YES user_id=6656 Hah, I was going to say that but was distracted by IE wiping out the machine I'm sitting at. Re-opening. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-30 12:16 Message: Logged In: YES user_id=89016 But Python could be compiled without unicode support (by undefining PY_USING_UNICODE), and string.zfill should work even in this case. What about making zfill a real str and unicode method? ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2002-03-29 17:24 Message: Logged In: YES user_id=11375 Thanks for your patch! I've checked it into CVS, with two modifications. First, I removed the code to handle the case where Python doesn't have a unicode() built-in; there's no expection that you can take the standard library for Python version N and use it with version N-1, so this code isn't needed. Second, I changed string.zfill() to take the str() and not the repr() when it gets a non-string object because that seems to make more sense. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 From noreply@sourceforge.net Mon Apr 15 15:47:13 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 07:47:13 -0700 Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode Message-ID: Patches item #536241, was opened at 2002-03-28 08:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 Category: Library (Lib) Group: None Status: Closed Resolution: Accepted Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: Walter Dörwald (doerwalter) Summary: string.zfill and unicode Initial Comment: This patch makes the function string.zfill work with unicode instances (and instances of str and unicode subclasses). Currently string.zfill(u"123", 10) results in "0000u'123'". With this patch the result is u'0000000123'. Should zfill be made a real str und unicode method? I noticed that a zfill implementation is available in unicodeobject.c, but commented out. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 10:47 Message: Logged In: YES user_id=6380 Yes, please open a separate bug report for those (I'd open a separate report for each file with warnings, unless you have an obvious fix). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 10:43 Message: Logged In: YES user_id=89016 > Does your compiler not warn you? Or did > you ignore warnings? > (The latter's a sin in Python-land :-). The warning was just lost in the long list of outputs. Now that you mention it, there are still a few warnings (gcc 2.96 on Linux): Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format': Objects/unicodeobject.c:5574: warning: int format, long int arg (arg 3) Objects/unicodeobject.c:5574: warning: unsigned int format, long unsigned int arg (arg 4) libpython2.3.a(posixmodule.o): In function `posix_tmpnam': Modules/posixmodule.c:5150: the use of `tmpnam_r' is dangerous, better use `mkstemp' libpython2.3.a(posixmodule.o): In function `posix_tempnam': Modules/posixmodule.c:5100: the use of `tempnam' is dangerous, better use `mkstemp' Modules/pwdmodule.c: In function `initpwd': Modules/pwdmodule.c:161: warning: unused variable `d' Modules/readline.c: In function `set_completer_delims': Modules/readline.c:273: warning: passing arg 1 of `free' discards qualifiers from pointer target type Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not used Should I open a separate bug report for that? > I've also folded some long lines that weren't > your fault -- but I noticed that elsewhere you > checked in some long lines; > please try to limit line length to 78. I noticed your descrobject.c checkin message. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 09:53 Message: Logged In: YES user_id=6380 Thanks, Walter! Some nits: The string_zfill() code you checked in caused two warnings about modifying data pointed to by a const pointer. I've removed the const, but I'd like to understand how come you didn't catch this. Does your compiler not warn you? Or did you ignore warnings? (The latter's a sin in Python-land :-). I've also folded some long lines that weren't your fault -- but I noticed that elsewhere you checked in some long lines; please try to limit line length to 78. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 09:41 Message: Logged In: YES user_id=89016 Checked in as: Doc/lib/libstdtypes.tex 1.88 Lib/UserString.py 1.12 Lib/string.py 1.63 test/string_tests.py 1.13 test/test_unicode.py 1.54 Misc/NEWS 1.388 Objects/stringobject.c 2.157 Objects/unicodeobject.c 2.138 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-12 21:00 Message: Logged In: YES user_id=6380 I'm for making them methods. Walter, just check it in! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-12 14:37 Message: Logged In: YES user_id=89016 Now that test_userstring.py works and fails (rev 1.6) should we add zfill as str and unicode methods or change UserString.zfill to use string.zfill? I've made a patch (attached) that implements zfill as methods (i.e. activates the version in unicodeobject.c that was commented out and implements the same in stringobject.c) (And it adds the test for unicode support back in.) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-12 10:51 Message: Logged In: YES user_id=21627 Re: optional Unicode: Walter is correct; configuring with --disable-unicode currently breaks the string module. One might consider using types.StringTypes; OTOH, pulling in types might not be desirable. As for str vs. repr: Python was always using repr in zfill, so changing it may break things. So I recommend that Walter reverts Andrew's check-in and applies his change. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 06:25 Message: Logged In: YES user_id=6656 Hah, I was going to say that but was distracted by IE wiping out the machine I'm sitting at. Re-opening. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-30 06:16 Message: Logged In: YES user_id=89016 But Python could be compiled without unicode support (by undefining PY_USING_UNICODE), and string.zfill should work even in this case. What about making zfill a real str and unicode method? ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2002-03-29 11:24 Message: Logged In: YES user_id=11375 Thanks for your patch! I've checked it into CVS, with two modifications. First, I removed the code to handle the case where Python doesn't have a unicode() built-in; there's no expection that you can take the standard library for Python version N and use it with version N-1, so this code isn't needed. Second, I changed string.zfill() to take the str() and not the repr() when it gets a non-string object because that seems to make more sense. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 From novaluz@sp.mailbr.com.br Mon Apr 15 16:51:06 2002 From: novaluz@sp.mailbr.com.br (NovaLuz) Date: Mon, 15 Apr 2002 12:51:06 -0300 Subject: [Patches] Concorra a uma Luminaria - Novaluz - GRATÏS!! Message-ID: <1100074-22002411515516810@sp.mailbr.com.br> This is a multi-part message in MIME format. ------=_NextPart_94915C5ABAF209EF376268C8 Content-Type: multipart/alternative; boundary="----=_NextPart_84815C5ABAF209EF376268C8" ------=_NextPart_84815C5ABAF209EF376268C8 Content-type: text/plain; charset=US-ASCII ------=_NextPart_84815C5ABAF209EF376268C8 Content-Type: text/html; charset=US-ASCII Content-Transfer-Encoding: quoted-printable A NovaLuz est=E1 sorteando um equipamento de Ilumina=E7=E3o de Emer= g=EAncia

A NovaLuz est=E1 sorteando uma Lumin=E1ria Normal e Emerg=EAncia!!!

 

Para participar basta acessar o website: http://www=2Enovaluz=2Ecom=2Ebr/, preencher o formul=E1rio do concurso e ind= icar este site aos amigos=2E

 

Sorteio =96 Dia 30 de Abril/2002


 

Lumin=E1ria Nor= mal e Emerg=EAncia =96 Modelo NL 3x9 NE

 

Equipamento com= fun=E7=E3o 2 em 1=2E Possui 3 l=E2mpadas PL de 9 watts, sendo que duas delas proporci= onam a ilumina=E7=E3o normal para o ambiente e a terceira =E9 acionada automati= camente na falta de energia el=E9trica=2E

 

Autonomia para = a luz de emerg=EAncia: 4 horas

 

 = ;

Boa Sorte!!!<= /b>

 

El=E9trica NovaLuz Ltda

http://www=2Eanovaluz=2Ecom=2Eb= r/

Tel: (11) 222 2699 * Fax: (11) 3331 3033

Luz de Emerg=EAncia & Sensores de Presen=E7a

 

 

------=_NextPart_84815C5ABAF209EF376268C8-- ------=_NextPart_94915C5ABAF209EF376268C8 Content-Type: application/octet-stream; name="image001.gif" Content-Description: image001.gif Content-Id: <38100-2200241151539523804@sp.mailbr.com.br> R0lGODlhmACfAPUYACkpKDk5QjlCQkVDM0lKSXFwaAAY3hUt3jRF1k1e1mNrzmt7znuEe3uEyIV7 couHeJWVjaSdlamnmLWyrZ+iyb3Gvca9ts3IusDAwM7MytnX1N7e5+fe2e7p2fDt6/39/MDAwAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAEAACAAIf4IR2lmIEx1YmUALAAA AACYAJ8AxSkpKDk5QjlCQkVDM0lKSXFwaAAY3hUt3jRF1k1e1mNrzmt7znuEe3uEyIV7couHeJWV jaSdlamnmLWyrZ+iyb3Gvca9ts3IusDAwM7MytnX1N7e5+fe2e7p2fDt6/39/MDAwAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAb+QINwSCwaj8ikcslsOp/QqAJEpRwoVGpimu16 v+CweEwum8/oNFUh5IIaCC8lrq7b7/h8nW1w6/+AgYJ3fH4UCQgKE4OMjY54hVl8RoMZGg8THR0c mhcXH6ChHhkTpRMaHhqqGmesj46Ra0h/GRkQAAUTER4fHh4docGhpBAFBRAQF6jDFxISFsvCvSAQ ERMZqauur2mxWQhYWQZ3GRcA5wAEA8gSvZq+79Ly8/TBqRMSDg4PEhccvKmaPYsGqoMnWx22DfJG BcMBNlbCpcnwYQAuAhgxVpvw4Z8wXqB8udPgD9jHX5re+VpZr2VIaxEkdJDAC5vCQAxBRNT5sA7+ qYwZcU14gEpDB4AceoVMyYGTR5fyAEJdykFDVU0aCijj6IHjzT85t2SxUkcDUAICAqzjN0HgKQ00 V8qdO1cpyKlR70LV9OHCKUYMMSw4cCCBRDUTzmJMt7GcLgjO2lqoZUkbJ7pzq6qq+i9zZ7lNMf9D tbIA0QKoIH/Nk/OP2cXn0MoWIEC2bcW4gaJLR4DBIsePJSy6YOECCOIgammzumE5ZWIFJETI9QD1 wjaNlnsAoasA7Nq5caebUACAxQgwH0BQX+1YsQcM2DmLKTyCA2NnYxM4VioCsQgPpNJIa9wEstwH CZUCwX3IxIRMgxIUE50EBPCmGGqqFOgFgRr+dujhgNh9KOKIgnBI4okokmFiiiy2CMKKLsY4Iowy 1qihNwocgMCOPI5j448g9jHWFVk45AeQSOqBIx0UGEBkklAqGSKJtVgQATQZgLBKlAUSiEECBiCA QSCkALCeBgimgkxbJvWiAT7CtfnRm+vZpA2XX9CYBzYVDrCOMwUphdc8o0hQnQPWRJOKBc70o1co HeiSaDZb3jhlFxNQcNhEGZSnH0YRynSUS+/wNeip9sC13gPKGGUVMBStJqWQXURkgBUL+NSBYrVF eEEqGwDEkj2bkHJKL8PWwxQ8KR2VbD0gBKhML8oEeaSRFDTwpBoP5BcAZFdqMBlpIpWL6rn+9sjl Ei8SZHgdrbUe4GSudVwgXlsA8jMQZqBl4yy/AAsbsFSihOQLXNDEVUBC77pxCGENLHJHt7rtBgB4 FaOznzW1+CVdKco9R9lmHnzmi2bLWZbychxMdwFkHWTQbsOMaIBOALNhVBvG4YUnAC4SeAeZfEMp FoDG/CU3AYTC+WVKKRaYEuF690lc4qUnptyLLgxgFEGFQin44IOmVBMhfTFNp54x7kFAXni4UCor IHriSYbW93R79DkRuHti3XYHriLWghfeDeGGJz44vIo3PgbgjhcOeeSB53TIjg1QrjiOR4Lgo+aV X6ppraBLPiU4X3xeOpeWI4JA5quHzrj+TpvGjqc3CcAOwhUNqG47kpP//iOMvtOSpVUlcaBMqKTd eYfft09JgQLUV089mRncx6o7wBhUijKPioLPpB84H3lgYCZgNSDYeAcgKpugCQqd+/BzwahuMioc QanqIpzcrEOcIGrhgQL4aUICO5UHKmCoAjjAbQQRF5zuFxWZuc1O0JsR4XakgARsyycesEg61gEZ g4lCXeiyx8uM8YAIUDAkF0CPC8OHIJlFJyGVspYcDjAmECQgAXbIQAd2k5Gl0QR/vehMumiYQmkQ 4wFQ7IdJUiGd5HFgf7CQnqa2uEWffMBTZ2HHJpgoDM00qy5NdMn9cAgXUExAQIDBGur+fFS8MvAJ N+uxQCrwV7KoNEUVSKwHPJqSkmcNaigfyET5Mkg3Ae7pNbqhTluM1Yta2AMUFwBOS1AIlUGekZOX qMonIjU3SMjxMHUkg8wUI5R+ZMACuXgUGdN4QpFMxRfluOLW4Hi12XHHdbUzQ6fO8rMWNk1mbZEA NjBTMEHRspPwsEo0gNGWUtohLPTiiU+I6bVkRC0yECxZUy7DkoGZiy7IwgsnpXGwB5SPZrI4Qh0S gxt8lIIfxVHOKkxGF9KgLDTmDGjAPrABArTrAUfRijXVwKFDOKkOEGAlbyzWm0xaQkEtlIwFJlMO faqCLpwpmdys4oHmcKA5m9nMBkb+U5TOFKMDD9CjLuIIr+lZr3p18E5QzvGz2+wMLTgzGtIu+EpJ Cec3WUqqRwH6i5GuTBsZ+NrLIFAOCCz0cL7Uw2s0FtTa4OynPfNZOvCz09jw52kwMUU5omac5FTG OUvjx3ocWIAs9bJzf6DQbn4K1rCKpwDTQUZ0yNM2s6FHOhGYK1kvBrb9uG0CUZOUfJBRHQxdFat4 dQTeQDAdsFkoKAQgj5/WYQ18jE2Gia2GfA47HxZW1m0MCKpQ0wGNyzLUkS7SmgYkJFhryKdRaXuQ McZGNfLIVjFv8VDwoqQNai0NAgxgwDHQcxugBKAAvBTRcle3shZtV3gx+i54vYv+2/FCSbzmRRF6 00ui9bJXu+V9r43cK98O0be+XYqCfvfL3/76979OuKmAB0zgAhv4wAhOsIIXzOAGDzi++HXRfSP8 CIZoawGaQgAPKcwib8BhhxxOEedSF2L1apGLXSxxey/1wQ+qWLmnq90BXqxBeIklXjSGL+MunK3C 5FjHmf3xirMq5CH7wYMaVkAPi2wpeKGOCho+gO6YXGHpSXkOM9YJlbkx4S3flshebnKQqZBKPVCk GZ6ISVsACAhW2LbD5S0zOcr3R83sIolwOtZyzNvlMWDjAQ74VUrKBw16jGIofVskI88nvZsi4KHs s1kLR6mJpwTEnkKcRwe4wzH+Nm+OcFZw8Z6iaiaENnXTEVIPm0ByD2coQ07B4HQ1Ff1mONeUMME0 swYGEFoXFqUdH4FOFDMdElK4mn9LaUZMYpXD+ULYzBewSC5ogqxZioI7EuIHsWsIJ2iE7x6J1XOz TQxmMk3APANAIIKciSo6sdCFbfKAX5BhgUCKImjJONi4xQyG6TVgyeQ49wjVA2yCPbN8+IBiRvEX swkahZ0yaxqa9k3TIy2ADiC4ODlkprEBJLYdTQWIvQ8eqfeoxx/CUrYU5VFyt2UXnt+QiEOC2Cki eryEId/EUw4uyDfdxzSeYDVcXM2UpPil1mdojbY0jID1oYHjGuvmEZk4l3j+8Dwq7kbUKX6hPKNw whOLnpUb4MCjsnvRU58NldVborxN/Ovq9ditJzrxq768s+JfuNzovFghxUBGVN+2ZRJJYipkFdJc V29LX/xB7V8hvQxhmcI4yKIGivDa7xCwQJruYm9eKG+NgkSJ2w/fRA8Yaia8KIoOu5A7z2mz8h7I jaEc7xFDJjsD41TiJpnyScFDBZHTorjY8y6vA2RzIpAE7VCUcTw3yfIDluBAVOW3RHqs3fDNevs8 LhEzNBnk8YtzwySK4BN75acAFqCqBjIwpkoSjFn6VqftCYUSp6iEFx1Q6Ce6EnZTzu51Q+IT9BRJ SyNE5UAuJ5QuzzR/9If+QxegUCD3RuAnBpEXgJwSUUDxM9GRDCRxDb7ne3B3EiAoSKNCExMYBq3x JQhgGHegUxlxNOAyGcTxDMuERiEITeqCEgVzDScIBjmBLdqSa2OQfBihFmbTFiBzUcxUMKx2g7t3 FIVWCj2YJ6fUBXIWBhnAK6UmGdJxVG/0LzlYF+XShNZmaAz4EQnRN1PYBZZDJLhSL9wEAKeAWJHh gZ4RUgIlhukEMKH3LM6CXWsoCc+WBgNoGwNwDeTxWKsAhgBTZyKVh5CITvNjUBcQIAXUf9c0Jb3D gnlQiEBhDcpkT6+2fm9FMnNBLishTSQlF0fBT5EoFxtwDBrgAEcBQTD+VytgImV1QCESZVb+4RjS 8XeTdBAjA0gDgzKb8RnjZBX/EBrNWFLNSC4MQIta0RV9c4tUAAcGwIlqgIEZIUKMVVZmdQ2ZVApe aFEio0+umAp/9FSckVIqE1LTsVtvFFV2hRNTwgZbcFM5lTEXw1i0ATdElAuWwB1LExMcRYxuZRmn qAEopQql+FQa4DLVYBD+gY2A4IIa01d+JR4agwnJ4TQf8z1ulU9L1TzIuH4qCZGcFVr4FiGBGE9j ZgevAY4/81U505EZwTP5gQ6AxYMxtBFQs1alQAURmTK1MBSD1ULGEJMvMoit4FkEoBZpoTPVpZM7 aSbzgR4S0pNj5Tb+xtEficUxUAM1bpWUUpNYbOOUXWYzscGTtgGXHYkLMkQAlNUgf5dYlaWRSOMb pgATXWhPcKJYplEZGKlVE1c0u3FcWIkWjIEODrIRa0JYquUgoBiM3jFROHN5pbV+hKUL6oFQTimT 3HAgCFcMsBFWjIERvKZaDRKZeIkeD9SFMjQfxVANavlZi7EfbVOQqzciB5I3ZRVanjIA/DCWqMWV rmlaXSgciGUa1VAdS8NKi4EJ6jeaglhuJKJbpeAAvAkTH2dauIk2jYJY+gAu9yEc5REe7YCdPgiV LKIdpAAgpjGW5Rk0LERZxkAf5CGXGFEBmIh3jrMyXfFz0jU0D7J+D+ShmxkBiO75OPAZOHhDJ13z j353d0YWYSlzD26DoX8ToWHWSNoZolk0oiQqoCcKYyaaovi4oiwKFiD6onsQozKKWTWaXy56o5mY ozr6ZTPZo/73o0C6o0I6pD5qpCiKpCJapEpqBn02pE8KpIXgYFRapVZ6pViapdRjAEEAADs= ------=_NextPart_94915C5ABAF209EF376268C8 Content-Type: application/octet-stream; name="image002.gif" Content-Description: image002.gif Content-Id: <255661-2200241151539523805@sp.mailbr.com.br> R0lGODlheAAoAOUAABwHADgNADIrCFQVAG0cAVQuCWsuBlU/OG07IFdND3BPEXhqFHFGOJQoAakr AcYvAP49AI1YErJYDpltHKd1HZx2NLB8IcxaEP5DAcxoEPVtDpyIGqySHbWjH7iFJ76jINaMF/ON F/inHM6SJ+WeKtGyI/OuLNzCJPLRKf/qLbSQT7udZ8KZUtysV+yxSc+ra+S5b/7HTfjLcqSdkc+1 iNnDm+/Okv/hjtzNs+rXtPzktOjezPXoz////wAAAAAAACH/C05FVFNDQVBFMi4wAwEAAAAh+QQB DwA/ACwAAAAAeAAoAAUG/8CecEgsGo/IpHLJbDqf0Kh0Sq1ar9isdpvU8aQyl25Jq3HPWx0qdvs2 daa4LMmyWMzovFWG6rPdSXGCLjlFOR52dnhGOS8vepBHPH2UazaARDGCcSQjMEM0iBYepB42RI0j qh6FUzg4NDiRTmqVgjEygDaCfSadqjYvI6Ulw6Q2OTCqy6SPSzuvKysqE9UqsrNOfBoY3d0acbhw JN7dFMXLqiXr6x7p6SWlrUI8OTWOFtX6+h7Y2bQoynVz4GGTiXIQIAhAgU4duxInGo54GA8Rixd1 LFDYuHEfR1b/kvDgoaOkDhs2ZJATiCFBChSCIGCQCcHBBErrKlVaZ6JPRP92iexwHEpB38ZFZ3Io zYFShtMYLlwYnBqHG0sKKbKiCEGz24ASOsOKZcgO0UYOaNNy6FgNKRobUanKnRqQJQYHHbKmsDpT 4di/lH6W+KBWbceiE0p5UMHY0QsakMvUmLwUU5N6ceX20av3oN2WWmd2q3lTpwgRITRoCBEiLOsQ IGIbzkA7A+JqFC7o1k2UQtChL+ZNyRFGM2e+V/uE8JbwayURGhJKlx7iZZ++CSWoxV5T7YXpDiIM DWohuGUsOopTrfvZAQcU0UX7pRR9un0I1a9PL5A2g333adlngG8fmZdNDlIZtNxn3SRgQld3LdAL dzU9cN97AU2nAFoU3Af/IAffSTfARh4EF1J668En0AMCRcAVTRA4dxB4AnBQQn3SRaCfdBtyQCEE DVQjwXQNiEeBCjWcx0VJMmhGiUAxCuQAi/IJABGOEDywEAopPDidjhlKlwAH/t3XAFrYPZAAUXZ4 wEIZSkqhVJNURXTCnScs6A0BE9gF3gIlvCidAyCchpp9pXE35n0JNbBBmQnxl5Zv5IniZpJRJKMM CXJRtA5yGADAAaijNffBjYymCkFpWC6AXQMWNuqjdEEaRt6tiSxWRhKpvEPCr5zK9euKVoLAHXdW lqAqow2c0AeODwwpnQCvhliTAh6NhysFizmyqxIj1YPSuDBEZa6CAhEA/2gJCCU0kwOAimAfAQbU a++9BdAnXazN9ZlqAfvwo4K3NSgVZ1JSgSrqpzBON0AHgdqXL2cUB4YlrQvMeh+f+6gQEq+CrFgA O8Z2Ne06giaUAZcpcIWBaq3hdHGkaHqYccATXPMxETAEq+dMfKbTcELwsnOfavcNgGFEMxOQ1o+S 4lyNBSt8zIMLwN4oEADuLPPjw+z8eF8AHfjE8J9Pzyu11DpHYoOvIqvSQgsjzCzAO2InRMDIZq8D Kc1p0xoBKWvjbAENerz9zggUGIAAAwwcMAPPjN9rQASLU9CAAw00QMAAERSgADEPUVDA6adPwBEH oaNeAAWIDwFNZDRMo5yCHfqs4E8WdC/TAgzIHNxDDounE5wwy4BVUdeWAtVmb0VBXzUUO+ywxQvI COcE8ss4IhzxipUS+wuJcCuKMzzUQIMKHhg1lDU7bwG+90m8IIodsdOTUVBuyQ4Z+wGrgPXih4V6 XEYU+UPF/SzwhB3UrgITiMDuCJiNbyGBBnYYQRUmSMEOEsERHgyhCEdIwhKa8IQoTKEKV0jAIAAA IfkEAQ8APwAsBgADAGoAIQAFBv/An3BILBqPyCJP1ks6n9CodEoV9lw/l67K7Xq/RhkR1gSbh7vz eWu0qak7HG21sqzeYDZeGp//LBNDE3d7XzGFSDk/Lz8egUeDiJKTSRSMkj09PDxCW26UahRCl1SZ mTw6Ojk5NjIyMTFYoKCWXj2sslwms1wcRRQqLy80xDQ1x6ubPKamQ5k6NrlFu7zVTi85zUc9OjLS 1oWiUMLZmVPc3uDg2Npeeupm4kXY2fD2XR73+vv8/f7/AAMKHDhLBjWCUCqkmfQNIRQVOMq8yUHi SQmCFOQRsUBDIhgbFX9cdNhlhbkuPKSNRDIiYAUVKixYqOBkRcQqNloiMWBgiKJkQjrfmNoRR04x Ou2uBf0xYoQLGDZ+evkkZESJoCzy3TuZJEfTplBzcMLTgxSRGkYsfBipVR8PGmJ5eR3iYayQHG21 siCJqAeNHyw8Dnlhgchfvn0FE+Gh04NUxOq8toVs77C1IAAh+QQBDwA/ACwHAAMAZgAiAAUG/8Cf cEgsGo9IYq+XbDqf0Kh0mvvJmNOsdss18oauandcxNFoZC7PVQynpzjzSjWZqHBvbS8mNJmIMGJ5 RHgrPxYTRniDWTJGEiOMX2h0TouMmF0eP5tSgnlLXz86OmQmkZlHn1BLoaQ5NjIyMS58qbdEq06t Nmy4v05nNTU5OTw8rclHPTw5vkYmKCjA1Ec0yEtIPTq9R9PVYxQUTy85rVLbtODV5dldjuupBSrm WPH3+Pn6+/z9/v8AAwoEt2agFAs07L3R8cegk3FDVuBQ2AWGQyiJilSY2MUZkRMXs1jgmAUGqhIl fqQMSeTMijkqVFSYOWSkOyc5UDUx8EPFD19RuDi8SUaUohEaOomMGAHDhi5PRUZ0+vFiiIeV/Hi0 UMrUKSMbSaky4TH1KqdD98COaFHu6aAeVYXUUEjDghG764i5xdWDhlsWUVmCqzEVsGBwLzahOQwu 52LGBKkFAQAh+QQBDwA/ACwJAAQAYQAhAAUG/8CfcEgsGo9IYS/XSzqf0OguSnX2bEJZs8qt7nC0 lWo17ZqvRdjWzMb9wpWJcLJi27FInv3ppv0sckUTfntdPHiFUDk/Lz8eT4OJkkMmI5MWVR41kz89 PXqcoUMcHEWPXJ6ePDo6NjIyMaKTpU4TKlWeOTIusr1cKjU5Ozypa0eeOrtHJiYoKL7QRx4ewMXH h7xEJtGhFD+BSC9MxlY6sdzoQ+Ke6e1PFOBD7O709fb3+Pn6+/z9/v/QFgHsQkaSjIFUFIBbMc9M jmw/ShxZgHCOEYbkoMCwFLGiHYxRcnD0+ERFDjA0UqpUuaPhERhUCvywITBUiRP6RDoZAYPmnk9G Qk4JYZHkFCZ0L0YO4emTUw6gQzb9qHmEaq8cp0aMeCEOlK+njMhB/fHhlNdeLzxwzWEVHbGqQoOe FcWW5Ni4JANyxJsXWlq+fb8ShRYEACH5BAEPAD8ALAYABABlACIABQb/wJ9wSCwaj0hjL8lsOp84 HO1JpfJcMl11u91JV5XJD8ctG69DV87M3v1oYDGRzK5rj+t60g1XyY0rdHpmd4NONT8qFE6Bho6P dRQvbpBDPTyVmT+LQ5KGPaA6ojIyMZqnnYhmoD02pC6osUY0O6xsoDmwSSg/IrKZGSMqNKxLVaG6 Qya8v4Nif0MqOba3NkYmzYOcQxWg2d/g4eLj5OXm5+jp6uvs7XUy2O5PjZA6yfLzxoY2JPhcFTj0 mcnlj80Kb2VglBnBrpjDYlVsMBxSogSTF808kOOBsckIhXm4CPyhUUjHdRKRfIQRclCriT9eGKth YciHIlOa5WBRZMSISBc5WlbqkeMFpiI8hXiwKCSnLBo1PXh4QUOorJFDoCotqQoVDg8sgFpFx4Nh yR9dNXkpOOTF2bRsZeWYCDeuLLdj7aLKcfRUEAAh+QQBDwA/ACwGAAQAZQAiAAUG/8CfcEgsGo9I Yy7JbDqfvV3vSaXaTCZZdbuN4mgr1W/HLRuvQxLMzJ5+V5XilE23Idf0pBtsQdLmeWZ2gU9LLx5O f4SLjHQeNYCNPzw6kpY/E0QeS4GUOjYyMjFYl6VDKpxlOTIuWKSmsEUqNTk7PGw6rCZMKT+7sZIX FxQUHh6zWzw2rUUovcCBFD/SRRHEL6llOjFGKNCBHEUG2N/l5ufo6err7O3u7/Dx8vN0OS70VBMr kj1a+FX6GNn7x2VCQDo98BAsY3CfqjIt5OX4QqOiRYtkqCQkMmIEk0GxwqXrke2OnUhVSnoUwmJe jxcmc6Csd2jIIyE5EAnROaSkKUOSRV5g4zFT4AsLMIMOsfBhyC1gL38IzSHT3FMlOi30mRSrB0mZ RdtF3foj7CKvC3uuLJu2XFS2bcvlYGE27iW0poIAACH5BAEPAD8ALAkABABhACEABQb/wJ9wSCwa j0hhziRLOp/QaC9KdfJMQleuyuX2ejga7dst/3iuotbMnoZXKqEKxzbHktt60v1bWYwrdHpcOk2D UDw/ND9xToGHkD8oJZAeVS95hzw6OpGeRBQURS9dmzo5NjIyMS4xWJ+Hok4UpFQ2Lia5sLtcFGI1 NTk5PIlJPDm4r0UoKCkpvNBHoaEWFjTFRTq3ykIo0Z4cP7JIoZhUhWnf6kMUFubr8E4c40MqmfH4 +fr7/P3+/wADChy4CxvBKtYOdToYZcG4CY/M9IAxhNuQCQyF0BMyoYKgKsiEkMjIZoKFj04mkoxS QcwKOCpUVJg5oeZJJ/ecHPgxDNYIYUr9eixKAsNGjilsRgxROsQSkj8/nH7j0cJI0aOebDAdUsug EaTQetgQ0gLTUbC7XvyxUKPIUCEeoKKF1SPY2XzXjvRgUWTEXE9kSLYdwneluh613hr+lmOE4sXR enj9FAQAIfkEAQ8APwAsBwADAGYAIgAFBv/An3BILBqPSCJPlmw6n9CodOr6uXTTrHbLPTKFJlh3 bOztemSuzlQc2dJTM462WllWaHh2LUShiG56Rjs/ND92ZYJaMUYUJYpoOT8vHk55iphdLD+bUpd6 PDxYPzZvYyiPmUefUKI6OTk2MjIxMS4muKq6RaxOsbdsu8JOKi8vNMg0NcuwoTxHOja3SCgpKcPY RxYWFC+S0DLTRtfZYxwcT9vG31A64cHlwxTd7FtV8aoD9PX4/f7/AAMKHEiwoMGDCLH1+JIQyoQK hAT1uNewCbohE1TggMOvIhIKRiZYMMSlhykhqTxKmTBhhRYe90aM+EFCZREVKrZVYMkTY8txjU9M PmHwY8aPXphAptmxA8ecZHUsvTjiAoaNHEijnEQ5U0inHxYqDexRz2oOHlm5UCriocbRqUIsfAD7 Q2y8HjxonO2RlmPXus+G5LA75Gs2vn1V0bBgmAjcISRtYuPxt6PkXTlmJr4MicZmzpCwBQEAIfkE AQ8APwAsBgADAGoAIQAFBv/An3BILBqPyKLMpUs6n9CodEoV6kw/k6zK7Xq/RuzQlQObib3zOXaE qam9Hg5Ho6XfXxsevpv/VipDKjh7YGKFRjw/NT8vFkgehIiTlJBlkzw8Ok0/eluVahNCjFU5pjk2 NjKrMS4uJrCgshOkXDauh1UoslwURR7AHirDL8V1dTXJNaeKQjw5uEi7vNROFNcvl0Y5S7nVhRxR FuPZzVE63d/fFB7lYNrqZuFG7Nnx912++Pv8/f7/AAMKHEgQFLSCUiYEogQPYRSFkt70cONkREEO 84pMsLBCDQ8XQkg49ALRSw89QywmaSEwArAJMJ1spFHlZBIEDIbc2eNhT59jYzQAqRgHc8KKiE56 sBzSAoaNHDx2dnlBpASKEilV3tux40mcF09zxJF6JodWIjSLsOvZqN9YspNesP1hIa0zFkMeLRqJ iEfPukdyzP0Bl68aGrWM9Eg7orBhXj2KPebnmFIQACH5BAEPAD8ALAQAAgBvACEABQb/wJ9wSCwa j8ijTZdsOp/QqHSK5JlMMqp2y+0+Y0NXzksum5tMIgl2bru9JiGKOOK9ob0erncvZ39zQnFCL31C eT84NCsrFj8rfIaSbXw5NI+ORis7k4aDZC8eUIWTPDw6qEI2NjJ/nWYvY1w1NDQwMDIuui5XV69m okkqsl20IyMkXIG/XRMUzxQe0tIq1SwsL9m12zU1RDw2Lj+fRSgpKczpSc8qsVU5ueRDy+rMFhYv NMRROuHy9X0yUbDgIda+Ljb+ATzzLNvBhRCjUIhIsaLFixgzatzIsSNAOx6hTFjxi4e4kEg4FJmg AockkyiTbDgywcImNz1sDEkWc8oEfQ+XyuTc2YTNmWBeKEyIMmGkSy49jBKxwICBmQaSCu3Yse1H tQoVmjYd8jPolJy3bOTIkSfSmxFHLHz4MdfQDkW1VlRb8fRJW7dudA4pMSQYDaSZfjzsaYZFEaQ/ epAiAphxGxqZPICkPHmy5Tc5pG02ksPx4s9nODlhuzAIACH+c0NyZWF0ZWQgYnkgRWNsaXBzZSBE aWdpdGFsIEltYWdpbmcgqTE5OTggIEFsbCBSaWdodHMgUmVzZXJ2ZWQuIA0KQ29tbWVyY2lhbCBs aWNlbnNlIGF2YWlsYWJsZSBhdCB3d3cuZWNsaXBzZWQuY29tDQoAOw== ------=_NextPart_94915C5ABAF209EF376268C8 Content-Type: application/octet-stream; name="image003.gif" Content-Description: image003.gif Content-Id: <188792-2200241151539523806@sp.mailbr.com.br> R0lGODlhtQA+APMHABwXFVhVVISCe7W0tMa9vcbGvcfHx/39/cfHxwAAAAAAAAAAAAAAAAAAAAAA AAAAACH5BAEAAAgAIf4IR2lmIEx1YmUALAAAAAC1AD4AwxwXFVhVVISCe7W0tMa9vcbGvcfHx/39 /cfHxwAAAAAAAAAAAAAAAAAAAAAAAAAAAAT+EMlJq7046827/2AocoFhEMRpFOtZpCzxtjLs2jU9 x/ut470cD0j8GYdHH3KpbAqdwVmOEJiUVASAdsvtmqS0sCHQTbViYZwKrW6H0YbB9ooypdzstD7P x+79cGt+d1USdCddiQAlZHNfLnYmjVtmk3MDApiallcFiSVgKmNbApGWXGZdAytqbImrdWiWq3EB mgO3uJiJZr2FCHRaB8PExcOni8UGiQLKlMbLigHFcloGxsjE0VvXxMLG3MXIit3E4wDN5opc08VZ n3a/wQDG9QfnXOX37cPbA9Dr6BGrBkDfvWTDCCI8lqheOHUBH0L8JE7Rv4oR6ciz883exIj+7ABq ueixJEGDDCUeeCfNobWS9RqhhPmRAE17WTROAADnziBAgvrYQEOUlQlBrZAalaXqTa83fqIq7Rlp hqg1d3b4ZDH1atIUA+SBynqnhtkXZ1GgXau2bdq3bOG6jUt3rl25eOvmvau3L986ywhYicQyICyl Q0WdMpMHD9CtjxH/gUxZslDLkR1DvhwDwGAsIF8tVToKlQldqFOrxtVLtRrCkhzNWA0jNQyfuE+4 BnQHtU/dq4MPaOPZEOHQV+I0vVGpTGznKHIoTM6M1YvOWw4XVgSjaY3r3xE1BT9j1gmzZE93CdCL zUbxN/HZHEjKmZb52gJSU3lQC7/t7nD+4VJBGIFk0Dn89NfSfusE89k78YW2hUf+iKQIgwR6NIsx AnTooS0D6oNPPtg0WOB9JSLH0WdWseDiGUZVZkZmWP2U241R4bhdCU+Fh9ttsOmY43lx3UWakEgi MsEwOQWQyZO3QClllLhMaSWVV2apyZS5bFmll1qGCeaYV35J5pdiZhnAmt0suQx7R8Up55x01mnn nXjmqeeefPbp559htfMgZWsxN4QMhiZ6lqKFLupoo5Cqxaikj1IaKaKVYnrpoZxOiqkZYT0InHA9 TgYJAbZVReNlmmEmo6mtxgrrrI1BRcVgmEp4SFGvLfbIZkEFm9iwMGIaz5qHvUisssX+Nsvss1AV xautPW2Uq4T3HTXjts9RIgpVojwVbiTjHtltJ9yKqy6565bb7rvkknZqPLhyBNIpPGorhq+lYeuI CfBMy8IkPA7MXbcAHJZZdZqmIAAXh424TnsnWGtvRggDACQP/Wbbscd13MHOUfjOm0p2igXUHMqR YcdFKbkZ8DDLP4J3sbfVPthRSaeIqEg63qBo34X0vXSiRGWIE+LR6xy4DtAKRmThv7caBx9N56So BdQVDt1Q0Rl+JNAB2+y8zdJiN631ekyHhOF69FoNIdYZ2cd1dlN3gaHTCxWWoEJjBx121AHxTVHa IxkDONwV6xy4hhJCbSFJ/egH9oH+/Fgi+U0HqCSx0YgvRLjGazcYNzD2Rugvfvklbkx0sKewN+TP tH54MfzdeRBKda7NOtl5yqGT1X/6mZWcU9goBaJxfmfDUYzbGfuMg4BRFvLTZx/dVc37cGMoNGzE rLvjfmUrUnmYzLEbxoK63G+lNtYTG+uLoSmRQFavg8l20H9qcQjIwrbSVxlwtcooXTFf/eA3vf5h QWDoa5lkcDSu6+yvVhBMjHIOsCSykWFNIAyhCEdIwhKa8IQoTKEKV8jCFrrQhSSZwDJoUbwa2vCG OMyhDvu0Jg5KQGP6+o2+ukfEIRoxXkjknhKpt0TkObGISWSiFJ94xCZC0YpDdJj++H4lFSF60Stg TFcQxwibMnYxjGQ8oxjN+MU1qjGNbTxiFj7TgjxZz0ZSkdMdk8RHPPbxj34MJCAH+ZPGGYcr24EH F9nHFV/lhlW0WpUkXwXJSk7SVZlxDx3FAxLG6AhIjpygIEeZJD0S8pSkJCWQYlY11GkLW7Q4HqVs EEpZ2vJUt/wULgtJMG3lMga/PM8udUlMYA4TmMgkEiQM6cqb3esL10OLUXwlzWhaU5jLi9Eqb4MD iGkrCNvUpjizSc5rxsKc3GTFUxD5Hmc+E5fGNJhpbsi8OdmyJ/NkYhX3d7966pF/TvQnn+DXTk6u zg7pLI9zPoateYwsob2Yp0H+vVW2hEFTDxO1qFTM487QePE9LjMM4HpB0hbw63PSgF4XSjFM5VA0 ox4zj45COhIyzqymLvVXtpjXyuTAFG7bgRlCp7lQZNTJbyp92RhZoJDjTAxhtNijO2MpCJnqCXD5 kgIAfZpIRXbMp3I6KS/ohFSArbROvTzK4ubJ0TslkoZh9aaeTsFS5L2HCuypVTwW8cHm8TWEi2je mjT6yhEugivKWRMTPwhYGjJ2EbAbTuw+SD3GnpB6VPgrCCG7VxQSVhKNPd08VDeH0qFjaJQjm+US wh9kkIQT+0FbSgxk2q21LWlMG54r5waTcbAOq6jNm4AuV7rXvgwjsh0dOWr+e9rQtTalzBwt3RZU ufrYLrUVLS1xT4SfDU0kuSg1HNucO7jOmS66qZuu2iByN9fZbrysBd1su4uKE4FXQuJ122wTYRCM oXcFO6OdYfbT3oQJ1z+zw0g5vIA7vYFDvuFlboKy5jXo9jS9vUWOMijXtffq9wAnKdGCSaSNOd03 NPnV7n4nVGELgzTA9kAp6ezR4ereDsT8WYWNYcw5z+FXwkdLkGpV9F/e8sxfQrYxdleLY/l6TcgU bvDgInxbqME2QLoSLYaPrFOD1Jhz8S0vkz6sXNENgz8wicZMSDtjMI9Zt3QYzk0KMJw6o+rOdq6z jt2BizXTJA5yrgegUTDDwjzjOdB83jPnUDUcGL3o0WnQs59LEgdfiEp7sROkVNtnvU4/Un84cqQQ gcKG9JjaRWVB7J5u9D1TP9DScsPC9UI2GlbXyA0JPF9lSMOvVd4agRvL9Vbaxa4zPFKBXfmvMvtZ q+8JEzfJhPYg9bojB8ryJwvcNDxPVb/ypQFJ2o4uLMCgFT38aJzWLKmrAWk9oPJGnTHCdvJwVG6s +Prdw1aekMYwgCXJAa47DLjAB07wHZLhGiNIuMIXzvCGO/zhEogAADs= ------=_NextPart_94915C5ABAF209EF376268C8-- From noreply@sourceforge.net Mon Apr 15 19:23:39 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 11:23:39 -0700 Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode Message-ID: Patches item #536241, was opened at 2002-03-28 14:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 Category: Library (Lib) Group: None Status: Closed Resolution: Accepted Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: Walter Dörwald (doerwalter) Summary: string.zfill and unicode Initial Comment: This patch makes the function string.zfill work with unicode instances (and instances of str and unicode subclasses). Currently string.zfill(u"123", 10) results in "0000u'123'". With this patch the result is u'0000000123'. Should zfill be made a real str und unicode method? I noticed that a zfill implementation is available in unicodeobject.c, but commented out. ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 20:23 Message: Logged In: YES user_id=89016 Currently zfill returns the original if nothing has to be done. Should I change this to only do it, if it's a real str or unicode instance? (as it was done lots of methods for bug http://www.python.org/sf/460020) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 16:47 Message: Logged In: YES user_id=6380 Yes, please open a separate bug report for those (I'd open a separate report for each file with warnings, unless you have an obvious fix). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 16:43 Message: Logged In: YES user_id=89016 > Does your compiler not warn you? Or did > you ignore warnings? > (The latter's a sin in Python-land :-). The warning was just lost in the long list of outputs. Now that you mention it, there are still a few warnings (gcc 2.96 on Linux): Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format': Objects/unicodeobject.c:5574: warning: int format, long int arg (arg 3) Objects/unicodeobject.c:5574: warning: unsigned int format, long unsigned int arg (arg 4) libpython2.3.a(posixmodule.o): In function `posix_tmpnam': Modules/posixmodule.c:5150: the use of `tmpnam_r' is dangerous, better use `mkstemp' libpython2.3.a(posixmodule.o): In function `posix_tempnam': Modules/posixmodule.c:5100: the use of `tempnam' is dangerous, better use `mkstemp' Modules/pwdmodule.c: In function `initpwd': Modules/pwdmodule.c:161: warning: unused variable `d' Modules/readline.c: In function `set_completer_delims': Modules/readline.c:273: warning: passing arg 1 of `free' discards qualifiers from pointer target type Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not used Should I open a separate bug report for that? > I've also folded some long lines that weren't > your fault -- but I noticed that elsewhere you > checked in some long lines; > please try to limit line length to 78. I noticed your descrobject.c checkin message. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 15:53 Message: Logged In: YES user_id=6380 Thanks, Walter! Some nits: The string_zfill() code you checked in caused two warnings about modifying data pointed to by a const pointer. I've removed the const, but I'd like to understand how come you didn't catch this. Does your compiler not warn you? Or did you ignore warnings? (The latter's a sin in Python-land :-). I've also folded some long lines that weren't your fault -- but I noticed that elsewhere you checked in some long lines; please try to limit line length to 78. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 15:41 Message: Logged In: YES user_id=89016 Checked in as: Doc/lib/libstdtypes.tex 1.88 Lib/UserString.py 1.12 Lib/string.py 1.63 test/string_tests.py 1.13 test/test_unicode.py 1.54 Misc/NEWS 1.388 Objects/stringobject.c 2.157 Objects/unicodeobject.c 2.138 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-13 03:00 Message: Logged In: YES user_id=6380 I'm for making them methods. Walter, just check it in! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-12 20:37 Message: Logged In: YES user_id=89016 Now that test_userstring.py works and fails (rev 1.6) should we add zfill as str and unicode methods or change UserString.zfill to use string.zfill? I've made a patch (attached) that implements zfill as methods (i.e. activates the version in unicodeobject.c that was commented out and implements the same in stringobject.c) (And it adds the test for unicode support back in.) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-12 16:51 Message: Logged In: YES user_id=21627 Re: optional Unicode: Walter is correct; configuring with --disable-unicode currently breaks the string module. One might consider using types.StringTypes; OTOH, pulling in types might not be desirable. As for str vs. repr: Python was always using repr in zfill, so changing it may break things. So I recommend that Walter reverts Andrew's check-in and applies his change. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 12:25 Message: Logged In: YES user_id=6656 Hah, I was going to say that but was distracted by IE wiping out the machine I'm sitting at. Re-opening. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-30 12:16 Message: Logged In: YES user_id=89016 But Python could be compiled without unicode support (by undefining PY_USING_UNICODE), and string.zfill should work even in this case. What about making zfill a real str and unicode method? ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2002-03-29 17:24 Message: Logged In: YES user_id=11375 Thanks for your patch! I've checked it into CVS, with two modifications. First, I removed the code to handle the case where Python doesn't have a unicode() built-in; there's no expection that you can take the standard library for Python version N and use it with version N-1, so this code isn't needed. Second, I changed string.zfill() to take the str() and not the repr() when it gets a non-string object because that seems to make more sense. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 From noreply@sourceforge.net Mon Apr 15 19:29:11 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 11:29:11 -0700 Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode Message-ID: Patches item #536241, was opened at 2002-03-28 08:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 Category: Library (Lib) Group: None >Status: Open Resolution: Accepted Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: Walter Dörwald (doerwalter) Summary: string.zfill and unicode Initial Comment: This patch makes the function string.zfill work with unicode instances (and instances of str and unicode subclasses). Currently string.zfill(u"123", 10) results in "0000u'123'". With this patch the result is u'0000000123'. Should zfill be made a real str und unicode method? I noticed that a zfill implementation is available in unicodeobject.c, but commented out. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 14:29 Message: Logged In: YES user_id=6380 Yes, that's the right thing. Reopened this for now. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 14:23 Message: Logged In: YES user_id=89016 Currently zfill returns the original if nothing has to be done. Should I change this to only do it, if it's a real str or unicode instance? (as it was done lots of methods for bug http://www.python.org/sf/460020) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 10:47 Message: Logged In: YES user_id=6380 Yes, please open a separate bug report for those (I'd open a separate report for each file with warnings, unless you have an obvious fix). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 10:43 Message: Logged In: YES user_id=89016 > Does your compiler not warn you? Or did > you ignore warnings? > (The latter's a sin in Python-land :-). The warning was just lost in the long list of outputs. Now that you mention it, there are still a few warnings (gcc 2.96 on Linux): Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format': Objects/unicodeobject.c:5574: warning: int format, long int arg (arg 3) Objects/unicodeobject.c:5574: warning: unsigned int format, long unsigned int arg (arg 4) libpython2.3.a(posixmodule.o): In function `posix_tmpnam': Modules/posixmodule.c:5150: the use of `tmpnam_r' is dangerous, better use `mkstemp' libpython2.3.a(posixmodule.o): In function `posix_tempnam': Modules/posixmodule.c:5100: the use of `tempnam' is dangerous, better use `mkstemp' Modules/pwdmodule.c: In function `initpwd': Modules/pwdmodule.c:161: warning: unused variable `d' Modules/readline.c: In function `set_completer_delims': Modules/readline.c:273: warning: passing arg 1 of `free' discards qualifiers from pointer target type Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not used Should I open a separate bug report for that? > I've also folded some long lines that weren't > your fault -- but I noticed that elsewhere you > checked in some long lines; > please try to limit line length to 78. I noticed your descrobject.c checkin message. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 09:53 Message: Logged In: YES user_id=6380 Thanks, Walter! Some nits: The string_zfill() code you checked in caused two warnings about modifying data pointed to by a const pointer. I've removed the const, but I'd like to understand how come you didn't catch this. Does your compiler not warn you? Or did you ignore warnings? (The latter's a sin in Python-land :-). I've also folded some long lines that weren't your fault -- but I noticed that elsewhere you checked in some long lines; please try to limit line length to 78. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 09:41 Message: Logged In: YES user_id=89016 Checked in as: Doc/lib/libstdtypes.tex 1.88 Lib/UserString.py 1.12 Lib/string.py 1.63 test/string_tests.py 1.13 test/test_unicode.py 1.54 Misc/NEWS 1.388 Objects/stringobject.c 2.157 Objects/unicodeobject.c 2.138 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-12 21:00 Message: Logged In: YES user_id=6380 I'm for making them methods. Walter, just check it in! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-12 14:37 Message: Logged In: YES user_id=89016 Now that test_userstring.py works and fails (rev 1.6) should we add zfill as str and unicode methods or change UserString.zfill to use string.zfill? I've made a patch (attached) that implements zfill as methods (i.e. activates the version in unicodeobject.c that was commented out and implements the same in stringobject.c) (And it adds the test for unicode support back in.) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-12 10:51 Message: Logged In: YES user_id=21627 Re: optional Unicode: Walter is correct; configuring with --disable-unicode currently breaks the string module. One might consider using types.StringTypes; OTOH, pulling in types might not be desirable. As for str vs. repr: Python was always using repr in zfill, so changing it may break things. So I recommend that Walter reverts Andrew's check-in and applies his change. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 06:25 Message: Logged In: YES user_id=6656 Hah, I was going to say that but was distracted by IE wiping out the machine I'm sitting at. Re-opening. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-30 06:16 Message: Logged In: YES user_id=89016 But Python could be compiled without unicode support (by undefining PY_USING_UNICODE), and string.zfill should work even in this case. What about making zfill a real str and unicode method? ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2002-03-29 11:24 Message: Logged In: YES user_id=11375 Thanks for your patch! I've checked it into CVS, with two modifications. First, I removed the code to handle the case where Python doesn't have a unicode() built-in; there's no expection that you can take the standard library for Python version N and use it with version N-1, so this code isn't needed. Second, I changed string.zfill() to take the str() and not the repr() when it gets a non-string object because that seems to make more sense. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 From noreply@sourceforge.net Mon Apr 15 19:47:05 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 11:47:05 -0700 Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode Message-ID: Patches item #536241, was opened at 2002-03-28 14:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: Walter Dörwald (doerwalter) Summary: string.zfill and unicode Initial Comment: This patch makes the function string.zfill work with unicode instances (and instances of str and unicode subclasses). Currently string.zfill(u"123", 10) results in "0000u'123'". With this patch the result is u'0000000123'. Should zfill be made a real str und unicode method? I noticed that a zfill implementation is available in unicodeobject.c, but commented out. ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 20:47 Message: Logged In: YES user_id=89016 Checked in as: Objects/stringobject.c 2.159 Objects/unicodeobject.c 2.139 Maybe we could add a test to Lib/test/test_unicode.py and Lib/test/test_string.py that makes sure that no method returns a str/unicode subinstance even when called for a str/unicode subinstance? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 20:29 Message: Logged In: YES user_id=6380 Yes, that's the right thing. Reopened this for now. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 20:23 Message: Logged In: YES user_id=89016 Currently zfill returns the original if nothing has to be done. Should I change this to only do it, if it's a real str or unicode instance? (as it was done lots of methods for bug http://www.python.org/sf/460020) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 16:47 Message: Logged In: YES user_id=6380 Yes, please open a separate bug report for those (I'd open a separate report for each file with warnings, unless you have an obvious fix). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 16:43 Message: Logged In: YES user_id=89016 > Does your compiler not warn you? Or did > you ignore warnings? > (The latter's a sin in Python-land :-). The warning was just lost in the long list of outputs. Now that you mention it, there are still a few warnings (gcc 2.96 on Linux): Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format': Objects/unicodeobject.c:5574: warning: int format, long int arg (arg 3) Objects/unicodeobject.c:5574: warning: unsigned int format, long unsigned int arg (arg 4) libpython2.3.a(posixmodule.o): In function `posix_tmpnam': Modules/posixmodule.c:5150: the use of `tmpnam_r' is dangerous, better use `mkstemp' libpython2.3.a(posixmodule.o): In function `posix_tempnam': Modules/posixmodule.c:5100: the use of `tempnam' is dangerous, better use `mkstemp' Modules/pwdmodule.c: In function `initpwd': Modules/pwdmodule.c:161: warning: unused variable `d' Modules/readline.c: In function `set_completer_delims': Modules/readline.c:273: warning: passing arg 1 of `free' discards qualifiers from pointer target type Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not used Should I open a separate bug report for that? > I've also folded some long lines that weren't > your fault -- but I noticed that elsewhere you > checked in some long lines; > please try to limit line length to 78. I noticed your descrobject.c checkin message. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 15:53 Message: Logged In: YES user_id=6380 Thanks, Walter! Some nits: The string_zfill() code you checked in caused two warnings about modifying data pointed to by a const pointer. I've removed the const, but I'd like to understand how come you didn't catch this. Does your compiler not warn you? Or did you ignore warnings? (The latter's a sin in Python-land :-). I've also folded some long lines that weren't your fault -- but I noticed that elsewhere you checked in some long lines; please try to limit line length to 78. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 15:41 Message: Logged In: YES user_id=89016 Checked in as: Doc/lib/libstdtypes.tex 1.88 Lib/UserString.py 1.12 Lib/string.py 1.63 test/string_tests.py 1.13 test/test_unicode.py 1.54 Misc/NEWS 1.388 Objects/stringobject.c 2.157 Objects/unicodeobject.c 2.138 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-13 03:00 Message: Logged In: YES user_id=6380 I'm for making them methods. Walter, just check it in! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-12 20:37 Message: Logged In: YES user_id=89016 Now that test_userstring.py works and fails (rev 1.6) should we add zfill as str and unicode methods or change UserString.zfill to use string.zfill? I've made a patch (attached) that implements zfill as methods (i.e. activates the version in unicodeobject.c that was commented out and implements the same in stringobject.c) (And it adds the test for unicode support back in.) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-12 16:51 Message: Logged In: YES user_id=21627 Re: optional Unicode: Walter is correct; configuring with --disable-unicode currently breaks the string module. One might consider using types.StringTypes; OTOH, pulling in types might not be desirable. As for str vs. repr: Python was always using repr in zfill, so changing it may break things. So I recommend that Walter reverts Andrew's check-in and applies his change. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 12:25 Message: Logged In: YES user_id=6656 Hah, I was going to say that but was distracted by IE wiping out the machine I'm sitting at. Re-opening. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-30 12:16 Message: Logged In: YES user_id=89016 But Python could be compiled without unicode support (by undefining PY_USING_UNICODE), and string.zfill should work even in this case. What about making zfill a real str and unicode method? ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2002-03-29 17:24 Message: Logged In: YES user_id=11375 Thanks for your patch! I've checked it into CVS, with two modifications. First, I removed the code to handle the case where Python doesn't have a unicode() built-in; there's no expection that you can take the standard library for Python version N and use it with version N-1, so this code isn't needed. Second, I changed string.zfill() to take the str() and not the repr() when it gets a non-string object because that seems to make more sense. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 From noreply@sourceforge.net Mon Apr 15 19:48:27 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 11:48:27 -0700 Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode Message-ID: Patches item #536241, was opened at 2002-03-28 08:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: Walter Dörwald (doerwalter) Summary: string.zfill and unicode Initial Comment: This patch makes the function string.zfill work with unicode instances (and instances of str and unicode subclasses). Currently string.zfill(u"123", 10) results in "0000u'123'". With this patch the result is u'0000000123'. Should zfill be made a real str und unicode method? I noticed that a zfill implementation is available in unicodeobject.c, but commented out. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 14:48 Message: Logged In: YES user_id=6380 If you want to be thorough, yes, that's a good test to add! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 14:47 Message: Logged In: YES user_id=89016 Checked in as: Objects/stringobject.c 2.159 Objects/unicodeobject.c 2.139 Maybe we could add a test to Lib/test/test_unicode.py and Lib/test/test_string.py that makes sure that no method returns a str/unicode subinstance even when called for a str/unicode subinstance? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 14:29 Message: Logged In: YES user_id=6380 Yes, that's the right thing. Reopened this for now. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 14:23 Message: Logged In: YES user_id=89016 Currently zfill returns the original if nothing has to be done. Should I change this to only do it, if it's a real str or unicode instance? (as it was done lots of methods for bug http://www.python.org/sf/460020) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 10:47 Message: Logged In: YES user_id=6380 Yes, please open a separate bug report for those (I'd open a separate report for each file with warnings, unless you have an obvious fix). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 10:43 Message: Logged In: YES user_id=89016 > Does your compiler not warn you? Or did > you ignore warnings? > (The latter's a sin in Python-land :-). The warning was just lost in the long list of outputs. Now that you mention it, there are still a few warnings (gcc 2.96 on Linux): Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format': Objects/unicodeobject.c:5574: warning: int format, long int arg (arg 3) Objects/unicodeobject.c:5574: warning: unsigned int format, long unsigned int arg (arg 4) libpython2.3.a(posixmodule.o): In function `posix_tmpnam': Modules/posixmodule.c:5150: the use of `tmpnam_r' is dangerous, better use `mkstemp' libpython2.3.a(posixmodule.o): In function `posix_tempnam': Modules/posixmodule.c:5100: the use of `tempnam' is dangerous, better use `mkstemp' Modules/pwdmodule.c: In function `initpwd': Modules/pwdmodule.c:161: warning: unused variable `d' Modules/readline.c: In function `set_completer_delims': Modules/readline.c:273: warning: passing arg 1 of `free' discards qualifiers from pointer target type Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not used Should I open a separate bug report for that? > I've also folded some long lines that weren't > your fault -- but I noticed that elsewhere you > checked in some long lines; > please try to limit line length to 78. I noticed your descrobject.c checkin message. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 09:53 Message: Logged In: YES user_id=6380 Thanks, Walter! Some nits: The string_zfill() code you checked in caused two warnings about modifying data pointed to by a const pointer. I've removed the const, but I'd like to understand how come you didn't catch this. Does your compiler not warn you? Or did you ignore warnings? (The latter's a sin in Python-land :-). I've also folded some long lines that weren't your fault -- but I noticed that elsewhere you checked in some long lines; please try to limit line length to 78. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 09:41 Message: Logged In: YES user_id=89016 Checked in as: Doc/lib/libstdtypes.tex 1.88 Lib/UserString.py 1.12 Lib/string.py 1.63 test/string_tests.py 1.13 test/test_unicode.py 1.54 Misc/NEWS 1.388 Objects/stringobject.c 2.157 Objects/unicodeobject.c 2.138 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-12 21:00 Message: Logged In: YES user_id=6380 I'm for making them methods. Walter, just check it in! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-12 14:37 Message: Logged In: YES user_id=89016 Now that test_userstring.py works and fails (rev 1.6) should we add zfill as str and unicode methods or change UserString.zfill to use string.zfill? I've made a patch (attached) that implements zfill as methods (i.e. activates the version in unicodeobject.c that was commented out and implements the same in stringobject.c) (And it adds the test for unicode support back in.) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-12 10:51 Message: Logged In: YES user_id=21627 Re: optional Unicode: Walter is correct; configuring with --disable-unicode currently breaks the string module. One might consider using types.StringTypes; OTOH, pulling in types might not be desirable. As for str vs. repr: Python was always using repr in zfill, so changing it may break things. So I recommend that Walter reverts Andrew's check-in and applies his change. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 06:25 Message: Logged In: YES user_id=6656 Hah, I was going to say that but was distracted by IE wiping out the machine I'm sitting at. Re-opening. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-30 06:16 Message: Logged In: YES user_id=89016 But Python could be compiled without unicode support (by undefining PY_USING_UNICODE), and string.zfill should work even in this case. What about making zfill a real str and unicode method? ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2002-03-29 11:24 Message: Logged In: YES user_id=11375 Thanks for your patch! I've checked it into CVS, with two modifications. First, I removed the code to handle the case where Python doesn't have a unicode() built-in; there's no expection that you can take the standard library for Python version N and use it with version N-1, so this code isn't needed. Second, I changed string.zfill() to take the str() and not the repr() when it gets a non-string object because that seems to make more sense. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 From noreply@sourceforge.net Mon Apr 15 19:54:10 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 11:54:10 -0700 Subject: [Patches] [ python-Patches-531901 ] binary packagers Message-ID: Patches item #531901, was opened at 2002-03-19 15:53 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=531901&group_id=5470 Category: Distutils and setup.py Group: None Status: Open Resolution: None Priority: 5 Submitted By: Mark Alexander (mwa) Assigned to: M.-A. Lemburg (lemburg) Summary: binary packagers Initial Comment: zip file with updated Solaris and HP-UX packagers. Replaces 415226, 415227, 415228. Changes made to take advantage of new PEP241 changes in the Distribution class. ---------------------------------------------------------------------- >Comment By: Mark Alexander (mwa) Date: 2002-04-15 18:54 Message: Logged In: YES user_id=12810 New file submitted. No documentation yet, but I am committed to maintaining them. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-04-11 16:59 Message: Logged In: YES user_id=38388 Mark, could you reupload the ZIP file ? I cannot download it from the SF page (the file is mostly empty). Also, is the documentation already included in the ZIP file ? If not, it would be nice if you could add them as well. I don't require a special PEP for these changes, BTW, but I do require you to maintain them. Thanks. ---------------------------------------------------------------------- Comment By: Mark Alexander (mwa) Date: 2002-03-20 19:55 Message: Logged In: YES user_id=12810 OK, the PEP seems to me to mean most of this is done. These additions are not library modules, they are Distutils "commands". So the way i read it, the Distutils-SIG (where I've been hanging around for some time) are the Maintainers. The documentation will be 2 new chapters for the Distutils manual "Creating Solaris packages" and "Creating HP-UX packages" each looking a whole lot like "Creating RPM packages". Does that clarify anything, or am I still missing a clue? p.s. Thanks for cleaning up the extra uploads! ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-20 15:35 Message: Logged In: YES user_id=21627 You volunteering as the maintainer is part of the prerequisites of accepting new modules, when following PEP 2, see http://python.sourceforge.net/peps/pep-0002.html It says: "developers ... will first form a group of maintainers. Then, this group shall produce a PEP called a library PEP." So existance of a PEP describing these library extensions would be a prerequisite for accepting them. If MAL wants to waive this requirement, it would be fine with me. However, such a PEP could also share text with the documentation, so it might not be wasted effort. ---------------------------------------------------------------------- Comment By: Mark Alexander (mwa) Date: 2002-03-20 14:49 Message: Logged In: YES user_id=12810 Any of the three (they're all the same). SourceForge hiccuped during the upload, and I don't have permission to delete the duplicates. I don't exactly understand what you mean by applying PEP 2. I uploaded this per Marc Lemburg's request for the latest versions of patches 41522[6-8]. He's acting as as the integrator in this case (see http://mail.python.org/pipermail/distutils-sig/2001-December/002659.html). I let him know about the duplicate uploads, so hopefully he'll correct it. If you can and want, feel free to delete the 2 of your choice. I agree they need to be documented. As soon as I can, I'll submit changes to the Distutils documentation. Finally, yes, I'll act as maintainer. I'm on the Distutils-sig and as soon as some other poor soul who has to deal with Solaris or HP-UX tries them, I'm there to work out issues. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-20 07:35 Message: Logged In: YES user_id=21627 Which of the three attached files is the right one (19633, 19634, or 19635)? Unless they are all needed, we should delete the extra copies. I recommend to apply PEP 2 to this patch: A library PEP is needed (which could be quite short), documentation, perhaps test cases. Most importantly, there must be an identified maintainer of these modules. Are you willing to act as the maintainer? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=531901&group_id=5470 From noreply@sourceforge.net Mon Apr 15 20:11:31 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 12:11:31 -0700 Subject: [Patches] [ python-Patches-544330 ] docs for PyObject_Call + PyObject_Length Message-ID: Patches item #544330, was opened at 2002-04-15 21:11 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544330&group_id=5470 Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: Thomas Heller (theller) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: docs for PyObject_Call + PyObject_Length Initial Comment: Summary says all.. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544330&group_id=5470 From noreply@sourceforge.net Mon Apr 15 20:55:33 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 12:55:33 -0700 Subject: [Patches] [ python-Patches-496705 ] Additions & corrections to libmacui.tex Message-ID: Patches item #496705, was opened at 2001-12-25 21:19 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=496705&group_id=5470 Category: Documentation Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Dean Draayer (draayer) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Additions & corrections to libmacui.tex Initial Comment: Includes a thorough description of the relatively new GetArgv function. Greatly expanded the description of the ProgressBar class, as well as updating the description to reflect recent changes to this class. Numerous minor changes - mostly grammatical - made throughout the document. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-04-15 15:55 Message: Logged In: YES user_id=3066 Checked in as Doc/mac/libmacui.tex 1.17 and 1.16.24.1 for the 2.2 maintenance branch (sorry for missing 2.2.1!) and the trunk. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-01-10 18:33 Message: Logged In: YES user_id=3066 Checked in the GetArgv() docs for Python 2.1.2 (Doc/mac/libmacui.tex revision 1.16.6.1); there just wasn't time to worry about making sure I had the ProgressBar docs right (due to the 2.2 addition of the indeterminate version), so I punted on that for 2.1.2. I still need to do the "right thing" for 2.2.* and the trunk. It shouldn't take long, but I can't right now. ---------------------------------------------------------------------- Comment By: Dean Draayer (draayer) Date: 2002-01-08 11:42 Message: Logged In: YES user_id=307112 I don't know which version introduced GetArgv(). I think it was 2.0, but since it wasn't documented I never saw it until 2.1. As far as the barber-pole style progress bars, that will be new in 2.2. So you may want to anotate the appropriate material with a version number there as well. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-01-04 22:53 Message: Logged In: YES user_id=3066 Attached a revised version of the patch (minor changes only, plus fix one markup error). ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-01-04 22:41 Message: Logged In: YES user_id=3066 Excellent! What version of MacPython introduced the GetArgv() function? I'd like to add a version annotation and back-port the portions of the patch that belong in the Python 2.1.2 and 2.2.1 documentation. Thanks for the contribution! ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=496705&group_id=5470 From noreply@sourceforge.net Mon Apr 15 21:52:06 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 13:52:06 -0700 Subject: [Patches] [ python-Patches-544330 ] docs for PyObject_Call + PyObject_Length Message-ID: Patches item #544330, was opened at 2002-04-15 15:11 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544330&group_id=5470 Category: Documentation Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Thomas Heller (theller) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: docs for PyObject_Call + PyObject_Length Initial Comment: Summary says all.. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2002-04-15 16:52 Message: Logged In: YES user_id=3066 Checked in a variant that uses new markup in Doc/api/abstract.tex revision 1.12. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544330&group_id=5470 From noreply@sourceforge.net Tue Apr 16 04:41:50 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 20:41:50 -0700 Subject: [Patches] [ python-Patches-541694 ] whichdb unittest Message-ID: Patches item #541694, was opened at 2002-04-09 15:15 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 Category: Tests Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Gregory H. Ball (greg_ball) Assigned to: Neal Norwitz (nnorwitz) Summary: whichdb unittest Initial Comment: Attached patch is a first crack at a unit test for whichdb. I think that all functionality required for use by the anydbm module is tested, but only for the database modules found in a given installation. The test case is built up at runtime to cover all the available modules, so it is a bit introspective, but I think it is obvious that it should run correctly. Unfortunately it crashes on my box (Redhat 6.2) and this seems to be a real problem with whichdb: it assumes things about the dbm format which turn out to be wrong sometimes. I only discovered this because test_anydbm was crashing, when whichdb failed to work on dbm files. It would not have crashed if dbhash was available... and dbhash was not available because bsddb was not built correctly. So I think there is a build bug there, but I have little idea how to solve that one at this point. Would I be correct in thinking that if this test really uncovers bugs in whichdb, it can't be checked in until they are fixed? Unfortunately I don't know much about the various databases, but I'll try to work with someone on it. ---------------------------------------------------------------------- >Comment By: Gregory H. Ball (greg_ball) Date: 2002-04-15 23:41 Message: Logged In: YES user_id=11365 I get two failures... First, using the dbm module as the engine, whichdb fails to indentify the type. This is apparently a platform problem... whichdbm.py has the comment # Check for dbm first -- this has a .pag and a .dir file but on my system the dbm modules creates a .db file. The 'file' utility says duh.db: Berkeley DB 2.X Hash/Little Endian (Version 5, Logical sequence number: file - 0, offset - 0, Bucket Size 4096, Overflow Point 1, Last Freed 0, Max Bucket 1, High Mask 0x1, Low Mask 0x0, Fill Factor 40, Number of Keys 0) Now, a very simple patch would be to look for .db files and call them 'dbm'. I have no idea though whether there might be other database formats which use this extension. So the thing to do might be to look for .db files and test their magic. Actually, the .db files are identified as "dbhash" databases if named explicitly to whichdb... But the dbhash module isn't available due to missing bsddb! I'm not sure what to make of all this. I could just assume .db files with dbhash magic are always of kind dbm... sound reasonable? Secondly, dumbdbm doesn't work either, if the database is empty... f.read(1) in ["'", '"'] doesn't turn out to be true, since the .dir file is empty. Ok, I've attached a naive patch. Note I'm not even looking at testing dbhash or gdbm since they're not built on my system. On the other hand since anydbm tries these first, maybe they are effectively tested by test_anydbm. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-14 21:10 Message: Logged In: YES user_id=6380 What kind of crash do you experience? Do you have a patch that fixes whichdb? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 From noreply@sourceforge.net Tue Apr 16 04:49:06 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 15 Apr 2002 20:49:06 -0700 Subject: [Patches] [ python-Patches-541694 ] whichdb unittest Message-ID: Patches item #541694, was opened at 2002-04-09 15:15 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 Category: Tests Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Gregory H. Ball (greg_ball) Assigned to: Neal Norwitz (nnorwitz) Summary: whichdb unittest Initial Comment: Attached patch is a first crack at a unit test for whichdb. I think that all functionality required for use by the anydbm module is tested, but only for the database modules found in a given installation. The test case is built up at runtime to cover all the available modules, so it is a bit introspective, but I think it is obvious that it should run correctly. Unfortunately it crashes on my box (Redhat 6.2) and this seems to be a real problem with whichdb: it assumes things about the dbm format which turn out to be wrong sometimes. I only discovered this because test_anydbm was crashing, when whichdb failed to work on dbm files. It would not have crashed if dbhash was available... and dbhash was not available because bsddb was not built correctly. So I think there is a build bug there, but I have little idea how to solve that one at this point. Would I be correct in thinking that if this test really uncovers bugs in whichdb, it can't be checked in until they are fixed? Unfortunately I don't know much about the various databases, but I'll try to work with someone on it. ---------------------------------------------------------------------- >Comment By: Gregory H. Ball (greg_ball) Date: 2002-04-15 23:49 Message: Logged In: YES user_id=11365 More detail... the failure mode of test_anydbm is that a database freshly created with anydbm.open() can't be reopened using the 'r' mode. Since whichdb returns None we wind up at raise error, "need 'c' or 'n' flag to open new db" Of course, whichdb is to blame for this. ---------------------------------------------------------------------- Comment By: Gregory H. Ball (greg_ball) Date: 2002-04-15 23:41 Message: Logged In: YES user_id=11365 I get two failures... First, using the dbm module as the engine, whichdb fails to indentify the type. This is apparently a platform problem... whichdbm.py has the comment # Check for dbm first -- this has a .pag and a .dir file but on my system the dbm modules creates a .db file. The 'file' utility says duh.db: Berkeley DB 2.X Hash/Little Endian (Version 5, Logical sequence number: file - 0, offset - 0, Bucket Size 4096, Overflow Point 1, Last Freed 0, Max Bucket 1, High Mask 0x1, Low Mask 0x0, Fill Factor 40, Number of Keys 0) Now, a very simple patch would be to look for .db files and call them 'dbm'. I have no idea though whether there might be other database formats which use this extension. So the thing to do might be to look for .db files and test their magic. Actually, the .db files are identified as "dbhash" databases if named explicitly to whichdb... But the dbhash module isn't available due to missing bsddb! I'm not sure what to make of all this. I could just assume .db files with dbhash magic are always of kind dbm... sound reasonable? Secondly, dumbdbm doesn't work either, if the database is empty... f.read(1) in ["'", '"'] doesn't turn out to be true, since the .dir file is empty. Ok, I've attached a naive patch. Note I'm not even looking at testing dbhash or gdbm since they're not built on my system. On the other hand since anydbm tries these first, maybe they are effectively tested by test_anydbm. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-14 21:10 Message: Logged In: YES user_id=6380 What kind of crash do you experience? Do you have a patch that fixes whichdb? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 From noreply@sourceforge.net Tue Apr 16 13:46:21 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 16 Apr 2002 05:46:21 -0700 Subject: [Patches] [ python-Patches-541694 ] whichdb unittest Message-ID: Patches item #541694, was opened at 2002-04-09 15:15 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 Category: Tests Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Gregory H. Ball (greg_ball) Assigned to: Neal Norwitz (nnorwitz) Summary: whichdb unittest Initial Comment: Attached patch is a first crack at a unit test for whichdb. I think that all functionality required for use by the anydbm module is tested, but only for the database modules found in a given installation. The test case is built up at runtime to cover all the available modules, so it is a bit introspective, but I think it is obvious that it should run correctly. Unfortunately it crashes on my box (Redhat 6.2) and this seems to be a real problem with whichdb: it assumes things about the dbm format which turn out to be wrong sometimes. I only discovered this because test_anydbm was crashing, when whichdb failed to work on dbm files. It would not have crashed if dbhash was available... and dbhash was not available because bsddb was not built correctly. So I think there is a build bug there, but I have little idea how to solve that one at this point. Would I be correct in thinking that if this test really uncovers bugs in whichdb, it can't be checked in until they are fixed? Unfortunately I don't know much about the various databases, but I'll try to work with someone on it. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-16 08:46 Message: Logged In: YES user_id=6380 Greg, you assigned this to Neal Norwitz. Why? Usually bug reports stay unassigned until a developer shows interest. I doubt that this is Neal's kind of bug: he's not commented on the bug report, nor does this match the other bugs that he's interested in. ---------------------------------------------------------------------- Comment By: Gregory H. Ball (greg_ball) Date: 2002-04-15 23:49 Message: Logged In: YES user_id=11365 More detail... the failure mode of test_anydbm is that a database freshly created with anydbm.open() can't be reopened using the 'r' mode. Since whichdb returns None we wind up at raise error, "need 'c' or 'n' flag to open new db" Of course, whichdb is to blame for this. ---------------------------------------------------------------------- Comment By: Gregory H. Ball (greg_ball) Date: 2002-04-15 23:41 Message: Logged In: YES user_id=11365 I get two failures... First, using the dbm module as the engine, whichdb fails to indentify the type. This is apparently a platform problem... whichdbm.py has the comment # Check for dbm first -- this has a .pag and a .dir file but on my system the dbm modules creates a .db file. The 'file' utility says duh.db: Berkeley DB 2.X Hash/Little Endian (Version 5, Logical sequence number: file - 0, offset - 0, Bucket Size 4096, Overflow Point 1, Last Freed 0, Max Bucket 1, High Mask 0x1, Low Mask 0x0, Fill Factor 40, Number of Keys 0) Now, a very simple patch would be to look for .db files and call them 'dbm'. I have no idea though whether there might be other database formats which use this extension. So the thing to do might be to look for .db files and test their magic. Actually, the .db files are identified as "dbhash" databases if named explicitly to whichdb... But the dbhash module isn't available due to missing bsddb! I'm not sure what to make of all this. I could just assume .db files with dbhash magic are always of kind dbm... sound reasonable? Secondly, dumbdbm doesn't work either, if the database is empty... f.read(1) in ["'", '"'] doesn't turn out to be true, since the .dir file is empty. Ok, I've attached a naive patch. Note I'm not even looking at testing dbhash or gdbm since they're not built on my system. On the other hand since anydbm tries these first, maybe they are effectively tested by test_anydbm. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-14 21:10 Message: Logged In: YES user_id=6380 What kind of crash do you experience? Do you have a patch that fixes whichdb? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 From noreply@sourceforge.net Tue Apr 16 13:52:39 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 16 Apr 2002 05:52:39 -0700 Subject: [Patches] [ python-Patches-541694 ] whichdb unittest Message-ID: Patches item #541694, was opened at 2002-04-09 15:15 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 Category: Tests Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Gregory H. Ball (greg_ball) Assigned to: Neal Norwitz (nnorwitz) Summary: whichdb unittest Initial Comment: Attached patch is a first crack at a unit test for whichdb. I think that all functionality required for use by the anydbm module is tested, but only for the database modules found in a given installation. The test case is built up at runtime to cover all the available modules, so it is a bit introspective, but I think it is obvious that it should run correctly. Unfortunately it crashes on my box (Redhat 6.2) and this seems to be a real problem with whichdb: it assumes things about the dbm format which turn out to be wrong sometimes. I only discovered this because test_anydbm was crashing, when whichdb failed to work on dbm files. It would not have crashed if dbhash was available... and dbhash was not available because bsddb was not built correctly. So I think there is a build bug there, but I have little idea how to solve that one at this point. Would I be correct in thinking that if this test really uncovers bugs in whichdb, it can't be checked in until they are fixed? Unfortunately I don't know much about the various databases, but I'll try to work with someone on it. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2002-04-16 08:52 Message: Logged In: YES user_id=33168 I have reviewed it, but others have stayed on top of this and I didn't have anything to contribute. I will be glad to check the patch in when it is in the proper state. But I don't know much about anydbm. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-16 08:46 Message: Logged In: YES user_id=6380 Greg, you assigned this to Neal Norwitz. Why? Usually bug reports stay unassigned until a developer shows interest. I doubt that this is Neal's kind of bug: he's not commented on the bug report, nor does this match the other bugs that he's interested in. ---------------------------------------------------------------------- Comment By: Gregory H. Ball (greg_ball) Date: 2002-04-15 23:49 Message: Logged In: YES user_id=11365 More detail... the failure mode of test_anydbm is that a database freshly created with anydbm.open() can't be reopened using the 'r' mode. Since whichdb returns None we wind up at raise error, "need 'c' or 'n' flag to open new db" Of course, whichdb is to blame for this. ---------------------------------------------------------------------- Comment By: Gregory H. Ball (greg_ball) Date: 2002-04-15 23:41 Message: Logged In: YES user_id=11365 I get two failures... First, using the dbm module as the engine, whichdb fails to indentify the type. This is apparently a platform problem... whichdbm.py has the comment # Check for dbm first -- this has a .pag and a .dir file but on my system the dbm modules creates a .db file. The 'file' utility says duh.db: Berkeley DB 2.X Hash/Little Endian (Version 5, Logical sequence number: file - 0, offset - 0, Bucket Size 4096, Overflow Point 1, Last Freed 0, Max Bucket 1, High Mask 0x1, Low Mask 0x0, Fill Factor 40, Number of Keys 0) Now, a very simple patch would be to look for .db files and call them 'dbm'. I have no idea though whether there might be other database formats which use this extension. So the thing to do might be to look for .db files and test their magic. Actually, the .db files are identified as "dbhash" databases if named explicitly to whichdb... But the dbhash module isn't available due to missing bsddb! I'm not sure what to make of all this. I could just assume .db files with dbhash magic are always of kind dbm... sound reasonable? Secondly, dumbdbm doesn't work either, if the database is empty... f.read(1) in ["'", '"'] doesn't turn out to be true, since the .dir file is empty. Ok, I've attached a naive patch. Note I'm not even looking at testing dbhash or gdbm since they're not built on my system. On the other hand since anydbm tries these first, maybe they are effectively tested by test_anydbm. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-14 21:10 Message: Logged In: YES user_id=6380 What kind of crash do you experience? Do you have a patch that fixes whichdb? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 From noreply@sourceforge.net Tue Apr 16 16:48:52 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 16 Apr 2002 08:48:52 -0700 Subject: [Patches] [ python-Patches-544733 ] Cygwin test_mmap fix for Python 2.2.1 Message-ID: Patches item #544733, was opened at 2002-04-16 07:48 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544733&group_id=5470 Category: Tests Group: Python 2.2.x Status: Open Resolution: None Priority: 5 Submitted By: Jason Tishler (jlt63) Assigned to: Nobody/Anonymous (nobody) Summary: Cygwin test_mmap fix for Python 2.2.1 Initial Comment: Due to the changes in test_mmap for Python 2.2.1, this test now fails under Cygwin for the following two reasons: o since the test file is left open in the second to last test it causes the last test to fail due to the standard way that Windows "deals" with open files o the last test fails because Windows appears to need the backing file to be flushed before the mmap operation will succeed This patch corrects the above problems. I have also tried this patch under Red Hat Linux 7.1 without any ill effects. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544733&group_id=5470 From noreply@sourceforge.net Tue Apr 16 17:24:36 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 16 Apr 2002 09:24:36 -0700 Subject: [Patches] [ python-Patches-541694 ] whichdb unittest Message-ID: Patches item #541694, was opened at 2002-04-09 15:15 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 Category: Tests Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Gregory H. Ball (greg_ball) Assigned to: Neal Norwitz (nnorwitz) Summary: whichdb unittest Initial Comment: Attached patch is a first crack at a unit test for whichdb. I think that all functionality required for use by the anydbm module is tested, but only for the database modules found in a given installation. The test case is built up at runtime to cover all the available modules, so it is a bit introspective, but I think it is obvious that it should run correctly. Unfortunately it crashes on my box (Redhat 6.2) and this seems to be a real problem with whichdb: it assumes things about the dbm format which turn out to be wrong sometimes. I only discovered this because test_anydbm was crashing, when whichdb failed to work on dbm files. It would not have crashed if dbhash was available... and dbhash was not available because bsddb was not built correctly. So I think there is a build bug there, but I have little idea how to solve that one at this point. Would I be correct in thinking that if this test really uncovers bugs in whichdb, it can't be checked in until they are fixed? Unfortunately I don't know much about the various databases, but I'll try to work with someone on it. ---------------------------------------------------------------------- >Comment By: Gregory H. Ball (greg_ball) Date: 2002-04-16 12:24 Message: Logged In: YES user_id=11365 Neal posted a list to python-dev of standard library modules without unit tests. (<3CB3093C.B7A22727@metaslash.com>, 1 week ago, subject Re: Stability and change) That prompted me to address the breakage that I was seeing in test_anydbm due to whichdb. So I thought he might be interested. If this interrupts the workflow I'll refrain from making assignments with so little justification... ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2002-04-16 08:52 Message: Logged In: YES user_id=33168 I have reviewed it, but others have stayed on top of this and I didn't have anything to contribute. I will be glad to check the patch in when it is in the proper state. But I don't know much about anydbm. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-16 08:46 Message: Logged In: YES user_id=6380 Greg, you assigned this to Neal Norwitz. Why? Usually bug reports stay unassigned until a developer shows interest. I doubt that this is Neal's kind of bug: he's not commented on the bug report, nor does this match the other bugs that he's interested in. ---------------------------------------------------------------------- Comment By: Gregory H. Ball (greg_ball) Date: 2002-04-15 23:49 Message: Logged In: YES user_id=11365 More detail... the failure mode of test_anydbm is that a database freshly created with anydbm.open() can't be reopened using the 'r' mode. Since whichdb returns None we wind up at raise error, "need 'c' or 'n' flag to open new db" Of course, whichdb is to blame for this. ---------------------------------------------------------------------- Comment By: Gregory H. Ball (greg_ball) Date: 2002-04-15 23:41 Message: Logged In: YES user_id=11365 I get two failures... First, using the dbm module as the engine, whichdb fails to indentify the type. This is apparently a platform problem... whichdbm.py has the comment # Check for dbm first -- this has a .pag and a .dir file but on my system the dbm modules creates a .db file. The 'file' utility says duh.db: Berkeley DB 2.X Hash/Little Endian (Version 5, Logical sequence number: file - 0, offset - 0, Bucket Size 4096, Overflow Point 1, Last Freed 0, Max Bucket 1, High Mask 0x1, Low Mask 0x0, Fill Factor 40, Number of Keys 0) Now, a very simple patch would be to look for .db files and call them 'dbm'. I have no idea though whether there might be other database formats which use this extension. So the thing to do might be to look for .db files and test their magic. Actually, the .db files are identified as "dbhash" databases if named explicitly to whichdb... But the dbhash module isn't available due to missing bsddb! I'm not sure what to make of all this. I could just assume .db files with dbhash magic are always of kind dbm... sound reasonable? Secondly, dumbdbm doesn't work either, if the database is empty... f.read(1) in ["'", '"'] doesn't turn out to be true, since the .dir file is empty. Ok, I've attached a naive patch. Note I'm not even looking at testing dbhash or gdbm since they're not built on my system. On the other hand since anydbm tries these first, maybe they are effectively tested by test_anydbm. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-14 21:10 Message: Logged In: YES user_id=6380 What kind of crash do you experience? Do you have a patch that fixes whichdb? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470 From noreply@sourceforge.net Tue Apr 16 18:00:22 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 16 Apr 2002 10:00:22 -0700 Subject: [Patches] [ python-Patches-500311 ] Work around for buggy https servers Message-ID: Patches item #500311, was opened at 2002-01-07 08:49 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=500311&group_id=5470 Category: Modules Group: None Status: Open Resolution: None Priority: 5 Submitted By: Michel Van den Bergh (vdbergh) Assigned to: Martin v. Löwis (loewis) Summary: Work around for buggy https servers Initial Comment: Python 2.2. Tested on RH 7.1. This a workaround for, http://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=494762 The problem is that some https servers close an ssl connection without properly resetting it first. In the above bug description it is suggested that this only occurs for IIS but apparently some (modified) Apache servers also suffer from it (see telemeter.telenet.be). One of the suggested workarounds is to modify httplib.py so as to ignore the combination of err[0]==SSL_ERROR_SYSCALL and err[1]=="EOF occurred in violation of protocol". However I think one should never compare error strings since in principle they may depend on language etc... So I decided to modify _socket.c slightly so that it becomes possible to return error codes which are not in in ssl.h. When an ssl-connection is closed without reset I now return the error code SSL_ERROR_EOF. Then I ignore this (apparently benign) error in httplib.py. In addition I fixed what I think was an error in PySSL_SetError(SSL *ssl, int ret) in socketmodule.c. Originally there was: case SSL_ERROR_SSL: { unsigned long e = ERR_get_error(); if (e == 0) { /* an EOF was observed that violates the protocol */ errstr = "EOF occurred in violation of protocol"; etc... but if I understand the documentation for SSL_get_error then the test should be: e==0 && ret==0. A similar error occurs a few lines later. ---------------------------------------------------------------------- Comment By: Jon Ribbens (jribbens) Date: 2002-04-16 18:00 Message: Logged In: YES user_id=76089 py23ssl.txt works fine for me when applied to latest CVS, and fixes the problem. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-11 06:34 Message: Logged In: YES user_id=21627 Unfortunately, your patch appears to be incorrect. Performing the script in #494762, I get an empty string as the result, whereas the content of the resource is 'HTTPS Test' In case you want to experiment with the CVS version I'll attach a patch for that. ---------------------------------------------------------------------- Comment By: Michel Van den Bergh (vdbergh) Date: 2002-01-09 10:25 Message: Logged In: YES user_id=10252 Due to some problems with sourceforge and incompetence on my part I submitted this several times. Please see patch 500311. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=500311&group_id=5470 From noreply@sourceforge.net Wed Apr 17 00:11:32 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 16 Apr 2002 16:11:32 -0700 Subject: [Patches] [ python-Patches-544909 ] addition of cmath.arg function Message-ID: Patches item #544909, was opened at 2002-04-16 18:11 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544909&group_id=5470 Category: Modules Group: None Status: Open Resolution: None Priority: 5 Submitted By: John Williams (johnw42) Assigned to: Nobody/Anonymous (nobody) Summary: addition of cmath.arg function Initial Comment: This patch adds the familiar "Arg" function from complex analysis to the cmath module, though it's called "arg" here for consistency with the other names. Along with the built-in abs function this makes polar/rectangular coordinate conversions trivial: z = complex(x,y) r, theta = abs(z), arg(z) z = r * exp(1j * theta) x, y = z.real, z.imag ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544909&group_id=5470 From noreply@sourceforge.net Wed Apr 17 06:07:34 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 16 Apr 2002 22:07:34 -0700 Subject: [Patches] [ python-Patches-462754 ] no '_d' ending for mingw32 Message-ID: Patches item #462754, was opened at 2001-09-19 05:29 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=462754&group_id=5470 Category: Distutils and setup.py Group: None Status: Open Resolution: None Priority: 5 Submitted By: Gerhard Häring (ghaering) Assigned to: Nobody/Anonymous (nobody) Summary: no '_d' ending for mingw32 Initial Comment: This patch prevents distutils from naming the extension modules _d.pyd when compiled with mingw32 on Windows in debug mode. Instead, the extension modules will get the normal name .pyd. Technically, the patch doesn't prevent the behaviour for mingw32, but only adds the _d for MS Visual C++ and Borland compilers (though I don't know about the Borland case). The reason for this? Adding "_d" doesn't make any sense for GNU compilers. I think it's just a MS Visual C++ madness. If you want to debug an extension module that was compiled with gcc, you have to use gdb anyway, because the debugging symbols of MSVC++ and gcc are incompatible. So you normally use a release Python version (from the python.org binary download) and compile your extensions with mingw32. To put it shortly: The current state is that you do a "setup.py build --compiler=mingw32 --debug" and then rename the extension modules, removing the _d. Then fire up gdb to debug your module. With this patch, the renaming isn't necessary anymore. ---------------------------------------------------------------------- >Comment By: Gerhard Häring (ghaering) Date: 2002-04-17 07:07 Message: Logged In: YES user_id=163326 If python.exe is compiled --with-pydebug, then this is true. But the point is that I want to compile debug versions of my extension modules and use them with the standard python.exe (*not* python_d.exe). So yes, the patch does work, at least it did when I submitted it . ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-09 12:44 Message: Logged In: YES user_id=21627 Does the patch actually work? It seems to me that, if compiled with-pydebug, import will automatically search for the _d version, and complain if it is not found. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-01-04 12:52 Message: Logged In: YES user_id=21627 The rationale for using the debugging version of MSVCRT are not the debugging information alone, but also the additional functionalities, like heap consistency checks and other assertions. So it is not obvious that you do not want to use the debugging version of this library in a debug build. ---------------------------------------------------------------------- Comment By: Gerhard Häring (ghaering) Date: 2002-01-04 03:50 Message: Logged In: YES user_id=163326 mingw links with msvcrt.dll. I've plans to add mingw32 support to the autoconf build process (hopefully soon enough for 2.3). The GNU and MS debugger symbols are incompatible, though, so I think that mingw32 shouldn't link to the debug version of msrcrt (gdb doesn't understand the Microsoft debugger symbols; and the Visual Studio debugger has no idea what the debugging symbols of gcc are all about; isn't cross-platform and cross-compiler programming fun?). ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-12-30 14:13 Message: Logged In: YES user_id=21627 How does the mingw port interact with the debugging libraries? With MSVC, the debug build will link to the debug versions of the CRT. What C library will mingw link with (I hope it won't use crtdll.dll)? ---------------------------------------------------------------------- Comment By: Gerhard Häring (ghaering) Date: 2001-09-28 23:28 Message: Logged In: YES user_id=163326 Yes. But mingw32 isn't emulating Unix under Windows (that would be Cygwin). It's just a version of gcc and friends that targets native win32. It links against msvcrt (not a Posix emulation library like Cygwin does). This is a bit hypothetical because I didn't yet hack the autoconf build process for native win32 with mingw32. Currently, you cannot build a complete Python with mingw32, but you *can* build extension modules against an existing Python (compiled with M$ VC++). ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-09-28 22:43 Message: Logged In: YES user_id=31435 All else being equal, a system emulating Unix under Windows should strive to make life comfortable for Unix folks. The question is thus whether all else is in fact equal . ---------------------------------------------------------------------- Comment By: Gerhard Häring (ghaering) Date: 2001-09-28 20:37 Message: Logged In: YES user_id=163326 Hmm. I don't like the _d endings at all. But if the policy on win32 is that debug executables and libraries get a "_d" ending, then I'm unsure wether this patch should be applied. I have plans to hack the autoconf madness to build a native win32 Python with mingw32. But that won't be ready by tomorror. And I don't think that I'll add "_d" endings there for debugging, because that would be inconsistent with the normal autoconf builds on Unix. I'm glad that *I* don't have to decide wether this patch is a Good Thing. Being consistent with Python win32 build or with GNU (gcc/autoconf). Take your pick :-) ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-09-19 05:46 Message: Logged In: YES user_id=31435 FYI, MSVC never adds _d on its own -- Mark Hammond and/or Guido forced it to do that. I don't remember why, but one of them explained it to me long ago and it made good sense at the time . MSCV normally compiles debug and release builds into distinct subdirectories, and uses the same names in both. But *our* MSVC setup forces it to compile both flavors of build directly into the PCbuild directory, so has to give the resulting DLLs and executables different names (else the second build would overwrite the results of the first build). ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=462754&group_id=5470 From noreply@sourceforge.net Wed Apr 17 11:21:35 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 03:21:35 -0700 Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks Message-ID: Patches item #432401, was opened at 2001-06-12 15:43 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Postponed Priority: 6 Submitted By: Walter Dörwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: unicode encoding error callbacks Initial Comment: This patch adds unicode error handling callbacks to the encode functionality. With this patch it's possible to not only pass 'strict', 'ignore' or 'replace' as the errors argument to encode, but also a callable function, that will be called with the encoding name, the original unicode object and the position of the unencodable character. The callback must return a replacement unicode object that will be encoded instead of the original character. For example replacing unencodable characters with XML character references can be done in the following way. u"aäoöuüß".encode( "ascii", lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos]) ) ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2002-04-17 12:21 Message: Logged In: YES user_id=89016 Another note: the patch will change the meaning of charmap encoding slightly: currently "replace" will put a ? into the output, even if ? is not in the mapping, i.e. codecs.charmap_encode(u"c", "replace", {ord("a"): ord ("b")}) will return ('?', 1). With the patch the above example will raise an exception. Off course with the patch many more replace characters can appear, so it is vital that for the replacement string the mapping is done. Is this semantic change OK? (I guess all of the existing codecs have a mapping ord("?")->ord("?")) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-15 18:19 Message: Logged In: YES user_id=89016 So this means that the encoder can collect illegal characters and pass it to the callback. "replace" will replace this with (end-start)*u"?". Decoders don't collect all illegal byte sequences, but call the callback once for every byte sequence that has been found illegal and "replace" will replace it with u"?". Does this make sense? ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-15 18:06 Message: Logged In: YES user_id=89016 For encoding it's always (end-start)*u"?": >>> u"ää".encode("ascii", "replace") '??' But for decoding, it is neither nor: >>> "\Ux\U".decode("unicode-escape", "replace") u'\ufffd\ufffd' i.e. a sequence of 5 illegal characters was replace by two replacement characters. This might mean that decoders can't collect all the illegal characters and call the callback once. They might have to call the callback for every single illegal byte sequence to get the old behaviour. (It seems that this patch would be much, much simpler, if we only change the encoders) ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-03-08 19:36 Message: Logged In: YES user_id=38388 Hmm, whatever it takes to maintain backwards compatibility. Do you have an example ? ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-08 18:31 Message: Logged In: YES user_id=89016 What should replace do: Return u"?" or (end-start)*u"?" ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-03-08 16:15 Message: Logged In: YES user_id=38388 Sounds like a good idea. Please keep the encoder and decoder APIs symmetric, though, ie. add the slice information to both APIs. The slice should use the same format as Python's standard slices, that is left inclusive, right exclusive. I like the highlighting feature ! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-08 00:09 Message: Logged In: YES user_id=89016 I'm think about extending the API a little bit: Consider the following example: >>> "\u1".decode("unicode-escape") Traceback (most recent call last): File "", line 1, in ? UnicodeError: encoding 'unicodeescape' can't decode byte 0x31 in position 2: truncated \uXXXX escape The error message is a lie: Not the '1' in position 2 is the problem, but the complete truncated sequence '\u1'. For this the decoder should pass a start and an end position to the handler. For encoding this would be useful too: Suppose I want to have an encoder that colors the unencodable character via an ANSI escape sequences. Then I could do the following: >>> import codecs >>> def color(enc, uni, pos, why, sta): ... return (u"\033[1m<%d>\033[0m" % ord(uni[pos]), pos+1) ... >>> codecs.register_unicodeencodeerrorhandler("color", color) >>> u"aäüöo".encode("ascii", "color") 'a\x1b[1m<228>\x1b[0m\x1b[1m<252>\x1b[0m\x1b[1m<246>\x1b [0mo' But here the sequences "\x1b[0m\x1b[1m" are not needed. To fix this problem the encoder could collect as many unencodable characters as possible and pass those to the error callback in one go (passing a start and end+1 position). This fixes the above problem and reduces the number of calls to the callback, so it should speed up the algorithms in case of custom encoding names. (And it makes the implementation very interesting ;)) What do you think? ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-07 02:29 Message: Logged In: YES user_id=89016 I started from scratch, and the current state is this: Encoding mostly works (except that I haven't changed TranslateCharmap and EncodeDecimal yet) and most of the decoding stuff works (DecodeASCII and DecodeCharmap are still unchanged) and the decoding callback helper isn't optimized for the "builtin" names yet (i.e. it still calls the handler). For encoding the callback helper knows how to handle "strict", "replace", "ignore" and "xmlcharrefreplace" itself and won't call the callback. This should make the encoder fast enough. As callback name string comparison results are cached it might even be faster than the original. The patch so far didn't require any changes to unicodeobject.h, stringobject.h or stringobject.c ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-03-05 17:49 Message: Logged In: YES user_id=38388 Walter, are you making any progress on the new scheme we discussed on the mailing list (adding an error handler registry much like the codec registry itself instead of trying to redo the complete codec API) ? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-09-20 12:38 Message: Logged In: YES user_id=38388 I am postponing this patch until the PEP process has started. This feature won't make it into Python 2.2. Walter, you may want to reference this patch in the PEP. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-08-16 12:53 Message: Logged In: YES user_id=38388 I think we ought to summarize these changes in a PEP to get some more feedback and testing from others as well. I'll look into this after I'm back from vacation on the 10.09. Given the release schedule I am not sure whether this feature will make it into 2.2. The size of the patch is huge and probably needs a lot of testing first. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-07-27 05:55 Message: Logged In: YES user_id=89016 Changing the decoding API is done now. There are new functions codec.register_unicodedecodeerrorhandler and codec.lookup_unicodedecodeerrorhandler. Only the standard handlers for 'strict', 'ignore' and 'replace' are preregistered. There may be many reasons for decoding errors in the byte string, so I added an additional argument to the decoding API: reason, which gives the reason for the failure, e.g.: >>> "\U1111111".decode("unicode_escape") Traceback (most recent call last): File "", line 1, in ? UnicodeError: encoding 'unicodeescape' can't decode byte 0x31 in position 8: truncated \UXXXXXXXX escape >>> "\U11111111".decode("unicode_escape") Traceback (most recent call last): File "", line 1, in ? UnicodeError: encoding 'unicodeescape' can't decode byte 0x31 in position 9: illegal Unicode character For symmetry I added this to the encoding API too: >>> u"\xff".encode("ascii") Traceback (most recent call last): File "", line 1, in ? UnicodeError: encoding 'ascii' can't decode byte 0xff in position 0: ordinal not in range(128) The parameters passed to the callbacks now are: encoding, unicode, position, reason, state. The encoding and decoding API for strings has been adapted too, so now the new API should be usable everywhere: >>> unicode("a\xffb\xffc", "ascii", ... lambda enc, uni, pos, rea, sta: (u"", pos+1)) u'abc' >>> "a\xffb\xffc".decode("ascii", ... lambda enc, uni, pos, rea, sta: (u"", pos+1)) u'abc' I had a problem with the decoding API: all the functions in _codecsmodule.c used the t# format specifier. I changed that to O! with &PyString_Type, because otherwise we would have the problem that the decoding API would must pass buffer object around instead of strings, and the callback would have to call str() on the buffer anyway to access a specific character, so this wouldn't be any faster than calling str() on the buffer before decoding. It seems that buffers aren't used anyway. I changed all the old function to call the new ones so bugfixes don't have to be done in two places. There are two exceptions: I didn't change PyString_AsEncodedString and PyString_AsDecodedString because they are documented as deprecated anyway (although they are called in a few spots) This means that I duplicated part of their functionality in PyString_AsEncodedObjectEx and PyString_AsDecodedObjectEx. There are still a few spots that call the old API: E.g. PyString_Format still calls PyUnicode_Decode (but with strict decoding) because it passes the rest of the format string to PyUnicode_Format when it encounters a Unicode object. Should we switch to the new API everywhere even if strict encoding/decoding is used? The size of this patch begins to scare me. I guess we need an extensive test script for all the new features and documentation. I hope you have time to do that, as I'll be busy with other projects in the next weeks. (BTW, I have't touched PyUnicode_TranslateCharmap yet.) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-07-23 19:03 Message: Logged In: YES user_id=89016 New version of the patch with the error handling callback registry. > > OK, done, now there's a > > PyCodec_EscapeReplaceUnicodeEncodeErrors/ > > codecs.escapereplace_unicodeencode_errors > > that uses \u (or \U if x>0xffff (with a wide build > > of Python)). > > Great! Now PyCodec_EscapeReplaceUnicodeEncodeErrors uses \x in addition to \u and \U where appropriate. > > [...] > > But for special one-shot error handlers, it might still be > > useful to pass the error handler directly, so maybe we > > should leave error as PyObject *, but implement the > > registry anyway? > > Good idea ! > > One minor nit: codecs.registerError() should be named > codecs.register_errorhandler() to be more inline with > the Python coding style guide. OK, but these function are specific to unicode encoding, so now the functions are called: codecs.register_unicodeencodeerrorhandler codecs.lookup_unicodeencodeerrorhandler Now all callbacks (including the new ones: "xmlcharrefreplace" and "escapereplace") are registered in the codecs.c/_PyCodecRegistry_Init so using them is really simple: u"gürk".encode("ascii", "xmlcharrefreplace") ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-07-13 13:26 Message: Logged In: YES user_id=38388 > > > > > BTW, I guess PyUnicode_EncodeUnicodeEscape > > > > > could be reimplemented as PyUnicode_EncodeASCII > > > > > with \uxxxx replacement callback. > > > > > > > > Hmm, wouldn't that result in a slowdown ? If so, > > > > I'd rather leave the special encoder in place, > > > > since it is being used a lot in Python and > > > > probably some applications too. > > > > > > It would be a slowdown. But callbacks open many > > > possiblities. > > > > True, but in this case I believe that we should stick with > > the native implementation for "unicode-escape". Having > > a standard callback error handler which does the \uXXXX > > replacement would be nice to have though, since this would > > also be usable with lots of other codecs (e.g. all the > > code page ones). > > OK, done, now there's a > PyCodec_EscapeReplaceUnicodeEncodeErrors/ > codecs.escapereplace_unicodeencode_errors > that uses \u (or \U if x>0xffff (with a wide build > of Python)). Great ! > > [...] > > > Should the old TranslateCharmap map to the new > > > TranslateCharmapEx and inherit the > > > "multicharacter replacement" feature, > > > or should I leave it as it is? > > > > If possible, please also add the multichar replacement > > to the old API. I think it is very useful and since the > > old APIs work on raw buffers it would be a benefit to have > > the functionality in the old implementation too. > > OK! I will try to find the time to implement that in the > next days. Good. > > [Decoding error callbacks] > > > > About the return value: > > > > I'd suggest to always use the same tuple interface, e.g. > > > > callback(encoding, input_data, input_position, > state) -> > > (output_to_be_appended, new_input_position) > > > > (I think it's better to use absolute values for the > > position rather than offsets.) > > > > Perhaps the encoding callbacks should use the same > > interface... what do you think ? > > This would make the callback feature hypergeneric and a > little slower, because tuples have to be created, but it > (almost) unifies the encoding and decoding API. ("almost" > because, for the encoder output_to_be_appended will be > reencoded, for the decoder it will simply be appended.), > so I'm for it. That's the point. Note that I don't think the tuple creation will hurt much (see the make_tuple() API in codecs.c) since small tuples are cached by Python internally. > I implemented this and changed the encoders to only > lookup the error handler on the first error. The UCS1 > encoder now no longer uses the two-item stack strategy. > (This strategy only makes sense for those encoder where > the encoding itself is much more complicated than the > looping/callback etc.) So now memory overflow tests are > only done, when an unencodable error occurs, so now the > UCS1 encoder should be as fast as it was without > error callbacks. > > Do we want to enforce new_input_position>input_position, > or should jumping back be allowed? No; moving backwards should be allowed (this may be useful in order to resynchronize with the input data). > Here's is the current todo list: > 1. implement a new TranslateCharmap and fix the old. > 2. New encoding API for string objects too. > 3. Decoding > 4. Documentation > 5. Test cases > > I'm thinking about a different strategy for implementing > callbacks > (see http://mail.python.org/pipermail/i18n-sig/2001- > July/001262.html) > > We coould have a error handler registry, which maps names > to error handlers, then it would be possible to keep the > errors argument as "const char *" instead of "PyObject *". > Currently PyCodec_UnicodeEncodeHandlerForObject is a > backwards compatibility hack that will never go away, > because > it's always more convenient to type > u"...".encode("...", "strict") > instead of > import codecs > u"...".encode("...", codecs.raise_encode_errors) > > But with an error handler registry this function would > become the official lookup method for error handlers. > (PyCodec_LookupUnicodeEncodeErrorHandler?) > Python code would look like this: > --- > def xmlreplace(encoding, unicode, pos, state): > return (u"&#%d;" % ord(uni[pos]), pos+1) > > import codec > > codec.registerError("xmlreplace",xmlreplace) > --- > and then the following call can be made: > u"äöü".encode("ascii", "xmlreplace") > As soon as the first error is encountered, the encoder uses > its builtin error handling method if it recognizes the name > ("strict", "replace" or "ignore") or looks up the error > handling function in the registry if it doesn't. In this way > the speed for the backwards compatible features is the same > as before and "const char *error" can be kept as the > parameter to all encoding functions. For speed common error > handling names could even be implemented in the encoder > itself. > > But for special one-shot error handlers, it might still be > useful to pass the error handler directly, so maybe we > should leave error as PyObject *, but implement the > registry anyway? Good idea ! One minor nit: codecs.registerError() should be named codecs.register_errorhandler() to be more inline with the Python coding style guide. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-07-12 13:03 Message: Logged In: YES user_id=89016 > > [...] > > so I guess we could change the replace handler > > to always return u'?'. This would make the > > implementation a little bit simpler, but the > > explanation of the callback feature *a lot* > > simpler. > > Go for it. OK, done! > [...] > > > Could you add these docs to the Misc/unicode.txt > > > file ? I will eventually take that file and turn > > > it into a PEP which will then serve as general > > > documentation for these things. > > > > I could, but first we should work out how the > > decoding callback API will work. > > Ok. BTW, Barry Warsaw already did the work of converting > the unicode.txt to PEP 100, so the docs should eventually > go there. OK. I guess it would be best to do this when everything is finished. > > > > BTW, I guess PyUnicode_EncodeUnicodeEscape > > > > could be reimplemented as PyUnicode_EncodeASCII > > > > with \uxxxx replacement callback. > > > > > > Hmm, wouldn't that result in a slowdown ? If so, > > > I'd rather leave the special encoder in place, > > > since it is being used a lot in Python and > > > probably some applications too. > > > > It would be a slowdown. But callbacks open many > > possiblities. > > True, but in this case I believe that we should stick with > the native implementation for "unicode-escape". Having > a standard callback error handler which does the \uXXXX > replacement would be nice to have though, since this would > also be usable with lots of other codecs (e.g. all the > code page ones). OK, done, now there's a PyCodec_EscapeReplaceUnicodeEncodeErrors/ codecs.escapereplace_unicodeencode_errors that uses \u (or \U if x>0xffff (with a wide build of Python)). > > For example: > > > > Why can't I print u"gürk"? > > > > is probably one of the most frequently asked > > questions in comp.lang.python. For printing > > Unicode stuff, print could be extended the use an > > error handling callback for Unicode strings (or > > objects where __str__ or tp_str returns a Unicode > > object) instead of using str() which always > > returns an 8bit string and uses strict encoding. > > There might even be a > > sys.setprintencodehandler()/sys.getprintencodehandler () > > There already is a print callback in Python (forgot the > name of the hook though), so this should be possible by > providing the encoding logic in the hook. True: sys.displayhook > [...] > > Should the old TranslateCharmap map to the new > > TranslateCharmapEx and inherit the > > "multicharacter replacement" feature, > > or should I leave it as it is? > > If possible, please also add the multichar replacement > to the old API. I think it is very useful and since the > old APIs work on raw buffers it would be a benefit to have > the functionality in the old implementation too. OK! I will try to find the time to implement that in the next days. > [Decoding error callbacks] > > About the return value: > > I'd suggest to always use the same tuple interface, e.g. > > callback(encoding, input_data, input_position, state) -> > (output_to_be_appended, new_input_position) > > (I think it's better to use absolute values for the > position rather than offsets.) > > Perhaps the encoding callbacks should use the same > interface... what do you think ? This would make the callback feature hypergeneric and a little slower, because tuples have to be created, but it (almost) unifies the encoding and decoding API. ("almost" because, for the encoder output_to_be_appended will be reencoded, for the decoder it will simply be appended.), so I'm for it. I implemented this and changed the encoders to only lookup the error handler on the first error. The UCS1 encoder now no longer uses the two-item stack strategy. (This strategy only makes sense for those encoder where the encoding itself is much more complicated than the looping/callback etc.) So now memory overflow tests are only done, when an unencodable error occurs, so now the UCS1 encoder should be as fast as it was without error callbacks. Do we want to enforce new_input_position>input_position, or should jumping back be allowed? > > > > One additional note: It is vital that errors > > > > is an assignable attribute of the StreamWriter. > > > > > > It is already ! > > > > I know, but IMHO it should be documented that an > > assignable errors attribute must be supported > > as part of the official codec API. > > > > Misc/unicode.txt is not clear on that: > > """ > > It is not required by the Unicode implementation > > to use these base classes, only the interfaces must > > match; this allows writing Codecs as extension types. > > """ > > Good point. I'll add that to the PEP 100. OK. Here's is the current todo list: 1. implement a new TranslateCharmap and fix the old. 2. New encoding API for string objects too. 3. Decoding 4. Documentation 5. Test cases I'm thinking about a different strategy for implementing callbacks (see http://mail.python.org/pipermail/i18n-sig/2001- July/001262.html) We coould have a error handler registry, which maps names to error handlers, then it would be possible to keep the errors argument as "const char *" instead of "PyObject *". Currently PyCodec_UnicodeEncodeHandlerForObject is a backwards compatibility hack that will never go away, because it's always more convenient to type u"...".encode("...", "strict") instead of import codecs u"...".encode("...", codecs.raise_encode_errors) But with an error handler registry this function would become the official lookup method for error handlers. (PyCodec_LookupUnicodeEncodeErrorHandler?) Python code would look like this: --- def xmlreplace(encoding, unicode, pos, state): return (u"&#%d;" % ord(uni[pos]), pos+1) import codec codec.registerError("xmlreplace",xmlreplace) --- and then the following call can be made: u"äöü".encode("ascii", "xmlreplace") As soon as the first error is encountered, the encoder uses its builtin error handling method if it recognizes the name ("strict", "replace" or "ignore") or looks up the error handling function in the registry if it doesn't. In this way the speed for the backwards compatible features is the same as before and "const char *error" can be kept as the parameter to all encoding functions. For speed common error handling names could even be implemented in the encoder itself. But for special one-shot error handlers, it might still be useful to pass the error handler directly, so maybe we should leave error as PyObject *, but implement the registry anyway? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-07-10 14:29 Message: Logged In: YES user_id=38388 Ok, here we go... > > > raise an exception). U+FFFD characters in the > replacement > > > string will be replaced with a character that the > encoder > > > chooses ('?' in all cases). > > > > Nice. > > But the special casing of U+FFFD makes the interface > somewhat > less clean than it could be. It was only done to be 100% > backwards compatible. With the original "replace" > error > handling the codec chose the replacement character. But as > far as I can tell none of the codecs uses anything other > than '?', True. > so I guess we could change the replace handler > to always return u'?'. This would make the implementation a > little bit simpler, but the explanation of the callback > feature *a lot* simpler. Go for it. > And if you still want to handle > an unencodable U+FFFD, you can write a special callback for > that, e.g. > > def FFFDreplace(enc, uni, pos): > if uni[pos] == "\ufffd": > return u"?" > else: > raise UnicodeError(...) > > > ...docs... > > > > Could you add these docs to the Misc/unicode.txt file ? I > > will eventually take that file and turn it into a PEP > which > > will then serve as general documentation for these things. > > I could, but first we should work out how the decoding > callback API will work. Ok. BTW, Barry Warsaw already did the work of converting the unicode.txt to PEP 100, so the docs should eventually go there. > > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > > > replacement callback. > > > > Hmm, wouldn't that result in a slowdown ? If so, I'd > rather > > leave the special encoder in place, since it is being > used a > > lot in Python and probably some applications too. > > It would be a slowdown. But callbacks open many > possiblities. True, but in this case I believe that we should stick with the native implementation for "unicode-escape". Having a standard callback error handler which does the \uXXXX replacement would be nice to have though, since this would also be usable with lots of other codecs (e.g. all the code page ones). > For example: > > Why can't I print u"gürk"? > > is probably one of the most frequently asked questions in > comp.lang.python. For printing Unicode stuff, print could be > extended the use an error handling callback for Unicode > strings (or objects where __str__ or tp_str returns a > Unicode object) instead of using str() which always returns > an 8bit string and uses strict encoding. There might even > be a > sys.setprintencodehandler()/sys.getprintencodehandler() There already is a print callback in Python (forgot the name of the hook though), so this should be possible by providing the encoding logic in the hook. > > > I have not touched PyUnicode_TranslateCharmap yet, > > > should this function also support error callbacks? Why > > > would one want the insert None into the mapping to > call > > > the callback? > > > > 1. Yes. > > 2. The user may want to e.g. restrict usage of certain > > character ranges. In this case the codec would be used to > > verify the input and an exception would indeed be useful > > (e.g. say you want to restrict input to Hangul + ASCII). > > OK, do we want TranslateCharmap to work exactly like > encoding, > i.e. in case of an error should the returned replacement > string again be mapped through the translation mapping or > should it be copied to the output directly? The former would > be more in line with encoding, but IMHO the latter would > be much more useful. It's better to take the second approach (copy the callback output directly to the output string) to avoid endless recursion and other pitfalls. I suppose this will also simplify the implementation somewhat. > BTW, when I implement it I can implement patch #403100 > ("Multicharacter replacements in > PyUnicode_TranslateCharmap") > along the way. I've seen it; will comment on it later. > Should the old TranslateCharmap map to the new > TranslateCharmapEx > and inherit the "multicharacter replacement" feature, > or > should I leave it as it is? If possible, please also add the multichar replacement to the old API. I think it is very useful and since the old APIs work on raw buffers it would be a benefit to have the functionality in the old implementation too. [Decoding error callbacks] > > > A remaining problem is how to implement decoding error > > > callbacks. In Python 2.1 encoding and decoding errors > are > > > handled in the same way with a string value. But with > > > callbacks it doesn't make sense to use the same > callback > > > for encoding and decoding (like > codecs.StreamReaderWriter > > > and codecs.StreamRecoder do). Decoding callbacks have > a > > > different API. Which arguments should be passed to the > > > decoding callback, and what is the decoding callback > > > supposed to do? > > > > I'd suggest adding another set of PyCodec_UnicodeDecode... > () > > APIs for this. We'd then have to augment the base classes > of > > the StreamCodecs to provide two attributes for .errors > with > > a fallback solution for the string case (i.s. "strict" > can > > still be used for both directions). > > Sounds good. Now what is the decoding callback supposed to > do? > I guess it will be called in the same way as the encoding > callback, i.e. with encoding name, original string and > position of the error. It might returns a Unicode string > (i.e. an object of the decoding target type), that will be > emitted from the codec instead of the one offending byte. Or > it might return a tuple with replacement Unicode object and > a resynchronisation offset, i.e. returning (u"?", 1) > means > emit a '?' and skip the offending character. But to make > the offset really useful the callback has to know something > about the encoding, perhaps the codec should be allowed to > pass an additional state object to the callback? > > Maybe the same should be added to the encoding callbacks to? > Maybe the encoding callback should be able to tell the > encoder if the replacement returned should be reencoded > (in which case it's a Unicode object), or directly emitted > (in which case it's an 8bit string)? I like the idea of having an optional state object (basically this should be a codec-defined arbitrary Python object) which then allow the callback to apply additional tricks. The object should be documented to be modifyable in place (simplifies the interface). About the return value: I'd suggest to always use the same tuple interface, e.g. callback(encoding, input_data, input_position, state) -> (output_to_be_appended, new_input_position) (I think it's better to use absolute values for the position rather than offsets.) Perhaps the encoding callbacks should use the same interface... what do you think ? > > > One additional note: It is vital that errors is an > > > assignable attribute of the StreamWriter. > > > > It is already ! > > I know, but IMHO it should be documented that an assignable > errors attribute must be supported as part of the official > codec API. > > Misc/unicode.txt is not clear on that: > """ > It is not required by the Unicode implementation to use > these base classes, only the interfaces must match; this > allows writing Codecs as extension types. > """ Good point. I'll add that to the PEP 100. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-22 22:51 Message: Logged In: YES user_id=38388 Sorry to keep you waiting, Walter. I will look into this again next week -- this week was way too busy... ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-13 19:00 Message: Logged In: YES user_id=38388 On your comment about the non-Unicode codecs: let's keep this separated from the current patch. Don't have much time today. I'll comment on the other things tomorrow. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-13 17:49 Message: Logged In: YES user_id=89016 Guido van Rossum wrote in python-dev: > True, the "codec" pattern can be used for other > encodings than Unicode. But it seems to me that the > entire codecs architecture is rather strongly geared > towards en/decoding Unicode, and it's not clear > how well other codecs fit in this pattern (e.g. I > noticed that all the non-Unicode codecs ignore the > error handling parameter or assert that > it is set to 'strict'). I noticed that too. asserting that errors=='strict' would mean that the encoder is not able to deal in any other way with unencodable stuff than by raising an error. But that is not the problem here, because for zlib, base64, quopri, hex and uu encoding there can be no unencodable characters. The encoders can simply ignore the errors parameter. Should I remove the asserts from those codecs and change the docstrings accordingly, or will this be done separately? ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-13 15:57 Message: Logged In: YES user_id=89016 > > [...] > > raise an exception). U+FFFD characters in the replacement > > string will be replaced with a character that the encoder > > chooses ('?' in all cases). > > Nice. But the special casing of U+FFFD makes the interface somewhat less clean than it could be. It was only done to be 100% backwards compatible. With the original "replace" error handling the codec chose the replacement character. But as far as I can tell none of the codecs uses anything other than '?', so I guess we could change the replace handler to always return u'?'. This would make the implementation a little bit simpler, but the explanation of the callback feature *a lot* simpler. And if you still want to handle an unencodable U+FFFD, you can write a special callback for that, e.g. def FFFDreplace(enc, uni, pos): if uni[pos] == "\ufffd": return u"?" else: raise UnicodeError(...) > > The implementation of the loop through the string is done > > in the following way. A stack with two strings is kept > > and the loop always encodes a character from the string > > at the stacktop. If an error is encountered and the stack > > has only one entry (during encoding of the original string) > > the callback is called and the unicode object returned is > > pushed on the stack, so the encoding continues with the > > replacement string. If the stack has two entries when an > > error is encountered, the replacement string itself has > > an unencodable character and a normal exception raised. > > When the encoder has reached the end of it's current string > > there are two possibilities: when the stack contains two > > entries, this was the replacement string, so the replacement > > string will be poppep from the stack and encoding continues > > with the next character from the original string. If the > > stack had only one entry, encoding is finished. > > Very elegant solution ! I'll put it as a comment in the source. > > (I hope that's enough explanation of the API and > implementation) > > Could you add these docs to the Misc/unicode.txt file ? I > will eventually take that file and turn it into a PEP which > will then serve as general documentation for these things. I could, but first we should work out how the decoding callback API will work. > > I have renamed the static ...121 function to all lowercase > > names. > > Ok. > > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > > replacement callback. > > Hmm, wouldn't that result in a slowdown ? If so, I'd rather > leave the special encoder in place, since it is being used a > lot in Python and probably some applications too. It would be a slowdown. But callbacks open many possiblities. For example: Why can't I print u"gürk"? is probably one of the most frequently asked questions in comp.lang.python. For printing Unicode stuff, print could be extended the use an error handling callback for Unicode strings (or objects where __str__ or tp_str returns a Unicode object) instead of using str() which always returns an 8bit string and uses strict encoding. There might even be a sys.setprintencodehandler()/sys.getprintencodehandler() > [...] > I think it would be worthwhile to rename the callbacks to > include "Unicode" somewhere, e.g. > PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but > then it points out the application field of the callback > rather well. Same for the callbacks exposed through the > _codecsmodule. OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors really is a long name ;)) > > I have not touched PyUnicode_TranslateCharmap yet, > > should this function also support error callbacks? Why > > would one want the insert None into the mapping to call > > the callback? > > 1. Yes. > 2. The user may want to e.g. restrict usage of certain > character ranges. In this case the codec would be used to > verify the input and an exception would indeed be useful > (e.g. say you want to restrict input to Hangul + ASCII). OK, do we want TranslateCharmap to work exactly like encoding, i.e. in case of an error should the returned replacement string again be mapped through the translation mapping or should it be copied to the output directly? The former would be more in line with encoding, but IMHO the latter would be much more useful. BTW, when I implement it I can implement patch #403100 ("Multicharacter replacements in PyUnicode_TranslateCharmap") along the way. Should the old TranslateCharmap map to the new TranslateCharmapEx and inherit the "multicharacter replacement" feature, or should I leave it as it is? > > A remaining problem is how to implement decoding error > > callbacks. In Python 2.1 encoding and decoding errors are > > handled in the same way with a string value. But with > > callbacks it doesn't make sense to use the same callback > > for encoding and decoding (like codecs.StreamReaderWriter > > and codecs.StreamRecoder do). Decoding callbacks have a > > different API. Which arguments should be passed to the > > decoding callback, and what is the decoding callback > > supposed to do? > > I'd suggest adding another set of PyCodec_UnicodeDecode... () > APIs for this. We'd then have to augment the base classes of > the StreamCodecs to provide two attributes for .errors with > a fallback solution for the string case (i.s. "strict" can > still be used for both directions). Sounds good. Now what is the decoding callback supposed to do? I guess it will be called in the same way as the encoding callback, i.e. with encoding name, original string and position of the error. It might returns a Unicode string (i.e. an object of the decoding target type), that will be emitted from the codec instead of the one offending byte. Or it might return a tuple with replacement Unicode object and a resynchronisation offset, i.e. returning (u"?", 1) means emit a '?' and skip the offending character. But to make the offset really useful the callback has to know something about the encoding, perhaps the codec should be allowed to pass an additional state object to the callback? Maybe the same should be added to the encoding callbacks to? Maybe the encoding callback should be able to tell the encoder if the replacement returned should be reencoded (in which case it's a Unicode object), or directly emitted (in which case it's an 8bit string)? > > One additional note: It is vital that errors is an > > assignable attribute of the StreamWriter. > > It is already ! I know, but IMHO it should be documented that an assignable errors attribute must be supported as part of the official codec API. Misc/unicode.txt is not clear on that: """ It is not required by the Unicode implementation to use these base classes, only the interfaces must match; this allows writing Codecs as extension types. """ ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-13 10:05 Message: Logged In: YES user_id=38388 > How the callbacks work: > > A PyObject * named errors is passed in. This may by NULL, > Py_None, 'strict', u'strict', 'ignore', u'ignore', > 'replace', u'replace' or a callable object. > PyCodec_EncodeHandlerForObject maps all of these objects to > one of the three builtin error callbacks > PyCodec_RaiseEncodeErrors (raises an exception), > PyCodec_IgnoreEncodeErrors (returns an empty replacement > string, in effect ignoring the error), > PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode > replacement character to signify to the encoder that it > should choose a suitable replacement character) or directly > returns errors if it is a callable object. When an > unencodable character is encounterd the error handling > callback will be called with the encoding name, the original > unicode object and the error position and must return a > unicode object that will be encoded instead of the offending > character (or the callback may of course raise an > exception). U+FFFD characters in the replacement string will > be replaced with a character that the encoder chooses ('?' > in all cases). Nice. > The implementation of the loop through the string is done in > the following way. A stack with two strings is kept and the > loop always encodes a character from the string at the > stacktop. If an error is encountered and the stack has only > one entry (during encoding of the original string) the > callback is called and the unicode object returned is pushed > on the stack, so the encoding continues with the replacement > string. If the stack has two entries when an error is > encountered, the replacement string itself has an > unencodable character and a normal exception raised. When > the encoder has reached the end of it's current string there > are two possibilities: when the stack contains two entries, > this was the replacement string, so the replacement string > will be poppep from the stack and encoding continues with > the next character from the original string. If the stack > had only one entry, encoding is finished. Very elegant solution ! > (I hope that's enough explanation of the API and implementation) Could you add these docs to the Misc/unicode.txt file ? I will eventually take that file and turn it into a PEP which will then serve as general documentation for these things. > I have renamed the static ...121 function to all lowercase > names. Ok. > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > replacement callback. Hmm, wouldn't that result in a slowdown ? If so, I'd rather leave the special encoder in place, since it is being used a lot in Python and probably some applications too. > PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, > PyCodec_ReplaceEncodeErrors are globally visible because > they have to be available in _codecsmodule.c to wrap them as > Python function objects, but they can't be implemented in > _codecsmodule, because they need to be available to the > encoders in unicodeobject.c (through > PyCodec_EncodeHandlerForObject), but importing the codecs > module might result in an endless recursion, because > importing a module requires unpickling of the bytecode, > which might require decoding utf8, which ... (but this will > only happen, if we implement the same mechanism for the > decoding API) I think that codecs.c is the right place for these APIs. _codecsmodule.c is only meant as Python access wrapper for the internal codecs and nothing more. One thing I noted about the callbacks: they assume that they will always get Unicode objects as input. This is certainly not true in the general case (it is for the codecs you touch in the patch). I think it would be worthwhile to rename the callbacks to include "Unicode" somewhere, e.g. PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but then it points out the application field of the callback rather well. Same for the callbacks exposed through the _codecsmodule. > I have not touched PyUnicode_TranslateCharmap yet, > should this function also support error callbacks? Why would > one want the insert None into the mapping to call the callback? 1. Yes. 2. The user may want to e.g. restrict usage of certain character ranges. In this case the codec would be used to verify the input and an exception would indeed be useful (e.g. say you want to restrict input to Hangul + ASCII). > A remaining problem is how to implement decoding error > callbacks. In Python 2.1 encoding and decoding errors are > handled in the same way with a string value. But with > callbacks it doesn't make sense to use the same callback for > encoding and decoding (like codecs.StreamReaderWriter and > codecs.StreamRecoder do). Decoding callbacks have a > different API. Which arguments should be passed to the > decoding callback, and what is the decoding callback > supposed to do? I'd suggest adding another set of PyCodec_UnicodeDecode...() APIs for this. We'd then have to augment the base classes of the StreamCodecs to provide two attributes for .errors with a fallback solution for the string case (i.s. "strict" can still be used for both directions). > One additional note: It is vital that errors is an > assignable attribute of the StreamWriter. It is already ! > Consider the XML example: For writing an XML DOM tree one > StreamWriter object is used. When a text node is written, > the error handling has to be set to > codecs.xmlreplace_encode_errors, but inside a comment or > processing instruction replacing unencodable characters with > charrefs is not possible, so here codecs.raise_encode_errors > should be used (or better a custom error handler that raises > an error that says "sorry, you can't have unencodable > characters inside a comment") Sure. > BTW, should we continue the discussion in the i18n SIG > mailing list? An email program is much more comfortable than > a HTML textarea! ;) I'd rather keep the discussions on this patch here -- forking it off to the i18n sig will make it very hard to follow up on it. (This HTML area is indeed damn small ;-) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 21:18 Message: Logged In: YES user_id=89016 One additional note: It is vital that errors is an assignable attribute of the StreamWriter. Consider the XML example: For writing an XML DOM tree one StreamWriter object is used. When a text node is written, the error handling has to be set to codecs.xmlreplace_encode_errors, but inside a comment or processing instruction replacing unencodable characters with charrefs is not possible, so here codecs.raise_encode_errors should be used (or better a custom error handler that raises an error that says "sorry, you can't have unencodable characters inside a comment") BTW, should we continue the discussion in the i18n SIG mailing list? An email program is much more comfortable than a HTML textarea! ;) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 20:59 Message: Logged In: YES user_id=89016 How the callbacks work: A PyObject * named errors is passed in. This may by NULL, Py_None, 'strict', u'strict', 'ignore', u'ignore', 'replace', u'replace' or a callable object. PyCodec_EncodeHandlerForObject maps all of these objects to one of the three builtin error callbacks PyCodec_RaiseEncodeErrors (raises an exception), PyCodec_IgnoreEncodeErrors (returns an empty replacement string, in effect ignoring the error), PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode replacement character to signify to the encoder that it should choose a suitable replacement character) or directly returns errors if it is a callable object. When an unencodable character is encounterd the error handling callback will be called with the encoding name, the original unicode object and the error position and must return a unicode object that will be encoded instead of the offending character (or the callback may of course raise an exception). U+FFFD characters in the replacement string will be replaced with a character that the encoder chooses ('?' in all cases). The implementation of the loop through the string is done in the following way. A stack with two strings is kept and the loop always encodes a character from the string at the stacktop. If an error is encountered and the stack has only one entry (during encoding of the original string) the callback is called and the unicode object returned is pushed on the stack, so the encoding continues with the replacement string. If the stack has two entries when an error is encountered, the replacement string itself has an unencodable character and a normal exception raised. When the encoder has reached the end of it's current string there are two possibilities: when the stack contains two entries, this was the replacement string, so the replacement string will be poppep from the stack and encoding continues with the next character from the original string. If the stack had only one entry, encoding is finished. (I hope that's enough explanation of the API and implementation) I have renamed the static ...121 function to all lowercase names. BTW, I guess PyUnicode_EncodeUnicodeEscape could be reimplemented as PyUnicode_EncodeASCII with a \uxxxx replacement callback. PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, PyCodec_ReplaceEncodeErrors are globally visible because they have to be available in _codecsmodule.c to wrap them as Python function objects, but they can't be implemented in _codecsmodule, because they need to be available to the encoders in unicodeobject.c (through PyCodec_EncodeHandlerForObject), but importing the codecs module might result in an endless recursion, because importing a module requires unpickling of the bytecode, which might require decoding utf8, which ... (but this will only happen, if we implement the same mechanism for the decoding API) I have not touched PyUnicode_TranslateCharmap yet, should this function also support error callbacks? Why would one want the insert None into the mapping to call the callback? A remaining problem is how to implement decoding error callbacks. In Python 2.1 encoding and decoding errors are handled in the same way with a string value. But with callbacks it doesn't make sense to use the same callback for encoding and decoding (like codecs.StreamReaderWriter and codecs.StreamRecoder do). Decoding callbacks have a different API. Which arguments should be passed to the decoding callback, and what is the decoding callback supposed to do? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 20:00 Message: Logged In: YES user_id=38388 About the Py_UNICODE*data, int size APIs: Ok, point taken. In general, I think we ought to keep the callback feature as open as possible, so passing in pointers and sizes would not be very useful. BTW, could you summarize how the callback works in a few lines ? About _Encode121: I'd name this _EncodeUCS1 since that's what it is ;-) About the new functions: I was referring to the new static functions which you gave PyUnicode_... names. If these are not supposed to turn into non-static functions, I'd rather have them use lower case names (since that's how the Python internals work too -- most of the times). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 18:56 Message: Logged In: YES user_id=89016 > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. Another problem is, that the callback requires a Python object, so in the PyObject *version, the refcount is incref'd and the object is passed to the callback. The Py_UNICODE*/int version would have to create a new Unicode object from the data. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 18:32 Message: Logged In: YES user_id=89016 > * please don't place more than one C statement on one line > like in: > """ > + unicode = unicode2; unicodepos = > unicode2pos; > + unicode2 = NULL; unicode2pos = 0; > """ OK, done! > * Comments should start with a capital letter and be > prepended > to the section they apply to Fixed! > * There should be spaces between arguments in compares > (a == b) not (a==b) Fixed! > * Where does the name "...Encode121" originate ? encode one-to-one, it implements both ASCII and latin-1 encoding. > * module internal APIs should use lower case names (you > converted some of these to PyUnicode_...() -- this is > normally reserved for APIs which are either marked as > potential candidates for the public API or are very > prominent in the code) Which ones? I introduced a new function for every old one, that had a "const char *errors" argument, and a few new ones in codecs.h, of those PyCodec_EncodeHandlerForObject is vital, because it is used to map for old string arguments to the new function objects. PyCodec_RaiseEncodeErrors can be used in the encoder implementation to raise an encode error, but it could be made static in unicodeobject.h so only those encoders implemented there have access to it. > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. I look through the code and found no situation where the Py_UNICODE*/int version is really used and having two (PyObject *)s (the original and the replacement string), instead of UNICODE*/int and PyObject * made the implementation a little easier, but I can fix that. > Please separate the errors.c patch from this patch -- it > seems totally unrelated to Unicode. PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with four hex digits. I removed it. I'll upload a revised patch as soon as it's done. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 16:29 Message: Logged In: YES user_id=38388 Thanks for the patch -- it looks very impressive !. I'll give it a try later this week. Some first cosmetic tidbits: * please don't place more than one C statement on one line like in: """ + unicode = unicode2; unicodepos = unicode2pos; + unicode2 = NULL; unicode2pos = 0; """ * Comments should start with a capital letter and be prepended to the section they apply to * There should be spaces between arguments in compares (a == b) not (a==b) * Where does the name "...Encode121" originate ? * module internal APIs should use lower case names (you converted some of these to PyUnicode_...() -- this is normally reserved for APIs which are either marked as potential candidates for the public API or are very prominent in the code) One thing which I don't like about your API change is that you removed the Py_UNICODE*data, int size style arguments -- this makes it impossible to use the new APIs on non-Python data or data which is not available as Unicode object. Please separate the errors.c patch from this patch -- it seems totally unrelated to Unicode. Thanks. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 From noreply@sourceforge.net Wed Apr 17 11:24:37 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 03:24:37 -0700 Subject: [Patches] [ python-Patches-545096 ] Janitoring in ConfigParser Message-ID: Patches item #545096, was opened at 2002-04-17 10:24 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545096&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Gustavo Niemeyer (niemeyer) Assigned to: Nobody/Anonymous (nobody) Summary: Janitoring in ConfigParser Initial Comment: The first patch fixes a bug, implements some speed improvements, some memory consumption improvements, enforces the usage of the already available global variables, and extends the allowed chars in option names to be very permissive. The second one, if used, is supposed to be applied over the first one, and implements a walk() generator method for walking trough the options of a section. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545096&group_id=5470 From noreply@sourceforge.net Wed Apr 17 13:40:04 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 05:40:04 -0700 Subject: [Patches] [ python-Patches-545150 ] {a,b} in fnmatch.translate Message-ID: Patches item #545150, was opened at 2002-04-17 12:40 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545150&group_id=5470 Category: Library (Lib) Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Gustavo Niemeyer (niemeyer) Assigned to: Nobody/Anonymous (nobody) Summary: {a,b} in fnmatch.translate Initial Comment: This patch adds support to {a,b} expansion constructs in fnmatch.translate. That is, file{a,b}.txt will match both, filea.txt and fileb.txt, like usual shell expansions. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545150&group_id=5470 From noreply@sourceforge.net Wed Apr 17 15:16:20 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 07:16:20 -0700 Subject: [Patches] [ python-Patches-500311 ] Work around for buggy https servers Message-ID: Patches item #500311, was opened at 2002-01-07 09:49 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=500311&group_id=5470 Category: Modules Group: None Status: Open Resolution: None Priority: 5 Submitted By: Michel Van den Bergh (vdbergh) Assigned to: Martin v. Löwis (loewis) Summary: Work around for buggy https servers Initial Comment: Python 2.2. Tested on RH 7.1. This a workaround for, http://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=494762 The problem is that some https servers close an ssl connection without properly resetting it first. In the above bug description it is suggested that this only occurs for IIS but apparently some (modified) Apache servers also suffer from it (see telemeter.telenet.be). One of the suggested workarounds is to modify httplib.py so as to ignore the combination of err[0]==SSL_ERROR_SYSCALL and err[1]=="EOF occurred in violation of protocol". However I think one should never compare error strings since in principle they may depend on language etc... So I decided to modify _socket.c slightly so that it becomes possible to return error codes which are not in in ssl.h. When an ssl-connection is closed without reset I now return the error code SSL_ERROR_EOF. Then I ignore this (apparently benign) error in httplib.py. In addition I fixed what I think was an error in PySSL_SetError(SSL *ssl, int ret) in socketmodule.c. Originally there was: case SSL_ERROR_SSL: { unsigned long e = ERR_get_error(); if (e == 0) { /* an EOF was observed that violates the protocol */ errstr = "EOF occurred in violation of protocol"; etc... but if I understand the documentation for SSL_get_error then the test should be: e==0 && ret==0. A similar error occurs a few lines later. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2002-04-17 16:16 Message: Logged In: YES user_id=21627 jribbens: Even when running the test from 494762, i.e. import os,urllib2 os.environ["http_proxy"]='' f = urllib2.urlopen("https://wwws.task.com.br/i.htm") print f.read() This gives an empty response for me... ---------------------------------------------------------------------- Comment By: Jon Ribbens (jribbens) Date: 2002-04-16 19:00 Message: Logged In: YES user_id=76089 py23ssl.txt works fine for me when applied to latest CVS, and fixes the problem. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-11 07:34 Message: Logged In: YES user_id=21627 Unfortunately, your patch appears to be incorrect. Performing the script in #494762, I get an empty string as the result, whereas the content of the resource is 'HTTPS Test' In case you want to experiment with the CVS version I'll attach a patch for that. ---------------------------------------------------------------------- Comment By: Michel Van den Bergh (vdbergh) Date: 2002-01-09 11:25 Message: Logged In: YES user_id=10252 Due to some problems with sourceforge and incompetence on my part I submitted this several times. Please see patch 500311. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=500311&group_id=5470 From noreply@sourceforge.net Wed Apr 17 15:44:07 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 07:44:07 -0700 Subject: [Patches] [ python-Patches-500311 ] Work around for buggy https servers Message-ID: Patches item #500311, was opened at 2002-01-07 08:49 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=500311&group_id=5470 Category: Modules Group: None Status: Open Resolution: None Priority: 5 Submitted By: Michel Van den Bergh (vdbergh) Assigned to: Martin v. Löwis (loewis) Summary: Work around for buggy https servers Initial Comment: Python 2.2. Tested on RH 7.1. This a workaround for, http://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=494762 The problem is that some https servers close an ssl connection without properly resetting it first. In the above bug description it is suggested that this only occurs for IIS but apparently some (modified) Apache servers also suffer from it (see telemeter.telenet.be). One of the suggested workarounds is to modify httplib.py so as to ignore the combination of err[0]==SSL_ERROR_SYSCALL and err[1]=="EOF occurred in violation of protocol". However I think one should never compare error strings since in principle they may depend on language etc... So I decided to modify _socket.c slightly so that it becomes possible to return error codes which are not in in ssl.h. When an ssl-connection is closed without reset I now return the error code SSL_ERROR_EOF. Then I ignore this (apparently benign) error in httplib.py. In addition I fixed what I think was an error in PySSL_SetError(SSL *ssl, int ret) in socketmodule.c. Originally there was: case SSL_ERROR_SSL: { unsigned long e = ERR_get_error(); if (e == 0) { /* an EOF was observed that violates the protocol */ errstr = "EOF occurred in violation of protocol"; etc... but if I understand the documentation for SSL_get_error then the test should be: e==0 && ret==0. A similar error occurs a few lines later. ---------------------------------------------------------------------- Comment By: Jon Ribbens (jribbens) Date: 2002-04-17 15:44 Message: Logged In: YES user_id=76089 Yes, that test works fine. The patch looks correct to me by inspection also. Michel's comments about SSL_get_error are correct according to the OpenSSL documentation, i.e. the existing code is incorrect (this being a separate issue to whether or not "EOF occurred" should be ignored, which is a work-around for other peoples' bugs). ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-17 15:16 Message: Logged In: YES user_id=21627 jribbens: Even when running the test from 494762, i.e. import os,urllib2 os.environ["http_proxy"]='' f = urllib2.urlopen("https://wwws.task.com.br/i.htm") print f.read() This gives an empty response for me... ---------------------------------------------------------------------- Comment By: Jon Ribbens (jribbens) Date: 2002-04-16 18:00 Message: Logged In: YES user_id=76089 py23ssl.txt works fine for me when applied to latest CVS, and fixes the problem. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-11 06:34 Message: Logged In: YES user_id=21627 Unfortunately, your patch appears to be incorrect. Performing the script in #494762, I get an empty string as the result, whereas the content of the resource is 'HTTPS Test' In case you want to experiment with the CVS version I'll attach a patch for that. ---------------------------------------------------------------------- Comment By: Michel Van den Bergh (vdbergh) Date: 2002-01-09 10:25 Message: Logged In: YES user_id=10252 Due to some problems with sourceforge and incompetence on my part I submitted this several times. Please see patch 500311. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=500311&group_id=5470 From noreply@sourceforge.net Wed Apr 17 19:16:12 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 11:16:12 -0700 Subject: [Patches] [ python-Patches-545300 ] sgmllib support for additional tag forms Message-ID: Patches item #545300, was opened at 2002-04-17 14:16 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545300&group_id=5470 Category: Library (Lib) Group: Python 2.1.2 Status: Open Resolution: None Priority: 5 Submitted By: Steven F. Lott (slott56) Assigned to: Nobody/Anonymous (nobody) Summary: sgmllib support for additional tag forms Initial Comment: MS-word generated HTML includes declaration tags of the form:   scattered throughout the body of an HTML document. The current sgmllib parse_declaration routine rejects these as invalid syntax, where browsers tolerate these embedded declarations. This patch accepts these declaration forms. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545300&group_id=5470 From noreply@sourceforge.net Wed Apr 17 19:55:04 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 11:55:04 -0700 Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode Message-ID: Patches item #536241, was opened at 2002-03-28 14:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: Walter Dörwald (doerwalter) Summary: string.zfill and unicode Initial Comment: This patch makes the function string.zfill work with unicode instances (and instances of str and unicode subclasses). Currently string.zfill(u"123", 10) results in "0000u'123'". With this patch the result is u'0000000123'. Should zfill be made a real str und unicode method? I noticed that a zfill implementation is available in unicodeobject.c, but commented out. ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2002-04-17 20:55 Message: Logged In: YES user_id=89016 Diff3.txt adds these tests to Lib/test/test_unicode.py and Lib/test/test_string.py. All tests pass (except that currently test_unicode.py fails the unicode_internal roundtripping test with --enable-unicode=ucs4) and when I change zfill back to always return self they properly fail. I don't know whether the fail message should be made better, and how this would interact with "make test" and whether the "Prefer string methods over string module functions" part in test_string.py might pose problems. And maybe the code could be simplyfied to always use the subclasses without first trying str und unicode? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 20:48 Message: Logged In: YES user_id=6380 If you want to be thorough, yes, that's a good test to add! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 20:47 Message: Logged In: YES user_id=89016 Checked in as: Objects/stringobject.c 2.159 Objects/unicodeobject.c 2.139 Maybe we could add a test to Lib/test/test_unicode.py and Lib/test/test_string.py that makes sure that no method returns a str/unicode subinstance even when called for a str/unicode subinstance? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 20:29 Message: Logged In: YES user_id=6380 Yes, that's the right thing. Reopened this for now. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 20:23 Message: Logged In: YES user_id=89016 Currently zfill returns the original if nothing has to be done. Should I change this to only do it, if it's a real str or unicode instance? (as it was done lots of methods for bug http://www.python.org/sf/460020) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 16:47 Message: Logged In: YES user_id=6380 Yes, please open a separate bug report for those (I'd open a separate report for each file with warnings, unless you have an obvious fix). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 16:43 Message: Logged In: YES user_id=89016 > Does your compiler not warn you? Or did > you ignore warnings? > (The latter's a sin in Python-land :-). The warning was just lost in the long list of outputs. Now that you mention it, there are still a few warnings (gcc 2.96 on Linux): Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format': Objects/unicodeobject.c:5574: warning: int format, long int arg (arg 3) Objects/unicodeobject.c:5574: warning: unsigned int format, long unsigned int arg (arg 4) libpython2.3.a(posixmodule.o): In function `posix_tmpnam': Modules/posixmodule.c:5150: the use of `tmpnam_r' is dangerous, better use `mkstemp' libpython2.3.a(posixmodule.o): In function `posix_tempnam': Modules/posixmodule.c:5100: the use of `tempnam' is dangerous, better use `mkstemp' Modules/pwdmodule.c: In function `initpwd': Modules/pwdmodule.c:161: warning: unused variable `d' Modules/readline.c: In function `set_completer_delims': Modules/readline.c:273: warning: passing arg 1 of `free' discards qualifiers from pointer target type Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not used Should I open a separate bug report for that? > I've also folded some long lines that weren't > your fault -- but I noticed that elsewhere you > checked in some long lines; > please try to limit line length to 78. I noticed your descrobject.c checkin message. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 15:53 Message: Logged In: YES user_id=6380 Thanks, Walter! Some nits: The string_zfill() code you checked in caused two warnings about modifying data pointed to by a const pointer. I've removed the const, but I'd like to understand how come you didn't catch this. Does your compiler not warn you? Or did you ignore warnings? (The latter's a sin in Python-land :-). I've also folded some long lines that weren't your fault -- but I noticed that elsewhere you checked in some long lines; please try to limit line length to 78. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 15:41 Message: Logged In: YES user_id=89016 Checked in as: Doc/lib/libstdtypes.tex 1.88 Lib/UserString.py 1.12 Lib/string.py 1.63 test/string_tests.py 1.13 test/test_unicode.py 1.54 Misc/NEWS 1.388 Objects/stringobject.c 2.157 Objects/unicodeobject.c 2.138 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-13 03:00 Message: Logged In: YES user_id=6380 I'm for making them methods. Walter, just check it in! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-12 20:37 Message: Logged In: YES user_id=89016 Now that test_userstring.py works and fails (rev 1.6) should we add zfill as str and unicode methods or change UserString.zfill to use string.zfill? I've made a patch (attached) that implements zfill as methods (i.e. activates the version in unicodeobject.c that was commented out and implements the same in stringobject.c) (And it adds the test for unicode support back in.) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-12 16:51 Message: Logged In: YES user_id=21627 Re: optional Unicode: Walter is correct; configuring with --disable-unicode currently breaks the string module. One might consider using types.StringTypes; OTOH, pulling in types might not be desirable. As for str vs. repr: Python was always using repr in zfill, so changing it may break things. So I recommend that Walter reverts Andrew's check-in and applies his change. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 12:25 Message: Logged In: YES user_id=6656 Hah, I was going to say that but was distracted by IE wiping out the machine I'm sitting at. Re-opening. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-30 12:16 Message: Logged In: YES user_id=89016 But Python could be compiled without unicode support (by undefining PY_USING_UNICODE), and string.zfill should work even in this case. What about making zfill a real str and unicode method? ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2002-03-29 17:24 Message: Logged In: YES user_id=11375 Thanks for your patch! I've checked it into CVS, with two modifications. First, I removed the code to handle the case where Python doesn't have a unicode() built-in; there's no expection that you can take the standard library for Python version N and use it with version N-1, so this code isn't needed. Second, I changed string.zfill() to take the str() and not the repr() when it gets a non-string object because that seems to make more sense. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 From noreply@sourceforge.net Wed Apr 17 20:40:44 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 12:40:44 -0700 Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks Message-ID: Patches item #432401, was opened at 2001-06-12 13:43 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Postponed Priority: 6 Submitted By: Walter Dörwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: unicode encoding error callbacks Initial Comment: This patch adds unicode error handling callbacks to the encode functionality. With this patch it's possible to not only pass 'strict', 'ignore' or 'replace' as the errors argument to encode, but also a callable function, that will be called with the encoding name, the original unicode object and the position of the unencodable character. The callback must return a replacement unicode object that will be encoded instead of the original character. For example replacing unencodable characters with XML character references can be done in the following way. u"aäoöuüß".encode( "ascii", lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos]) ) ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2002-04-17 19:40 Message: Logged In: YES user_id=38388 Sorry for the late response. About the difference between encoding and decoding: you shouldn't just look at the case where you work with Unicode and strings, e.g. take the rot-13 codec which works on strings only or other codecs which translate objects into strings and vice-versa. Error handling has to be flexible enough to handle all these situations. Since the codecs know best how to handle the situations, I'd make this an implementation detail of the codec and leave the behaviour undefined in the general case. For the existing codecs, backward compatibility should be maintained, if at all possible. If the patch gets overly complicated because of this, we may have to provide a downgrade solution for this particular problem (I don't think replace is used in any computational context, though, since you can never be sure how many replacement character do get inserted, so the case may not be that realistic). Raising an exception for the charmap codec is the right way to go, IMHO. I would consider the current behaviour a bug. For new codecs, I think we should suggest that replace tries to collect as much illegal data as possible before invoking the error handler. The handler should be aware of the fact that it won't necessarily get all the broken data in one call. About the codec error handling registry: You seem to be using a Unicode specific approach here. I'd rather like to see a generic approach which uses the API we discussed earlier. Would that be possible ? In that case, the codec API should probably be called codecs.register_error('myhandler', myhandler). Does that make sense ? BTW, the patch which uses the callback registry does not seem to be available on this SF page (the last patch still converts the errors argument to a PyObject, which shouldn't be needed anymore with the new approach). Can you please upload your latest version ? Note that the highlighting codec would make a nice example for the new feature. Thanks. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-17 10:21 Message: Logged In: YES user_id=89016 Another note: the patch will change the meaning of charmap encoding slightly: currently "replace" will put a ? into the output, even if ? is not in the mapping, i.e. codecs.charmap_encode(u"c", "replace", {ord("a"): ord ("b")}) will return ('?', 1). With the patch the above example will raise an exception. Off course with the patch many more replace characters can appear, so it is vital that for the replacement string the mapping is done. Is this semantic change OK? (I guess all of the existing codecs have a mapping ord("?")->ord("?")) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-15 17:19 Message: Logged In: YES user_id=89016 So this means that the encoder can collect illegal characters and pass it to the callback. "replace" will replace this with (end-start)*u"?". Decoders don't collect all illegal byte sequences, but call the callback once for every byte sequence that has been found illegal and "replace" will replace it with u"?". Does this make sense? ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-15 17:06 Message: Logged In: YES user_id=89016 For encoding it's always (end-start)*u"?": >>> u"ää".encode("ascii", "replace") '??' But for decoding, it is neither nor: >>> "\Ux\U".decode("unicode-escape", "replace") u'\ufffd\ufffd' i.e. a sequence of 5 illegal characters was replace by two replacement characters. This might mean that decoders can't collect all the illegal characters and call the callback once. They might have to call the callback for every single illegal byte sequence to get the old behaviour. (It seems that this patch would be much, much simpler, if we only change the encoders) ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-03-08 18:36 Message: Logged In: YES user_id=38388 Hmm, whatever it takes to maintain backwards compatibility. Do you have an example ? ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-08 17:31 Message: Logged In: YES user_id=89016 What should replace do: Return u"?" or (end-start)*u"?" ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-03-08 15:15 Message: Logged In: YES user_id=38388 Sounds like a good idea. Please keep the encoder and decoder APIs symmetric, though, ie. add the slice information to both APIs. The slice should use the same format as Python's standard slices, that is left inclusive, right exclusive. I like the highlighting feature ! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-07 23:09 Message: Logged In: YES user_id=89016 I'm think about extending the API a little bit: Consider the following example: >>> "\u1".decode("unicode-escape") Traceback (most recent call last): File "", line 1, in ? UnicodeError: encoding 'unicodeescape' can't decode byte 0x31 in position 2: truncated \uXXXX escape The error message is a lie: Not the '1' in position 2 is the problem, but the complete truncated sequence '\u1'. For this the decoder should pass a start and an end position to the handler. For encoding this would be useful too: Suppose I want to have an encoder that colors the unencodable character via an ANSI escape sequences. Then I could do the following: >>> import codecs >>> def color(enc, uni, pos, why, sta): ... return (u"\033[1m<%d>\033[0m" % ord(uni[pos]), pos+1) ... >>> codecs.register_unicodeencodeerrorhandler("color", color) >>> u"aäüöo".encode("ascii", "color") 'a\x1b[1m<228>\x1b[0m\x1b[1m<252>\x1b[0m\x1b[1m<246>\x1b [0mo' But here the sequences "\x1b[0m\x1b[1m" are not needed. To fix this problem the encoder could collect as many unencodable characters as possible and pass those to the error callback in one go (passing a start and end+1 position). This fixes the above problem and reduces the number of calls to the callback, so it should speed up the algorithms in case of custom encoding names. (And it makes the implementation very interesting ;)) What do you think? ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-07 01:29 Message: Logged In: YES user_id=89016 I started from scratch, and the current state is this: Encoding mostly works (except that I haven't changed TranslateCharmap and EncodeDecimal yet) and most of the decoding stuff works (DecodeASCII and DecodeCharmap are still unchanged) and the decoding callback helper isn't optimized for the "builtin" names yet (i.e. it still calls the handler). For encoding the callback helper knows how to handle "strict", "replace", "ignore" and "xmlcharrefreplace" itself and won't call the callback. This should make the encoder fast enough. As callback name string comparison results are cached it might even be faster than the original. The patch so far didn't require any changes to unicodeobject.h, stringobject.h or stringobject.c ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-03-05 16:49 Message: Logged In: YES user_id=38388 Walter, are you making any progress on the new scheme we discussed on the mailing list (adding an error handler registry much like the codec registry itself instead of trying to redo the complete codec API) ? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-09-20 10:38 Message: Logged In: YES user_id=38388 I am postponing this patch until the PEP process has started. This feature won't make it into Python 2.2. Walter, you may want to reference this patch in the PEP. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-08-16 10:53 Message: Logged In: YES user_id=38388 I think we ought to summarize these changes in a PEP to get some more feedback and testing from others as well. I'll look into this after I'm back from vacation on the 10.09. Given the release schedule I am not sure whether this feature will make it into 2.2. The size of the patch is huge and probably needs a lot of testing first. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-07-27 03:55 Message: Logged In: YES user_id=89016 Changing the decoding API is done now. There are new functions codec.register_unicodedecodeerrorhandler and codec.lookup_unicodedecodeerrorhandler. Only the standard handlers for 'strict', 'ignore' and 'replace' are preregistered. There may be many reasons for decoding errors in the byte string, so I added an additional argument to the decoding API: reason, which gives the reason for the failure, e.g.: >>> "\U1111111".decode("unicode_escape") Traceback (most recent call last): File "", line 1, in ? UnicodeError: encoding 'unicodeescape' can't decode byte 0x31 in position 8: truncated \UXXXXXXXX escape >>> "\U11111111".decode("unicode_escape") Traceback (most recent call last): File "", line 1, in ? UnicodeError: encoding 'unicodeescape' can't decode byte 0x31 in position 9: illegal Unicode character For symmetry I added this to the encoding API too: >>> u"\xff".encode("ascii") Traceback (most recent call last): File "", line 1, in ? UnicodeError: encoding 'ascii' can't decode byte 0xff in position 0: ordinal not in range(128) The parameters passed to the callbacks now are: encoding, unicode, position, reason, state. The encoding and decoding API for strings has been adapted too, so now the new API should be usable everywhere: >>> unicode("a\xffb\xffc", "ascii", ... lambda enc, uni, pos, rea, sta: (u"", pos+1)) u'abc' >>> "a\xffb\xffc".decode("ascii", ... lambda enc, uni, pos, rea, sta: (u"", pos+1)) u'abc' I had a problem with the decoding API: all the functions in _codecsmodule.c used the t# format specifier. I changed that to O! with &PyString_Type, because otherwise we would have the problem that the decoding API would must pass buffer object around instead of strings, and the callback would have to call str() on the buffer anyway to access a specific character, so this wouldn't be any faster than calling str() on the buffer before decoding. It seems that buffers aren't used anyway. I changed all the old function to call the new ones so bugfixes don't have to be done in two places. There are two exceptions: I didn't change PyString_AsEncodedString and PyString_AsDecodedString because they are documented as deprecated anyway (although they are called in a few spots) This means that I duplicated part of their functionality in PyString_AsEncodedObjectEx and PyString_AsDecodedObjectEx. There are still a few spots that call the old API: E.g. PyString_Format still calls PyUnicode_Decode (but with strict decoding) because it passes the rest of the format string to PyUnicode_Format when it encounters a Unicode object. Should we switch to the new API everywhere even if strict encoding/decoding is used? The size of this patch begins to scare me. I guess we need an extensive test script for all the new features and documentation. I hope you have time to do that, as I'll be busy with other projects in the next weeks. (BTW, I have't touched PyUnicode_TranslateCharmap yet.) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-07-23 17:03 Message: Logged In: YES user_id=89016 New version of the patch with the error handling callback registry. > > OK, done, now there's a > > PyCodec_EscapeReplaceUnicodeEncodeErrors/ > > codecs.escapereplace_unicodeencode_errors > > that uses \u (or \U if x>0xffff (with a wide build > > of Python)). > > Great! Now PyCodec_EscapeReplaceUnicodeEncodeErrors uses \x in addition to \u and \U where appropriate. > > [...] > > But for special one-shot error handlers, it might still be > > useful to pass the error handler directly, so maybe we > > should leave error as PyObject *, but implement the > > registry anyway? > > Good idea ! > > One minor nit: codecs.registerError() should be named > codecs.register_errorhandler() to be more inline with > the Python coding style guide. OK, but these function are specific to unicode encoding, so now the functions are called: codecs.register_unicodeencodeerrorhandler codecs.lookup_unicodeencodeerrorhandler Now all callbacks (including the new ones: "xmlcharrefreplace" and "escapereplace") are registered in the codecs.c/_PyCodecRegistry_Init so using them is really simple: u"gürk".encode("ascii", "xmlcharrefreplace") ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-07-13 11:26 Message: Logged In: YES user_id=38388 > > > > > BTW, I guess PyUnicode_EncodeUnicodeEscape > > > > > could be reimplemented as PyUnicode_EncodeASCII > > > > > with \uxxxx replacement callback. > > > > > > > > Hmm, wouldn't that result in a slowdown ? If so, > > > > I'd rather leave the special encoder in place, > > > > since it is being used a lot in Python and > > > > probably some applications too. > > > > > > It would be a slowdown. But callbacks open many > > > possiblities. > > > > True, but in this case I believe that we should stick with > > the native implementation for "unicode-escape". Having > > a standard callback error handler which does the \uXXXX > > replacement would be nice to have though, since this would > > also be usable with lots of other codecs (e.g. all the > > code page ones). > > OK, done, now there's a > PyCodec_EscapeReplaceUnicodeEncodeErrors/ > codecs.escapereplace_unicodeencode_errors > that uses \u (or \U if x>0xffff (with a wide build > of Python)). Great ! > > [...] > > > Should the old TranslateCharmap map to the new > > > TranslateCharmapEx and inherit the > > > "multicharacter replacement" feature, > > > or should I leave it as it is? > > > > If possible, please also add the multichar replacement > > to the old API. I think it is very useful and since the > > old APIs work on raw buffers it would be a benefit to have > > the functionality in the old implementation too. > > OK! I will try to find the time to implement that in the > next days. Good. > > [Decoding error callbacks] > > > > About the return value: > > > > I'd suggest to always use the same tuple interface, e.g. > > > > callback(encoding, input_data, input_position, > state) -> > > (output_to_be_appended, new_input_position) > > > > (I think it's better to use absolute values for the > > position rather than offsets.) > > > > Perhaps the encoding callbacks should use the same > > interface... what do you think ? > > This would make the callback feature hypergeneric and a > little slower, because tuples have to be created, but it > (almost) unifies the encoding and decoding API. ("almost" > because, for the encoder output_to_be_appended will be > reencoded, for the decoder it will simply be appended.), > so I'm for it. That's the point. Note that I don't think the tuple creation will hurt much (see the make_tuple() API in codecs.c) since small tuples are cached by Python internally. > I implemented this and changed the encoders to only > lookup the error handler on the first error. The UCS1 > encoder now no longer uses the two-item stack strategy. > (This strategy only makes sense for those encoder where > the encoding itself is much more complicated than the > looping/callback etc.) So now memory overflow tests are > only done, when an unencodable error occurs, so now the > UCS1 encoder should be as fast as it was without > error callbacks. > > Do we want to enforce new_input_position>input_position, > or should jumping back be allowed? No; moving backwards should be allowed (this may be useful in order to resynchronize with the input data). > Here's is the current todo list: > 1. implement a new TranslateCharmap and fix the old. > 2. New encoding API for string objects too. > 3. Decoding > 4. Documentation > 5. Test cases > > I'm thinking about a different strategy for implementing > callbacks > (see http://mail.python.org/pipermail/i18n-sig/2001- > July/001262.html) > > We coould have a error handler registry, which maps names > to error handlers, then it would be possible to keep the > errors argument as "const char *" instead of "PyObject *". > Currently PyCodec_UnicodeEncodeHandlerForObject is a > backwards compatibility hack that will never go away, > because > it's always more convenient to type > u"...".encode("...", "strict") > instead of > import codecs > u"...".encode("...", codecs.raise_encode_errors) > > But with an error handler registry this function would > become the official lookup method for error handlers. > (PyCodec_LookupUnicodeEncodeErrorHandler?) > Python code would look like this: > --- > def xmlreplace(encoding, unicode, pos, state): > return (u"&#%d;" % ord(uni[pos]), pos+1) > > import codec > > codec.registerError("xmlreplace",xmlreplace) > --- > and then the following call can be made: > u"äöü".encode("ascii", "xmlreplace") > As soon as the first error is encountered, the encoder uses > its builtin error handling method if it recognizes the name > ("strict", "replace" or "ignore") or looks up the error > handling function in the registry if it doesn't. In this way > the speed for the backwards compatible features is the same > as before and "const char *error" can be kept as the > parameter to all encoding functions. For speed common error > handling names could even be implemented in the encoder > itself. > > But for special one-shot error handlers, it might still be > useful to pass the error handler directly, so maybe we > should leave error as PyObject *, but implement the > registry anyway? Good idea ! One minor nit: codecs.registerError() should be named codecs.register_errorhandler() to be more inline with the Python coding style guide. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-07-12 11:03 Message: Logged In: YES user_id=89016 > > [...] > > so I guess we could change the replace handler > > to always return u'?'. This would make the > > implementation a little bit simpler, but the > > explanation of the callback feature *a lot* > > simpler. > > Go for it. OK, done! > [...] > > > Could you add these docs to the Misc/unicode.txt > > > file ? I will eventually take that file and turn > > > it into a PEP which will then serve as general > > > documentation for these things. > > > > I could, but first we should work out how the > > decoding callback API will work. > > Ok. BTW, Barry Warsaw already did the work of converting > the unicode.txt to PEP 100, so the docs should eventually > go there. OK. I guess it would be best to do this when everything is finished. > > > > BTW, I guess PyUnicode_EncodeUnicodeEscape > > > > could be reimplemented as PyUnicode_EncodeASCII > > > > with \uxxxx replacement callback. > > > > > > Hmm, wouldn't that result in a slowdown ? If so, > > > I'd rather leave the special encoder in place, > > > since it is being used a lot in Python and > > > probably some applications too. > > > > It would be a slowdown. But callbacks open many > > possiblities. > > True, but in this case I believe that we should stick with > the native implementation for "unicode-escape". Having > a standard callback error handler which does the \uXXXX > replacement would be nice to have though, since this would > also be usable with lots of other codecs (e.g. all the > code page ones). OK, done, now there's a PyCodec_EscapeReplaceUnicodeEncodeErrors/ codecs.escapereplace_unicodeencode_errors that uses \u (or \U if x>0xffff (with a wide build of Python)). > > For example: > > > > Why can't I print u"gürk"? > > > > is probably one of the most frequently asked > > questions in comp.lang.python. For printing > > Unicode stuff, print could be extended the use an > > error handling callback for Unicode strings (or > > objects where __str__ or tp_str returns a Unicode > > object) instead of using str() which always > > returns an 8bit string and uses strict encoding. > > There might even be a > > sys.setprintencodehandler()/sys.getprintencodehandler () > > There already is a print callback in Python (forgot the > name of the hook though), so this should be possible by > providing the encoding logic in the hook. True: sys.displayhook > [...] > > Should the old TranslateCharmap map to the new > > TranslateCharmapEx and inherit the > > "multicharacter replacement" feature, > > or should I leave it as it is? > > If possible, please also add the multichar replacement > to the old API. I think it is very useful and since the > old APIs work on raw buffers it would be a benefit to have > the functionality in the old implementation too. OK! I will try to find the time to implement that in the next days. > [Decoding error callbacks] > > About the return value: > > I'd suggest to always use the same tuple interface, e.g. > > callback(encoding, input_data, input_position, state) -> > (output_to_be_appended, new_input_position) > > (I think it's better to use absolute values for the > position rather than offsets.) > > Perhaps the encoding callbacks should use the same > interface... what do you think ? This would make the callback feature hypergeneric and a little slower, because tuples have to be created, but it (almost) unifies the encoding and decoding API. ("almost" because, for the encoder output_to_be_appended will be reencoded, for the decoder it will simply be appended.), so I'm for it. I implemented this and changed the encoders to only lookup the error handler on the first error. The UCS1 encoder now no longer uses the two-item stack strategy. (This strategy only makes sense for those encoder where the encoding itself is much more complicated than the looping/callback etc.) So now memory overflow tests are only done, when an unencodable error occurs, so now the UCS1 encoder should be as fast as it was without error callbacks. Do we want to enforce new_input_position>input_position, or should jumping back be allowed? > > > > One additional note: It is vital that errors > > > > is an assignable attribute of the StreamWriter. > > > > > > It is already ! > > > > I know, but IMHO it should be documented that an > > assignable errors attribute must be supported > > as part of the official codec API. > > > > Misc/unicode.txt is not clear on that: > > """ > > It is not required by the Unicode implementation > > to use these base classes, only the interfaces must > > match; this allows writing Codecs as extension types. > > """ > > Good point. I'll add that to the PEP 100. OK. Here's is the current todo list: 1. implement a new TranslateCharmap and fix the old. 2. New encoding API for string objects too. 3. Decoding 4. Documentation 5. Test cases I'm thinking about a different strategy for implementing callbacks (see http://mail.python.org/pipermail/i18n-sig/2001- July/001262.html) We coould have a error handler registry, which maps names to error handlers, then it would be possible to keep the errors argument as "const char *" instead of "PyObject *". Currently PyCodec_UnicodeEncodeHandlerForObject is a backwards compatibility hack that will never go away, because it's always more convenient to type u"...".encode("...", "strict") instead of import codecs u"...".encode("...", codecs.raise_encode_errors) But with an error handler registry this function would become the official lookup method for error handlers. (PyCodec_LookupUnicodeEncodeErrorHandler?) Python code would look like this: --- def xmlreplace(encoding, unicode, pos, state): return (u"&#%d;" % ord(uni[pos]), pos+1) import codec codec.registerError("xmlreplace",xmlreplace) --- and then the following call can be made: u"äöü".encode("ascii", "xmlreplace") As soon as the first error is encountered, the encoder uses its builtin error handling method if it recognizes the name ("strict", "replace" or "ignore") or looks up the error handling function in the registry if it doesn't. In this way the speed for the backwards compatible features is the same as before and "const char *error" can be kept as the parameter to all encoding functions. For speed common error handling names could even be implemented in the encoder itself. But for special one-shot error handlers, it might still be useful to pass the error handler directly, so maybe we should leave error as PyObject *, but implement the registry anyway? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-07-10 12:29 Message: Logged In: YES user_id=38388 Ok, here we go... > > > raise an exception). U+FFFD characters in the > replacement > > > string will be replaced with a character that the > encoder > > > chooses ('?' in all cases). > > > > Nice. > > But the special casing of U+FFFD makes the interface > somewhat > less clean than it could be. It was only done to be 100% > backwards compatible. With the original "replace" > error > handling the codec chose the replacement character. But as > far as I can tell none of the codecs uses anything other > than '?', True. > so I guess we could change the replace handler > to always return u'?'. This would make the implementation a > little bit simpler, but the explanation of the callback > feature *a lot* simpler. Go for it. > And if you still want to handle > an unencodable U+FFFD, you can write a special callback for > that, e.g. > > def FFFDreplace(enc, uni, pos): > if uni[pos] == "\ufffd": > return u"?" > else: > raise UnicodeError(...) > > > ...docs... > > > > Could you add these docs to the Misc/unicode.txt file ? I > > will eventually take that file and turn it into a PEP > which > > will then serve as general documentation for these things. > > I could, but first we should work out how the decoding > callback API will work. Ok. BTW, Barry Warsaw already did the work of converting the unicode.txt to PEP 100, so the docs should eventually go there. > > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > > > replacement callback. > > > > Hmm, wouldn't that result in a slowdown ? If so, I'd > rather > > leave the special encoder in place, since it is being > used a > > lot in Python and probably some applications too. > > It would be a slowdown. But callbacks open many > possiblities. True, but in this case I believe that we should stick with the native implementation for "unicode-escape". Having a standard callback error handler which does the \uXXXX replacement would be nice to have though, since this would also be usable with lots of other codecs (e.g. all the code page ones). > For example: > > Why can't I print u"gürk"? > > is probably one of the most frequently asked questions in > comp.lang.python. For printing Unicode stuff, print could be > extended the use an error handling callback for Unicode > strings (or objects where __str__ or tp_str returns a > Unicode object) instead of using str() which always returns > an 8bit string and uses strict encoding. There might even > be a > sys.setprintencodehandler()/sys.getprintencodehandler() There already is a print callback in Python (forgot the name of the hook though), so this should be possible by providing the encoding logic in the hook. > > > I have not touched PyUnicode_TranslateCharmap yet, > > > should this function also support error callbacks? Why > > > would one want the insert None into the mapping to > call > > > the callback? > > > > 1. Yes. > > 2. The user may want to e.g. restrict usage of certain > > character ranges. In this case the codec would be used to > > verify the input and an exception would indeed be useful > > (e.g. say you want to restrict input to Hangul + ASCII). > > OK, do we want TranslateCharmap to work exactly like > encoding, > i.e. in case of an error should the returned replacement > string again be mapped through the translation mapping or > should it be copied to the output directly? The former would > be more in line with encoding, but IMHO the latter would > be much more useful. It's better to take the second approach (copy the callback output directly to the output string) to avoid endless recursion and other pitfalls. I suppose this will also simplify the implementation somewhat. > BTW, when I implement it I can implement patch #403100 > ("Multicharacter replacements in > PyUnicode_TranslateCharmap") > along the way. I've seen it; will comment on it later. > Should the old TranslateCharmap map to the new > TranslateCharmapEx > and inherit the "multicharacter replacement" feature, > or > should I leave it as it is? If possible, please also add the multichar replacement to the old API. I think it is very useful and since the old APIs work on raw buffers it would be a benefit to have the functionality in the old implementation too. [Decoding error callbacks] > > > A remaining problem is how to implement decoding error > > > callbacks. In Python 2.1 encoding and decoding errors > are > > > handled in the same way with a string value. But with > > > callbacks it doesn't make sense to use the same > callback > > > for encoding and decoding (like > codecs.StreamReaderWriter > > > and codecs.StreamRecoder do). Decoding callbacks have > a > > > different API. Which arguments should be passed to the > > > decoding callback, and what is the decoding callback > > > supposed to do? > > > > I'd suggest adding another set of PyCodec_UnicodeDecode... > () > > APIs for this. We'd then have to augment the base classes > of > > the StreamCodecs to provide two attributes for .errors > with > > a fallback solution for the string case (i.s. "strict" > can > > still be used for both directions). > > Sounds good. Now what is the decoding callback supposed to > do? > I guess it will be called in the same way as the encoding > callback, i.e. with encoding name, original string and > position of the error. It might returns a Unicode string > (i.e. an object of the decoding target type), that will be > emitted from the codec instead of the one offending byte. Or > it might return a tuple with replacement Unicode object and > a resynchronisation offset, i.e. returning (u"?", 1) > means > emit a '?' and skip the offending character. But to make > the offset really useful the callback has to know something > about the encoding, perhaps the codec should be allowed to > pass an additional state object to the callback? > > Maybe the same should be added to the encoding callbacks to? > Maybe the encoding callback should be able to tell the > encoder if the replacement returned should be reencoded > (in which case it's a Unicode object), or directly emitted > (in which case it's an 8bit string)? I like the idea of having an optional state object (basically this should be a codec-defined arbitrary Python object) which then allow the callback to apply additional tricks. The object should be documented to be modifyable in place (simplifies the interface). About the return value: I'd suggest to always use the same tuple interface, e.g. callback(encoding, input_data, input_position, state) -> (output_to_be_appended, new_input_position) (I think it's better to use absolute values for the position rather than offsets.) Perhaps the encoding callbacks should use the same interface... what do you think ? > > > One additional note: It is vital that errors is an > > > assignable attribute of the StreamWriter. > > > > It is already ! > > I know, but IMHO it should be documented that an assignable > errors attribute must be supported as part of the official > codec API. > > Misc/unicode.txt is not clear on that: > """ > It is not required by the Unicode implementation to use > these base classes, only the interfaces must match; this > allows writing Codecs as extension types. > """ Good point. I'll add that to the PEP 100. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-22 20:51 Message: Logged In: YES user_id=38388 Sorry to keep you waiting, Walter. I will look into this again next week -- this week was way too busy... ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-13 17:00 Message: Logged In: YES user_id=38388 On your comment about the non-Unicode codecs: let's keep this separated from the current patch. Don't have much time today. I'll comment on the other things tomorrow. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-13 15:49 Message: Logged In: YES user_id=89016 Guido van Rossum wrote in python-dev: > True, the "codec" pattern can be used for other > encodings than Unicode. But it seems to me that the > entire codecs architecture is rather strongly geared > towards en/decoding Unicode, and it's not clear > how well other codecs fit in this pattern (e.g. I > noticed that all the non-Unicode codecs ignore the > error handling parameter or assert that > it is set to 'strict'). I noticed that too. asserting that errors=='strict' would mean that the encoder is not able to deal in any other way with unencodable stuff than by raising an error. But that is not the problem here, because for zlib, base64, quopri, hex and uu encoding there can be no unencodable characters. The encoders can simply ignore the errors parameter. Should I remove the asserts from those codecs and change the docstrings accordingly, or will this be done separately? ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-13 13:57 Message: Logged In: YES user_id=89016 > > [...] > > raise an exception). U+FFFD characters in the replacement > > string will be replaced with a character that the encoder > > chooses ('?' in all cases). > > Nice. But the special casing of U+FFFD makes the interface somewhat less clean than it could be. It was only done to be 100% backwards compatible. With the original "replace" error handling the codec chose the replacement character. But as far as I can tell none of the codecs uses anything other than '?', so I guess we could change the replace handler to always return u'?'. This would make the implementation a little bit simpler, but the explanation of the callback feature *a lot* simpler. And if you still want to handle an unencodable U+FFFD, you can write a special callback for that, e.g. def FFFDreplace(enc, uni, pos): if uni[pos] == "\ufffd": return u"?" else: raise UnicodeError(...) > > The implementation of the loop through the string is done > > in the following way. A stack with two strings is kept > > and the loop always encodes a character from the string > > at the stacktop. If an error is encountered and the stack > > has only one entry (during encoding of the original string) > > the callback is called and the unicode object returned is > > pushed on the stack, so the encoding continues with the > > replacement string. If the stack has two entries when an > > error is encountered, the replacement string itself has > > an unencodable character and a normal exception raised. > > When the encoder has reached the end of it's current string > > there are two possibilities: when the stack contains two > > entries, this was the replacement string, so the replacement > > string will be poppep from the stack and encoding continues > > with the next character from the original string. If the > > stack had only one entry, encoding is finished. > > Very elegant solution ! I'll put it as a comment in the source. > > (I hope that's enough explanation of the API and > implementation) > > Could you add these docs to the Misc/unicode.txt file ? I > will eventually take that file and turn it into a PEP which > will then serve as general documentation for these things. I could, but first we should work out how the decoding callback API will work. > > I have renamed the static ...121 function to all lowercase > > names. > > Ok. > > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > > replacement callback. > > Hmm, wouldn't that result in a slowdown ? If so, I'd rather > leave the special encoder in place, since it is being used a > lot in Python and probably some applications too. It would be a slowdown. But callbacks open many possiblities. For example: Why can't I print u"gürk"? is probably one of the most frequently asked questions in comp.lang.python. For printing Unicode stuff, print could be extended the use an error handling callback for Unicode strings (or objects where __str__ or tp_str returns a Unicode object) instead of using str() which always returns an 8bit string and uses strict encoding. There might even be a sys.setprintencodehandler()/sys.getprintencodehandler() > [...] > I think it would be worthwhile to rename the callbacks to > include "Unicode" somewhere, e.g. > PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but > then it points out the application field of the callback > rather well. Same for the callbacks exposed through the > _codecsmodule. OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors really is a long name ;)) > > I have not touched PyUnicode_TranslateCharmap yet, > > should this function also support error callbacks? Why > > would one want the insert None into the mapping to call > > the callback? > > 1. Yes. > 2. The user may want to e.g. restrict usage of certain > character ranges. In this case the codec would be used to > verify the input and an exception would indeed be useful > (e.g. say you want to restrict input to Hangul + ASCII). OK, do we want TranslateCharmap to work exactly like encoding, i.e. in case of an error should the returned replacement string again be mapped through the translation mapping or should it be copied to the output directly? The former would be more in line with encoding, but IMHO the latter would be much more useful. BTW, when I implement it I can implement patch #403100 ("Multicharacter replacements in PyUnicode_TranslateCharmap") along the way. Should the old TranslateCharmap map to the new TranslateCharmapEx and inherit the "multicharacter replacement" feature, or should I leave it as it is? > > A remaining problem is how to implement decoding error > > callbacks. In Python 2.1 encoding and decoding errors are > > handled in the same way with a string value. But with > > callbacks it doesn't make sense to use the same callback > > for encoding and decoding (like codecs.StreamReaderWriter > > and codecs.StreamRecoder do). Decoding callbacks have a > > different API. Which arguments should be passed to the > > decoding callback, and what is the decoding callback > > supposed to do? > > I'd suggest adding another set of PyCodec_UnicodeDecode... () > APIs for this. We'd then have to augment the base classes of > the StreamCodecs to provide two attributes for .errors with > a fallback solution for the string case (i.s. "strict" can > still be used for both directions). Sounds good. Now what is the decoding callback supposed to do? I guess it will be called in the same way as the encoding callback, i.e. with encoding name, original string and position of the error. It might returns a Unicode string (i.e. an object of the decoding target type), that will be emitted from the codec instead of the one offending byte. Or it might return a tuple with replacement Unicode object and a resynchronisation offset, i.e. returning (u"?", 1) means emit a '?' and skip the offending character. But to make the offset really useful the callback has to know something about the encoding, perhaps the codec should be allowed to pass an additional state object to the callback? Maybe the same should be added to the encoding callbacks to? Maybe the encoding callback should be able to tell the encoder if the replacement returned should be reencoded (in which case it's a Unicode object), or directly emitted (in which case it's an 8bit string)? > > One additional note: It is vital that errors is an > > assignable attribute of the StreamWriter. > > It is already ! I know, but IMHO it should be documented that an assignable errors attribute must be supported as part of the official codec API. Misc/unicode.txt is not clear on that: """ It is not required by the Unicode implementation to use these base classes, only the interfaces must match; this allows writing Codecs as extension types. """ ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-13 08:05 Message: Logged In: YES user_id=38388 > How the callbacks work: > > A PyObject * named errors is passed in. This may by NULL, > Py_None, 'strict', u'strict', 'ignore', u'ignore', > 'replace', u'replace' or a callable object. > PyCodec_EncodeHandlerForObject maps all of these objects to > one of the three builtin error callbacks > PyCodec_RaiseEncodeErrors (raises an exception), > PyCodec_IgnoreEncodeErrors (returns an empty replacement > string, in effect ignoring the error), > PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode > replacement character to signify to the encoder that it > should choose a suitable replacement character) or directly > returns errors if it is a callable object. When an > unencodable character is encounterd the error handling > callback will be called with the encoding name, the original > unicode object and the error position and must return a > unicode object that will be encoded instead of the offending > character (or the callback may of course raise an > exception). U+FFFD characters in the replacement string will > be replaced with a character that the encoder chooses ('?' > in all cases). Nice. > The implementation of the loop through the string is done in > the following way. A stack with two strings is kept and the > loop always encodes a character from the string at the > stacktop. If an error is encountered and the stack has only > one entry (during encoding of the original string) the > callback is called and the unicode object returned is pushed > on the stack, so the encoding continues with the replacement > string. If the stack has two entries when an error is > encountered, the replacement string itself has an > unencodable character and a normal exception raised. When > the encoder has reached the end of it's current string there > are two possibilities: when the stack contains two entries, > this was the replacement string, so the replacement string > will be poppep from the stack and encoding continues with > the next character from the original string. If the stack > had only one entry, encoding is finished. Very elegant solution ! > (I hope that's enough explanation of the API and implementation) Could you add these docs to the Misc/unicode.txt file ? I will eventually take that file and turn it into a PEP which will then serve as general documentation for these things. > I have renamed the static ...121 function to all lowercase > names. Ok. > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > replacement callback. Hmm, wouldn't that result in a slowdown ? If so, I'd rather leave the special encoder in place, since it is being used a lot in Python and probably some applications too. > PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, > PyCodec_ReplaceEncodeErrors are globally visible because > they have to be available in _codecsmodule.c to wrap them as > Python function objects, but they can't be implemented in > _codecsmodule, because they need to be available to the > encoders in unicodeobject.c (through > PyCodec_EncodeHandlerForObject), but importing the codecs > module might result in an endless recursion, because > importing a module requires unpickling of the bytecode, > which might require decoding utf8, which ... (but this will > only happen, if we implement the same mechanism for the > decoding API) I think that codecs.c is the right place for these APIs. _codecsmodule.c is only meant as Python access wrapper for the internal codecs and nothing more. One thing I noted about the callbacks: they assume that they will always get Unicode objects as input. This is certainly not true in the general case (it is for the codecs you touch in the patch). I think it would be worthwhile to rename the callbacks to include "Unicode" somewhere, e.g. PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but then it points out the application field of the callback rather well. Same for the callbacks exposed through the _codecsmodule. > I have not touched PyUnicode_TranslateCharmap yet, > should this function also support error callbacks? Why would > one want the insert None into the mapping to call the callback? 1. Yes. 2. The user may want to e.g. restrict usage of certain character ranges. In this case the codec would be used to verify the input and an exception would indeed be useful (e.g. say you want to restrict input to Hangul + ASCII). > A remaining problem is how to implement decoding error > callbacks. In Python 2.1 encoding and decoding errors are > handled in the same way with a string value. But with > callbacks it doesn't make sense to use the same callback for > encoding and decoding (like codecs.StreamReaderWriter and > codecs.StreamRecoder do). Decoding callbacks have a > different API. Which arguments should be passed to the > decoding callback, and what is the decoding callback > supposed to do? I'd suggest adding another set of PyCodec_UnicodeDecode...() APIs for this. We'd then have to augment the base classes of the StreamCodecs to provide two attributes for .errors with a fallback solution for the string case (i.s. "strict" can still be used for both directions). > One additional note: It is vital that errors is an > assignable attribute of the StreamWriter. It is already ! > Consider the XML example: For writing an XML DOM tree one > StreamWriter object is used. When a text node is written, > the error handling has to be set to > codecs.xmlreplace_encode_errors, but inside a comment or > processing instruction replacing unencodable characters with > charrefs is not possible, so here codecs.raise_encode_errors > should be used (or better a custom error handler that raises > an error that says "sorry, you can't have unencodable > characters inside a comment") Sure. > BTW, should we continue the discussion in the i18n SIG > mailing list? An email program is much more comfortable than > a HTML textarea! ;) I'd rather keep the discussions on this patch here -- forking it off to the i18n sig will make it very hard to follow up on it. (This HTML area is indeed damn small ;-) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 19:18 Message: Logged In: YES user_id=89016 One additional note: It is vital that errors is an assignable attribute of the StreamWriter. Consider the XML example: For writing an XML DOM tree one StreamWriter object is used. When a text node is written, the error handling has to be set to codecs.xmlreplace_encode_errors, but inside a comment or processing instruction replacing unencodable characters with charrefs is not possible, so here codecs.raise_encode_errors should be used (or better a custom error handler that raises an error that says "sorry, you can't have unencodable characters inside a comment") BTW, should we continue the discussion in the i18n SIG mailing list? An email program is much more comfortable than a HTML textarea! ;) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 18:59 Message: Logged In: YES user_id=89016 How the callbacks work: A PyObject * named errors is passed in. This may by NULL, Py_None, 'strict', u'strict', 'ignore', u'ignore', 'replace', u'replace' or a callable object. PyCodec_EncodeHandlerForObject maps all of these objects to one of the three builtin error callbacks PyCodec_RaiseEncodeErrors (raises an exception), PyCodec_IgnoreEncodeErrors (returns an empty replacement string, in effect ignoring the error), PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode replacement character to signify to the encoder that it should choose a suitable replacement character) or directly returns errors if it is a callable object. When an unencodable character is encounterd the error handling callback will be called with the encoding name, the original unicode object and the error position and must return a unicode object that will be encoded instead of the offending character (or the callback may of course raise an exception). U+FFFD characters in the replacement string will be replaced with a character that the encoder chooses ('?' in all cases). The implementation of the loop through the string is done in the following way. A stack with two strings is kept and the loop always encodes a character from the string at the stacktop. If an error is encountered and the stack has only one entry (during encoding of the original string) the callback is called and the unicode object returned is pushed on the stack, so the encoding continues with the replacement string. If the stack has two entries when an error is encountered, the replacement string itself has an unencodable character and a normal exception raised. When the encoder has reached the end of it's current string there are two possibilities: when the stack contains two entries, this was the replacement string, so the replacement string will be poppep from the stack and encoding continues with the next character from the original string. If the stack had only one entry, encoding is finished. (I hope that's enough explanation of the API and implementation) I have renamed the static ...121 function to all lowercase names. BTW, I guess PyUnicode_EncodeUnicodeEscape could be reimplemented as PyUnicode_EncodeASCII with a \uxxxx replacement callback. PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, PyCodec_ReplaceEncodeErrors are globally visible because they have to be available in _codecsmodule.c to wrap them as Python function objects, but they can't be implemented in _codecsmodule, because they need to be available to the encoders in unicodeobject.c (through PyCodec_EncodeHandlerForObject), but importing the codecs module might result in an endless recursion, because importing a module requires unpickling of the bytecode, which might require decoding utf8, which ... (but this will only happen, if we implement the same mechanism for the decoding API) I have not touched PyUnicode_TranslateCharmap yet, should this function also support error callbacks? Why would one want the insert None into the mapping to call the callback? A remaining problem is how to implement decoding error callbacks. In Python 2.1 encoding and decoding errors are handled in the same way with a string value. But with callbacks it doesn't make sense to use the same callback for encoding and decoding (like codecs.StreamReaderWriter and codecs.StreamRecoder do). Decoding callbacks have a different API. Which arguments should be passed to the decoding callback, and what is the decoding callback supposed to do? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 18:00 Message: Logged In: YES user_id=38388 About the Py_UNICODE*data, int size APIs: Ok, point taken. In general, I think we ought to keep the callback feature as open as possible, so passing in pointers and sizes would not be very useful. BTW, could you summarize how the callback works in a few lines ? About _Encode121: I'd name this _EncodeUCS1 since that's what it is ;-) About the new functions: I was referring to the new static functions which you gave PyUnicode_... names. If these are not supposed to turn into non-static functions, I'd rather have them use lower case names (since that's how the Python internals work too -- most of the times). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 16:56 Message: Logged In: YES user_id=89016 > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. Another problem is, that the callback requires a Python object, so in the PyObject *version, the refcount is incref'd and the object is passed to the callback. The Py_UNICODE*/int version would have to create a new Unicode object from the data. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 16:32 Message: Logged In: YES user_id=89016 > * please don't place more than one C statement on one line > like in: > """ > + unicode = unicode2; unicodepos = > unicode2pos; > + unicode2 = NULL; unicode2pos = 0; > """ OK, done! > * Comments should start with a capital letter and be > prepended > to the section they apply to Fixed! > * There should be spaces between arguments in compares > (a == b) not (a==b) Fixed! > * Where does the name "...Encode121" originate ? encode one-to-one, it implements both ASCII and latin-1 encoding. > * module internal APIs should use lower case names (you > converted some of these to PyUnicode_...() -- this is > normally reserved for APIs which are either marked as > potential candidates for the public API or are very > prominent in the code) Which ones? I introduced a new function for every old one, that had a "const char *errors" argument, and a few new ones in codecs.h, of those PyCodec_EncodeHandlerForObject is vital, because it is used to map for old string arguments to the new function objects. PyCodec_RaiseEncodeErrors can be used in the encoder implementation to raise an encode error, but it could be made static in unicodeobject.h so only those encoders implemented there have access to it. > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. I look through the code and found no situation where the Py_UNICODE*/int version is really used and having two (PyObject *)s (the original and the replacement string), instead of UNICODE*/int and PyObject * made the implementation a little easier, but I can fix that. > Please separate the errors.c patch from this patch -- it > seems totally unrelated to Unicode. PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with four hex digits. I removed it. I'll upload a revised patch as soon as it's done. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 14:29 Message: Logged In: YES user_id=38388 Thanks for the patch -- it looks very impressive !. I'll give it a try later this week. Some first cosmetic tidbits: * please don't place more than one C statement on one line like in: """ + unicode = unicode2; unicodepos = unicode2pos; + unicode2 = NULL; unicode2pos = 0; """ * Comments should start with a capital letter and be prepended to the section they apply to * There should be spaces between arguments in compares (a == b) not (a==b) * Where does the name "...Encode121" originate ? * module internal APIs should use lower case names (you converted some of these to PyUnicode_...() -- this is normally reserved for APIs which are either marked as potential candidates for the public API or are very prominent in the code) One thing which I don't like about your API change is that you removed the Py_UNICODE*data, int size style arguments -- this makes it impossible to use the new APIs on non-Python data or data which is not available as Unicode object. Please separate the errors.c patch from this patch -- it seems totally unrelated to Unicode. Thanks. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 From noreply@sourceforge.net Wed Apr 17 20:42:48 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 12:42:48 -0700 Subject: [Patches] [ python-Patches-415227 ] Solaris pkgtool bdist command Message-ID: Patches item #415227, was opened at 2001-04-10 19:54 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415227&group_id=5470 Category: Distutils and setup.py Group: None >Status: Closed >Resolution: Duplicate Priority: 5 Submitted By: Mark Alexander (mwa) Assigned to: M.-A. Lemburg (lemburg) Summary: Solaris pkgtool bdist command Initial Comment: The bdist_pktool command is based on bdist_packager and provides support for the Solaris pkgadd and pkgrm commands. In most cases, no additional options beyond the PEP 241 options are required. An exception is if the package name is >9 characters, a --pkg-abrev option is required because that's all pkgtool will handle. It makes listing the packages on the system a pain, but the actual package files produced do match name-version-revision-pyvers.pkg format. By default, bdist_pkgtool provides request, postinstall, preremove, and postremove scripts that will properly relocate modules to the site-packages directory and recompile all .py modules on the target machine. An author can provide a custom request script and either have it auto-relocate by merging the scripts, or inhibit auto-relocation with --no-autorelocate. ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2002-04-17 19:42 Message: Logged In: YES user_id=38388 Replaced by 531901. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-02-24 18:25 Message: Logged In: YES user_id=38388 The code looks OK, but I can't test it... I'm sure the user base will, though, once it's in CVS. Please also write up some documentation which we can add to the distutils TeX docs and add them to the patch. I will then add it to CVS. Thanks. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-11-20 14:00 Message: Logged In: YES user_id=38388 Hijacking this patch to take load off of Andrew. This patch should be reviewed after the Python 2.2 feature freeze. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-07 05:39 Message: Logged In: YES user_id=21627 Should there also be some Makefile machinery to create a Solaris package for python itself? There is a 1.6a2 package on sunfreeware; it would surely help if building Solaris packages was supported by the Python core itself. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415227&group_id=5470 From noreply@sourceforge.net Wed Apr 17 20:43:11 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 12:43:11 -0700 Subject: [Patches] [ python-Patches-415228 ] HP-UX packaging command Message-ID: Patches item #415228, was opened at 2001-04-10 19:56 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415228&group_id=5470 Category: Distutils and setup.py Group: None >Status: Closed >Resolution: Duplicate Priority: 5 Submitted By: Mark Alexander (mwa) Assigned to: M.-A. Lemburg (lemburg) Summary: HP-UX packaging command Initial Comment: The bdist_sdux (SD-UX is HP's packager) command is based on bdist_packager and provides the same functionality as the bdist_pkgtool command, except the resulting packages cannot auto-relocate. Instead, a checkinstall script is included by default that determines of the target machines python installation matches that of the creating machine. If not, it bails out and provides the installer with the correct version of the swinstall command to place it in the proper directory. ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2002-04-17 19:43 Message: Logged In: YES user_id=38388 Replaced by 531901. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-02-24 18:27 Message: Logged In: YES user_id=38388 The code looks OK, but I can't test it... I'm sure the user base will, though, once it's in CVS. Please also write up some documentation which we can add to the distutils TeX docs and add them to the patch. I will then add it to CVS. Thanks. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-11-20 14:01 Message: Logged In: YES user_id=38388 Hijacking this patch to take load off of Andrew. This patch should be reviewed after the Python 2.2 feature freeze. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-04-10 21:18 Message: Logged In: YES user_id=6380 Please select the proper category when submitting patches! This is clearly a distutils thing. Assigned to Andrew. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415228&group_id=5470 From noreply@sourceforge.net Wed Apr 17 20:43:52 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 12:43:52 -0700 Subject: [Patches] [ python-Patches-415226 ] new base class for binary packaging Message-ID: Patches item #415226, was opened at 2001-04-10 19:51 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415226&group_id=5470 Category: Distutils and setup.py Group: None >Status: Closed >Resolution: Duplicate Priority: 5 Submitted By: Mark Alexander (mwa) Assigned to: M.-A. Lemburg (lemburg) Summary: new base class for binary packaging Initial Comment: bdist_packager.py provides an abstract base class for bdist commands. It provides easy access to all the PEP 241 metadata fields, plus "revision" for the package revision and installation scripts for preinstall, postinstall preremove, and postremove. That covers the base characteristics of all the package managers that I'm familiar with. If anyone can think of any others, let me know, otherwise additional extensions would be implemented in the specific packager's commands. I would, however, discourage _requiring_ any additional fields. It would be nice if by simply supplying the PEP241 metadata under the [bdist_packager] section all subclassed packagers worked with no further effort. It also has rudimentary relocation support by including a --no-autorelocate option. The bdist_packager is also where I see creating seperate binary packages for sub-packages supported. My need for that is much less than my desire for it right now, so I didn't give it much thought as I wrote it. I'd be delighted to hear any comments and suggestions on how to approach sub-packaging, though. ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2002-04-17 19:43 Message: Logged In: YES user_id=38388 Replaced by 531901. ---------------------------------------------------------------------- Comment By: Mark Alexander (mwa) Date: 2001-10-02 21:10 Message: Logged In: YES user_id=12810 Regarding script code: The preinstall, postinstall, etc. scripts are hooked into the package manager specific subclasses. It's the responsibility of the specific class to "do the right thing". For *NIX package managers, this is usually script code, although changing the help text to be more informative isn't a problem. More specifically, using python scripts under pkgtool and sdux would fail. Install scripts are not executed, they're sourced (in some wierd fashion I've yet to identify). Theoretically, using a shell script to find the python interpreter by querying the package manager and calling it with either -i or a runtime created script should work fine. This is intended as a class for instantiating new bdist commands with full support for pep 241. Current bdist commands do their own thing, and they do it very differently. I'd rather see this put in as a migration path than shut down bdist commands that function just fine on their own. Eventual adoption of a standard abstract base would mean that module authors could provide all metadata in a standard format, and distutils would be able to create binary packages for systems the author doesn't have access to. This works for Solaris pkgtool and HP-UX SDUX. All three patches can be included with ZERO side effects on any other aspect of Distutils. I'm really kind of curious why they're not integrated yet so other's can try them out. ---------------------------------------------------------------------- Comment By: david arnold (dja) Date: 2001-09-20 09:08 Message: Logged In: YES user_id=78574 i recently struck a case where i wanted the ability to run a post-install script on Windows (from a bdist_wininst-produced package). while i agree with what seems to be the basic intention of this patch, wouldn't it be more useful to have the various scripts run by the Python interpreter, rather than Bourne shell (which is extremely seldom available on Windows, MacOS, etc) ? i went looking for the source of the .exe file embedded in the wininst command, but couldn't find it. does anyone know where it lives? ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-07 05:33 Message: Logged In: YES user_id=21627 Shouldn't the patch also modify the existing bdist commands to use this as a base class? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415226&group_id=5470 From noreply@sourceforge.net Wed Apr 17 20:54:35 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 12:54:35 -0700 Subject: [Patches] [ python-Patches-531901 ] binary packagers Message-ID: Patches item #531901, was opened at 2002-03-19 15:53 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=531901&group_id=5470 Category: Distutils and setup.py Group: None Status: Open Resolution: None Priority: 5 Submitted By: Mark Alexander (mwa) Assigned to: M.-A. Lemburg (lemburg) Summary: binary packagers Initial Comment: zip file with updated Solaris and HP-UX packagers. Replaces 415226, 415227, 415228. Changes made to take advantage of new PEP241 changes in the Distribution class. ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2002-04-17 19:54 Message: Logged In: YES user_id=38388 I will try to checkin your latest version into CVS today. The PSF will still require you to sign a contributor agreement for these addition, though, after these have been through the legal review phase. http://www.python.org/psf/psf-contributor-agreement.html Is that acceptable ? Note: I'm still awaiting the documentation for these files. Thanks. ---------------------------------------------------------------------- Comment By: Mark Alexander (mwa) Date: 2002-04-15 18:54 Message: Logged In: YES user_id=12810 New file submitted. No documentation yet, but I am committed to maintaining them. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-04-11 16:59 Message: Logged In: YES user_id=38388 Mark, could you reupload the ZIP file ? I cannot download it from the SF page (the file is mostly empty). Also, is the documentation already included in the ZIP file ? If not, it would be nice if you could add them as well. I don't require a special PEP for these changes, BTW, but I do require you to maintain them. Thanks. ---------------------------------------------------------------------- Comment By: Mark Alexander (mwa) Date: 2002-03-20 19:55 Message: Logged In: YES user_id=12810 OK, the PEP seems to me to mean most of this is done. These additions are not library modules, they are Distutils "commands". So the way i read it, the Distutils-SIG (where I've been hanging around for some time) are the Maintainers. The documentation will be 2 new chapters for the Distutils manual "Creating Solaris packages" and "Creating HP-UX packages" each looking a whole lot like "Creating RPM packages". Does that clarify anything, or am I still missing a clue? p.s. Thanks for cleaning up the extra uploads! ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-20 15:35 Message: Logged In: YES user_id=21627 You volunteering as the maintainer is part of the prerequisites of accepting new modules, when following PEP 2, see http://python.sourceforge.net/peps/pep-0002.html It says: "developers ... will first form a group of maintainers. Then, this group shall produce a PEP called a library PEP." So existance of a PEP describing these library extensions would be a prerequisite for accepting them. If MAL wants to waive this requirement, it would be fine with me. However, such a PEP could also share text with the documentation, so it might not be wasted effort. ---------------------------------------------------------------------- Comment By: Mark Alexander (mwa) Date: 2002-03-20 14:49 Message: Logged In: YES user_id=12810 Any of the three (they're all the same). SourceForge hiccuped during the upload, and I don't have permission to delete the duplicates. I don't exactly understand what you mean by applying PEP 2. I uploaded this per Marc Lemburg's request for the latest versions of patches 41522[6-8]. He's acting as as the integrator in this case (see http://mail.python.org/pipermail/distutils-sig/2001-December/002659.html). I let him know about the duplicate uploads, so hopefully he'll correct it. If you can and want, feel free to delete the 2 of your choice. I agree they need to be documented. As soon as I can, I'll submit changes to the Distutils documentation. Finally, yes, I'll act as maintainer. I'm on the Distutils-sig and as soon as some other poor soul who has to deal with Solaris or HP-UX tries them, I'm there to work out issues. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-03-20 07:35 Message: Logged In: YES user_id=21627 Which of the three attached files is the right one (19633, 19634, or 19635)? Unless they are all needed, we should delete the extra copies. I recommend to apply PEP 2 to this patch: A library PEP is needed (which could be quite short), documentation, perhaps test cases. Most importantly, there must be an identified maintainer of these modules. Are you willing to act as the maintainer? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=531901&group_id=5470 From noreply@sourceforge.net Wed Apr 17 21:50:07 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 13:50:07 -0700 Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks Message-ID: Patches item #432401, was opened at 2001-06-12 15:43 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 Category: Core (C code) Group: None Status: Open Resolution: Postponed Priority: 6 Submitted By: Walter Dörwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: unicode encoding error callbacks Initial Comment: This patch adds unicode error handling callbacks to the encode functionality. With this patch it's possible to not only pass 'strict', 'ignore' or 'replace' as the errors argument to encode, but also a callable function, that will be called with the encoding name, the original unicode object and the position of the unencodable character. The callback must return a replacement unicode object that will be encoded instead of the original character. For example replacing unencodable characters with XML character references can be done in the following way. u"aäoöuüß".encode( "ascii", lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos]) ) ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2002-04-17 22:50 Message: Logged In: YES user_id=89016 > About the difference between encoding > and decoding: you shouldn't just look > at the case where you work with Unicode > and strings, e.g. take the rot-13 codec > which works on strings only or other > codecs which translate objects into > strings and vice-versa. unicode.encode encodes to str and str.decode decodes to unicode, even for rot-13: >>> u"gürk".encode("rot13") 't\xfcex' >>> "gürk".decode("rot13") u't\xfcex' >>> u"gürk".decode("rot13") Traceback (most recent call last): File "", line 1, in ? AttributeError: 'unicode' object has no attribute 'decode' >>> "gürk".encode("rot13") Traceback (most recent call last): File "", line 1, in ? File "/home/walter/Python-current- readonly/dist/src/Lib/encodings/rot_13.py", line 18, in encode return codecs.charmap_encode(input,errors,encoding_map) UnicodeError: ASCII decoding error: ordinal not in range (128) Here the str is converted to unicode first, before encode is called, but the conversion to unicode fails. Is there an example where something else happens? > Error handling has to be flexible enough > to handle all these situations. Since > the codecs know best how to handle the > situations, I'd make this an implementation > detail of the codec and leave the > behaviour undefined in the general case. OK, but we should suggest, that for encoding unencodable characters are collected and for decoding seperate byte sequences that are considered broken by the codec are passed to the callback: i.e for decoding the handler will never get all broken data in one call, e.g. for "\u30\Uffffffff".decode("unicode-escape") the handler will be called twice (once for "\u30" and "truncated \u escape" as the reason and once for "\Uffffffff" and "illegal character" as the reason.) > For the existing codecs, backward > compatibility should be maintained, > if at all possible. If the patch gets > overly complicated because of this, > we may have to provide a downgrade solution > for this particular problem (I don't think > replace is used in any computational context, > though, since you can never be sure how > many replacement character do get > inserted, so the case may not be > that realistic). > > Raising an exception for the charmap codec > is the right way to go, IMHO. I would > consider the current behaviour a bug. OK, this is implemented in PyUnicode_EncodeCharmap now, and collecting unencodable characters works too. I completely changed the implementation, because the stack approach would have gotten much more complicated when unencodable characters are collected. > For new codecs, I think we should > suggest that replace tries to collect > as much illegal data as possible before > invoking the error handler. The handler > should be aware of the fact that it > won't necessarily get all the broken > data in one call. OK for encoders, for decoders see above. > About the codec error handling > registry: You seem to be using a > Unicode specific approach here. > I'd rather like to see a generic > approach which uses the API > we discussed earlier. Would that be possible? The handlers in the registry are all Unicode specific. and they are different for encoding and for decoding. I renamed the function because of your comment from 2001-06-13 10:05 (which becomes exceedingly difficult to find on this long page! ;)). > In that case, the codec API should > probably be called > codecs.register_error('myhandler', myhandler). > > Does that make sense ? We could require that unique names are used for custom handlers, but for the standard handlers we do have name collisions. To prevent them, we could either remove them from the registry and require that the codec implements the error handling for those itself, or we could to some fiddling, so that u"üöä".encode("ascii", "replace") becomes u"üöä".encode("ascii", "unicodeencodereplace") behind the scenes. But I think two unicode specific registries are much simpler to handle. > BTW, the patch which uses the callback > registry does not seem to be available > on this SF page (the last patch still > converts the errors argument to a > PyObject, which shouldn't be needed > anymore with the new approach). > Can you please upload your > latest version? OK, I'll upload a preliminary version tomorrow. PyUnicode_EncodeDecimal and PyUnicode_TranslateCharmap are still missing, but otherwise the patch seems to be finished. All decoders work and the encoders collect unencodable characters and implement the handling of known callback handler names themselves. As PyUnicode_EncodeDecimal is only used by the int, long, float, and complex constructors, I'd love to get rid of the errors argument, but for completeness sake, I'll implement the callback functionality. > Note that the highlighting codec > would make a nice example > for the new feature. This could be part of the codec callback test script, which I've started to write. We could kill two birds with one stone here: 1. Test the implementation. 2. Document and advocate what is possible with the patch. Another idea: we could have as an example a decoding handler that relaxes the UTF-8 minimal encoding restriction, e.g. def relaxedutf8(enc, uni, startpos, endpos, reason, data): if uni[startpos:startpos+2] == u"\xc0\x80": return (u"\x00", startpos+2) else: raise UnicodeError(...) ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-04-17 21:40 Message: Logged In: YES user_id=38388 Sorry for the late response. About the difference between encoding and decoding: you shouldn't just look at the case where you work with Unicode and strings, e.g. take the rot-13 codec which works on strings only or other codecs which translate objects into strings and vice-versa. Error handling has to be flexible enough to handle all these situations. Since the codecs know best how to handle the situations, I'd make this an implementation detail of the codec and leave the behaviour undefined in the general case. For the existing codecs, backward compatibility should be maintained, if at all possible. If the patch gets overly complicated because of this, we may have to provide a downgrade solution for this particular problem (I don't think replace is used in any computational context, though, since you can never be sure how many replacement character do get inserted, so the case may not be that realistic). Raising an exception for the charmap codec is the right way to go, IMHO. I would consider the current behaviour a bug. For new codecs, I think we should suggest that replace tries to collect as much illegal data as possible before invoking the error handler. The handler should be aware of the fact that it won't necessarily get all the broken data in one call. About the codec error handling registry: You seem to be using a Unicode specific approach here. I'd rather like to see a generic approach which uses the API we discussed earlier. Would that be possible ? In that case, the codec API should probably be called codecs.register_error('myhandler', myhandler). Does that make sense ? BTW, the patch which uses the callback registry does not seem to be available on this SF page (the last patch still converts the errors argument to a PyObject, which shouldn't be needed anymore with the new approach). Can you please upload your latest version ? Note that the highlighting codec would make a nice example for the new feature. Thanks. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-17 12:21 Message: Logged In: YES user_id=89016 Another note: the patch will change the meaning of charmap encoding slightly: currently "replace" will put a ? into the output, even if ? is not in the mapping, i.e. codecs.charmap_encode(u"c", "replace", {ord("a"): ord ("b")}) will return ('?', 1). With the patch the above example will raise an exception. Off course with the patch many more replace characters can appear, so it is vital that for the replacement string the mapping is done. Is this semantic change OK? (I guess all of the existing codecs have a mapping ord("?")->ord("?")) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-15 18:19 Message: Logged In: YES user_id=89016 So this means that the encoder can collect illegal characters and pass it to the callback. "replace" will replace this with (end-start)*u"?". Decoders don't collect all illegal byte sequences, but call the callback once for every byte sequence that has been found illegal and "replace" will replace it with u"?". Does this make sense? ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-15 18:06 Message: Logged In: YES user_id=89016 For encoding it's always (end-start)*u"?": >>> u"ää".encode("ascii", "replace") '??' But for decoding, it is neither nor: >>> "\Ux\U".decode("unicode-escape", "replace") u'\ufffd\ufffd' i.e. a sequence of 5 illegal characters was replace by two replacement characters. This might mean that decoders can't collect all the illegal characters and call the callback once. They might have to call the callback for every single illegal byte sequence to get the old behaviour. (It seems that this patch would be much, much simpler, if we only change the encoders) ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-03-08 19:36 Message: Logged In: YES user_id=38388 Hmm, whatever it takes to maintain backwards compatibility. Do you have an example ? ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-08 18:31 Message: Logged In: YES user_id=89016 What should replace do: Return u"?" or (end-start)*u"?" ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-03-08 16:15 Message: Logged In: YES user_id=38388 Sounds like a good idea. Please keep the encoder and decoder APIs symmetric, though, ie. add the slice information to both APIs. The slice should use the same format as Python's standard slices, that is left inclusive, right exclusive. I like the highlighting feature ! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-08 00:09 Message: Logged In: YES user_id=89016 I'm think about extending the API a little bit: Consider the following example: >>> "\u1".decode("unicode-escape") Traceback (most recent call last): File "", line 1, in ? UnicodeError: encoding 'unicodeescape' can't decode byte 0x31 in position 2: truncated \uXXXX escape The error message is a lie: Not the '1' in position 2 is the problem, but the complete truncated sequence '\u1'. For this the decoder should pass a start and an end position to the handler. For encoding this would be useful too: Suppose I want to have an encoder that colors the unencodable character via an ANSI escape sequences. Then I could do the following: >>> import codecs >>> def color(enc, uni, pos, why, sta): ... return (u"\033[1m<%d>\033[0m" % ord(uni[pos]), pos+1) ... >>> codecs.register_unicodeencodeerrorhandler("color", color) >>> u"aäüöo".encode("ascii", "color") 'a\x1b[1m<228>\x1b[0m\x1b[1m<252>\x1b[0m\x1b[1m<246>\x1b [0mo' But here the sequences "\x1b[0m\x1b[1m" are not needed. To fix this problem the encoder could collect as many unencodable characters as possible and pass those to the error callback in one go (passing a start and end+1 position). This fixes the above problem and reduces the number of calls to the callback, so it should speed up the algorithms in case of custom encoding names. (And it makes the implementation very interesting ;)) What do you think? ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-07 02:29 Message: Logged In: YES user_id=89016 I started from scratch, and the current state is this: Encoding mostly works (except that I haven't changed TranslateCharmap and EncodeDecimal yet) and most of the decoding stuff works (DecodeASCII and DecodeCharmap are still unchanged) and the decoding callback helper isn't optimized for the "builtin" names yet (i.e. it still calls the handler). For encoding the callback helper knows how to handle "strict", "replace", "ignore" and "xmlcharrefreplace" itself and won't call the callback. This should make the encoder fast enough. As callback name string comparison results are cached it might even be faster than the original. The patch so far didn't require any changes to unicodeobject.h, stringobject.h or stringobject.c ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2002-03-05 17:49 Message: Logged In: YES user_id=38388 Walter, are you making any progress on the new scheme we discussed on the mailing list (adding an error handler registry much like the codec registry itself instead of trying to redo the complete codec API) ? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-09-20 12:38 Message: Logged In: YES user_id=38388 I am postponing this patch until the PEP process has started. This feature won't make it into Python 2.2. Walter, you may want to reference this patch in the PEP. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-08-16 12:53 Message: Logged In: YES user_id=38388 I think we ought to summarize these changes in a PEP to get some more feedback and testing from others as well. I'll look into this after I'm back from vacation on the 10.09. Given the release schedule I am not sure whether this feature will make it into 2.2. The size of the patch is huge and probably needs a lot of testing first. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-07-27 05:55 Message: Logged In: YES user_id=89016 Changing the decoding API is done now. There are new functions codec.register_unicodedecodeerrorhandler and codec.lookup_unicodedecodeerrorhandler. Only the standard handlers for 'strict', 'ignore' and 'replace' are preregistered. There may be many reasons for decoding errors in the byte string, so I added an additional argument to the decoding API: reason, which gives the reason for the failure, e.g.: >>> "\U1111111".decode("unicode_escape") Traceback (most recent call last): File "", line 1, in ? UnicodeError: encoding 'unicodeescape' can't decode byte 0x31 in position 8: truncated \UXXXXXXXX escape >>> "\U11111111".decode("unicode_escape") Traceback (most recent call last): File "", line 1, in ? UnicodeError: encoding 'unicodeescape' can't decode byte 0x31 in position 9: illegal Unicode character For symmetry I added this to the encoding API too: >>> u"\xff".encode("ascii") Traceback (most recent call last): File "", line 1, in ? UnicodeError: encoding 'ascii' can't decode byte 0xff in position 0: ordinal not in range(128) The parameters passed to the callbacks now are: encoding, unicode, position, reason, state. The encoding and decoding API for strings has been adapted too, so now the new API should be usable everywhere: >>> unicode("a\xffb\xffc", "ascii", ... lambda enc, uni, pos, rea, sta: (u"", pos+1)) u'abc' >>> "a\xffb\xffc".decode("ascii", ... lambda enc, uni, pos, rea, sta: (u"", pos+1)) u'abc' I had a problem with the decoding API: all the functions in _codecsmodule.c used the t# format specifier. I changed that to O! with &PyString_Type, because otherwise we would have the problem that the decoding API would must pass buffer object around instead of strings, and the callback would have to call str() on the buffer anyway to access a specific character, so this wouldn't be any faster than calling str() on the buffer before decoding. It seems that buffers aren't used anyway. I changed all the old function to call the new ones so bugfixes don't have to be done in two places. There are two exceptions: I didn't change PyString_AsEncodedString and PyString_AsDecodedString because they are documented as deprecated anyway (although they are called in a few spots) This means that I duplicated part of their functionality in PyString_AsEncodedObjectEx and PyString_AsDecodedObjectEx. There are still a few spots that call the old API: E.g. PyString_Format still calls PyUnicode_Decode (but with strict decoding) because it passes the rest of the format string to PyUnicode_Format when it encounters a Unicode object. Should we switch to the new API everywhere even if strict encoding/decoding is used? The size of this patch begins to scare me. I guess we need an extensive test script for all the new features and documentation. I hope you have time to do that, as I'll be busy with other projects in the next weeks. (BTW, I have't touched PyUnicode_TranslateCharmap yet.) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-07-23 19:03 Message: Logged In: YES user_id=89016 New version of the patch with the error handling callback registry. > > OK, done, now there's a > > PyCodec_EscapeReplaceUnicodeEncodeErrors/ > > codecs.escapereplace_unicodeencode_errors > > that uses \u (or \U if x>0xffff (with a wide build > > of Python)). > > Great! Now PyCodec_EscapeReplaceUnicodeEncodeErrors uses \x in addition to \u and \U where appropriate. > > [...] > > But for special one-shot error handlers, it might still be > > useful to pass the error handler directly, so maybe we > > should leave error as PyObject *, but implement the > > registry anyway? > > Good idea ! > > One minor nit: codecs.registerError() should be named > codecs.register_errorhandler() to be more inline with > the Python coding style guide. OK, but these function are specific to unicode encoding, so now the functions are called: codecs.register_unicodeencodeerrorhandler codecs.lookup_unicodeencodeerrorhandler Now all callbacks (including the new ones: "xmlcharrefreplace" and "escapereplace") are registered in the codecs.c/_PyCodecRegistry_Init so using them is really simple: u"gürk".encode("ascii", "xmlcharrefreplace") ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-07-13 13:26 Message: Logged In: YES user_id=38388 > > > > > BTW, I guess PyUnicode_EncodeUnicodeEscape > > > > > could be reimplemented as PyUnicode_EncodeASCII > > > > > with \uxxxx replacement callback. > > > > > > > > Hmm, wouldn't that result in a slowdown ? If so, > > > > I'd rather leave the special encoder in place, > > > > since it is being used a lot in Python and > > > > probably some applications too. > > > > > > It would be a slowdown. But callbacks open many > > > possiblities. > > > > True, but in this case I believe that we should stick with > > the native implementation for "unicode-escape". Having > > a standard callback error handler which does the \uXXXX > > replacement would be nice to have though, since this would > > also be usable with lots of other codecs (e.g. all the > > code page ones). > > OK, done, now there's a > PyCodec_EscapeReplaceUnicodeEncodeErrors/ > codecs.escapereplace_unicodeencode_errors > that uses \u (or \U if x>0xffff (with a wide build > of Python)). Great ! > > [...] > > > Should the old TranslateCharmap map to the new > > > TranslateCharmapEx and inherit the > > > "multicharacter replacement" feature, > > > or should I leave it as it is? > > > > If possible, please also add the multichar replacement > > to the old API. I think it is very useful and since the > > old APIs work on raw buffers it would be a benefit to have > > the functionality in the old implementation too. > > OK! I will try to find the time to implement that in the > next days. Good. > > [Decoding error callbacks] > > > > About the return value: > > > > I'd suggest to always use the same tuple interface, e.g. > > > > callback(encoding, input_data, input_position, > state) -> > > (output_to_be_appended, new_input_position) > > > > (I think it's better to use absolute values for the > > position rather than offsets.) > > > > Perhaps the encoding callbacks should use the same > > interface... what do you think ? > > This would make the callback feature hypergeneric and a > little slower, because tuples have to be created, but it > (almost) unifies the encoding and decoding API. ("almost" > because, for the encoder output_to_be_appended will be > reencoded, for the decoder it will simply be appended.), > so I'm for it. That's the point. Note that I don't think the tuple creation will hurt much (see the make_tuple() API in codecs.c) since small tuples are cached by Python internally. > I implemented this and changed the encoders to only > lookup the error handler on the first error. The UCS1 > encoder now no longer uses the two-item stack strategy. > (This strategy only makes sense for those encoder where > the encoding itself is much more complicated than the > looping/callback etc.) So now memory overflow tests are > only done, when an unencodable error occurs, so now the > UCS1 encoder should be as fast as it was without > error callbacks. > > Do we want to enforce new_input_position>input_position, > or should jumping back be allowed? No; moving backwards should be allowed (this may be useful in order to resynchronize with the input data). > Here's is the current todo list: > 1. implement a new TranslateCharmap and fix the old. > 2. New encoding API for string objects too. > 3. Decoding > 4. Documentation > 5. Test cases > > I'm thinking about a different strategy for implementing > callbacks > (see http://mail.python.org/pipermail/i18n-sig/2001- > July/001262.html) > > We coould have a error handler registry, which maps names > to error handlers, then it would be possible to keep the > errors argument as "const char *" instead of "PyObject *". > Currently PyCodec_UnicodeEncodeHandlerForObject is a > backwards compatibility hack that will never go away, > because > it's always more convenient to type > u"...".encode("...", "strict") > instead of > import codecs > u"...".encode("...", codecs.raise_encode_errors) > > But with an error handler registry this function would > become the official lookup method for error handlers. > (PyCodec_LookupUnicodeEncodeErrorHandler?) > Python code would look like this: > --- > def xmlreplace(encoding, unicode, pos, state): > return (u"&#%d;" % ord(uni[pos]), pos+1) > > import codec > > codec.registerError("xmlreplace",xmlreplace) > --- > and then the following call can be made: > u"äöü".encode("ascii", "xmlreplace") > As soon as the first error is encountered, the encoder uses > its builtin error handling method if it recognizes the name > ("strict", "replace" or "ignore") or looks up the error > handling function in the registry if it doesn't. In this way > the speed for the backwards compatible features is the same > as before and "const char *error" can be kept as the > parameter to all encoding functions. For speed common error > handling names could even be implemented in the encoder > itself. > > But for special one-shot error handlers, it might still be > useful to pass the error handler directly, so maybe we > should leave error as PyObject *, but implement the > registry anyway? Good idea ! One minor nit: codecs.registerError() should be named codecs.register_errorhandler() to be more inline with the Python coding style guide. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-07-12 13:03 Message: Logged In: YES user_id=89016 > > [...] > > so I guess we could change the replace handler > > to always return u'?'. This would make the > > implementation a little bit simpler, but the > > explanation of the callback feature *a lot* > > simpler. > > Go for it. OK, done! > [...] > > > Could you add these docs to the Misc/unicode.txt > > > file ? I will eventually take that file and turn > > > it into a PEP which will then serve as general > > > documentation for these things. > > > > I could, but first we should work out how the > > decoding callback API will work. > > Ok. BTW, Barry Warsaw already did the work of converting > the unicode.txt to PEP 100, so the docs should eventually > go there. OK. I guess it would be best to do this when everything is finished. > > > > BTW, I guess PyUnicode_EncodeUnicodeEscape > > > > could be reimplemented as PyUnicode_EncodeASCII > > > > with \uxxxx replacement callback. > > > > > > Hmm, wouldn't that result in a slowdown ? If so, > > > I'd rather leave the special encoder in place, > > > since it is being used a lot in Python and > > > probably some applications too. > > > > It would be a slowdown. But callbacks open many > > possiblities. > > True, but in this case I believe that we should stick with > the native implementation for "unicode-escape". Having > a standard callback error handler which does the \uXXXX > replacement would be nice to have though, since this would > also be usable with lots of other codecs (e.g. all the > code page ones). OK, done, now there's a PyCodec_EscapeReplaceUnicodeEncodeErrors/ codecs.escapereplace_unicodeencode_errors that uses \u (or \U if x>0xffff (with a wide build of Python)). > > For example: > > > > Why can't I print u"gürk"? > > > > is probably one of the most frequently asked > > questions in comp.lang.python. For printing > > Unicode stuff, print could be extended the use an > > error handling callback for Unicode strings (or > > objects where __str__ or tp_str returns a Unicode > > object) instead of using str() which always > > returns an 8bit string and uses strict encoding. > > There might even be a > > sys.setprintencodehandler()/sys.getprintencodehandler () > > There already is a print callback in Python (forgot the > name of the hook though), so this should be possible by > providing the encoding logic in the hook. True: sys.displayhook > [...] > > Should the old TranslateCharmap map to the new > > TranslateCharmapEx and inherit the > > "multicharacter replacement" feature, > > or should I leave it as it is? > > If possible, please also add the multichar replacement > to the old API. I think it is very useful and since the > old APIs work on raw buffers it would be a benefit to have > the functionality in the old implementation too. OK! I will try to find the time to implement that in the next days. > [Decoding error callbacks] > > About the return value: > > I'd suggest to always use the same tuple interface, e.g. > > callback(encoding, input_data, input_position, state) -> > (output_to_be_appended, new_input_position) > > (I think it's better to use absolute values for the > position rather than offsets.) > > Perhaps the encoding callbacks should use the same > interface... what do you think ? This would make the callback feature hypergeneric and a little slower, because tuples have to be created, but it (almost) unifies the encoding and decoding API. ("almost" because, for the encoder output_to_be_appended will be reencoded, for the decoder it will simply be appended.), so I'm for it. I implemented this and changed the encoders to only lookup the error handler on the first error. The UCS1 encoder now no longer uses the two-item stack strategy. (This strategy only makes sense for those encoder where the encoding itself is much more complicated than the looping/callback etc.) So now memory overflow tests are only done, when an unencodable error occurs, so now the UCS1 encoder should be as fast as it was without error callbacks. Do we want to enforce new_input_position>input_position, or should jumping back be allowed? > > > > One additional note: It is vital that errors > > > > is an assignable attribute of the StreamWriter. > > > > > > It is already ! > > > > I know, but IMHO it should be documented that an > > assignable errors attribute must be supported > > as part of the official codec API. > > > > Misc/unicode.txt is not clear on that: > > """ > > It is not required by the Unicode implementation > > to use these base classes, only the interfaces must > > match; this allows writing Codecs as extension types. > > """ > > Good point. I'll add that to the PEP 100. OK. Here's is the current todo list: 1. implement a new TranslateCharmap and fix the old. 2. New encoding API for string objects too. 3. Decoding 4. Documentation 5. Test cases I'm thinking about a different strategy for implementing callbacks (see http://mail.python.org/pipermail/i18n-sig/2001- July/001262.html) We coould have a error handler registry, which maps names to error handlers, then it would be possible to keep the errors argument as "const char *" instead of "PyObject *". Currently PyCodec_UnicodeEncodeHandlerForObject is a backwards compatibility hack that will never go away, because it's always more convenient to type u"...".encode("...", "strict") instead of import codecs u"...".encode("...", codecs.raise_encode_errors) But with an error handler registry this function would become the official lookup method for error handlers. (PyCodec_LookupUnicodeEncodeErrorHandler?) Python code would look like this: --- def xmlreplace(encoding, unicode, pos, state): return (u"&#%d;" % ord(uni[pos]), pos+1) import codec codec.registerError("xmlreplace",xmlreplace) --- and then the following call can be made: u"äöü".encode("ascii", "xmlreplace") As soon as the first error is encountered, the encoder uses its builtin error handling method if it recognizes the name ("strict", "replace" or "ignore") or looks up the error handling function in the registry if it doesn't. In this way the speed for the backwards compatible features is the same as before and "const char *error" can be kept as the parameter to all encoding functions. For speed common error handling names could even be implemented in the encoder itself. But for special one-shot error handlers, it might still be useful to pass the error handler directly, so maybe we should leave error as PyObject *, but implement the registry anyway? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-07-10 14:29 Message: Logged In: YES user_id=38388 Ok, here we go... > > > raise an exception). U+FFFD characters in the > replacement > > > string will be replaced with a character that the > encoder > > > chooses ('?' in all cases). > > > > Nice. > > But the special casing of U+FFFD makes the interface > somewhat > less clean than it could be. It was only done to be 100% > backwards compatible. With the original "replace" > error > handling the codec chose the replacement character. But as > far as I can tell none of the codecs uses anything other > than '?', True. > so I guess we could change the replace handler > to always return u'?'. This would make the implementation a > little bit simpler, but the explanation of the callback > feature *a lot* simpler. Go for it. > And if you still want to handle > an unencodable U+FFFD, you can write a special callback for > that, e.g. > > def FFFDreplace(enc, uni, pos): > if uni[pos] == "\ufffd": > return u"?" > else: > raise UnicodeError(...) > > > ...docs... > > > > Could you add these docs to the Misc/unicode.txt file ? I > > will eventually take that file and turn it into a PEP > which > > will then serve as general documentation for these things. > > I could, but first we should work out how the decoding > callback API will work. Ok. BTW, Barry Warsaw already did the work of converting the unicode.txt to PEP 100, so the docs should eventually go there. > > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > > > replacement callback. > > > > Hmm, wouldn't that result in a slowdown ? If so, I'd > rather > > leave the special encoder in place, since it is being > used a > > lot in Python and probably some applications too. > > It would be a slowdown. But callbacks open many > possiblities. True, but in this case I believe that we should stick with the native implementation for "unicode-escape". Having a standard callback error handler which does the \uXXXX replacement would be nice to have though, since this would also be usable with lots of other codecs (e.g. all the code page ones). > For example: > > Why can't I print u"gürk"? > > is probably one of the most frequently asked questions in > comp.lang.python. For printing Unicode stuff, print could be > extended the use an error handling callback for Unicode > strings (or objects where __str__ or tp_str returns a > Unicode object) instead of using str() which always returns > an 8bit string and uses strict encoding. There might even > be a > sys.setprintencodehandler()/sys.getprintencodehandler() There already is a print callback in Python (forgot the name of the hook though), so this should be possible by providing the encoding logic in the hook. > > > I have not touched PyUnicode_TranslateCharmap yet, > > > should this function also support error callbacks? Why > > > would one want the insert None into the mapping to > call > > > the callback? > > > > 1. Yes. > > 2. The user may want to e.g. restrict usage of certain > > character ranges. In this case the codec would be used to > > verify the input and an exception would indeed be useful > > (e.g. say you want to restrict input to Hangul + ASCII). > > OK, do we want TranslateCharmap to work exactly like > encoding, > i.e. in case of an error should the returned replacement > string again be mapped through the translation mapping or > should it be copied to the output directly? The former would > be more in line with encoding, but IMHO the latter would > be much more useful. It's better to take the second approach (copy the callback output directly to the output string) to avoid endless recursion and other pitfalls. I suppose this will also simplify the implementation somewhat. > BTW, when I implement it I can implement patch #403100 > ("Multicharacter replacements in > PyUnicode_TranslateCharmap") > along the way. I've seen it; will comment on it later. > Should the old TranslateCharmap map to the new > TranslateCharmapEx > and inherit the "multicharacter replacement" feature, > or > should I leave it as it is? If possible, please also add the multichar replacement to the old API. I think it is very useful and since the old APIs work on raw buffers it would be a benefit to have the functionality in the old implementation too. [Decoding error callbacks] > > > A remaining problem is how to implement decoding error > > > callbacks. In Python 2.1 encoding and decoding errors > are > > > handled in the same way with a string value. But with > > > callbacks it doesn't make sense to use the same > callback > > > for encoding and decoding (like > codecs.StreamReaderWriter > > > and codecs.StreamRecoder do). Decoding callbacks have > a > > > different API. Which arguments should be passed to the > > > decoding callback, and what is the decoding callback > > > supposed to do? > > > > I'd suggest adding another set of PyCodec_UnicodeDecode... > () > > APIs for this. We'd then have to augment the base classes > of > > the StreamCodecs to provide two attributes for .errors > with > > a fallback solution for the string case (i.s. "strict" > can > > still be used for both directions). > > Sounds good. Now what is the decoding callback supposed to > do? > I guess it will be called in the same way as the encoding > callback, i.e. with encoding name, original string and > position of the error. It might returns a Unicode string > (i.e. an object of the decoding target type), that will be > emitted from the codec instead of the one offending byte. Or > it might return a tuple with replacement Unicode object and > a resynchronisation offset, i.e. returning (u"?", 1) > means > emit a '?' and skip the offending character. But to make > the offset really useful the callback has to know something > about the encoding, perhaps the codec should be allowed to > pass an additional state object to the callback? > > Maybe the same should be added to the encoding callbacks to? > Maybe the encoding callback should be able to tell the > encoder if the replacement returned should be reencoded > (in which case it's a Unicode object), or directly emitted > (in which case it's an 8bit string)? I like the idea of having an optional state object (basically this should be a codec-defined arbitrary Python object) which then allow the callback to apply additional tricks. The object should be documented to be modifyable in place (simplifies the interface). About the return value: I'd suggest to always use the same tuple interface, e.g. callback(encoding, input_data, input_position, state) -> (output_to_be_appended, new_input_position) (I think it's better to use absolute values for the position rather than offsets.) Perhaps the encoding callbacks should use the same interface... what do you think ? > > > One additional note: It is vital that errors is an > > > assignable attribute of the StreamWriter. > > > > It is already ! > > I know, but IMHO it should be documented that an assignable > errors attribute must be supported as part of the official > codec API. > > Misc/unicode.txt is not clear on that: > """ > It is not required by the Unicode implementation to use > these base classes, only the interfaces must match; this > allows writing Codecs as extension types. > """ Good point. I'll add that to the PEP 100. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-22 22:51 Message: Logged In: YES user_id=38388 Sorry to keep you waiting, Walter. I will look into this again next week -- this week was way too busy... ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-13 19:00 Message: Logged In: YES user_id=38388 On your comment about the non-Unicode codecs: let's keep this separated from the current patch. Don't have much time today. I'll comment on the other things tomorrow. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-13 17:49 Message: Logged In: YES user_id=89016 Guido van Rossum wrote in python-dev: > True, the "codec" pattern can be used for other > encodings than Unicode. But it seems to me that the > entire codecs architecture is rather strongly geared > towards en/decoding Unicode, and it's not clear > how well other codecs fit in this pattern (e.g. I > noticed that all the non-Unicode codecs ignore the > error handling parameter or assert that > it is set to 'strict'). I noticed that too. asserting that errors=='strict' would mean that the encoder is not able to deal in any other way with unencodable stuff than by raising an error. But that is not the problem here, because for zlib, base64, quopri, hex and uu encoding there can be no unencodable characters. The encoders can simply ignore the errors parameter. Should I remove the asserts from those codecs and change the docstrings accordingly, or will this be done separately? ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-13 15:57 Message: Logged In: YES user_id=89016 > > [...] > > raise an exception). U+FFFD characters in the replacement > > string will be replaced with a character that the encoder > > chooses ('?' in all cases). > > Nice. But the special casing of U+FFFD makes the interface somewhat less clean than it could be. It was only done to be 100% backwards compatible. With the original "replace" error handling the codec chose the replacement character. But as far as I can tell none of the codecs uses anything other than '?', so I guess we could change the replace handler to always return u'?'. This would make the implementation a little bit simpler, but the explanation of the callback feature *a lot* simpler. And if you still want to handle an unencodable U+FFFD, you can write a special callback for that, e.g. def FFFDreplace(enc, uni, pos): if uni[pos] == "\ufffd": return u"?" else: raise UnicodeError(...) > > The implementation of the loop through the string is done > > in the following way. A stack with two strings is kept > > and the loop always encodes a character from the string > > at the stacktop. If an error is encountered and the stack > > has only one entry (during encoding of the original string) > > the callback is called and the unicode object returned is > > pushed on the stack, so the encoding continues with the > > replacement string. If the stack has two entries when an > > error is encountered, the replacement string itself has > > an unencodable character and a normal exception raised. > > When the encoder has reached the end of it's current string > > there are two possibilities: when the stack contains two > > entries, this was the replacement string, so the replacement > > string will be poppep from the stack and encoding continues > > with the next character from the original string. If the > > stack had only one entry, encoding is finished. > > Very elegant solution ! I'll put it as a comment in the source. > > (I hope that's enough explanation of the API and > implementation) > > Could you add these docs to the Misc/unicode.txt file ? I > will eventually take that file and turn it into a PEP which > will then serve as general documentation for these things. I could, but first we should work out how the decoding callback API will work. > > I have renamed the static ...121 function to all lowercase > > names. > > Ok. > > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > > replacement callback. > > Hmm, wouldn't that result in a slowdown ? If so, I'd rather > leave the special encoder in place, since it is being used a > lot in Python and probably some applications too. It would be a slowdown. But callbacks open many possiblities. For example: Why can't I print u"gürk"? is probably one of the most frequently asked questions in comp.lang.python. For printing Unicode stuff, print could be extended the use an error handling callback for Unicode strings (or objects where __str__ or tp_str returns a Unicode object) instead of using str() which always returns an 8bit string and uses strict encoding. There might even be a sys.setprintencodehandler()/sys.getprintencodehandler() > [...] > I think it would be worthwhile to rename the callbacks to > include "Unicode" somewhere, e.g. > PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but > then it points out the application field of the callback > rather well. Same for the callbacks exposed through the > _codecsmodule. OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors really is a long name ;)) > > I have not touched PyUnicode_TranslateCharmap yet, > > should this function also support error callbacks? Why > > would one want the insert None into the mapping to call > > the callback? > > 1. Yes. > 2. The user may want to e.g. restrict usage of certain > character ranges. In this case the codec would be used to > verify the input and an exception would indeed be useful > (e.g. say you want to restrict input to Hangul + ASCII). OK, do we want TranslateCharmap to work exactly like encoding, i.e. in case of an error should the returned replacement string again be mapped through the translation mapping or should it be copied to the output directly? The former would be more in line with encoding, but IMHO the latter would be much more useful. BTW, when I implement it I can implement patch #403100 ("Multicharacter replacements in PyUnicode_TranslateCharmap") along the way. Should the old TranslateCharmap map to the new TranslateCharmapEx and inherit the "multicharacter replacement" feature, or should I leave it as it is? > > A remaining problem is how to implement decoding error > > callbacks. In Python 2.1 encoding and decoding errors are > > handled in the same way with a string value. But with > > callbacks it doesn't make sense to use the same callback > > for encoding and decoding (like codecs.StreamReaderWriter > > and codecs.StreamRecoder do). Decoding callbacks have a > > different API. Which arguments should be passed to the > > decoding callback, and what is the decoding callback > > supposed to do? > > I'd suggest adding another set of PyCodec_UnicodeDecode... () > APIs for this. We'd then have to augment the base classes of > the StreamCodecs to provide two attributes for .errors with > a fallback solution for the string case (i.s. "strict" can > still be used for both directions). Sounds good. Now what is the decoding callback supposed to do? I guess it will be called in the same way as the encoding callback, i.e. with encoding name, original string and position of the error. It might returns a Unicode string (i.e. an object of the decoding target type), that will be emitted from the codec instead of the one offending byte. Or it might return a tuple with replacement Unicode object and a resynchronisation offset, i.e. returning (u"?", 1) means emit a '?' and skip the offending character. But to make the offset really useful the callback has to know something about the encoding, perhaps the codec should be allowed to pass an additional state object to the callback? Maybe the same should be added to the encoding callbacks to? Maybe the encoding callback should be able to tell the encoder if the replacement returned should be reencoded (in which case it's a Unicode object), or directly emitted (in which case it's an 8bit string)? > > One additional note: It is vital that errors is an > > assignable attribute of the StreamWriter. > > It is already ! I know, but IMHO it should be documented that an assignable errors attribute must be supported as part of the official codec API. Misc/unicode.txt is not clear on that: """ It is not required by the Unicode implementation to use these base classes, only the interfaces must match; this allows writing Codecs as extension types. """ ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-13 10:05 Message: Logged In: YES user_id=38388 > How the callbacks work: > > A PyObject * named errors is passed in. This may by NULL, > Py_None, 'strict', u'strict', 'ignore', u'ignore', > 'replace', u'replace' or a callable object. > PyCodec_EncodeHandlerForObject maps all of these objects to > one of the three builtin error callbacks > PyCodec_RaiseEncodeErrors (raises an exception), > PyCodec_IgnoreEncodeErrors (returns an empty replacement > string, in effect ignoring the error), > PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode > replacement character to signify to the encoder that it > should choose a suitable replacement character) or directly > returns errors if it is a callable object. When an > unencodable character is encounterd the error handling > callback will be called with the encoding name, the original > unicode object and the error position and must return a > unicode object that will be encoded instead of the offending > character (or the callback may of course raise an > exception). U+FFFD characters in the replacement string will > be replaced with a character that the encoder chooses ('?' > in all cases). Nice. > The implementation of the loop through the string is done in > the following way. A stack with two strings is kept and the > loop always encodes a character from the string at the > stacktop. If an error is encountered and the stack has only > one entry (during encoding of the original string) the > callback is called and the unicode object returned is pushed > on the stack, so the encoding continues with the replacement > string. If the stack has two entries when an error is > encountered, the replacement string itself has an > unencodable character and a normal exception raised. When > the encoder has reached the end of it's current string there > are two possibilities: when the stack contains two entries, > this was the replacement string, so the replacement string > will be poppep from the stack and encoding continues with > the next character from the original string. If the stack > had only one entry, encoding is finished. Very elegant solution ! > (I hope that's enough explanation of the API and implementation) Could you add these docs to the Misc/unicode.txt file ? I will eventually take that file and turn it into a PEP which will then serve as general documentation for these things. > I have renamed the static ...121 function to all lowercase > names. Ok. > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > replacement callback. Hmm, wouldn't that result in a slowdown ? If so, I'd rather leave the special encoder in place, since it is being used a lot in Python and probably some applications too. > PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, > PyCodec_ReplaceEncodeErrors are globally visible because > they have to be available in _codecsmodule.c to wrap them as > Python function objects, but they can't be implemented in > _codecsmodule, because they need to be available to the > encoders in unicodeobject.c (through > PyCodec_EncodeHandlerForObject), but importing the codecs > module might result in an endless recursion, because > importing a module requires unpickling of the bytecode, > which might require decoding utf8, which ... (but this will > only happen, if we implement the same mechanism for the > decoding API) I think that codecs.c is the right place for these APIs. _codecsmodule.c is only meant as Python access wrapper for the internal codecs and nothing more. One thing I noted about the callbacks: they assume that they will always get Unicode objects as input. This is certainly not true in the general case (it is for the codecs you touch in the patch). I think it would be worthwhile to rename the callbacks to include "Unicode" somewhere, e.g. PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but then it points out the application field of the callback rather well. Same for the callbacks exposed through the _codecsmodule. > I have not touched PyUnicode_TranslateCharmap yet, > should this function also support error callbacks? Why would > one want the insert None into the mapping to call the callback? 1. Yes. 2. The user may want to e.g. restrict usage of certain character ranges. In this case the codec would be used to verify the input and an exception would indeed be useful (e.g. say you want to restrict input to Hangul + ASCII). > A remaining problem is how to implement decoding error > callbacks. In Python 2.1 encoding and decoding errors are > handled in the same way with a string value. But with > callbacks it doesn't make sense to use the same callback for > encoding and decoding (like codecs.StreamReaderWriter and > codecs.StreamRecoder do). Decoding callbacks have a > different API. Which arguments should be passed to the > decoding callback, and what is the decoding callback > supposed to do? I'd suggest adding another set of PyCodec_UnicodeDecode...() APIs for this. We'd then have to augment the base classes of the StreamCodecs to provide two attributes for .errors with a fallback solution for the string case (i.s. "strict" can still be used for both directions). > One additional note: It is vital that errors is an > assignable attribute of the StreamWriter. It is already ! > Consider the XML example: For writing an XML DOM tree one > StreamWriter object is used. When a text node is written, > the error handling has to be set to > codecs.xmlreplace_encode_errors, but inside a comment or > processing instruction replacing unencodable characters with > charrefs is not possible, so here codecs.raise_encode_errors > should be used (or better a custom error handler that raises > an error that says "sorry, you can't have unencodable > characters inside a comment") Sure. > BTW, should we continue the discussion in the i18n SIG > mailing list? An email program is much more comfortable than > a HTML textarea! ;) I'd rather keep the discussions on this patch here -- forking it off to the i18n sig will make it very hard to follow up on it. (This HTML area is indeed damn small ;-) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 21:18 Message: Logged In: YES user_id=89016 One additional note: It is vital that errors is an assignable attribute of the StreamWriter. Consider the XML example: For writing an XML DOM tree one StreamWriter object is used. When a text node is written, the error handling has to be set to codecs.xmlreplace_encode_errors, but inside a comment or processing instruction replacing unencodable characters with charrefs is not possible, so here codecs.raise_encode_errors should be used (or better a custom error handler that raises an error that says "sorry, you can't have unencodable characters inside a comment") BTW, should we continue the discussion in the i18n SIG mailing list? An email program is much more comfortable than a HTML textarea! ;) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 20:59 Message: Logged In: YES user_id=89016 How the callbacks work: A PyObject * named errors is passed in. This may by NULL, Py_None, 'strict', u'strict', 'ignore', u'ignore', 'replace', u'replace' or a callable object. PyCodec_EncodeHandlerForObject maps all of these objects to one of the three builtin error callbacks PyCodec_RaiseEncodeErrors (raises an exception), PyCodec_IgnoreEncodeErrors (returns an empty replacement string, in effect ignoring the error), PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode replacement character to signify to the encoder that it should choose a suitable replacement character) or directly returns errors if it is a callable object. When an unencodable character is encounterd the error handling callback will be called with the encoding name, the original unicode object and the error position and must return a unicode object that will be encoded instead of the offending character (or the callback may of course raise an exception). U+FFFD characters in the replacement string will be replaced with a character that the encoder chooses ('?' in all cases). The implementation of the loop through the string is done in the following way. A stack with two strings is kept and the loop always encodes a character from the string at the stacktop. If an error is encountered and the stack has only one entry (during encoding of the original string) the callback is called and the unicode object returned is pushed on the stack, so the encoding continues with the replacement string. If the stack has two entries when an error is encountered, the replacement string itself has an unencodable character and a normal exception raised. When the encoder has reached the end of it's current string there are two possibilities: when the stack contains two entries, this was the replacement string, so the replacement string will be poppep from the stack and encoding continues with the next character from the original string. If the stack had only one entry, encoding is finished. (I hope that's enough explanation of the API and implementation) I have renamed the static ...121 function to all lowercase names. BTW, I guess PyUnicode_EncodeUnicodeEscape could be reimplemented as PyUnicode_EncodeASCII with a \uxxxx replacement callback. PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, PyCodec_ReplaceEncodeErrors are globally visible because they have to be available in _codecsmodule.c to wrap them as Python function objects, but they can't be implemented in _codecsmodule, because they need to be available to the encoders in unicodeobject.c (through PyCodec_EncodeHandlerForObject), but importing the codecs module might result in an endless recursion, because importing a module requires unpickling of the bytecode, which might require decoding utf8, which ... (but this will only happen, if we implement the same mechanism for the decoding API) I have not touched PyUnicode_TranslateCharmap yet, should this function also support error callbacks? Why would one want the insert None into the mapping to call the callback? A remaining problem is how to implement decoding error callbacks. In Python 2.1 encoding and decoding errors are handled in the same way with a string value. But with callbacks it doesn't make sense to use the same callback for encoding and decoding (like codecs.StreamReaderWriter and codecs.StreamRecoder do). Decoding callbacks have a different API. Which arguments should be passed to the decoding callback, and what is the decoding callback supposed to do? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 20:00 Message: Logged In: YES user_id=38388 About the Py_UNICODE*data, int size APIs: Ok, point taken. In general, I think we ought to keep the callback feature as open as possible, so passing in pointers and sizes would not be very useful. BTW, could you summarize how the callback works in a few lines ? About _Encode121: I'd name this _EncodeUCS1 since that's what it is ;-) About the new functions: I was referring to the new static functions which you gave PyUnicode_... names. If these are not supposed to turn into non-static functions, I'd rather have them use lower case names (since that's how the Python internals work too -- most of the times). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 18:56 Message: Logged In: YES user_id=89016 > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. Another problem is, that the callback requires a Python object, so in the PyObject *version, the refcount is incref'd and the object is passed to the callback. The Py_UNICODE*/int version would have to create a new Unicode object from the data. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 18:32 Message: Logged In: YES user_id=89016 > * please don't place more than one C statement on one line > like in: > """ > + unicode = unicode2; unicodepos = > unicode2pos; > + unicode2 = NULL; unicode2pos = 0; > """ OK, done! > * Comments should start with a capital letter and be > prepended > to the section they apply to Fixed! > * There should be spaces between arguments in compares > (a == b) not (a==b) Fixed! > * Where does the name "...Encode121" originate ? encode one-to-one, it implements both ASCII and latin-1 encoding. > * module internal APIs should use lower case names (you > converted some of these to PyUnicode_...() -- this is > normally reserved for APIs which are either marked as > potential candidates for the public API or are very > prominent in the code) Which ones? I introduced a new function for every old one, that had a "const char *errors" argument, and a few new ones in codecs.h, of those PyCodec_EncodeHandlerForObject is vital, because it is used to map for old string arguments to the new function objects. PyCodec_RaiseEncodeErrors can be used in the encoder implementation to raise an encode error, but it could be made static in unicodeobject.h so only those encoders implemented there have access to it. > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. I look through the code and found no situation where the Py_UNICODE*/int version is really used and having two (PyObject *)s (the original and the replacement string), instead of UNICODE*/int and PyObject * made the implementation a little easier, but I can fix that. > Please separate the errors.c patch from this patch -- it > seems totally unrelated to Unicode. PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with four hex digits. I removed it. I'll upload a revised patch as soon as it's done. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 16:29 Message: Logged In: YES user_id=38388 Thanks for the patch -- it looks very impressive !. I'll give it a try later this week. Some first cosmetic tidbits: * please don't place more than one C statement on one line like in: """ + unicode = unicode2; unicodepos = unicode2pos; + unicode2 = NULL; unicode2pos = 0; """ * Comments should start with a capital letter and be prepended to the section they apply to * There should be spaces between arguments in compares (a == b) not (a==b) * Where does the name "...Encode121" originate ? * module internal APIs should use lower case names (you converted some of these to PyUnicode_...() -- this is normally reserved for APIs which are either marked as potential candidates for the public API or are very prominent in the code) One thing which I don't like about your API change is that you removed the Py_UNICODE*data, int size style arguments -- this makes it impossible to use the new APIs on non-Python data or data which is not available as Unicode object. Please separate the errors.c patch from this patch -- it seems totally unrelated to Unicode. Thanks. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 From noreply@sourceforge.net Wed Apr 17 21:50:56 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 13:50:56 -0700 Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode Message-ID: Patches item #536241, was opened at 2002-03-28 08:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 Category: Library (Lib) Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: Walter Dörwald (doerwalter) Summary: string.zfill and unicode Initial Comment: This patch makes the function string.zfill work with unicode instances (and instances of str and unicode subclasses). Currently string.zfill(u"123", 10) results in "0000u'123'". With this patch the result is u'0000000123'. Should zfill be made a real str und unicode method? I noticed that a zfill implementation is available in unicodeobject.c, but commented out. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-17 16:50 Message: Logged In: YES user_id=6380 The test seems fine, and a good addition. Don't worry too much about how to report the failure (though perhaps including the key word "subtype" in the error output might help). I noticed that when I change the Unicode function fixup() to not do a check for subclasses, I only get very few failures: one for capitalize, two for lower, one for upper. I think this is because the test suite doesn't have enough sample cases where the output is the same as the input. Maybe some could be added. But go ahead and check in diff3.txt. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-17 14:55 Message: Logged In: YES user_id=89016 Diff3.txt adds these tests to Lib/test/test_unicode.py and Lib/test/test_string.py. All tests pass (except that currently test_unicode.py fails the unicode_internal roundtripping test with --enable-unicode=ucs4) and when I change zfill back to always return self they properly fail. I don't know whether the fail message should be made better, and how this would interact with "make test" and whether the "Prefer string methods over string module functions" part in test_string.py might pose problems. And maybe the code could be simplyfied to always use the subclasses without first trying str und unicode? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 14:48 Message: Logged In: YES user_id=6380 If you want to be thorough, yes, that's a good test to add! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 14:47 Message: Logged In: YES user_id=89016 Checked in as: Objects/stringobject.c 2.159 Objects/unicodeobject.c 2.139 Maybe we could add a test to Lib/test/test_unicode.py and Lib/test/test_string.py that makes sure that no method returns a str/unicode subinstance even when called for a str/unicode subinstance? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 14:29 Message: Logged In: YES user_id=6380 Yes, that's the right thing. Reopened this for now. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 14:23 Message: Logged In: YES user_id=89016 Currently zfill returns the original if nothing has to be done. Should I change this to only do it, if it's a real str or unicode instance? (as it was done lots of methods for bug http://www.python.org/sf/460020) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 10:47 Message: Logged In: YES user_id=6380 Yes, please open a separate bug report for those (I'd open a separate report for each file with warnings, unless you have an obvious fix). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 10:43 Message: Logged In: YES user_id=89016 > Does your compiler not warn you? Or did > you ignore warnings? > (The latter's a sin in Python-land :-). The warning was just lost in the long list of outputs. Now that you mention it, there are still a few warnings (gcc 2.96 on Linux): Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format': Objects/unicodeobject.c:5574: warning: int format, long int arg (arg 3) Objects/unicodeobject.c:5574: warning: unsigned int format, long unsigned int arg (arg 4) libpython2.3.a(posixmodule.o): In function `posix_tmpnam': Modules/posixmodule.c:5150: the use of `tmpnam_r' is dangerous, better use `mkstemp' libpython2.3.a(posixmodule.o): In function `posix_tempnam': Modules/posixmodule.c:5100: the use of `tempnam' is dangerous, better use `mkstemp' Modules/pwdmodule.c: In function `initpwd': Modules/pwdmodule.c:161: warning: unused variable `d' Modules/readline.c: In function `set_completer_delims': Modules/readline.c:273: warning: passing arg 1 of `free' discards qualifiers from pointer target type Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not used Should I open a separate bug report for that? > I've also folded some long lines that weren't > your fault -- but I noticed that elsewhere you > checked in some long lines; > please try to limit line length to 78. I noticed your descrobject.c checkin message. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 09:53 Message: Logged In: YES user_id=6380 Thanks, Walter! Some nits: The string_zfill() code you checked in caused two warnings about modifying data pointed to by a const pointer. I've removed the const, but I'd like to understand how come you didn't catch this. Does your compiler not warn you? Or did you ignore warnings? (The latter's a sin in Python-land :-). I've also folded some long lines that weren't your fault -- but I noticed that elsewhere you checked in some long lines; please try to limit line length to 78. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 09:41 Message: Logged In: YES user_id=89016 Checked in as: Doc/lib/libstdtypes.tex 1.88 Lib/UserString.py 1.12 Lib/string.py 1.63 test/string_tests.py 1.13 test/test_unicode.py 1.54 Misc/NEWS 1.388 Objects/stringobject.c 2.157 Objects/unicodeobject.c 2.138 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-12 21:00 Message: Logged In: YES user_id=6380 I'm for making them methods. Walter, just check it in! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-12 14:37 Message: Logged In: YES user_id=89016 Now that test_userstring.py works and fails (rev 1.6) should we add zfill as str and unicode methods or change UserString.zfill to use string.zfill? I've made a patch (attached) that implements zfill as methods (i.e. activates the version in unicodeobject.c that was commented out and implements the same in stringobject.c) (And it adds the test for unicode support back in.) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-12 10:51 Message: Logged In: YES user_id=21627 Re: optional Unicode: Walter is correct; configuring with --disable-unicode currently breaks the string module. One might consider using types.StringTypes; OTOH, pulling in types might not be desirable. As for str vs. repr: Python was always using repr in zfill, so changing it may break things. So I recommend that Walter reverts Andrew's check-in and applies his change. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 06:25 Message: Logged In: YES user_id=6656 Hah, I was going to say that but was distracted by IE wiping out the machine I'm sitting at. Re-opening. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-30 06:16 Message: Logged In: YES user_id=89016 But Python could be compiled without unicode support (by undefining PY_USING_UNICODE), and string.zfill should work even in this case. What about making zfill a real str and unicode method? ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2002-03-29 11:24 Message: Logged In: YES user_id=11375 Thanks for your patch! I've checked it into CVS, with two modifications. First, I removed the code to handle the case where Python doesn't have a unicode() built-in; there's no expection that you can take the standard library for Python version N and use it with version N-1, so this code isn't needed. Second, I changed string.zfill() to take the str() and not the repr() when it gets a non-string object because that seems to make more sense. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 From noreply@sourceforge.net Wed Apr 17 22:35:50 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 14:35:50 -0700 Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode Message-ID: Patches item #536241, was opened at 2002-03-28 14:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 Category: Library (Lib) Group: None >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: Walter Dörwald (doerwalter) Summary: string.zfill and unicode Initial Comment: This patch makes the function string.zfill work with unicode instances (and instances of str and unicode subclasses). Currently string.zfill(u"123", 10) results in "0000u'123'". With this patch the result is u'0000000123'. Should zfill be made a real str und unicode method? I noticed that a zfill implementation is available in unicodeobject.c, but commented out. ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2002-04-17 23:35 Message: Logged In: YES user_id=89016 Checked in as: Lib/test/test_string.py 1.16 Lib/test/test_unicode.py 1.56 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-17 22:50 Message: Logged In: YES user_id=6380 The test seems fine, and a good addition. Don't worry too much about how to report the failure (though perhaps including the key word "subtype" in the error output might help). I noticed that when I change the Unicode function fixup() to not do a check for subclasses, I only get very few failures: one for capitalize, two for lower, one for upper. I think this is because the test suite doesn't have enough sample cases where the output is the same as the input. Maybe some could be added. But go ahead and check in diff3.txt. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-17 20:55 Message: Logged In: YES user_id=89016 Diff3.txt adds these tests to Lib/test/test_unicode.py and Lib/test/test_string.py. All tests pass (except that currently test_unicode.py fails the unicode_internal roundtripping test with --enable-unicode=ucs4) and when I change zfill back to always return self they properly fail. I don't know whether the fail message should be made better, and how this would interact with "make test" and whether the "Prefer string methods over string module functions" part in test_string.py might pose problems. And maybe the code could be simplyfied to always use the subclasses without first trying str und unicode? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 20:48 Message: Logged In: YES user_id=6380 If you want to be thorough, yes, that's a good test to add! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 20:47 Message: Logged In: YES user_id=89016 Checked in as: Objects/stringobject.c 2.159 Objects/unicodeobject.c 2.139 Maybe we could add a test to Lib/test/test_unicode.py and Lib/test/test_string.py that makes sure that no method returns a str/unicode subinstance even when called for a str/unicode subinstance? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 20:29 Message: Logged In: YES user_id=6380 Yes, that's the right thing. Reopened this for now. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 20:23 Message: Logged In: YES user_id=89016 Currently zfill returns the original if nothing has to be done. Should I change this to only do it, if it's a real str or unicode instance? (as it was done lots of methods for bug http://www.python.org/sf/460020) ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 16:47 Message: Logged In: YES user_id=6380 Yes, please open a separate bug report for those (I'd open a separate report for each file with warnings, unless you have an obvious fix). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 16:43 Message: Logged In: YES user_id=89016 > Does your compiler not warn you? Or did > you ignore warnings? > (The latter's a sin in Python-land :-). The warning was just lost in the long list of outputs. Now that you mention it, there are still a few warnings (gcc 2.96 on Linux): Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format': Objects/unicodeobject.c:5574: warning: int format, long int arg (arg 3) Objects/unicodeobject.c:5574: warning: unsigned int format, long unsigned int arg (arg 4) libpython2.3.a(posixmodule.o): In function `posix_tmpnam': Modules/posixmodule.c:5150: the use of `tmpnam_r' is dangerous, better use `mkstemp' libpython2.3.a(posixmodule.o): In function `posix_tempnam': Modules/posixmodule.c:5100: the use of `tempnam' is dangerous, better use `mkstemp' Modules/pwdmodule.c: In function `initpwd': Modules/pwdmodule.c:161: warning: unused variable `d' Modules/readline.c: In function `set_completer_delims': Modules/readline.c:273: warning: passing arg 1 of `free' discards qualifiers from pointer target type Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not used Should I open a separate bug report for that? > I've also folded some long lines that weren't > your fault -- but I noticed that elsewhere you > checked in some long lines; > please try to limit line length to 78. I noticed your descrobject.c checkin message. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-15 15:53 Message: Logged In: YES user_id=6380 Thanks, Walter! Some nits: The string_zfill() code you checked in caused two warnings about modifying data pointed to by a const pointer. I've removed the const, but I'd like to understand how come you didn't catch this. Does your compiler not warn you? Or did you ignore warnings? (The latter's a sin in Python-land :-). I've also folded some long lines that weren't your fault -- but I noticed that elsewhere you checked in some long lines; please try to limit line length to 78. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-15 15:41 Message: Logged In: YES user_id=89016 Checked in as: Doc/lib/libstdtypes.tex 1.88 Lib/UserString.py 1.12 Lib/string.py 1.63 test/string_tests.py 1.13 test/test_unicode.py 1.54 Misc/NEWS 1.388 Objects/stringobject.c 2.157 Objects/unicodeobject.c 2.138 ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-13 03:00 Message: Logged In: YES user_id=6380 I'm for making them methods. Walter, just check it in! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-04-12 20:37 Message: Logged In: YES user_id=89016 Now that test_userstring.py works and fails (rev 1.6) should we add zfill as str and unicode methods or change UserString.zfill to use string.zfill? I've made a patch (attached) that implements zfill as methods (i.e. activates the version in unicodeobject.c that was commented out and implements the same in stringobject.c) (And it adds the test for unicode support back in.) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-04-12 16:51 Message: Logged In: YES user_id=21627 Re: optional Unicode: Walter is correct; configuring with --disable-unicode currently breaks the string module. One might consider using types.StringTypes; OTOH, pulling in types might not be desirable. As for str vs. repr: Python was always using repr in zfill, so changing it may break things. So I recommend that Walter reverts Andrew's check-in and applies his change. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2002-03-30 12:25 Message: Logged In: YES user_id=6656 Hah, I was going to say that but was distracted by IE wiping out the machine I'm sitting at. Re-opening. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2002-03-30 12:16 Message: Logged In: YES user_id=89016 But Python could be compiled without unicode support (by undefining PY_USING_UNICODE), and string.zfill should work even in this case. What about making zfill a real str and unicode method? ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2002-03-29 17:24 Message: Logged In: YES user_id=11375 Thanks for your patch! I've checked it into CVS, with two modifications. First, I removed the code to handle the case where Python doesn't have a unicode() built-in; there's no expection that you can take the standard library for Python version N and use it with version N-1, so this code isn't needed. Second, I changed string.zfill() to take the str() and not the repr() when it gets a non-string object because that seems to make more sense. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470 From noreply@sourceforge.net Thu Apr 18 01:31:08 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 17:31:08 -0700 Subject: [Patches] [ python-Patches-545439 ] interactive help in python-mode Message-ID: Patches item #545439, was opened at 2002-04-17 19:31 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545439&group_id=5470 Category: Demos and tools Group: None Status: Open Resolution: None Priority: 5 Submitted By: Skip Montanaro (montanaro) Assigned to: Barry Warsaw (bwarsaw) Summary: interactive help in python-mode Initial Comment: If you apply the patch from bug 545436 to python-mode.el, the attached code allows programmers to get help from pydoc about the current possibly dotted expression. This is just a quick-n-dirty hack, but seems at least marginally useful. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545439&group_id=5470 From noreply@sourceforge.net Thu Apr 18 03:19:03 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 19:19:03 -0700 Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols Message-ID: Patches item #540394, was opened at 2002-04-07 01:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 Category: Core (C code) Group: None >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Neil Schemenauer (nascheme) Summary: Remove PyMalloc_* symbols Initial Comment: This patch removes all PyMalloc_* symbols from the source. obmalloc now implements PyObject_{Malloc, Realloc, Free}. PyObject_{New,NewVar} allocate using pymalloc. I also changed PyObject_Del and PyObject_GC_Del so that they be used as function designators. Is changing the signature of PyObject_Del going to cause any problems? I had to add some extra typecasts when assigning to tp_free. Please review and assign back to me. The next phase would be to cleanup the memory API usage. Do we want to replace all PyObject_Del calls with PyObject_Free? PyObject_Del seems to match better with PyObject_GC_Del. Oh yes, we also need to change PyMem_{Free, Del, ...} to use pymalloc's free. ---------------------------------------------------------------------- >Comment By: Neil Schemenauer (nascheme) Date: 2002-04-18 02:19 Message: Logged In: YES user_id=35752 A modified version of the patch has been commited. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-10 00:53 Message: Logged In: YES user_id=6380 The binary compatibility issue is extensions compiled for 2.2 that have references to _PyObject_Del compiled into them and aren't recompiled for 2.3. I think that should work (even if they get a warning). To make it work, the _PyObject_Del entry point must continue to exist. Back to Neil, I think my instructions are clear enough. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-09 20:43 Message: Logged In: YES user_id=31435 It'll be a day or two before PLabs can get back to Python work anyway. Reassigning to Guido -- I'm not even going to try to channel him on backwards compatibility, or the feasibility of introducing possible warnings. If I were you I'd check in the patch with the casts in; they can be taken out again later if Guido is agreeable. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2002-04-09 20:29 Message: Logged In: YES user_id=35752 It might be a day or two before I get to this. Regarding the type of tp_free, could we change it to be something like: typedef void (*freefunc)(void *); ... freefunc tp_free; and leave the type of tp_dealloc alone. Maybe it's too late now that 2.2 is out and uses 'destructor'. I don't see how this relates to binary compatibility though. Why does it matter if the function takes a PyObject pointer or a void pointer? The worse I see happening is that people could get warnings when they compile their extension modules. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-09 18:47 Message: Logged In: YES user_id=31435 Clarifying or just repeating Guido here: + Binary compatibility is important. It's better on Unix than it appears -- while you'll get a warning if you run an old 1.5.2 extension with 2.2 today and without recompiling, it will almost certainly work anyway. So in the case of macros that expanded to a private API function before, that private API function must still exist, but the macro needn't expand to that anymore (nor even *be* a macro anymore). _PyObject_Del is a particular problem cuz it's even documented in the C API manual -- there simply wasn't a public API function before that did the same thing and could be used as a function designator. You're making life better for future generations. + Casts on tp_free slots are par for the course, because "destructor" has an impractical signature. I'm afraid that can't change either, so the casts stay. + Fred and I agreed to add PyObject_Del to the "minimal recommended API", so, for the next round of this, feel wholly righteous in leaving existing PyObject_Del calls alone. If anything's unclear, hit me. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-09 16:27 Message: Logged In: YES user_id=6380 I've not fully read Tim's response in email, but instead I've reviewed and discussed the patch with Tim. I think the only thing to which I object at this point is the removal of the entry point _PyObject_Del. I believe that for source and binary compatibility with 2.2, that entry point should remain, with the same meaning, but it should not be used at all by the core. (Motivation to keep it: it's the only thing you can reasonably stick in tp_free that works for 2.2 as well as for 2.3.) One minor question: there are a bunch of #undefs in gcmodule.c (e.g. PyObject_GC_Track) that don't seem to make sense -- at least I cannot find where these would be #defined any more. Ditto for #indef PyObject_Malloc in obmalloc.c. I suggest that you check this thing in, but keeping _PyObject_Del alive, and we'll take it from there. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-08 19:18 Message: Logged In: YES user_id=6380 (Wouldn't it be more efficient to take this to email between the three of us?) > Extensions that *currently* call PyObject_Del have > its old macro expansion ("_PyObject_Del((PyObject > *)(op))") buried in them, so getting rid of > _PyObject_Del is a binary-API incompatibility > (existing extensions will no longer link without > recompilation). I personally don't mind that, but > I run on Windows and "binary compatability" never > works there across minor releases for other > reasons, so I don't have any real feel for how > much people on other platforms value it. As you > pointed out recently too, binary compatability > has, in reality, not been the case since 1.5.2 > anyway. Still, tradition has it that we keep such entry points around for a long time. I propose that we do so now, too. > So that's one for Python-Dev. If we do break > binary compatibility, I'd be sorely tempted to > change the "destructor" typedef to say destructors > take void*. IMO saying they take PyObject* was a > poor idea, as you almost never have a PyObject* > when calling one of these guys. Huh? "destructor" is used to declare tp_dealloc, which definitely needs a PyObject * (or some "subclass" of it, like PyIntObject *). It's also used to declare tp_free, which arguably shouldn't take a PyObject * (since by the time tp_free is called, most of the object's contents have been destroyed by tp_dealloc). So maybe tp_free (a newcomer in 2.2) should be declared to take something else, but then the risk is breaking code that defines a tp_free with the correct signature. > That's why PyObject_Del "had to" be a macro, to > hide the cast to PyObject* almost everyone needs > because of destructor's "correct" but impractical > signature. If "destructor" had a practical > signature, there would have been no temptation to > use a macro. I don't understand this at all. > Note that if the typedef of destructor were so > changed, you wouldn't have needed new casts in > tp_free slots. And I'd rather break binary > compatability than make extension authors add new > casts. Nor this. > Hmm. I'm assigning this to Guido for comment: > Guido, what are your feelings about binary > compatibility here? C didn't define free() as > taking a void* by mistake . I want binary compatibility, but I don't understand your comments very well. > Back to Neil: I wouldn't bother changing PyObject_Del > to PyObject_Free. The former isn't in the > "recommended" minimal API, but neither is it > discouraged. I expect TMTOWTDI here forever. I prefer PyObject_Del -- like PyObject_GC_Del, and like we did in the past. Plus, I like New to match Del and Malloc to match Free. Since it's PyObject_New, it should be _Del. I'm not sure what to say of Neil's patch, except that I'm glad to be rid of the PyMalloc_XXX family. I wish we didn't have to change all the places that used to say _PyObject_Del. Maybe it's best to keep that name around? The patch would (psychologically) become a lot smaller. I almost wish that this would work: #define PyObject_Del ((destructor)PyObject_Free) Or maybe it *does* work??? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-04-08 18:47 Message: Logged In: YES user_id=6380 I'm looking at this now... ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-07 02:59 Message: Logged In: YES user_id=31435 Extensions that *currently* call PyObject_Del have its old macro expansion ("_PyObject_Del((PyObject *)(op))") buried in them, so getting rid of _PyObject_Del is a binary-API incompatibility (existing extensions will no longer link without recompilation). I personally don't mind that, but I run on Windows and "binary compatability" never works there across minor releases for other reasons, so I don't have any real feel for how much people on other platforms value it. As you pointed out recently too, binary compatability has, in reality, not been the case since 1.5.2 anyway. So that's one for Python-Dev. If we do break binary compatibility, I'd be sorely tempted to change the "destructor" typedef to say destructors take void*. IMO saying they take PyObject* was a poor idea, as you almost never have a PyObject* when calling one of these guys. That's why PyObject_Del "had to" be a macro, to hide the cast to PyObject* almost everyone needs because of destructor's "correct" but impractical signature. If "destructor" had a practical signature, there would have been no temptation to use a macro. Note that if the typedef of destructor were so changed, you wouldn't have needed new casts in tp_free slots. And I'd rather break binary compatability than make extension authors add new casts. Hmm. I'm assigning this to Guido for comment: Guido, what are your feelings about binary compatibility here? C didn't define free() as taking a void* by mistake . Back to Neil: I wouldn't bother changing PyObject_Del to PyObject_Free. The former isn't in the "recommended" minimal API, but neither is it discouraged. I expect TMTOWTDI here forever. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-07 02:41 Message: Logged In: YES user_id=31435 Oops -- I hit "Submit" prematurely. More to come. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2002-04-07 02:40 Message: Logged In: YES user_id=31435 Looks good to me -- thanks! ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470 From noreply@sourceforge.net Thu Apr 18 05:13:32 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 17 Apr 2002 21:13:32 -0700 Subject: [Patches] [ python-Patches-545480 ] Examples for urllib2 Message-ID: Patches item #545480, was opened at 2002-04-18 04:13 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545480&group_id=5470 Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: Sean Reifschneider (jafo) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Examples for urllib2 Initial Comment: An associate who's learning Python recently complained about a lack of examples for urllib2. As a starting point, I'd like to submit the following: This example gets the python.org main page and displays the first 100 bytes of it: >>> import urllib2 >>> url = urllib2.urlopen('http://www.python.org/') >>> print url.read()[:100]