From noreply@sourceforge.net Fri Jun 1 15:34:28 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 01 Jun 2001 07:34:28 -0700 Subject: [Patches] [ python-Patches-427190 ] Speed-up "O" calls Message-ID: Patches item #427190, was updated on 2001-05-24 22:30 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427190&group_id=5470 >Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Martin v. Löwis (loewis) >Assigned to: Jeremy Hylton (jhylton) >Summary: Speed-up "O" calls Initial Comment: This patch improves the performance of a few functions which have an "O" signature (ord, len, and list_append). On selected test cases, this patch gives a speed-up of 40%. If accepted, the approach can be extended to more signatures. E.g. "l" is already provided in the patch, but currently not used. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-01 07:34 Message: Logged In: YES user_id=21627 I rewrote the patch to only support METH_NOARGS and METH_O, and to not use bit masks for them. I also changed calling conventions for all Object operations and bltin and sys functions. In the course of these changes, two functions got a changed meaning: - file.writelines accepts only exactly one argument - iter.next does not accept any arguments anymore As you can see in the patch,there is still a lot of places that continue to use OLDARGS (plus all the Modules functions that have not been changed in this patch), so OLDARGS will be needed for quite some time. ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2001-05-29 13:59 Message: Logged In: YES user_id=31392 I like METH_O, but I'm not sure about METH_L. I'd rather see the call handling in ceval be type-neutral. It's easy enough for the callee to cast from an object to an int (or any other type). There should be no effect on performance and it reduces the amount of code in the core. I think the implementation could be simplified a lot if it defined METH_O -- or perhaps METH_NOARGS, METH_ONEARG, and maybe even METH_TWOARGS (but Tim has a pretty good argument against that one). I don't think there's any define METH_O via METH_SPECIAL and reserve all of 0xFFF0 for flags on METH_SPECIAL. Instead, I'd just use the next N bits to implement the next N flags. The SPECIALSIZE and extra stack used in the implementation seem like unneeded generality, too. If the implementation is only going to support 0 and 1 (and possibly 2) argument, there's no need for anything more general. Finally, I suggest appropriating fast_cfunction() for this purpose, rather than calling the new function do_call_special(), where "special" isn't a very specific meaning. If METH_NOARGS and METH_ONEARG are implemented, there is basically no reason to use METH_OLDARGS. So we can get rid of it in the code base and stop attempting to optimize it. Do you want to have a go at a smaller patch that just did METH_ONEARG and METH_NOARGS? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427190&group_id=5470 From noreply@sourceforge.net Fri Jun 1 16:14:27 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 01 Jun 2001 08:14:27 -0700 Subject: [Patches] [ python-Patches-427190 ] Speed-up "O" calls Message-ID: Patches item #427190, was updated on 2001-05-24 22:30 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427190&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Martin v. Löwis (loewis) Assigned to: Jeremy Hylton (jhylton) >Summary: Speed-up "O" calls Initial Comment: This patch improves the performance of a few functions which have an "O" signature (ord, len, and list_append). On selected test cases, this patch gives a speed-up of 40%. If accepted, the approach can be extended to more signatures. E.g. "l" is already provided in the patch, but currently not used. ---------------------------------------------------------------------- >Comment By: Jeremy Hylton (jhylton) Date: 2001-06-01 08:14 Message: Logged In: YES user_id=31392 Just took a quick look -- looks good. One question: Why does METH_NOARGS call the method with two arguments where the second is always NULL? Wouldn't it be clearer to have these functions take one argument? ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-01 07:34 Message: Logged In: YES user_id=21627 I rewrote the patch to only support METH_NOARGS and METH_O, and to not use bit masks for them. I also changed calling conventions for all Object operations and bltin and sys functions. In the course of these changes, two functions got a changed meaning: - file.writelines accepts only exactly one argument - iter.next does not accept any arguments anymore As you can see in the patch,there is still a lot of places that continue to use OLDARGS (plus all the Modules functions that have not been changed in this patch), so OLDARGS will be needed for quite some time. ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2001-05-29 13:59 Message: Logged In: YES user_id=31392 I like METH_O, but I'm not sure about METH_L. I'd rather see the call handling in ceval be type-neutral. It's easy enough for the callee to cast from an object to an int (or any other type). There should be no effect on performance and it reduces the amount of code in the core. I think the implementation could be simplified a lot if it defined METH_O -- or perhaps METH_NOARGS, METH_ONEARG, and maybe even METH_TWOARGS (but Tim has a pretty good argument against that one). I don't think there's any define METH_O via METH_SPECIAL and reserve all of 0xFFF0 for flags on METH_SPECIAL. Instead, I'd just use the next N bits to implement the next N flags. The SPECIALSIZE and extra stack used in the implementation seem like unneeded generality, too. If the implementation is only going to support 0 and 1 (and possibly 2) argument, there's no need for anything more general. Finally, I suggest appropriating fast_cfunction() for this purpose, rather than calling the new function do_call_special(), where "special" isn't a very specific meaning. If METH_NOARGS and METH_ONEARG are implemented, there is basically no reason to use METH_OLDARGS. So we can get rid of it in the code base and stop attempting to optimize it. Do you want to have a go at a smaller patch that just did METH_ONEARG and METH_NOARGS? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427190&group_id=5470 From noreply@sourceforge.net Fri Jun 1 16:19:54 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 01 Jun 2001 08:19:54 -0700 Subject: [Patches] [ python-Patches-420565 ] makes setup.py search sys.prefix Message-ID: Patches item #420565, was updated on 2001-05-01 14:24 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=420565&group_id=5470 Category: Build Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: A.M. Kuchling (akuchling) Summary: makes setup.py search sys.prefix Initial Comment: It's useful to have setup.py search the lib and include directories in sys.prefix before it checks /usr/local. That way, if you are building Python into a custom location and want it to use the the libraries installed there rather than the system defaults, you can give the --prefix option to configure and setup.py will search that path first. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-06-01 08:19 Message: Logged In: NO I totally agree. I'm building for hard hat linux on a debian host, and the implicit search in /usr/lib is totally the wrong thing to do in this case. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=420565&group_id=5470 From noreply@sourceforge.net Fri Jun 1 21:07:21 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 01 Jun 2001 13:07:21 -0700 Subject: [Patches] [ python-Patches-429442 ] Cygwin sys.platform/get_platform() patch Message-ID: Patches item #429442, was updated on 2001-06-01 13:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429442&group_id=5470 Category: distutils Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jason Tishler (jlt63) Assigned to: Nobody/Anonymous (nobody) Summary: Cygwin sys.platform/get_platform() patch Initial Comment: This patch corrects sys.platform and distutils.util.get_platform() problems caused by the cruft contained in Cygwin's uname -s. Please see the following for the gory details: http://www.cygwin.com/ml/cygwin-apps/2001-05/msg00106.html Note that the above also solicited input from the community in an attempt to prevent any potential heartache. Since no one responded it would appear that either the changes are acceptable or that no one really cares... :,) ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429442&group_id=5470 From noreply@sourceforge.net Sat Jun 2 01:34:43 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 01 Jun 2001 17:34:43 -0700 Subject: [Patches] [ python-Patches-414991 ] Separate CFLAGS and CPPFLAGS Message-ID: Patches item #414991, was updated on 2001-04-09 13:33 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=414991&group_id=5470 Category: Build Group: None Status: Open Resolution: Postponed Priority: 5 Submitted By: Wilfredo Sanchez (wsanchez) Assigned to: Neil Schemenauer (nascheme) Summary: Separate CFLAGS and CPPFLAGS Initial Comment: CFLAGS should not contain preprocessor directives, which is the role of CPPFLAGS. By combining the two, it is not possible to override CFLAGS (eg. make CFLAGS="-arch i386 -arch ppc -O3 -pipe") without breaking the build. This patch is against Python 2.1b2a. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-06-01 17:34 Message: Logged In: NO Neil, Can we get this checked in for 2.2? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2001-04-10 14:44 Message: Logged In: YES user_id=35752 I agree with the change but I'm not comfortable checking it in for 2.1 (even though the patch is quite simple). It will have to wait. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-04-10 14:26 Message: Logged In: YES user_id=6380 Newl, can you review this and maybe check this in? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=414991&group_id=5470 From noreply@sourceforge.net Sat Jun 2 07:20:03 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 01 Jun 2001 23:20:03 -0700 Subject: [Patches] [ python-Patches-414991 ] Separate CFLAGS and CPPFLAGS Message-ID: Patches item #414991, was updated on 2001-04-09 13:33 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=414991&group_id=5470 Category: Build Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Wilfredo Sanchez (wsanchez) Assigned to: Neil Schemenauer (nascheme) Summary: Separate CFLAGS and CPPFLAGS Initial Comment: CFLAGS should not contain preprocessor directives, which is the role of CPPFLAGS. By combining the two, it is not possible to override CFLAGS (eg. make CFLAGS="-arch i386 -arch ppc -O3 -pipe") without breaking the build. This patch is against Python 2.1b2a. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-06-01 17:34 Message: Logged In: NO Neil, Can we get this checked in for 2.2? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2001-04-10 14:44 Message: Logged In: YES user_id=35752 I agree with the change but I'm not comfortable checking it in for 2.1 (even though the patch is quite simple). It will have to wait. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-04-10 14:26 Message: Logged In: YES user_id=6380 Newl, can you review this and maybe check this in? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=414991&group_id=5470 From noreply@sourceforge.net Sat Jun 2 10:27:12 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 02 Jun 2001 02:27:12 -0700 Subject: [Patches] [ python-Patches-429542 ] Bugfix for libsmtp example Message-ID: Patches item #429542, was updated on 2001-06-02 02:27 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429542&group_id=5470 Category: documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: Sean Reifschneider (jafo) Assigned to: Nobody/Anonymous (nobody) Summary: Bugfix for libsmtp example Initial Comment: libsmtp includes an example which does: while 1: line = raw_input() if not line: break which fails raising an EOFError exception. This patch changes the code to: while 1: try: line = raw_input() except EOFError: break Sean ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429542&group_id=5470 From noreply@sourceforge.net Sat Jun 2 11:12:45 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 02 Jun 2001 03:12:45 -0700 Subject: [Patches] [ python-Patches-427190 ] Speed-up "O" calls Message-ID: Patches item #427190, was updated on 2001-05-24 22:30 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427190&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Martin v. Löwis (loewis) Assigned to: Jeremy Hylton (jhylton) >Summary: Speed-up "O" calls Initial Comment: This patch improves the performance of a few functions which have an "O" signature (ord, len, and list_append). On selected test cases, this patch gives a speed-up of 40%. If accepted, the approach can be extended to more signatures. E.g. "l" is already provided in the patch, but currently not used. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-02 03:12 Message: Logged In: YES user_id=21627 New version uploaded. This uses functions with only the self argument for METH_NOARGS, and introduces PyNoArgsFunction for them. It also adds a section for api.tex documenting the METH_ flags, and an entry in NEWS mentioning the new METH_ flags. ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2001-06-01 08:14 Message: Logged In: YES user_id=31392 Just took a quick look -- looks good. One question: Why does METH_NOARGS call the method with two arguments where the second is always NULL? Wouldn't it be clearer to have these functions take one argument? ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-01 07:34 Message: Logged In: YES user_id=21627 I rewrote the patch to only support METH_NOARGS and METH_O, and to not use bit masks for them. I also changed calling conventions for all Object operations and bltin and sys functions. In the course of these changes, two functions got a changed meaning: - file.writelines accepts only exactly one argument - iter.next does not accept any arguments anymore As you can see in the patch,there is still a lot of places that continue to use OLDARGS (plus all the Modules functions that have not been changed in this patch), so OLDARGS will be needed for quite some time. ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2001-05-29 13:59 Message: Logged In: YES user_id=31392 I like METH_O, but I'm not sure about METH_L. I'd rather see the call handling in ceval be type-neutral. It's easy enough for the callee to cast from an object to an int (or any other type). There should be no effect on performance and it reduces the amount of code in the core. I think the implementation could be simplified a lot if it defined METH_O -- or perhaps METH_NOARGS, METH_ONEARG, and maybe even METH_TWOARGS (but Tim has a pretty good argument against that one). I don't think there's any define METH_O via METH_SPECIAL and reserve all of 0xFFF0 for flags on METH_SPECIAL. Instead, I'd just use the next N bits to implement the next N flags. The SPECIALSIZE and extra stack used in the implementation seem like unneeded generality, too. If the implementation is only going to support 0 and 1 (and possibly 2) argument, there's no need for anything more general. Finally, I suggest appropriating fast_cfunction() for this purpose, rather than calling the new function do_call_special(), where "special" isn't a very specific meaning. If METH_NOARGS and METH_ONEARG are implemented, there is basically no reason to use METH_OLDARGS. So we can get rid of it in the code base and stop attempting to optimize it. Do you want to have a go at a smaller patch that just did METH_ONEARG and METH_NOARGS? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427190&group_id=5470 From noreply@sourceforge.net Sat Jun 2 11:15:47 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 02 Jun 2001 03:15:47 -0700 Subject: [Patches] [ python-Patches-424335 ] richcompare for strings Message-ID: Patches item #424335, was updated on 2001-05-15 12:54 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=424335&group_id=5470 Category: core (C code) Group: None >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Martin v. Löwis (loewis) Assigned to: Martin v. Löwis (loewis) Summary: richcompare for strings Initial Comment: This patch implements the tp_richcompare slot for string objects. It shows a 8% speed-up on selected test cases. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-02 03:15 Message: Logged In: YES user_id=21627 Committed as 2.117 of stringobject.c, 2.95 of dictobject.c, and 2.27 of stringobject.h. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-05-23 21:48 Message: Logged In: YES user_id=31435 Oops! Looks like I forgot to assign this back to Martin. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-05-22 16:04 Message: Logged In: YES user_id=31435 Marked accepted. Looks good! Suggest return a->ob_size == b->ob_size && *a->ob_sval == *b->ob_sval && memcmp(a->ob_sval, b->ob_sval, a->ob_size) == 0; for the tail of the _PyString_Eq body as compilers should have an easier time of turning that into the best code for the platform (especially the weaker compilers do better optimizing expressions than across branches). Plus it improves clarity, at least for me. Unsure why the case Py_EQ: c = c == 0; break; /* not needed here */ case is there: if it's truly unreacable (and I agree it isn't), better to assert-fail if it gets there. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-05-22 14:31 Message: Logged In: YES user_id=31435 Assigned to me. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-05-21 08:33 Message: Logged In: YES user_id=21627 The new revision of the patch entirely removes tp_compare for string, following discussions on python-dev. The only direct user of string_compare has been changed to use the new function _PyString_Eq instead. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=424335&group_id=5470 From noreply@sourceforge.net Sat Jun 2 16:40:23 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 02 Jun 2001 08:40:23 -0700 Subject: [Patches] [ python-Patches-429611 ] doc build on win32 with MiKTeX et al. Message-ID: Patches item #429611, was updated on 2001-06-02 08:40 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429611&group_id=5470 Category: documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: Frederic Giacometti (giacometti) Assigned to: Nobody/Anonymous (nobody) Summary: doc build on win32 with MiKTeX et al. Initial Comment: With this patch, everything build fine on win32 but for the following problems: - html/api/labels.pl not generated -> html/api/*.html uncorrect - lib/modindex.html not generated -> html/modindex.html uncorrect Problems worked out: - fancyhdr.sty is not in the Miktex distribution ... - Makefile content made compatible with the Windows command line (now runs fine with VC++'s nmake, or cygnus's make --win32) - misc. problems regarding the path formats - miktex 2.0's pdflatex would block on a mismatching macro level in python.sty -> fixed Hints on installing latex2html: - I had to work out some fixes in the config.pl script (2,000 lines of perl...) - make sure the paths to the ghostscript and miktex installations have no spaces!!!!!! latex2html will silently screw up its configuration process - looking at perl scripts gave me a serious trauma ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429611&group_id=5470 From noreply@sourceforge.net Sat Jun 2 16:56:53 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 02 Jun 2001 08:56:53 -0700 Subject: [Patches] [ python-Patches-429614 ] pythonpath and optimize def. before init Message-ID: Patches item #429614, was updated on 2001-06-02 08:56 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429614&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Frederic Giacometti (giacometti) Assigned to: Nobody/Anonymous (nobody) Summary: pythonpath and optimize def. before init Initial Comment: A) Addition of four functions ===================== Py_{Set, Get}{PythonPath, OptimizeLevel}() with the same semantics as Py_{Set, Get}ProgramName() (Note: the C ANSI type 'char const*' is used to describe non-modifiable strings) These four functions are needed in the next JPE runtime (Python 2.1 patch included in the distribution); this allows setting the PYTHONPATH and optimize level from Java property values. B) Option '-P pythonpath' on the Python command line: ======================================== This option defines 'pythonpath' from the command line (and override the PYTHONPATH environment variable if necessary). Usefullness: Sometimes, one does not want to rely on the environment variables, or modify them. Sample application: Running build and test scripts in full control of the environment, and with different PYTHONPATH values. This option is needed by the build and test scripts of the next JPE source distribution (Python 2.1 patch included in the distribution. Frederic Giacometti fred@arakne.com ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429614&group_id=5470 From noreply@sourceforge.net Sun Jun 3 14:55:14 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 03 Jun 2001 06:55:14 -0700 Subject: [Patches] [ python-Patches-423394 ] Fix pulldom to preserve ns attributes Message-ID: Patches item #423394, was updated on 2001-05-11 11:04 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=423394&group_id=5470 Category: XML Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Martin v. Löwis (loewis) Summary: Fix pulldom to preserve ns attributes Initial Comment: Here is a fix for pulldom.py that preserves xmlns attributes that declare namespaces. The current pulldom / minidom captures xml namespace information in elements and attributes, but the actual namespace declaration attributes (xmlns:foo="...") are not preserved on the element where they appear. This makes it impossible for certain applications that do more complex name dereferencing (XMLSchema is an example) that requires not only namespace uris but also the prefixes used and the original scope information. The current patch preserves xmlns="" and xmlns:foo="" as *non-namespace qualified* attributes, which appears to be the norm in other DOM implementations. Pls let me know if you have any questions. -Brian (brian@digicool.com) ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-03 06:55 Message: Logged In: YES user_id=21627 The patch is a good idea, but I think it does not conform to the DOM recommendation. In the DOM, the namespace URI "http://www.w3.org/2000/xmlns/" is used for attributes whose namespace prefix or qualified name is xmlns. In addition, the patch contains a typo, it hould not say atetr_items. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=423394&group_id=5470 From noreply@sourceforge.net Mon Jun 4 03:31:52 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 03 Jun 2001 19:31:52 -0700 Subject: [Patches] [ python-Patches-423221 ] Add a few Windows encoding aliases Message-ID: Patches item #423221, was updated on 2001-05-10 21:14 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=423221&group_id=5470 Category: library Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Brian Quinlan (bquinlan) >Assigned to: Mark Hammond (mhammond) Summary: Add a few Windows encoding aliases Initial Comment: This patch adds aliases for some of the common Windows encodings. Windows-1252 is particularly useful because it is the default encoding used by Visual Studio .NET projects. Microsoft's complete encoding list can be found at: http://msdn.microsoft.com/workshop/author/dhtml/referen ce/charsets/charset4.asp ---------------------------------------------------------------------- >Comment By: Mark Hammond (mhammond) Date: 2001-06-03 19:31 Message: Logged In: YES user_id=14198 Checked in: /cvsroot/python/python/dist/src/Lib/encodings/aliases.py,v <-- aliases.py new revision: 1.7; previous revision: 1.6 ---------------------------------------------------------------------- Comment By: Brian Quinlan (bquinlan) Date: 2001-05-30 14:04 Message: Logged In: YES user_id=108973 Yeah, there are tonnes. But after my initial mistake, I don't want to add any without carefully checking that the encodings that I am aliasing are exactly identical. I'll probably add some more later but Windows-1252 is probably the most immediately useful. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-05-30 12:59 Message: Logged In: YES user_id=38388 The link you gave doesn't work for me, but the aliases look reasonable... aren't there more (Windows does have far more code pages than the few you added) ? In any case, go ahead and check them in :-) ---------------------------------------------------------------------- Comment By: Mark Hammond (mhammond) Date: 2001-05-29 23:30 Message: Logged In: YES user_id=14198 This looks good to me! Assiging to MAL simply for comment. Marc - if you have no objections and this sounds reasonable, assign back to me and I will check it in. ---------------------------------------------------------------------- Comment By: Brian Quinlan (bquinlan) Date: 2001-05-29 23:23 Message: Logged In: YES user_id=108973 The first patch (alias.patch) is incorrect and I'm not sure it the replacement (aliases.patch) is visible to anyone but me, so here is aliases.patch inline: *** d:\Dev\python\dist\src\Lib\encodings\aliases.py Wed Jun 7 02:12:30 2000 --- d:\Dev\python-dev\dist\src\Lib\encodings\aliases.py Tue May 29 19:16:58 2001 *************** *** 59,64 **** --- 59,69 ---- 'macroman': 'mac_roman', 'macturkish': 'mac_turkish', + # Windows + 'windows_1252': 'cp1252', + 'windows_1254': 'cp1254', + 'windows_1255': 'cp1255', + # MBCS 'dbcs': 'mbcs', ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=423221&group_id=5470 From noreply@sourceforge.net Mon Jun 4 04:53:16 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 03 Jun 2001 20:53:16 -0700 Subject: [Patches] [ python-Patches-429957 ] Add some more EBCDIC encodings Message-ID: Patches item #429957, was updated on 2001-06-03 20:53 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429957&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Brian Quinlan (bquinlan) Assigned to: Nobody/Anonymous (nobody) Summary: Add some more EBCDIC encodings Initial Comment: Add support for cp1140, which is identical to cp037, with the addition of the euro character. Also added a few EDBDIC aliases. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429957&group_id=5470 From noreply@sourceforge.net Mon Jun 4 05:24:46 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 03 Jun 2001 21:24:46 -0700 Subject: [Patches] [ python-Patches-429024 ] Deal with some unary ops at compile time Message-ID: Patches item #429024, was updated on 2001-05-31 07:27 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429024&group_id=5470 Category: Parser/Compiler Group: None Status: Open Resolution: None Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) >Assigned to: Tim Peters (tim_one) Summary: Deal with some unary ops at compile time Initial Comment: This patch makes unary + and - operations with numeric literals compile to a constant reference instead of a constant reference and UNARY_POSITIVE or UNARY_NEGATIVE opcode. This could be extended to support UNARY_INVERT as well, but that would be a little more complicated. Folding unary + only affects one case in the regression test, but folding the - affects 817 places (on a Linux system with pretty much everything enabled). I don't know that this makes much difference at runtime, but certainly reduces the number of opcodes evaluated. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-06-03 21:24 Message: Logged In: YES user_id=3066 Re-assigned to Tim since Jeremy's on a new assignment. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429024&group_id=5470 From noreply@sourceforge.net Mon Jun 4 08:02:33 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 04 Jun 2001 00:02:33 -0700 Subject: [Patches] [ python-Patches-401229 ] Optional memory profiler Message-ID: Patches item #401229, was updated on 2000-08-18 23:49 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401229&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Vladimir Marangozov (marangoz) Assigned to: Jeremy Hylton (jhylton) Summary: Optional memory profiler Initial Comment: ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-04 00:02 Message: Logged In: YES user_id=21627 The patch, in its current form, fails to apply (4 hunks fail). Also, the URL of the discussion of the patch changed to http://mail.python.org/pipermail/python-dev/2000-August/008527.html I recommend to reject this patch, since I cannot see what use the information it produces has to a Python developer. If there is a desire to have the feature in Python, I'd volunteer to provide an updated patch. ---------------------------------------------------------------------- Comment By: Vladimir Marangozov (marangoz) Date: 2000-08-19 00:18 Message: An optional memory profiler, which goes in tandem with the optional object memory allocator (SourceForge patch #101104). The profiler was introduced briefly on python-dev: http://www.python.org/pipermail/python-dev/2000-August/015239.html Applying both patches gives for me (screen dump): ~> patch -p1 < ../obmalloc-patch patching file `Include/objimpl.h' patching file `Objects/object.c' patching file `Objects/obmalloc.c' patching file `acconfig.h' patching file `configure.in' ~> patch -p1 < ../memprof-patch patching file `Include/pydebug.h' patching file `Modules/Setup.config.in' patching file `Modules/main.c' patching file `Modules/memprof.c' patching file `Python/pythonrun.c' patching file `acconfig.h' patching file `configure.in' - Don't forget that you need to autoheader; autoconf; This patch: 1) introduces a new --with-memprof configure option. Off by default. 2) introduced a Py_ProfileFlag and a "-p" Python option which starts the profiler in Py_Initialize() before any initializations, and stops it in Py_Finalize() after all finalizations. 3) contains a new Modules/memprof.c module. The inclusion of this file in the core is similar to the thread and GC modules (Setup.config.in) The patch *can* be applied without the object allocator and it *does* compile on request. However, it issues a warning that it won't profile anything, because it can't be called (the profiler can't install its hooks). Besides, it will refuse to start(). The point is that both the profiler and the allocator are really optional. Needs docs & tests :( The interface can be improved (just like everything else) but the core functionality is there. It *is* useful for getting snapshots of the minimum allocated (object) memory, at least. Some worthy points to condifer, IMO, are listed in the TODO of memprof.c. I am submitting this for testing, reviewing, comments and more ideas. Overall, I think it is a BIG plus regarding Python's typical introspection. Comments welcome. As usual, flames to /dev/null . Status set straight to Postponed. Assigned to marangoz who's in charge of opening it in due time, together with #101104. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401229&group_id=5470 From noreply@sourceforge.net Mon Jun 4 08:37:31 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 04 Jun 2001 00:37:31 -0700 Subject: [Patches] [ python-Patches-401335 ] Adds login to auth-type servers (smtplib.py) Message-ID: Patches item #401335, was updated on 2000-08-29 01:05 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401335&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Ulf Engstrøm (alexisjuh) Assigned to: Jeremy Hylton (jhylton) Summary: Adds login to auth-type servers (smtplib.py) Initial Comment: ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-04 00:37 Message: Logged In: YES user_id=21627 I've transformed this into a patch into a diff-style patch. I've left out the self.authenticated attribute, since it appears never to be set, and since its purpose is unclear. Applying the patch seems harmless, since it just adds another method to the SMTP class. I still recommend rejecting it, since it has no apparent relationship to RFC 2554. In that RFC, the AUTH line of the EHLO response will contain a list of SASL authentication mechanisms, as defined in RFC 2222, and listed in http://www.iana.org/assignments/sasl-mechanisms So a valid AUTH request would be "AUTH CRAM-MD5", as the example in RFC 2554 shows. "AUTH login", as implemented in this patch, does not conform to the RFC. Therefore, I recommend to reject this patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-01-04 07:44 Message: Jeremy, can you look at this again? ---------------------------------------------------------------------- Comment By: Moshe Zadka (moshez) Date: 2000-08-29 02:22 Message: Postponed -- we're in feature freeze. Assigned to Jeremy in case he disagrees. Note also that it's preferable to submit patches in diff format, not as human-readable summaries. Try "diff -c". ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401335&group_id=5470 From noreply@sourceforge.net Mon Jun 4 08:47:12 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 04 Jun 2001 00:47:12 -0700 Subject: [Patches] [ python-Patches-401606 ] threads and __del__ Message-ID: Patches item #401606, was updated on 2000-09-22 07:47 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401606&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Toby Dickenson (htrd) Assigned to: Tim Peters (tim_one) Summary: threads and __del__ Initial Comment: ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-04 00:47 Message: Logged In: YES user_id=21627 I recommend to approve this patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2000-09-22 08:06 Message: Works fine for me. Assigned to Tim for review since he's the race condition czar. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401606&group_id=5470 From noreply@sourceforge.net Mon Jun 4 08:56:18 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 04 Jun 2001 00:56:18 -0700 Subject: [Patches] [ python-Patches-401713 ] Free extension DLLs' handles during the Py_Finalize() Message-ID: Patches item #401713, was updated on 2000-09-29 12:02 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401713&group_id=5470 Category: Windows Group: None Status: Open Resolution: None Priority: 5 Submitted By: Yakov Markovitch (markovitch) Assigned to: Tim Peters (tim_one) Summary: Free extension DLLs' handles during the Py_Finalize() Initial Comment: ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-04 00:56 Message: Logged In: YES user_id=21627 I recommend to reject this patch. If such a feature is implemented, it should be implemented uniformly across platforms - i.e. on Unix, appropriate dlclose calls should be issued. Furthermore, I don't see the problem with the DLLs being loaded. AFAIK, each DLL will be loaded only once, so even if the interpreter is stopped and started again, you get only one copy of the DLLs state per process, right? So what is the problem? Finally, it seems reasonable that people embedding the interpreter might need to customize its code. It is possible that the finalization procedure of user A won't work for user B, e.g. because they require state to survive different activations and deactivations. ---------------------------------------------------------------------- Comment By: Yakov Markovitch (markovitch) Date: 2000-10-06 02:54 Message: Yes, I agree with Mark, but there is the other side of the problem. Let's suppose that we have an application that uses the interpreter through dynamic loading (I mean through the LoadLibrary). It isn't likely to be directly, but the application can load/unload some other DLL which, in turn, uses an embedded interpreter. Now after freeing this DLL the application has ALL extensions which was used by this DLL loaded! (Though it hasn't the interpreter embedded at all!) ---------------------------------------------------------------------- Comment By: Mark Hammond (mhammond) Date: 2000-10-05 18:19 Message: I agree we should close handles that we can't use as extension modules. I am quite skeptical of the unloading of modules, tho. Python simply doesn't provide enough cleanup semantics to guarantee we are finished with the module at Py_Finalize() time. Indeed, extension modules are one main reason why Python often can not handle multiple Py_Initialize()/Py_Finalize() calls in the same process. I think that Python needs to grow module termination semantics. Something like, at Py_Finalize time: Try and find function "term_{module}" If function exists: call function free handle else: pass Thus - only modules that have gone to the trouble of providing a finalize function can be trusted to be unloaded. On one hand, the addition of the map means we _are_ in a better position for better finalization semantics on Windows. On the larger hand, module finalization semantics must be cross-platform anyway. So - while I acknowledge the problem, I don't believe this alone is a reasonable solution. Marking as postponed, and assigning back to Tim, so he can rule on the next step.... This came up a number of years ago, and Guido agreed "better" semantics were needed. Sounds like PEP material. I guess I _do_ care enough about this issue to own a PEP on it, as long as no-one needs the PEP finalized this year ;-) ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2000-10-05 18:02 Message: Mark, you got anything to say about this? Can't say I've ever noticed a problem here. Note that "the patch" is actually a .zip archive, and it takes a little effort to sort out what's what. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2000-09-30 17:01 Message: Assigned to one of our Windows guys for review. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2000-09-30 17:01 Message: Assigned to one of our Windows guys for review. ---------------------------------------------------------------------- Comment By: Yakov Markovitch (markovitch) Date: 2000-09-29 12:09 Message: This patch is intended to fix the following problem: Python on Windows never frees DLLs loaded as extension. Whenever it's not a big problem when the interpreter is being used in a standart way, it becomes THE problem (or even a disaster) when the interpreter DLL is dynamically initialized/finalized from one process many times during single run. Moreover, even in case of single initialization there is a trap - DLLs loaded by mistake are unloaded only then a process finishes (e.g. suppose there is a foo.dll in the current directory and foo.dll is NOT a Python extension; "import foo" ends up with error, but foo.dll will be anging in process' address space!) This patch 1) frees a DLL handle in case of it has no proper initialization funcion 2) registers in an internal array all handles of successfully loaded dynamic extensions 2) frees all registered handles during Py_Finalize() Yakov Markovitch, markovitch@iso.ru ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401713&group_id=5470 From noreply@sourceforge.net Mon Jun 4 08:59:07 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 04 Jun 2001 00:59:07 -0700 Subject: [Patches] [ python-Patches-402780 ] SET_LINENO for augassign Message-ID: Patches item #402780, was updated on 2000-12-11 08:03 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=402780&group_id=5470 Category: demos and tools Group: None Status: Open Resolution: None Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Jeremy Hylton (jhylton) Summary: SET_LINENO for augassign Initial Comment: ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-04 00:59 Message: Logged In: YES user_id=21627 I recommend to approve this patch. ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2000-12-11 08:05 Message: Line numbers are currently not set for augmented assignment statements for code compiled by Tools/compiler. Here is a one line fix. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=402780&group_id=5470 From noreply@sourceforge.net Mon Jun 4 09:00:19 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 04 Jun 2001 01:00:19 -0700 Subject: [Patches] [ python-Patches-402891 ] Alternative readline module Message-ID: Patches item #402891, was updated on 2000-12-17 14:22 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=402891&group_id=5470 Category: Modules Group: None Status: Open Resolution: None Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Neil Schemenauer (nascheme) Summary: Alternative readline module Initial Comment: ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-04 01:00 Message: Logged In: YES user_id=21627 I see this patch is still not committed. Any reason why not? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-01-18 16:04 Message: Neil, I'm assigning this back to you and reopening it (from Accepted). It seems patches with status Accepted frequently get lost -- probably because "My Patches" doesn't show them. In any case, I think you should just check this in ASAP and close the patch! ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-01-16 06:59 Message: Ah, of course. I saw that, even played with it a bit. Looks cool, but I don't know about using it to replace readline. But you might want to change the name given that pyrl is already taken. ;-) ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2001-01-16 06:34 Message: pyrl is my line reader written in Python that I've been intermittently blathering about on python-dev: http://www-jcsu.jesus.cam.ac.uk/~mwh21/hacks/pyrl-0.2.0.tar.gz it's still very experimental, though. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-01-16 05:41 Message: What's pyrl in this context? A Google search turns up a bunch of references to a Perl preprocessor that takes Pythonic syntax and translates it into Perl. :-) [ESR replied Neil via email: "I'm on it. Gotta ship my PC9 paper first, though."] ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2001-01-16 02:23 Message: You could defer the decision between readline and edline until runtime, as in: (will sf mangle this? we'll see) Index: Modules/main.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Modules/main.c,v retrieving revision 1.47 diff -c -r1.47 main.c *** Modules/main.c 2000/12/15 22:00:54 1.47 --- Modules/main.c 2001/01/16 10:19:45 *************** *** 267,274 **** isatty(fileno(stdin))) { PyObject *v; v = PyImport_ImportModule("readline"); ! if (v == NULL) PyErr_Clear(); else Py_DECREF(v); } --- 267,280 ---- isatty(fileno(stdin))) { PyObject *v; v = PyImport_ImportModule("readline"); ! if (v == NULL) { PyErr_Clear(); + v = PyImport_ImportModule("edline"); + if (v == NULL) + PyErr_Clear(); + else + Py_DECREF(v); + } else Py_DECREF(v); } (and pyrl's not going to be ready for 2.1, by a country mile...) ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2001-01-15 21:30 Message: Hmm, I still like pyrl better. What to do about GNU readline now that its in Setup.conf? You can't enable them both and I don't feel comfortable enough with autoconf to fix things. ESR, if you could add the magic to test for termios that would be cool. configure should use readline if its there and fall back to edline if termios is available. Feel free to bounce it back to me if you don't have time. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-01-03 06:10 Message: Neil, this has now status Accepted. Go ahead and check it in! ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2000-12-18 14:17 Message: Neil, I propose that you check this in. The edline.c file would need a little work to compile without warnings, and you should add #HAVE_STRDUP to edline.h (Python makes sure strdup() is always present). The comment for Setup.dist "Neil Schemenauer's edline library" sounds a little strange given that most of the code is by others. Maybe "Neil Schemenauer's edline wrapper module"? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2000-12-17 14:28 Message: I like Michael Hudson's idea of writing a readline replacement in Python using modules like _curses and termios better but I had this patch 90% complete before I recieved his email. I stripped the editline library down and updated it for modern Unix systems. I have no idea if it compiles on anything other than Linux however. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=402891&group_id=5470 From noreply@sourceforge.net Mon Jun 4 17:59:20 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 04 Jun 2001 09:59:20 -0700 Subject: [Patches] [ python-Patches-430030 ] Avoid multiple BOMs in UTF-16 streams Message-ID: Patches item #430030, was updated on 2001-06-04 09:59 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430030&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Martin v. Löwis (loewis) Assigned to: M.-A. Lemburg (lemburg) Summary: Avoid multiple BOMs in UTF-16 streams Initial Comment: This patch fixes the UTF-16 reader and writer to emit and expect the BOM only at the beginning of the stream. It is implemented by changing the encode/decode function of the stream object after the byte order is detected. In addition, it adds a new test case test_codecs. When committing the patch, the corresponding output file must be generated. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430030&group_id=5470 From noreply@sourceforge.net Mon Jun 4 18:13:25 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 04 Jun 2001 10:13:25 -0700 Subject: [Patches] [ python-Patches-421893 ] Cleanup GC API Message-ID: Patches item #421893, was updated on 2001-05-06 14:42 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421893&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Neil Schemenauer (nascheme) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Cleanup GC API Initial Comment: This patch adds three new APIs: PyObject_GC_New PyObject_GC_NewVar PyObject_GC_Resize PyObject_GC_Del and renames PyObject_GC_Init and PyObject_GC_Fini to: PyObject_GC_Track PyObject_GC_Ignore respectively. Objects that wish to be tracked by the collector must use these new APIs. Many more details about the GC implementation are hidden inside gcmodule.c. There seems to be no change in performance. Note that PyObject_GC_{New,NewVar} automatically adds the object to the GC lists. There is no need to call PyObject_GC_Track. PyObject_GC_Del automatically removes the object from the GC list but usually you want to call PyObject_GC_Ignore yourself (DECREFs can end up running arbitrary code). It should be more difficult to corrupt the GC linked lists now. Also, you can now call PyObject_GC_Ignore on objects that you know will not create RCs. The _weakref module does this. Previously, every object that had the GC type flag set and could be found by using tp_traverse had to be in a GC linked list. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-04 10:13 Message: Logged In: YES user_id=21627 I have two problems with this patch: 1. It comes with no documentation. 2. It breaks existing third-party modules which use the GC API as defined in Python 2. Consequently, I recommend rejection of the patch in its current form. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421893&group_id=5470 From noreply@sourceforge.net Mon Jun 4 18:28:59 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 04 Jun 2001 10:28:59 -0700 Subject: [Patches] [ python-Patches-421709 ] Access { thread id : frame } dict Message-ID: Patches item #421709, was updated on 2001-05-05 13:30 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421709&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: John D. Heintz (jheintz) Assigned to: Barry Warsaw (bwarsaw) Summary: Access { thread id : frame } dict Initial Comment: This patch adds a new function sys._getframes() that returns a dictionary mapping from thread id to current frame object. This is very useful when diagnosing deadlock issues in Python code. The new C code function is purely additive except for modifying the PyThreadState struct (adding a long thread_ident) and modifying PyThreadState_New() function to set this new long. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-04 10:28 Message: Logged In: YES user_id=21627 I think the patch could use some more documentation, e.g. as a patch to Doc/lib/libsys.tex. E.g. what are the tuples that are put into the dictionaries? Also, isn't there a problem with the tuple size? The patch allocates tuples of size 0, but then puts things into index 0. Is there any kind of test case for this code? Finally, I don't think the docstring should say that the function is for internal and specialized purposes only (what specialized purposes, anyway), if you think its primary use is in diagnosing deadlocks. It should only document what the function does, not what you intend it to use for. For these reasons, I also think its name should not start with an underscore. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421709&group_id=5470 From noreply@sourceforge.net Mon Jun 4 18:52:19 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 04 Jun 2001 10:52:19 -0700 Subject: [Patches] [ python-Patches-421709 ] Access { thread id : frame } dict Message-ID: Patches item #421709, was updated on 2001-05-05 13:30 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421709&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: John D. Heintz (jheintz) Assigned to: Barry Warsaw (bwarsaw) Summary: Access { thread id : frame } dict Initial Comment: This patch adds a new function sys._getframes() that returns a dictionary mapping from thread id to current frame object. This is very useful when diagnosing deadlock issues in Python code. The new C code function is purely additive except for modifying the PyThreadState struct (adding a long thread_ident) and modifying PyThreadState_New() function to set this new long. ---------------------------------------------------------------------- >Comment By: John D. Heintz (jheintz) Date: 2001-06-04 10:52 Message: Logged In: YES user_id=20438 Martin: I agree with you on the documentation issue and will look into the tuple size issue you raised. The docstring is modeled on the sys._getframe() function so I figured it would be sufficient to follow the leader. (I think that both sys._getframe() and sys._getframes() should be part of the public api for the sys module by the way.) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-04 10:28 Message: Logged In: YES user_id=21627 I think the patch could use some more documentation, e.g. as a patch to Doc/lib/libsys.tex. E.g. what are the tuples that are put into the dictionaries? Also, isn't there a problem with the tuple size? The patch allocates tuples of size 0, but then puts things into index 0. Is there any kind of test case for this code? Finally, I don't think the docstring should say that the function is for internal and specialized purposes only (what specialized purposes, anyway), if you think its primary use is in diagnosing deadlocks. It should only document what the function does, not what you intend it to use for. For these reasons, I also think its name should not start with an underscore. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421709&group_id=5470 From noreply@sourceforge.net Mon Jun 4 19:32:29 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 04 Jun 2001 11:32:29 -0700 Subject: [Patches] [ python-Patches-407764 ] allow whitespace lines for doctest tests Message-ID: Patches item #407764, was updated on 2001-03-11 13:37 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=407764&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Trent Mick (tmick) >Assigned to: Trent Mick (tmick) Summary: allow whitespace lines for doctest tests Initial Comment: Currently doctest.py does not allow individual tests to have all-whitespace output lines. This patch proposes a fix for this. With this patch a leading '.' on a doctest output line, if and only if the tests are indented, will signal that following whitespace *is* the expected output. For example, currently this cannot be doctest'ed """ >>> print "\nhello\n" hello >>> """ But with this patch *this* can be: # file test_doctest.py """ >>> print "\nhello\n" . hello . >>> """ def _test(): import doctest, test_doctest return doctest.testmod(test_doctest) if __name__ == "__main__": _test() ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2001-06-04 11:32 Message: Logged In: YES user_id=31435 Should have assigned this back to Trent months ago. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-03-16 21:04 Message: Logged In: YES user_id=31435 Trent, yuck. doctests are primarily documentation, and there's nothing about "." that suggests-- let alone screams --"ah, this line is really a blank line, not the period that it sure looks like". Too confusing. I'd be happy with this if it *screamed* "blank line", though! For example, accept as meaning it's really a blank line. In that case, though, note that: 1. The restriction about blank lines is documented in both doctest's docstrings and in the Library Manual, so this would also need doc changes in both places. and 2. doctest is self-testing, i.e. the standard test for doctest simply runs doctest on doctest. So in the very same place you document your blank line convention in the doctest docstring, you should also include an executable doctest example in the docstring. Then the standard test_doctest.py will verify that it works exactly as advertised forever more. ---------------------------------------------------------------------- Comment By: Trent Mick (tmick) Date: 2001-03-11 13:40 Message: Logged In: YES user_id=34892 Grrr, the code I put in the comment is supposed to be indented of course. I will attach the test_doctest.py to clarify. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=407764&group_id=5470 From noreply@sourceforge.net Mon Jun 4 21:04:53 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 04 Jun 2001 13:04:53 -0700 Subject: [Patches] [ python-Patches-403514 ] small speedup in Tkinter.Misc._bind Message-ID: Patches item #403514, was updated on 2001-01-30 12:21 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403514&group_id=5470 Category: Tkinter Group: None Status: Open Resolution: None Priority: 5 Submitted By: Markus F.X.J. Oberhumer (mfx) Assigned to: Fredrik Lundh (effbot) Summary: small speedup in Tkinter.Misc._bind Initial Comment: This patch precomputes _subst_format_str to avoid a call to _string.join() on each invocation of _bind. It gives a small but noticable speed improvement when creating a lot of bindings, such as in the upcoming PySol Mahjongg games. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-04 13:04 Message: Logged In: YES user_id=21627 I recommend to approve this patch. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403514&group_id=5470 From noreply@sourceforge.net Mon Jun 4 21:45:15 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 04 Jun 2001 13:45:15 -0700 Subject: [Patches] [ python-Patches-403743 ] [windows] Correction to bug #131273 Message-ID: Patches item #403743, was updated on 2001-02-12 01:03 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403743&group_id=5470 Category: Windows Group: None Status: Open Resolution: None Priority: 5 Submitted By: Christophe Gouiran (cgouiran) Assigned to: Mark Hammond (mhammond) Summary: [windows] Correction to bug #131273 Initial Comment: I found a bug in the posixmodule.c file, not killing children processes when exiting python. Now in the posixmodule i wrote a win32_atexit() function that does the trick. Then it's registered it in the INITFUNC function with the atexit() function. Now at exit, any children process are automatically killed. The patch must be applyed in the module directory, not at the python root one. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-04 13:45 Message: Logged In: YES user_id=21627 It appears that the patch is a nearly empty file containing only garbage. Christophe, you probably should try uploading it again. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-02-12 08:07 Message: Assigned to Mark Hammond since the original bug is already assigned to him. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403743&group_id=5470 From noreply@sourceforge.net Mon Jun 4 21:55:33 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 04 Jun 2001 13:55:33 -0700 Subject: [Patches] [ python-Patches-403753 ] zlib decompress; uncontrollable memory usage Message-ID: Patches item #403753, was updated on 2001-02-12 08:10 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403753&group_id=5470 Category: Modules Group: None Status: Open Resolution: None Priority: 5 Submitted By: Toby Dickenson (htrd) Assigned to: Jeremy Hylton (jhylton) Summary: zlib decompress; uncontrollable memory usage Initial Comment: zlib's decompress method will allocate as much memory as is needed to hold the decompressed output. The length of the output buffer may be very much larger than the length of the input buffer, and the python code calling the decompress method has no other way to control how much memory is allocated. In experimentation, I seen decompress generate output that is 1000 times larger than its input These characteristics may make the decompress method unsuitable for handling data obtained from untrusted sources (for example, in a http proxy which implements gzip encoding) since it may be vulnerable to a denial of service attack. A malicious user could construct a moderately sized input which forces 'decompress' to try to allocate too much memory. This patch adds a new method, decompress_incremental, which allows the caller to specify the maximum size of the output. This method returns the excess input, in addition to the decompressed output. It is possible to solve this problem without a patch: If input is fed to the decompressor a few tens of bytes at a time, memory usage will surge by (at most) a few tens of kilobytes. Such a process is a kludge, and much less efficient that the approach used in this patch. (Ive not been able to test the documentation patch; I hope its ok) (This patch also includes the change from Patch #103748) ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-04 13:55 Message: Logged In: YES user_id=21627 The patch looks good to me; I recommend to approve it. ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2001-04-10 10:52 Message: Logged In: YES user_id=11375 I've let this patch gather dust long enough. Unassigning so that someone else can review it. ---------------------------------------------------------------------- Comment By: Gregory P. Smith (greg) Date: 2001-04-07 00:20 Message: Logged In: YES user_id=413 as a side note. I believe I implemented a python workaround for this problem by just decompressing data in small chunks (4k at a time) using a decompressor object. see the mojonation project on sourceforge if you're curious. (specifically, in the mojonation evil module, look at common/mojoutil.py for function named safe_zlib_decompress). Regardless, I like thie idea of this patch. It would be good to have that in the main API and documentation for simplicity. (and because there are too many programmers out there who don't realize potential denial of service issues on their own...) ---------------------------------------------------------------------- Comment By: Toby Dickenson (htrd) Date: 2001-02-22 04:50 Message: New patch implementing a new optional parameter to .decompress, and a new attribute .unconsumed_tail ---------------------------------------------------------------------- Comment By: Toby Dickenson (htrd) Date: 2001-02-22 03:42 Message: Waaah - that last comment should be 'cant' not 'can' ---------------------------------------------------------------------- Comment By: Toby Dickenson (htrd) Date: 2001-02-22 03:40 Message: We can reuse .unused_data without introducing an ambiguity. I will prepare a patch that uses a new attribute .unconsumed_tail ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2001-02-21 11:32 Message: Doesn't .unused_data serve much the same purpose, though? So that even with a maximum size, .decompress() always returns a string, and .unused_data would contain the unprocessed data. ---------------------------------------------------------------------- Comment By: Toby Dickenson (htrd) Date: 2001-02-21 06:00 Message: I did consider that.... An extra change that you didnt mention is the need for a different return value. Currently .decompress() always returns a string. The new method in my patch returns a tuple containing the same string, and an integer specifying how many bytes were consumed from the input. Overloading return values based on an optional parameter seems a little hairy to me, but I would be happy to change the patch if that is your preferred option. I also considered (and rejected) the possibility of adding an optional max-size argument to .decompress() as you suggest, but raising an exception if this limit is exceeded. This avoids the need for an extra return value, but looses out on flexibility. ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2001-02-20 18:48 Message: Rather than introducing a new method, why not just add an optional maxlen argument to .decompress(). I think the changes would be: * add 'int maxlen=-1;' * add "...|i" ... ,&maxlen to the argument parsing * if maxlen != -1, length = maxlen else length = DEFAULTALLOC; * Add '&& maxlen==-1' to the while loop. (Use the current CVS; I just checked in a patch rearranging the zlib module a bit.) Do you want to make those changes and resubmit the patch? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403753&group_id=5470 From noreply@sourceforge.net Mon Jun 4 21:59:12 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 04 Jun 2001 13:59:12 -0700 Subject: [Patches] [ python-Patches-403977 ] Rename config.h to pyac_config.h, per SF bug #131774 Message-ID: Patches item #403977, was updated on 2001-02-23 13:28 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403977&group_id=5470 Category: Build Group: None Status: Open Resolution: Postponed Priority: 1 Submitted By: Thomas Wouters (twouters) Assigned to: Thomas Wouters (twouters) Summary: Rename config.h to pyac_config.h, per SF bug #131774 Initial Comment: This patch fixes the UNIX and Windows builds to use 'pyac_config.h' instead of 'config.h', to avoid the problems summarized in SF bug #131774. It doesn't address the placing issue, however, because I believe it's intended to be like this. Most changes were done using a fairly intelligent shell+sed oneliner, but they should be correct. The Windows build *seems* correct, though I can't be sure. Someone will have to check ;) It is probably a good idea to remove 'config.h' before testing, to be sure I got all references. The UNIX build requires that autoconf is installed, and requires a 'autoheader ; autoconf' is done before running 'configure'. Removing config.h(.in) is also a good idea. I excluded the OS2 build files, and will be uploading those as a seperate patch to avoid making this one unreadable Though only two files are involved, they both list all dependencies for *all* files in its entirety, so the patch is quite large. If those files are auto-generated, someone please tell me so :-) I also didn't fix distutils, though it looks like it does need fixing. And I didn't do anything wrt. backwards compatibility. We should probably provide a config.h that just does #warning Warning: Use of Python-specific config.h is deprecated. Use pyac_config.h instead. #include The name is just my suggestion, changing it into something less acronymic would be no problem at all. I think 'pythonconfig.h' gives the wrong message though: the file isn't used to configure Python itself, after all ;) ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-04 13:59 Message: Logged In: YES user_id=21627 I think we should come to a conclusion for these patches, and applying one of them. I still like pyconfig.h better than pyac_config.h, but apart from that, *something* should get installed. ---------------------------------------------------------------------- Comment By: Thomas Wouters (twouters) Date: 2001-04-11 05:43 Message: Logged In: YES user_id=34209 I'm not sure about the supersedence here. See my comment in #411138. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-04-10 15:15 Message: Logged In: YES user_id=6380 Is this superseded by patch #411138? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-03-18 16:09 Message: Logged In: YES user_id=6380 Let's do this after 2.1 is released. Status set to postponed and priority lowered. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-03-01 17:29 Message: Logged In: YES user_id=31435 Na, I don't mind the pyac name. I had forgotten (or perhaps never knew) that this thing is a generated file (on Windows it's done by hand). It's an internal implementation detail anyway, so it doesn't matter if the name "makes sense" to Windows geeks; at least pyac_config will make some sense to Linux dweebs. ---------------------------------------------------------------------- Comment By: Trent Mick (tmick) Date: 2001-03-01 17:08 Message: Logged In: YES user_id=34892 Tim said: > BTW, I have no idea what "pyac" is supposed to bring > to mind. Is that some Unixism? In answer to that. How about just calling it "pyconfig.h". The reference to autoconf is not very accurate for Windows. ---------------------------------------------------------------------- Comment By: Thomas Wouters (twouters) Date: 2001-02-28 01:07 Message: Logged In: YES user_id=34209 I forgot to mention that I think this should be postponed until 2.2 or 2.1.1 anyway. It's not that big a change, but it's big enough to have weird and unsuspected sideffects. The bug is now numbered #231774, by the way. The problem is that 'config.h' is an oft-used name, and if you include it but have another directory with another project's config.h earlier in your include path, you get the wrong one. Similar if you intend to use the other one, but get this one. Leaving a fake config.h would only cause this patch to fix half of those problems, but only the first problem was reported in the bugreport :) The 'pyac_config' name comes from 'python', 'autoconf', 'config', and is IMHO sufficiently vague that it implies it is autogenerated :-) ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2001-02-27 23:14 Message: Logged In: YES user_id=31392 No time ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2001-02-27 13:53 Message: Logged In: YES user_id=35752 SF seems to have changed the bug ids! I can't find bug #131774. Unless there is a very good reason for the change I'm against it for 2.1. ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2001-02-27 13:05 Message: Logged In: YES user_id=11375 Regarding Distutils: I think the only actual *code* that would change is in distutils/sysconfig.py, in the get_config_h_filename() method. For backward compat., this method would probably have to check the Python version and use pyac_config.h if the version is 2.1 or greater. There are also lots of references to config.h in comments; we can change those or not, as desired. (I probably *would* change most of them.) ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-02-27 12:48 Message: Logged In: YES user_id=31435 Pushed onto Jeremy. Jeremy, we want to do this much fiddling so late in the cycle? Thomas, don't worry about Windows. I only need a warning about that, and I've aware of this now (thanks!). Check in the new MS project files or don't, it's easy for me to fix 'em up regardless (indeed, it's not worth extra time to check it in advance). Note that "#warning" is not std C. I'm afraid you'll have to make it an #error. OTOH, if you leave a file *named* "config.h" in the distribution, it doesn't really address the bug report, right? BTW, I have no idea what "pyac" is supposed to bring to mind. Is that some Unixism? ---------------------------------------------------------------------- Comment By: Thomas Wouters (twouters) Date: 2001-02-23 13:32 Message: Apologies for the large blurb in the 'details' section. I keep forgetting SF strips *all* whitespace from that block :( Assigning to Tim "The Windows Bot" Peters to test (and fix) the Windows build changes. Let me know if your patch still doesn't work and you want me to send you patched files instead, Tim. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403977&group_id=5470 From noreply@sourceforge.net Tue Jun 5 03:40:45 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 04 Jun 2001 19:40:45 -0700 Subject: [Patches] [ python-Patches-430181 ] Make httplib work with picky servers Message-ID: Patches item #430181, was updated on 2001-06-04 19:40 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430181&group_id=5470 Category: library Group: 2.0.1 bugfix Status: Open Resolution: None Priority: 5 Submitted By: Leonard Samuelson (lenski) Assigned to: Nobody/Anonymous (nobody) Summary: Make httplib work with picky servers Initial Comment: Python2.0: httplib.py: httplib: HTTPconnection Header processing: (putheader, putrequest, and endheaders) methods transmit each HTTP header line using a separate socket send invocation. Before this change, My Linksys Etherfast Cable/DSL router (Linksys BEFSR41, firmware v 1.22, March 31 2000) rejected the request becuase the entire HTTP header block is not contained in a single TCP packet. Clearly, the router is engaging in a noncompliant optimization! This patch is not required to allow httplib to work with real servers, making it completely optional. The patch I am submitting with this note causes httplib to work with the router. It is intended mostly as a model; a developer with greater familiarity with the library might have a better approach. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430181&group_id=5470 From noreply@sourceforge.net Tue Jun 5 09:00:33 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 05 Jun 2001 01:00:33 -0700 Subject: [Patches] [ python-Patches-421709 ] Access { thread id : frame } dict Message-ID: Patches item #421709, was updated on 2001-05-05 13:30 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421709&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: John D. Heintz (jheintz) Assigned to: Barry Warsaw (bwarsaw) Summary: Access { thread id : frame } dict Initial Comment: This patch adds a new function sys._getframes() that returns a dictionary mapping from thread id to current frame object. This is very useful when diagnosing deadlock issues in Python code. The new C code function is purely additive except for modifying the PyThreadState struct (adding a long thread_ident) and modifying PyThreadState_New() function to set this new long. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-05 01:00 Message: Logged In: YES user_id=21627 There is a difference between these two functions. _getframe is not an official API; inspect.currentframe is the official API. It seems that your function is meant to be used via sys, so it would be public there. In any case, I also think that the sys._getframe doc string should not talk about intended uses - if anything, it should mention what function to call instead. ---------------------------------------------------------------------- Comment By: John D. Heintz (jheintz) Date: 2001-06-04 10:52 Message: Logged In: YES user_id=20438 Martin: I agree with you on the documentation issue and will look into the tuple size issue you raised. The docstring is modeled on the sys._getframe() function so I figured it would be sufficient to follow the leader. (I think that both sys._getframe() and sys._getframes() should be part of the public api for the sys module by the way.) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-04 10:28 Message: Logged In: YES user_id=21627 I think the patch could use some more documentation, e.g. as a patch to Doc/lib/libsys.tex. E.g. what are the tuples that are put into the dictionaries? Also, isn't there a problem with the tuple size? The patch allocates tuples of size 0, but then puts things into index 0. Is there any kind of test case for this code? Finally, I don't think the docstring should say that the function is for internal and specialized purposes only (what specialized purposes, anyway), if you think its primary use is in diagnosing deadlocks. It should only document what the function does, not what you intend it to use for. For these reasons, I also think its name should not start with an underscore. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421709&group_id=5470 From noreply@sourceforge.net Wed Jun 6 07:27:07 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 05 Jun 2001 23:27:07 -0700 Subject: [Patches] [ python-Patches-409973 ] glob.glob speedups Message-ID: Patches item #409973, was updated on 2001-03-20 01:57 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=409973&group_id=5470 Category: library Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Rob W.W. Hooft (hooft) Assigned to: Nobody/Anonymous (nobody) Summary: glob.glob speedups Initial Comment: A lot of the time spent by glob.glob on large directories is spent doing os.path.normcase(). Half of this can be saved by normcasing the pattern only once, and on unix the whole normcase call can be left out. This patch attempts to optimize globbing even a bit more by delegating the fnmatching of a list of file names to a new function fnmatch.filter, which allows us to move a few more lookups outside of the file name loop. Furthermore, an optimization is added to glob.glob calls that do not contain any directory specifications, saving a round of os.path.join calls. Speedups of the pattern '*.py?' in the python lib directory range from a factor of 2 with directory specification to a factor of 5 without directory specifications. Unfortunately there is no test_glob regression test, but I did my best to verify that nothing changed in my calls. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-05 23:27 Message: Logged In: YES user_id=21627 Committed as glob.py 1.10 and fnmatch.py 1.12. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=409973&group_id=5470 From noreply@sourceforge.net Wed Jun 6 07:39:59 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 05 Jun 2001 23:39:59 -0700 Subject: [Patches] [ python-Patches-412229 ] runtime RTLD_NOW control via sys Message-ID: Patches item #412229, was updated on 2001-03-29 08:55 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=412229&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Bram Stolk (bram) Assigned to: Nobody/Anonymous (nobody) Summary: runtime RTLD_NOW control via sys Initial Comment: This patch enables runtime control over the RTLD_NOW flag, which can be used to do lazy symbol resolving when loading a shared lib. It's an extention to the sys module: sys.setlazysymresolve(0|1) The patch is against the latest CVS code, and was generated by 'cvs diff'. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-05 23:39 Message: Logged In: YES user_id=21627 The patch needs further work: The code currently compiles on systems which don't define RTLD_NOW (although I'm not sure what these systems are); your code doesn't. Also, the code allows to set the flags, but has no interface to query them. Finally, users often complain that Python should use RTLD_GLOBAL, so that they can share symbols across extension modules. Therefore, I propose that you allow setting arbitrary dlopen flags; users would have to write sys.setdlopenflags(0) to turn off RTLD_NOW, and use sys.setdlopenflags(dl.RTLD_NOW|dl.RTLD_GLOBAL) to add RTLD_GLOBAL. When you revise this patch, please submit unified (-u) or context (-c) diffs. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-04-10 14:48 Message: Logged In: YES user_id=6380 Sorry, no new features in 2.1. I'll look at this after 2.1 is released though. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=412229&group_id=5470 From noreply@sourceforge.net Wed Jun 6 07:45:02 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 05 Jun 2001 23:45:02 -0700 Subject: [Patches] [ python-Patches-414492 ] adds a gc.get_generation function Message-ID: Patches item #414492, was updated on 2001-04-07 00:33 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=414492&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Gregory P. Smith (greg) Assigned to: Neil Schemenauer (nascheme) Summary: adds a gc.get_generation function Initial Comment: gc.get_generation(num) added by this patch allows you to get a list of all objects in a given garbage collector generation. I wrote this while trying to debug a memory leak so that I could peek at what types of objects were remaining allocated but never freed. Looking through the patches I see another similarish patch that allow for searching the collection lists for references to a particular thing or set of things. interesting. Is it useful? Yes and no. I still haven't found the memory leak. But I know what objects are consuming it so I can narrow my search through to code to find how they are remaining referenced. as a side note, there's not much point in the generation number parameter to this method, 2 is the only generation really worth examining. This or something like it would be nice to see in a future python gc module as a debugging aid. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-05 23:45 Message: Logged In: YES user_id=21627 I still would like to see my gc.getreferents patch applied, which offers a similar debugging aid. However, since this offers a somewhat orthogonal functionality, and is a quite short patch, I recommend to approve it. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=414492&group_id=5470 From noreply@sourceforge.net Wed Jun 6 16:33:54 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 06 Jun 2001 08:33:54 -0700 Subject: [Patches] [ python-Patches-430706 ] Persistent connections in BaseHTTPServer Message-ID: Patches item #430706, was updated on 2001-06-06 08:33 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430706&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Chris Lawrence (lordsutch) Assigned to: Nobody/Anonymous (nobody) Summary: Persistent connections in BaseHTTPServer Initial Comment: This patch provides HTTP/1.1 persistent connection support in BaseHTTPServer.py. It is not enabled by default (for backwards compatibility) because Content-Length headers must be supplied for persistent connections to work correctly. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430706&group_id=5470 From noreply@sourceforge.net Wed Jun 6 18:21:19 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 06 Jun 2001 10:21:19 -0700 Subject: [Patches] [ python-Patches-430754 ] Makes ftpmirror.py .netrc aware Message-ID: Patches item #430754, was updated on 2001-06-06 10:21 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430754&group_id=5470 Category: demos and tools Group: None Status: Open Resolution: None Priority: 5 Submitted By: Mike Romberg (romberg) Assigned to: Nobody/Anonymous (nobody) Summary: Makes ftpmirror.py .netrc aware Initial Comment: The following patch modifies the ftpmirror.py script found in Tools/scripts to use the netrc module. This allows the ftpmirror script to act more like a standard ftp client and take the login, password, and account from a users $HOME/.netrc file if it exists. This patch is against the ftpmirror.py found in python 2.1 ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430754&group_id=5470 From noreply@sourceforge.net Wed Jun 6 22:14:08 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 06 Jun 2001 14:14:08 -0700 Subject: [Patches] [ python-Patches-430846 ] faster string-decoding in base64.py Message-ID: Patches item #430846, was updated on 2001-06-06 14:14 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 3 Submitted By: Peter Schneider-Kamp (nowonder) Assigned to: Nobody/Anonymous (nobody) Summary: faster string-decoding in base64.py Initial Comment: This addresses bug #419390 by anthonybaxter. Instead of wrapping a string-to-be-decoded into a StringIO class and using base64.decode use binascii.a2b_base64 directly. Speedup for big files is over 10 times (on Linux x86 anyway). If uncontroversial I'll check it in. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470 From noreply@sourceforge.net Wed Jun 6 22:25:56 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 06 Jun 2001 14:25:56 -0700 Subject: [Patches] [ python-Patches-430846 ] faster string-decoding in base64.py Message-ID: Patches item #430846, was updated on 2001-06-06 14:14 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 3 Submitted By: Peter Schneider-Kamp (nowonder) Assigned to: Nobody/Anonymous (nobody) Summary: faster string-decoding in base64.py Initial Comment: This addresses bug #419390 by anthonybaxter. Instead of wrapping a string-to-be-decoded into a StringIO class and using base64.decode use binascii.a2b_base64 directly. Speedup for big files is over 10 times (on Linux x86 anyway). If uncontroversial I'll check it in. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2001-06-06 14:25 Message: Logged In: YES user_id=31435 Umm -- there's no patch here. If there were, I bet I would have changed this to Accepted, though . ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470 From noreply@sourceforge.net Thu Jun 7 06:25:48 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 06 Jun 2001 22:25:48 -0700 Subject: [Patches] [ python-Patches-413171 ] fix UserDict.get, setdefault, update Message-ID: Patches item #413171, was updated on 2001-04-02 10:18 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=413171&group_id=5470 Category: library Group: None Status: Open Resolution: Postponed Priority: 4 Submitted By: Ka-Ping Yee (ping) Assigned to: Ka-Ping Yee (ping) Summary: fix UserDict.get, setdefault, update Initial Comment: The methods 'get', 'setdefault', and 'update' on a dictionary are usually implemented (and thought of) in terms of the lower-level methods has_key, __getitem__, and __setitem__. The current implementation of UserDict relays a call to e.g. x.get() to x.data.get(), which behaves inconsistently if __getitem__ has been implemented on x. One particular big place where this turns up is cgi. If you get a dict = cgi.SvFormContentDict(), then dict.get('key') will return a *list* even though dict['key'] returns a single item! To make UserDict behave consistently, this patch fixes get(), update(), and setdefault() to re-use the other methods. Then the only occurrence of self.data[k] = v is in __setitem__, the only occurrence of self.data[k] without assignment is in __getitem__, etc. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-06 22:25 Message: Logged In: YES user_id=21627 I recommend to approve this patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-04-10 14:17 Message: Logged In: YES user_id=6380 Let's not fix this in 2.1. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=413171&group_id=5470 From noreply@sourceforge.net Thu Jun 7 06:28:36 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 06 Jun 2001 22:28:36 -0700 Subject: [Patches] [ python-Patches-414775 ] Add --skip-build option to bdist command Message-ID: Patches item #414775, was updated on 2001-04-08 18:20 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=414775&group_id=5470 Category: distutils Group: None Status: Open Resolution: None Priority: 5 Submitted By: Robert Kern (kern) Assigned to: A.M. Kuchling (akuchling) Summary: Add --skip-build option to bdist command Initial Comment: Whenever one uses a non-default compiler to build an extension, the bdist command will try to rebuild the package with the default compiler and fail. The install command has a --skip-build option to manually skip the re-building part of the install. I adapted that code to add a similar --skip-build option to the bdist, bdist_dumb, and bdist_wininst commands. I'm not familiar enough with the bdist_rpm command's code to see where it would work in there. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-06 22:28 Message: Logged In: YES user_id=21627 I recommend to approve this patch. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=414775&group_id=5470 From noreply@sourceforge.net Thu Jun 7 06:29:01 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 06 Jun 2001 22:29:01 -0700 Subject: [Patches] [ python-Patches-430948 ] Performance improvement for profiler Message-ID: Patches item #430948, was updated on 2001-06-06 22:29 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) Assigned to: Tim Peters (tim_one) Summary: Performance improvement for profiler Initial Comment: This patch adds a bit of complexity to Profile.__init__() in an effort to reduce the overhead of the profiler. The essential piece of the puzzle is that the general Profile.get_time() method is replaced with a function which does only as much as is needed for the underlying timer. For example, if time.clock() is available, it can become a PyCFunction instead of a bound method, requires only 1 dict lookup to execute instead of the 11 it takes to execute get_time() without this patch. Also removes a couple of duplicate imports from the "if __name__ == ..." section. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470 From noreply@sourceforge.net Thu Jun 7 06:30:01 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 06 Jun 2001 22:30:01 -0700 Subject: [Patches] [ python-Patches-430948 ] Performance improvement for profiler Message-ID: Patches item #430948, was updated on 2001-06-06 22:29 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) Assigned to: Tim Peters (tim_one) Summary: Performance improvement for profiler Initial Comment: This patch adds a bit of complexity to Profile.__init__() in an effort to reduce the overhead of the profiler. The essential piece of the puzzle is that the general Profile.get_time() method is replaced with a function which does only as much as is needed for the underlying timer. For example, if time.clock() is available, it can become a PyCFunction instead of a bound method, requires only 1 dict lookup to execute instead of the 11 it takes to execute get_time() without this patch. Also removes a couple of duplicate imports from the "if __name__ == ..." section. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-06-06 22:30 Message: Logged In: YES user_id=3066 I should note that this works with both 2.1.1 and 2.2, though this is not a bugfix. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470 From noreply@sourceforge.net Thu Jun 7 06:33:24 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 06 Jun 2001 22:33:24 -0700 Subject: [Patches] [ python-Patches-415226 ] new base class for binary packaging Message-ID: Patches item #415226, was updated on 2001-04-10 12:51 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415226&group_id=5470 Category: distutils Group: None Status: Open Resolution: None Priority: 5 Submitted By: Mark Alexander (mwa) Assigned to: A.M. Kuchling (akuchling) Summary: new base class for binary packaging Initial Comment: bdist_packager.py provides an abstract base class for bdist commands. It provides easy access to all the PEP 241 metadata fields, plus "revision" for the package revision and installation scripts for preinstall, postinstall preremove, and postremove. That covers the base characteristics of all the package managers that I'm familiar with. If anyone can think of any others, let me know, otherwise additional extensions would be implemented in the specific packager's commands. I would, however, discourage _requiring_ any additional fields. It would be nice if by simply supplying the PEP241 metadata under the [bdist_packager] section all subclassed packagers worked with no further effort. It also has rudimentary relocation support by including a --no-autorelocate option. The bdist_packager is also where I see creating seperate binary packages for sub-packages supported. My need for that is much less than my desire for it right now, so I didn't give it much thought as I wrote it. I'd be delighted to hear any comments and suggestions on how to approach sub-packaging, though. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-06 22:33 Message: Logged In: YES user_id=21627 Shouldn't the patch also modify the existing bdist commands to use this as a base class? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415226&group_id=5470 From noreply@sourceforge.net Thu Jun 7 06:39:09 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 06 Jun 2001 22:39:09 -0700 Subject: [Patches] [ python-Patches-415227 ] Solaris pkgtool bdist command Message-ID: Patches item #415227, was updated on 2001-04-10 12:54 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415227&group_id=5470 Category: distutils Group: None Status: Open Resolution: None Priority: 5 Submitted By: Mark Alexander (mwa) Assigned to: A.M. Kuchling (akuchling) Summary: Solaris pkgtool bdist command Initial Comment: The bdist_pktool command is based on bdist_packager and provides support for the Solaris pkgadd and pkgrm commands. In most cases, no additional options beyond the PEP 241 options are required. An exception is if the package name is >9 characters, a --pkg-abrev option is required because that's all pkgtool will handle. It makes listing the packages on the system a pain, but the actual package files produced do match name-version-revision-pyvers.pkg format. By default, bdist_pkgtool provides request, postinstall, preremove, and postremove scripts that will properly relocate modules to the site-packages directory and recompile all .py modules on the target machine. An author can provide a custom request script and either have it auto-relocate by merging the scripts, or inhibit auto-relocation with --no-autorelocate. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-06 22:39 Message: Logged In: YES user_id=21627 Should there also be some Makefile machinery to create a Solaris package for python itself? There is a 1.6a2 package on sunfreeware; it would surely help if building Solaris packages was supported by the Python core itself. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415227&group_id=5470 From noreply@sourceforge.net Thu Jun 7 06:45:43 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 06 Jun 2001 22:45:43 -0700 Subject: [Patches] [ python-Patches-415629 ] setup.py: readline req. ncurses (SuSE) Message-ID: Patches item #415629, was updated on 2001-04-12 02:29 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415629&group_id=5470 Category: Build Group: None >Status: Closed >Resolution: Rejected Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: setup.py: readline req. ncurses (SuSE) Initial Comment: Python 2.1b2 on SuSE Linux 7.0: The readline extension module must be linked with libncurses, else 'import readline' fails because of unresolved symbols. (libtermcap is only installed for libc5 compatibility in SuSE 7.0) ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-06 22:45 Message: Logged In: YES user_id=21627 This patch is not necessary. If readline.so is a shared library that relies on libncurses, it should itself be linked with libncurses; this is indeed the case on SuSE 7.2: martin@mira:~ > ldd /usr/lib/libreadline.so libncurses.so.5 => /lib/libncurses.so.5 libc.so.6 => /lib/libc.so.6 /lib/ld-linux.so.2 => /lib/ld-linux.so.2 Now, if the libreadline.so on SuSE 7.0 does not link itself with libncurses, that's a bug in the readline package. OTOH, linking libncurses might be the *wrong* thing, since on some systems, libcurses might be needed even if libncurses is present (e.g. some Solaris installations). If some system requires a special build procedure, the administrator must build the module using Modules/Setup, so that setup.py will not attempt to build it. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415629&group_id=5470 From noreply@sourceforge.net Thu Jun 7 06:53:16 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 06 Jun 2001 22:53:16 -0700 Subject: [Patches] [ python-Patches-416220 ] pstats.py interactive read function fix Message-ID: Patches item #416220, was updated on 2001-04-14 19:25 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=416220&group_id=5470 Category: library Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Matthew Mueller (donut) Assigned to: Eric S. Raymond (esr) Summary: pstats.py interactive read function fix Initial Comment: In pstats.py new interactive mode, read with no arguments dies because of a misplaced paren. Simple one liner fix. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-06 22:53 Message: Logged In: YES user_id=21627 Committed as 1.18 of pstats.py. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=416220&group_id=5470 From noreply@sourceforge.net Thu Jun 7 07:07:56 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 06 Jun 2001 23:07:56 -0700 Subject: [Patches] [ python-Patches-430846 ] faster string-decoding in base64.py Message-ID: Patches item #430846, was updated on 2001-06-06 14:14 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 3 Submitted By: Peter Schneider-Kamp (nowonder) Assigned to: Nobody/Anonymous (nobody) Summary: faster string-decoding in base64.py Initial Comment: This addresses bug #419390 by anthonybaxter. Instead of wrapping a string-to-be-decoded into a StringIO class and using base64.decode use binascii.a2b_base64 directly. Speedup for big files is over 10 times (on Linux x86 anyway). If uncontroversial I'll check it in. ---------------------------------------------------------------------- >Comment By: Peter Schneider-Kamp (nowonder) Date: 2001-06-06 23:07 Message: Logged In: YES user_id=14463 Mhh, I did click that "Check to Upload & Attach File" thing. No matter what, here is the new version (including your speedup for encodestring). ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-06 14:25 Message: Logged In: YES user_id=31435 Umm -- there's no patch here. If there were, I bet I would have changed this to Accepted, though . ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470 From noreply@sourceforge.net Thu Jun 7 07:08:47 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 06 Jun 2001 23:08:47 -0700 Subject: [Patches] [ python-Patches-430754 ] Makes ftpmirror.py .netrc aware Message-ID: Patches item #430754, was updated on 2001-06-06 10:21 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430754&group_id=5470 Category: demos and tools Group: None Status: Open Resolution: None Priority: 5 Submitted By: Mike Romberg (romberg) >Assigned to: Martin v. Löwis (loewis) Summary: Makes ftpmirror.py .netrc aware Initial Comment: The following patch modifies the ftpmirror.py script found in Tools/scripts to use the netrc module. This allows the ftpmirror script to act more like a standard ftp client and take the login, password, and account from a users $HOME/.netrc file if it exists. This patch is against the ftpmirror.py found in python 2.1 ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-06 23:08 Message: Logged In: YES user_id=21627 I recommend a number of improvements to the patch: - When unpacking the tuple, it is more intuitive to put the variables in the order in which they are documented for the function, ie. if auth: login, account, passwd = auth - If the user does not have a .netrc, IOError will be raised and should be expected - If a user is specified in the command line, it should probably take precedence over the .netrc setting - The debug message (Loggin in as) should probably display the user which is used for login. Please indicate whether you can produce a revised patch. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430754&group_id=5470 From noreply@sourceforge.net Thu Jun 7 07:10:11 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 06 Jun 2001 23:10:11 -0700 Subject: [Patches] [ python-Patches-430846 ] faster string-decoding in base64.py Message-ID: Patches item #430846, was updated on 2001-06-06 14:14 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 3 Submitted By: Peter Schneider-Kamp (nowonder) >Assigned to: Tim Peters (tim_one) Summary: faster string-decoding in base64.py Initial Comment: This addresses bug #419390 by anthonybaxter. Instead of wrapping a string-to-be-decoded into a StringIO class and using base64.decode use binascii.a2b_base64 directly. Speedup for big files is over 10 times (on Linux x86 anyway). If uncontroversial I'll check it in. ---------------------------------------------------------------------- Comment By: Peter Schneider-Kamp (nowonder) Date: 2001-06-06 23:07 Message: Logged In: YES user_id=14463 Mhh, I did click that "Check to Upload & Attach File" thing. No matter what, here is the new version (including your speedup for encodestring). ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-06 14:25 Message: Logged In: YES user_id=31435 Umm -- there's no patch here. If there were, I bet I would have changed this to Accepted, though . ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470 From noreply@sourceforge.net Thu Jun 7 11:09:13 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 03:09:13 -0700 Subject: [Patches] [ python-Patches-403100 ] Multicharacter replacements in PyUnicode_TranslateCharmap Message-ID: Patches item #403100, was updated on 2001-01-04 09:50 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403100&group_id=5470 Category: core (C code) Group: None Status: Closed Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: Multicharacter replacements in PyUnicode_TranslateCharmap Initial Comment: This patch modifies Objects/unicodeobject.c/PyUnicode_TranslateCharmap, so that the error PyErr_SetString(PyExc_NotImplementedError, "1-n mappings are currently not implemented"); no longer occurs. I.e. u"ab".translate({ord(u"a"): u"bbb", ord(u"b"): u"aaa"}) now works. It does this by exponentially reallocating the string, when there is no more available space. ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2001-06-07 03:09 Message: Logged In: YES user_id=89016 The patch that was checked in changes PyUnicode_DecodeCharmap and PyUnicode_EncodeCharmap, but not PyUnicode_TranslateCharmap, where this functionality is also useful. . (e.g. for u"".translate({ord("<"): u"<", ord(">"): u">"}) ) ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-01-06 07:03 Message: Checked in a different patch providing the same functionality. Please see the CVS checking message for details. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-01-05 10:45 Message: I'll checkin a patch for this tomorrow which implements what I had in mind. The patch doesn't change the performance of the charmap codec. Thanks, -- Marc-Andre ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-01-05 09:07 Message: The problem, that you can't know beforehand how long the result string will be, i.e. if there really will be any 1-n replacements happening. It would be possible to do a loop through the replacement strings and see if there are any that are longer than one character, but even if there are, you don't know if they will really be used. So you have three choices: (1) You either guess how much space you need and reallocate when the space is not enough or (2) you do a dry run of the algorithm once and count how much space you need and do the algorithm a second time and this time use the strings. (3) you can keep the strings in a list and join the list into one string in the end. For the case of 1-1 mapping the following will happen: (1) The first allocation has exactly the right amount of space, there won't be any reallocations, but a size check for every character will be don (which should be only a few assembler instructions). The mapping will have to be accessed for every character in the source string once. (2) There will only be one allocation, but for every character in the source string, the mapping has to be accessed twice, which are calls to Python function, exception handling etc. (3) You have to make as many memory allocations are are parts of the final string that you create, including error handling etc. I think (1) is clearly the fastest method. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-01-04 10:33 Message: I like the idea, but the implementation needs some reworking: the common case is 1-1 mapping so this should be as fast as possible; extra size checks slow things down too much. You can take a different approach, though: leave things as they are and only add a special case for the 1-n which does resizing depending on how many extra chars are inserted. Then as final step, if resizing occurred, call _PyUnicode_Resize() to cut down the allocate buffer to its true size. -- Marc-Andre ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403100&group_id=5470 From noreply@sourceforge.net Thu Jun 7 11:20:18 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 03:20:18 -0700 Subject: [Patches] [ python-Patches-430986 ] Buglet in PyUnicode_FromUnicode Message-ID: Patches item #430986, was updated on 2001-06-07 03:20 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430986&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: Nobody/Anonymous (nobody) Summary: Buglet in PyUnicode_FromUnicode Initial Comment: PyUnicode_FromUnicode contains the following code, which is clearly wrong: unicode = _PyUnicode_New(1); unicode->str[0] = *u; if (!unicode) return NULL; ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430986&group_id=5470 From noreply@sourceforge.net Thu Jun 7 11:52:39 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 03:52:39 -0700 Subject: [Patches] [ python-Patches-412229 ] runtime RTLD_NOW control via sys Message-ID: Patches item #412229, was updated on 2001-03-29 08:55 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=412229&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Bram Stolk (bram) >Assigned to: Martin v. Löwis (loewis) Summary: runtime RTLD_NOW control via sys Initial Comment: This patch enables runtime control over the RTLD_NOW flag, which can be used to do lazy symbol resolving when loading a shared lib. It's an extention to the sys module: sys.setlazysymresolve(0|1) The patch is against the latest CVS code, and was generated by 'cvs diff'. ---------------------------------------------------------------------- >Comment By: Bram Stolk (bram) Date: 2001-06-07 03:52 Message: Logged In: YES user_id=14028 Ok, I've revised the patch as you suggested. Currently, you can get and set the flags just as you specified. Also, it should also build on platforms without RTLD_NOW, and even on platforms without LDOPEN altogether. However, I see one problem with this: After Python 1.5.2, the dl module seems to be removed from the default installation. This means that dl.RTLD_NOW and dl.RTLD_LAZY are not readilly available on a standard Python install. This is akward. The patch was generated with the command: cvs diff -c against the cvs tree of Thu Jun 7 12:44:18 MDT 2001 Bram ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-05 23:39 Message: Logged In: YES user_id=21627 The patch needs further work: The code currently compiles on systems which don't define RTLD_NOW (although I'm not sure what these systems are); your code doesn't. Also, the code allows to set the flags, but has no interface to query them. Finally, users often complain that Python should use RTLD_GLOBAL, so that they can share symbols across extension modules. Therefore, I propose that you allow setting arbitrary dlopen flags; users would have to write sys.setdlopenflags(0) to turn off RTLD_NOW, and use sys.setdlopenflags(dl.RTLD_NOW|dl.RTLD_GLOBAL) to add RTLD_GLOBAL. When you revise this patch, please submit unified (-u) or context (-c) diffs. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-04-10 14:48 Message: Logged In: YES user_id=6380 Sorry, no new features in 2.1. I'll look at this after 2.1 is released though. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=412229&group_id=5470 From noreply@sourceforge.net Thu Jun 7 13:26:45 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 05:26:45 -0700 Subject: [Patches] [ python-Patches-430986 ] Buglet in PyUnicode_FromUnicode Message-ID: Patches item #430986, was updated on 2001-06-07 03:20 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430986&group_id=5470 Category: core (C code) Group: None >Status: Closed Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) >Assigned to: M.-A. Lemburg (lemburg) Summary: Buglet in PyUnicode_FromUnicode Initial Comment: PyUnicode_FromUnicode contains the following code, which is clearly wrong: unicode = _PyUnicode_New(1); unicode->str[0] = *u; if (!unicode) return NULL; ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-07 05:26 Message: Logged In: YES user_id=38388 Thanks. I checked in a fix in CVS. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430986&group_id=5470 From noreply@sourceforge.net Thu Jun 7 13:30:38 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 05:30:38 -0700 Subject: [Patches] [ python-Patches-403100 ] Multicharacter replacements in PyUnicode_TranslateCharmap Message-ID: Patches item #403100, was updated on 2001-01-04 09:50 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403100&group_id=5470 Category: core (C code) Group: None >Status: Open Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: Multicharacter replacements in PyUnicode_TranslateCharmap Initial Comment: This patch modifies Objects/unicodeobject.c/PyUnicode_TranslateCharmap, so that the error PyErr_SetString(PyExc_NotImplementedError, "1-n mappings are currently not implemented"); no longer occurs. I.e. u"ab".translate({ord(u"a"): u"bbb", ord(u"b"): u"aaa"}) now works. It does this by exponentially reallocating the string, when there is no more available space. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-07 03:09 Message: Logged In: YES user_id=89016 The patch that was checked in changes PyUnicode_DecodeCharmap and PyUnicode_EncodeCharmap, but not PyUnicode_TranslateCharmap, where this functionality is also useful. . (e.g. for u"".translate({ord("<"): u"<", ord(">"): u">"}) ) ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-01-06 07:03 Message: Checked in a different patch providing the same functionality. Please see the CVS checking message for details. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-01-05 10:45 Message: I'll checkin a patch for this tomorrow which implements what I had in mind. The patch doesn't change the performance of the charmap codec. Thanks, -- Marc-Andre ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-01-05 09:07 Message: The problem, that you can't know beforehand how long the result string will be, i.e. if there really will be any 1-n replacements happening. It would be possible to do a loop through the replacement strings and see if there are any that are longer than one character, but even if there are, you don't know if they will really be used. So you have three choices: (1) You either guess how much space you need and reallocate when the space is not enough or (2) you do a dry run of the algorithm once and count how much space you need and do the algorithm a second time and this time use the strings. (3) you can keep the strings in a list and join the list into one string in the end. For the case of 1-1 mapping the following will happen: (1) The first allocation has exactly the right amount of space, there won't be any reallocations, but a size check for every character will be don (which should be only a few assembler instructions). The mapping will have to be accessed for every character in the source string once. (2) There will only be one allocation, but for every character in the source string, the mapping has to be accessed twice, which are calls to Python function, exception handling etc. (3) You have to make as many memory allocations are are parts of the final string that you create, including error handling etc. I think (1) is clearly the fastest method. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-01-04 10:33 Message: I like the idea, but the implementation needs some reworking: the common case is 1-1 mapping so this should be as fast as possible; extra size checks slow things down too much. You can take a different approach, though: leave things as they are and only add a special case for the 1-n which does resizing depending on how many extra chars are inserted. Then as final step, if resizing occurred, call _PyUnicode_Resize() to cut down the allocate buffer to its true size. -- Marc-Andre ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403100&group_id=5470 From noreply@sourceforge.net Thu Jun 7 13:32:10 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 05:32:10 -0700 Subject: [Patches] [ python-Patches-403100 ] Multicharacter replacements in PyUnicode_TranslateCharmap Message-ID: Patches item #403100, was updated on 2001-01-04 09:50 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403100&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: Multicharacter replacements in PyUnicode_TranslateCharmap Initial Comment: This patch modifies Objects/unicodeobject.c/PyUnicode_TranslateCharmap, so that the error PyErr_SetString(PyExc_NotImplementedError, "1-n mappings are currently not implemented"); no longer occurs. I.e. u"ab".translate({ord(u"a"): u"bbb", ord(u"b"): u"aaa"}) now works. It does this by exponentially reallocating the string, when there is no more available space. ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-07 05:32 Message: Logged In: YES user_id=38388 Reopened. This should really be marked as feature request but for some reason SF won't let me change the Data Type. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-07 03:09 Message: Logged In: YES user_id=89016 The patch that was checked in changes PyUnicode_DecodeCharmap and PyUnicode_EncodeCharmap, but not PyUnicode_TranslateCharmap, where this functionality is also useful. . (e.g. for u"".translate({ord("<"): u"<", ord(">"): u">"}) ) ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-01-06 07:03 Message: Checked in a different patch providing the same functionality. Please see the CVS checking message for details. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-01-05 10:45 Message: I'll checkin a patch for this tomorrow which implements what I had in mind. The patch doesn't change the performance of the charmap codec. Thanks, -- Marc-Andre ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-01-05 09:07 Message: The problem, that you can't know beforehand how long the result string will be, i.e. if there really will be any 1-n replacements happening. It would be possible to do a loop through the replacement strings and see if there are any that are longer than one character, but even if there are, you don't know if they will really be used. So you have three choices: (1) You either guess how much space you need and reallocate when the space is not enough or (2) you do a dry run of the algorithm once and count how much space you need and do the algorithm a second time and this time use the strings. (3) you can keep the strings in a list and join the list into one string in the end. For the case of 1-1 mapping the following will happen: (1) The first allocation has exactly the right amount of space, there won't be any reallocations, but a size check for every character will be don (which should be only a few assembler instructions). The mapping will have to be accessed for every character in the source string once. (2) There will only be one allocation, but for every character in the source string, the mapping has to be accessed twice, which are calls to Python function, exception handling etc. (3) You have to make as many memory allocations are are parts of the final string that you create, including error handling etc. I think (1) is clearly the fastest method. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-01-04 10:33 Message: I like the idea, but the implementation needs some reworking: the common case is 1-1 mapping so this should be as fast as possible; extra size checks slow things down too much. You can take a different approach, though: leave things as they are and only add a special case for the 1-n which does resizing depending on how many extra chars are inserted. Then as final step, if resizing occurred, call _PyUnicode_Resize() to cut down the allocate buffer to its true size. -- Marc-Andre ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403100&group_id=5470 From noreply@sourceforge.net Thu Jun 7 15:34:44 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 07:34:44 -0700 Subject: [Patches] [ python-Patches-412229 ] runtime RTLD_NOW control via sys Message-ID: Patches item #412229, was updated on 2001-03-29 08:55 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=412229&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Bram Stolk (bram) Assigned to: Martin v. Löwis (loewis) Summary: runtime RTLD_NOW control via sys Initial Comment: This patch enables runtime control over the RTLD_NOW flag, which can be used to do lazy symbol resolving when loading a shared lib. It's an extention to the sys module: sys.setlazysymresolve(0|1) The patch is against the latest CVS code, and was generated by 'cvs diff'. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-07 07:34 Message: Logged In: YES user_id=21627 The patch looks good to me now, so I recommend accepting it - except for the part that activates dlmodule by default. As for getting at the RTLD flags, I see three options: 1. setup.py could be changed to build dl wherever possible. 2. Administrators should activate dlmodule if they trust it. 3. Application authors somehow need to find out the values of RTLD_ on their system, e.g. by per-system hard-coded values, or by running h2py on dlfcn.h; that could be part of the Python distribution for systems known to support dlfcn.h. 3. the RTLD_ flags are exported from some other module as well; imp comes to mind. Actually, putting setdlopenflags into imp instead of sys might be worth a consideration. ---------------------------------------------------------------------- Comment By: Bram Stolk (bram) Date: 2001-06-07 03:52 Message: Logged In: YES user_id=14028 Ok, I've revised the patch as you suggested. Currently, you can get and set the flags just as you specified. Also, it should also build on platforms without RTLD_NOW, and even on platforms without LDOPEN altogether. However, I see one problem with this: After Python 1.5.2, the dl module seems to be removed from the default installation. This means that dl.RTLD_NOW and dl.RTLD_LAZY are not readilly available on a standard Python install. This is akward. The patch was generated with the command: cvs diff -c against the cvs tree of Thu Jun 7 12:44:18 MDT 2001 Bram ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-05 23:39 Message: Logged In: YES user_id=21627 The patch needs further work: The code currently compiles on systems which don't define RTLD_NOW (although I'm not sure what these systems are); your code doesn't. Also, the code allows to set the flags, but has no interface to query them. Finally, users often complain that Python should use RTLD_GLOBAL, so that they can share symbols across extension modules. Therefore, I propose that you allow setting arbitrary dlopen flags; users would have to write sys.setdlopenflags(0) to turn off RTLD_NOW, and use sys.setdlopenflags(dl.RTLD_NOW|dl.RTLD_GLOBAL) to add RTLD_GLOBAL. When you revise this patch, please submit unified (-u) or context (-c) diffs. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-04-10 14:48 Message: Logged In: YES user_id=6380 Sorry, no new features in 2.1. I'll look at this after 2.1 is released though. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=412229&group_id=5470 From noreply@sourceforge.net Thu Jun 7 16:10:39 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 08:10:39 -0700 Subject: [Patches] [ python-Patches-413171 ] fix UserDict.get, setdefault, update Message-ID: Patches item #413171, was updated on 2001-04-02 10:18 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=413171&group_id=5470 Category: library Group: None Status: Open >Resolution: Accepted Priority: 4 Submitted By: Ka-Ping Yee (ping) Assigned to: Ka-Ping Yee (ping) Summary: fix UserDict.get, setdefault, update Initial Comment: The methods 'get', 'setdefault', and 'update' on a dictionary are usually implemented (and thought of) in terms of the lower-level methods has_key, __getitem__, and __setitem__. The current implementation of UserDict relays a call to e.g. x.get() to x.data.get(), which behaves inconsistently if __getitem__ has been implemented on x. One particular big place where this turns up is cgi. If you get a dict = cgi.SvFormContentDict(), then dict.get('key') will return a *list* even though dict['key'] returns a single item! To make UserDict behave consistently, this patch fixes get(), update(), and setdefault() to re-use the other methods. Then the only occurrence of self.data[k] = v is in __setitem__, the only occurrence of self.data[k] without assignment is in __getitem__, etc. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-07 08:10 Message: Logged In: YES user_id=6380 Approved. Check it in already! ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-06 22:25 Message: Logged In: YES user_id=21627 I recommend to approve this patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-04-10 14:17 Message: Logged In: YES user_id=6380 Let's not fix this in 2.1. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=413171&group_id=5470 From noreply@sourceforge.net Thu Jun 7 17:37:40 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 09:37:40 -0700 Subject: [Patches] [ python-Patches-430754 ] Makes ftpmirror.py .netrc aware Message-ID: Patches item #430754, was updated on 2001-06-06 10:21 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430754&group_id=5470 Category: demos and tools Group: None Status: Open Resolution: None Priority: 5 Submitted By: Mike Romberg (romberg) Assigned to: Martin v. Löwis (loewis) Summary: Makes ftpmirror.py .netrc aware Initial Comment: The following patch modifies the ftpmirror.py script found in Tools/scripts to use the netrc module. This allows the ftpmirror script to act more like a standard ftp client and take the login, password, and account from a users $HOME/.netrc file if it exists. This patch is against the ftpmirror.py found in python 2.1 ---------------------------------------------------------------------- >Comment By: Mike Romberg (romberg) Date: 2001-06-07 09:37 Message: Logged In: YES user_id=61373 Good ideas. Here is a revised patch. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-06 23:08 Message: Logged In: YES user_id=21627 I recommend a number of improvements to the patch: - When unpacking the tuple, it is more intuitive to put the variables in the order in which they are documented for the function, ie. if auth: login, account, passwd = auth - If the user does not have a .netrc, IOError will be raised and should be expected - If a user is specified in the command line, it should probably take precedence over the .netrc setting - The debug message (Loggin in as) should probably display the user which is used for login. Please indicate whether you can produce a revised patch. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430754&group_id=5470 From abbn@v-share.com Thu Jun 7 17:48:33 2001 From: abbn@v-share.com (Vassilis Vassiliou) Date: Thu, 7 Jun 2001 19:48:33 +0300 Subject: [Patches] Shareware Software Registration Services Message-ID: <200106071648.f57GmXB13791@v-share.com> Dear Software Vendor, Our company Visage Services Inc. offers valuable shareware software registration services to many developers for the past 4 years. Being ourselves shareware software developers we created in 1998 a state of the art service administration system which proved very reliable and prosperous, due to its highly adaptable flexibility. Taking into consideration our very attractive fee schedule this could be a major opportunity to enhance your profits at a minimum cost. Please visit our site at http://www.v-share.com for a detailed description of these services and our fee schedule. Should you need assistance, feel free to contact me anytime. Thank you for your time reading my mail. Sincerely, Vassilis Vassiliou Sales Manager VISAGE SERVICES INC. abbn@v-share.com From noreply@sourceforge.net Thu Jun 7 18:17:41 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 10:17:41 -0700 Subject: [Patches] [ python-Patches-430754 ] Makes ftpmirror.py .netrc aware Message-ID: Patches item #430754, was updated on 2001-06-06 10:21 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430754&group_id=5470 Category: demos and tools Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Mike Romberg (romberg) Assigned to: Martin v. Löwis (loewis) Summary: Makes ftpmirror.py .netrc aware Initial Comment: The following patch modifies the ftpmirror.py script found in Tools/scripts to use the netrc module. This allows the ftpmirror script to act more like a standard ftp client and take the login, password, and account from a users $HOME/.netrc file if it exists. This patch is against the ftpmirror.py found in python 2.1 ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-07 10:17 Message: Logged In: YES user_id=21627 Committed as ftpmirror.py 1.14. ---------------------------------------------------------------------- Comment By: Mike Romberg (romberg) Date: 2001-06-07 09:37 Message: Logged In: YES user_id=61373 Good ideas. Here is a revised patch. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-06 23:08 Message: Logged In: YES user_id=21627 I recommend a number of improvements to the patch: - When unpacking the tuple, it is more intuitive to put the variables in the order in which they are documented for the function, ie. if auth: login, account, passwd = auth - If the user does not have a .netrc, IOError will be raised and should be expected - If a user is specified in the command line, it should probably take precedence over the .netrc setting - The debug message (Loggin in as) should probably display the user which is used for login. Please indicate whether you can produce a revised patch. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430754&group_id=5470 From noreply@sourceforge.net Thu Jun 7 19:42:44 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 11:42:44 -0700 Subject: [Patches] [ python-Patches-430846 ] faster string-decoding in base64.py Message-ID: Patches item #430846, was updated on 2001-06-06 14:14 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470 Category: library Group: None Status: Open >Resolution: Accepted Priority: 3 Submitted By: Peter Schneider-Kamp (nowonder) >Assigned to: Peter Schneider-Kamp (nowonder) Summary: faster string-decoding in base64.py Initial Comment: This addresses bug #419390 by anthonybaxter. Instead of wrapping a string-to-be-decoded into a StringIO class and using base64.decode use binascii.a2b_base64 directly. Speedup for big files is over 10 times (on Linux x86 anyway). If uncontroversial I'll check it in. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2001-06-07 11:42 Message: Logged In: YES user_id=31435 Accepted and assigned back to Peter for checkin. Don't see how this could be controversial -- it's simple and appropriate. ---------------------------------------------------------------------- Comment By: Peter Schneider-Kamp (nowonder) Date: 2001-06-06 23:07 Message: Logged In: YES user_id=14463 Mhh, I did click that "Check to Upload & Attach File" thing. No matter what, here is the new version (including your speedup for encodestring). ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-06 14:25 Message: Logged In: YES user_id=31435 Umm -- there's no patch here. If there were, I bet I would have changed this to Accepted, though . ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470 From noreply@sourceforge.net Thu Jun 7 20:39:31 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 12:39:31 -0700 Subject: [Patches] [ python-Patches-430948 ] Performance improvement for profiler Message-ID: Patches item #430948, was updated on 2001-06-06 22:29 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) >Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Performance improvement for profiler Initial Comment: This patch adds a bit of complexity to Profile.__init__() in an effort to reduce the overhead of the profiler. The essential piece of the puzzle is that the general Profile.get_time() method is replaced with a function which does only as much as is needed for the underlying timer. For example, if time.clock() is available, it can become a PyCFunction instead of a bound method, requires only 1 dict lookup to execute instead of the 11 it takes to execute get_time() without this patch. Also removes a couple of duplicate imports from the "if __name__ == ..." section. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2001-06-07 12:39 Message: Logged In: YES user_id=31435 Fine by me (and good idea!). I'd rather see get_time_mac be a module-level function _get_time_mac, get_time_timer a module-level _get_time_timer (or, better, _get_time_list), and get_time_times a module-level function _get_time_times; and in the last case without the needless expense of reduce (): .def _get_time_times(times=os.times): . t = times() . return t[0] + t[1] ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-06-06 22:30 Message: Logged In: YES user_id=3066 I should note that this works with both 2.1.1 and 2.2, though this is not a bugfix. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470 From noreply@sourceforge.net Thu Jun 7 20:40:10 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 12:40:10 -0700 Subject: [Patches] [ python-Patches-429957 ] Add some more EBCDIC encodings Message-ID: Patches item #429957, was updated on 2001-06-03 20:53 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429957&group_id=5470 Category: library Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Brian Quinlan (bquinlan) Assigned to: Nobody/Anonymous (nobody) Summary: Add some more EBCDIC encodings Initial Comment: Add support for cp1140, which is identical to cp037, with the addition of the euro character. Also added a few EDBDIC aliases. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-07 12:40 Message: Logged In: YES user_id=21627 Committed as cp1140.py 1.1 and aliases.py 1.8. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429957&group_id=5470 From noreply@sourceforge.net Thu Jun 7 20:47:00 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 12:47:00 -0700 Subject: [Patches] [ python-Patches-430181 ] Make httplib work with picky servers Message-ID: Patches item #430181, was updated on 2001-06-04 19:40 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430181&group_id=5470 Category: library Group: 2.0.1 bugfix Status: Open Resolution: None Priority: 5 Submitted By: Leonard Samuelson (lenski) Assigned to: Nobody/Anonymous (nobody) Summary: Make httplib work with picky servers Initial Comment: Python2.0: httplib.py: httplib: HTTPconnection Header processing: (putheader, putrequest, and endheaders) methods transmit each HTTP header line using a separate socket send invocation. Before this change, My Linksys Etherfast Cable/DSL router (Linksys BEFSR41, firmware v 1.22, March 31 2000) rejected the request becuase the entire HTTP header block is not contained in a single TCP packet. Clearly, the router is engaging in a noncompliant optimization! This patch is not required to allow httplib to work with real servers, making it completely optional. The patch I am submitting with this note causes httplib to work with the router. It is intended mostly as a model; a developer with greater familiarity with the library might have a better approach. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-07 12:47 Message: Logged In: YES user_id=21627 I recommend to reject this patch. Not only is the router broken, but it appears that the operating system is broken also; I think it legally could, and probably should, combine small write requests to a TCP socket that occur shortly after each other into a single IP packet. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430181&group_id=5470 From noreply@sourceforge.net Thu Jun 7 20:53:57 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 12:53:57 -0700 Subject: [Patches] [ python-Patches-429542 ] Bugfix for libsmtp example Message-ID: Patches item #429542, was updated on 2001-06-02 02:27 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429542&group_id=5470 Category: documentation Group: None >Status: Closed >Resolution: Works For Me Priority: 5 Submitted By: Sean Reifschneider (jafo) Assigned to: Nobody/Anonymous (nobody) Summary: Bugfix for libsmtp example Initial Comment: libsmtp includes an example which does: while 1: line = raw_input() if not line: break which fails raising an EOFError exception. This patch changes the code to: while 1: try: line = raw_input() except EOFError: break Sean ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-07 12:53 Message: Logged In: YES user_id=21627 This is already fixed in Doc/lib/libsmtplib.tex revisions 1.17 and 1.16.6.1, as a response to bug report #424776. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429542&group_id=5470 From noreply@sourceforge.net Thu Jun 7 20:58:34 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 12:58:34 -0700 Subject: [Patches] [ python-Patches-429614 ] pythonpath and optimize def. before init Message-ID: Patches item #429614, was updated on 2001-06-02 08:56 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429614&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Frederic Giacometti (giacometti) Assigned to: Nobody/Anonymous (nobody) Summary: pythonpath and optimize def. before init Initial Comment: A) Addition of four functions ===================== Py_{Set, Get}{PythonPath, OptimizeLevel}() with the same semantics as Py_{Set, Get}ProgramName() (Note: the C ANSI type 'char const*' is used to describe non-modifiable strings) These four functions are needed in the next JPE runtime (Python 2.1 patch included in the distribution); this allows setting the PYTHONPATH and optimize level from Java property values. B) Option '-P pythonpath' on the Python command line: ======================================== This option defines 'pythonpath' from the command line (and override the PYTHONPATH environment variable if necessary). Usefullness: Sometimes, one does not want to rely on the environment variables, or modify them. Sample application: Running build and test scripts in full control of the environment, and with different PYTHONPATH values. This option is needed by the build and test scripts of the next JPE source distribution (Python 2.1 patch included in the distribution. Frederic Giacometti fred@arakne.com ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-07 12:58 Message: Logged In: YES user_id=21627 I think a PEP describing the exact rationale and nature of the change is required here. For example, why is it good that -P overrides PYTHONPATH, instead of combining both somehow? Also, the documentation talks about Py_GetOptimizeLevel, whereas the header declares Py_GetOptimizeFlag. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429614&group_id=5470 From noreply@sourceforge.net Thu Jun 7 21:00:54 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 13:00:54 -0700 Subject: [Patches] [ python-Patches-429442 ] Cygwin sys.platform/get_platform() patch Message-ID: Patches item #429442, was updated on 2001-06-01 13:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429442&group_id=5470 Category: distutils Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jason Tishler (jlt63) >Assigned to: Greg Ward (gward) Summary: Cygwin sys.platform/get_platform() patch Initial Comment: This patch corrects sys.platform and distutils.util.get_platform() problems caused by the cruft contained in Cygwin's uname -s. Please see the following for the gory details: http://www.cygwin.com/ml/cygwin-apps/2001-05/msg00106.html Note that the above also solicited input from the community in an attempt to prevent any potential heartache. Since no one responded it would appear that either the changes are acceptable or that no one really cares... :,) ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2001-06-07 13:00 Message: Logged In: YES user_id=31435 Assigned to GregW. Greg, note that since Cygwin is really a Unix derivative, your primary concern is probably just that this doesn't break other Unixoid systems. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429442&group_id=5470 From noreply@sourceforge.net Thu Jun 7 21:05:50 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 13:05:50 -0700 Subject: [Patches] [ python-Patches-429171 ] sgmllib - leading spaces in declaration Message-ID: Patches item #429171, was updated on 2001-05-31 15:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429171&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Doug Fort (dougfort) Assigned to: Nobody/Anonymous (nobody) Summary: sgmllib - leading spaces in declaration Initial Comment: Some sites sloppily leave a space in their doctype declaration: i.e. . The Python 2.1 sgml parser raises an exception for this. This patch modifies sgmllib.py to allow leading whitespace in the declaration. It also adds a little information to the exception message. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-07 13:05 Message: Logged In: YES user_id=21627 I don't have an SGML spec, so I can only check the XML spec. In XML, such a DOCTYPE declaration is ill-formed; I expect the same to be true for SGML. Therefore, I recommend to reject this patch. If you have a need to process such ill-formed documents, I recommend to derive from SGMLParser and replace parse_declaration appropriately. E.g. you could advance i until after the space, then call the base method. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429171&group_id=5470 From noreply@sourceforge.net Thu Jun 7 21:07:00 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 13:07:00 -0700 Subject: [Patches] [ python-Patches-427749 ] Patch for bug #419390 (base64.py) Message-ID: Patches item #427749, was updated on 2001-05-27 11:35 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427749&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Kalle Svensson (krftkndl) >Assigned to: Peter Schneider-Kamp (nowonder) Summary: Patch for bug #419390 (base64.py) Initial Comment: Improves performance of base64.encodestring and base64.decodestring by avoiding StringIO and using binascii directly. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2001-06-07 13:07 Message: Logged In: YES user_id=31435 Assigned to Peter since it appears to compete with his patch. Peter, I expect your patch is quicker. If you agree and check in your patch, close this as Duplicate (or something). ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427749&group_id=5470 From noreply@sourceforge.net Thu Jun 7 21:09:50 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 13:09:50 -0700 Subject: [Patches] [ python-Patches-426746 ] Infrastructure for getting MacPython modules working on OSX Message-ID: Patches item #426746, was updated on 2001-05-23 13:29 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470 Category: Build Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jack Jansen (jackjansen) >Assigned to: Thomas Wouters (twouters) Summary: Infrastructure for getting MacPython modules working on OSX Initial Comment: Here are a couple of patches that optionally (on MacOSX) enable a bit of extra infrastructure in the Python core, to allow various (MacPython-originated) dynamic extension modules to be built. Here's what I patched: - Added a MACHDEP_OBJS variable to Makefile.pre.in and configure.in. This allows platforms to include patform-specific sourcefiles to be added to the core build. - Added (using MACHDEP_OBJS) a macglue.c file to the build, which contains glue code that allows Mac extension modules to refer to each other while being in separate dynamically loaded modules, plus a couple of utility routines. There's also a few changes to LDFLAGS to get the object file incorporated (as it is otherwise optimized away because the rest of Python doesn't refer to it). - Added a config.h.in define USE_TOOLBOX_OBJECT_GLUE which enables the glue code mentioned above (which isn't need in MacPython, only in Mach-O Python). Possibly the latter two should be dependent on a configure switch (--with-mac-toolbox-modules?) but (a) I think the added memory footprint is minimal and (b) I never understood how to add configure switches:-) A setup.py patch will follow, but I'm still testing it. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2001-06-07 13:09 Message: Logged In: YES user_id=31435 Assigned to Thomas because he's shown previous signs of knowing how to spell "configure" <0.9 wink>. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470 From noreply@sourceforge.net Thu Jun 7 21:19:04 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 13:19:04 -0700 Subject: [Patches] [ python-Patches-424475 ] Speed-up tp_compare usage Message-ID: Patches item #424475, was updated on 2001-05-16 01:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=424475&group_id=5470 Category: core (C code) Group: None Status: Open >Resolution: Accepted Priority: 5 Submitted By: Martin v. Löwis (loewis) >Assigned to: Martin v. Löwis (loewis) Summary: Speed-up tp_compare usage Initial Comment: This patch tries to optimize PyObject_RichCompare for the common case of objects with equal types which support tp_compare. It gives a speed-up of roughly 7% for comparing strings in a loop. The patch also gives type objects a tp_compare function, so that they can make use of the improvement. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2001-06-07 13:19 Message: Logged In: YES user_id=31435 Accepted and assigned back to Martin. This is too valuable to quibble over. Note that when calling a tp_compare slot, this kind of thing: . c = (*f)(v, w); . if (PyErr_Occurred()) is better spelled: . c = (*f)(v, w); . if (c < 0 && Py_Err_Occurred()) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-05-21 09:57 Message: Logged In: YES user_id=21627 The revised patch prefers tp_compare over tp_richcompare in do_cmp if both are available. It also restores UserList.__cmp__ from deprecation. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=424475&group_id=5470 From noreply@sourceforge.net Thu Jun 7 21:41:58 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 13:41:58 -0700 Subject: [Patches] [ python-Patches-429171 ] sgmllib - leading spaces in declaration Message-ID: Patches item #429171, was updated on 2001-05-31 15:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429171&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Doug Fort (dougfort) Assigned to: Nobody/Anonymous (nobody) Summary: sgmllib - leading spaces in declaration Initial Comment: Some sites sloppily leave a space in their doctype declaration: i.e. . The Python 2.1 sgml parser raises an exception for this. This patch modifies sgmllib.py to allow leading whitespace in the declaration. It also adds a little information to the exception message. ---------------------------------------------------------------------- >Comment By: Doug Fort (dougfort) Date: 2001-06-07 13:41 Message: Logged In: YES user_id=6399 I have already overloaded parse_declaration. I will withdraw the patch. However, I would like to make one final comment. A rigid interpretation of the RFCs is correct in servers, but clients should be as flexible as possible, to handle real servers. Our system (http://www.stressmy.com) uses heavily overloaded versions of sgmllib, httplib, and other Python library modules because while they may adhere here to some notion of academic purity, they just don't work very well against real websites. Whew, I feel better now. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-07 13:05 Message: Logged In: YES user_id=21627 I don't have an SGML spec, so I can only check the XML spec. In XML, such a DOCTYPE declaration is ill-formed; I expect the same to be true for SGML. Therefore, I recommend to reject this patch. If you have a need to process such ill-formed documents, I recommend to derive from SGMLParser and replace parse_declaration appropriately. E.g. you could advance i until after the space, then call the base method. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429171&group_id=5470 From noreply@sourceforge.net Thu Jun 7 21:43:00 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 13:43:00 -0700 Subject: [Patches] [ python-Patches-429171 ] sgmllib - leading spaces in declaration Message-ID: Patches item #429171, was updated on 2001-05-31 15:26 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429171&group_id=5470 Category: library Group: None >Status: Deleted Resolution: None Priority: 5 Submitted By: Doug Fort (dougfort) Assigned to: Nobody/Anonymous (nobody) Summary: sgmllib - leading spaces in declaration Initial Comment: Some sites sloppily leave a space in their doctype declaration: i.e. . The Python 2.1 sgml parser raises an exception for this. This patch modifies sgmllib.py to allow leading whitespace in the declaration. It also adds a little information to the exception message. ---------------------------------------------------------------------- Comment By: Doug Fort (dougfort) Date: 2001-06-07 13:41 Message: Logged In: YES user_id=6399 I have already overloaded parse_declaration. I will withdraw the patch. However, I would like to make one final comment. A rigid interpretation of the RFCs is correct in servers, but clients should be as flexible as possible, to handle real servers. Our system (http://www.stressmy.com) uses heavily overloaded versions of sgmllib, httplib, and other Python library modules because while they may adhere here to some notion of academic purity, they just don't work very well against real websites. Whew, I feel better now. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-07 13:05 Message: Logged In: YES user_id=21627 I don't have an SGML spec, so I can only check the XML spec. In XML, such a DOCTYPE declaration is ill-formed; I expect the same to be true for SGML. Therefore, I recommend to reject this patch. If you have a need to process such ill-formed documents, I recommend to derive from SGMLParser and replace parse_declaration appropriately. E.g. you could advance i until after the space, then call the base method. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429171&group_id=5470 From noreply@sourceforge.net Thu Jun 7 22:37:52 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 14:37:52 -0700 Subject: [Patches] [ python-Patches-430948 ] Performance improvement for profiler Message-ID: Patches item #430948, was updated on 2001-06-06 22:29 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) >Assigned to: Tim Peters (tim_one) Summary: Performance improvement for profiler Initial Comment: This patch adds a bit of complexity to Profile.__init__() in an effort to reduce the overhead of the profiler. The essential piece of the puzzle is that the general Profile.get_time() method is replaced with a function which does only as much as is needed for the underlying timer. For example, if time.clock() is available, it can become a PyCFunction instead of a bound method, requires only 1 dict lookup to execute instead of the 11 it takes to execute get_time() without this patch. Also removes a couple of duplicate imports from the "if __name__ == ..." section. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-06-07 14:37 Message: Logged In: YES user_id=3066 I've attached a revised patch with the suggested changes, plus a few more. This is more agressive about avoiding dictionary lookups, and the dispatch table no longer contains bound methods -- using plain functions with self passed as an explicit argument is faster as it avoids more of Python's call machinery, and avoids circular references. This patch also attempts not to add any breakage to the OldProfile and HotProfile classes. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-07 12:39 Message: Logged In: YES user_id=31435 Fine by me (and good idea!). I'd rather see get_time_mac be a module-level function _get_time_mac, get_time_timer a module-level _get_time_timer (or, better, _get_time_list), and get_time_times a module-level function _get_time_times; and in the last case without the needless expense of reduce (): .def _get_time_times(times=os.times): . t = times() . return t[0] + t[1] ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-06-06 22:30 Message: Logged In: YES user_id=3066 I should note that this works with both 2.1.1 and 2.2, though this is not a bugfix. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470 From noreply@sourceforge.net Thu Jun 7 22:44:35 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 14:44:35 -0700 Subject: [Patches] [ python-Patches-431257 ] profile/trace dispatch speed-up Message-ID: Patches item #431257, was updated on 2001-06-07 14:44 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431257&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) Assigned to: Tim Peters (tim_one) Summary: profile/trace dispatch speed-up Initial Comment: The profile and trace functions take a string as one of their parameters, where the value of the string is one of exactly four values. Unfortunately, a new string object is created for each call to the profile/trace functions, and is not interned. This patch modifies ceval.c so the string object for each of these values is created only once and is interned, allowing faster dictionary lookups in the profile/trace functions. This avoids a lot of string creation overhead for calling these functions, and can help the standard profiler work faster by using interned string objects. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431257&group_id=5470 From noreply@sourceforge.net Thu Jun 7 22:54:13 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 14:54:13 -0700 Subject: [Patches] [ python-Patches-430948 ] Performance improvement for profiler Message-ID: Patches item #430948, was updated on 2001-06-06 22:29 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470 Category: library Group: None Status: Open >Resolution: Accepted Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) >Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Performance improvement for profiler Initial Comment: This patch adds a bit of complexity to Profile.__init__() in an effort to reduce the overhead of the profiler. The essential piece of the puzzle is that the general Profile.get_time() method is replaced with a function which does only as much as is needed for the underlying timer. For example, if time.clock() is available, it can become a PyCFunction instead of a bound method, requires only 1 dict lookup to execute instead of the 11 it takes to execute get_time() without this patch. Also removes a couple of duplicate imports from the "if __name__ == ..." section. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2001-06-07 14:54 Message: Logged In: YES user_id=31435 Accepted and back to Fred, with the caveat we talked about that __init__ should still do the right thing with a passed- in timer returning an arbitrary sequence-like object of number-like objects . ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-06-07 14:37 Message: Logged In: YES user_id=3066 I've attached a revised patch with the suggested changes, plus a few more. This is more agressive about avoiding dictionary lookups, and the dispatch table no longer contains bound methods -- using plain functions with self passed as an explicit argument is faster as it avoids more of Python's call machinery, and avoids circular references. This patch also attempts not to add any breakage to the OldProfile and HotProfile classes. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-07 12:39 Message: Logged In: YES user_id=31435 Fine by me (and good idea!). I'd rather see get_time_mac be a module-level function _get_time_mac, get_time_timer a module-level _get_time_timer (or, better, _get_time_list), and get_time_times a module-level function _get_time_times; and in the last case without the needless expense of reduce (): .def _get_time_times(times=os.times): . t = times() . return t[0] + t[1] ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-06-06 22:30 Message: Logged In: YES user_id=3066 I should note that this works with both 2.1.1 and 2.2, though this is not a bugfix. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470 From noreply@sourceforge.net Thu Jun 7 23:01:57 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 15:01:57 -0700 Subject: [Patches] [ python-Patches-431257 ] profile/trace dispatch speed-up Message-ID: Patches item #431257, was updated on 2001-06-07 14:44 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431257&group_id=5470 Category: core (C code) Group: None Status: Open >Resolution: Accepted Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) >Assigned to: Fred L. Drake, Jr. (fdrake) Summary: profile/trace dispatch speed-up Initial Comment: The profile and trace functions take a string as one of their parameters, where the value of the string is one of exactly four values. Unfortunately, a new string object is created for each call to the profile/trace functions, and is not interned. This patch modifies ceval.c so the string object for each of these values is created only once and is interned, allowing faster dictionary lookups in the profile/trace functions. This avoids a lot of string creation overhead for calling these functions, and can help the standard profiler work faster by using interned string objects. ---------------------------------------------------------------------- >Comment By: Tim Peters (tim_one) Date: 2001-06-07 15:01 Message: Logged In: YES user_id=31435 Accepted, and back to Fred "The Interner" Drake, Jr. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431257&group_id=5470 From noreply@sourceforge.net Fri Jun 8 00:28:55 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 16:28:55 -0700 Subject: [Patches] [ python-Patches-427749 ] Patch for bug #419390 (base64.py) Message-ID: Patches item #427749, was updated on 2001-05-27 11:35 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427749&group_id=5470 Category: library Group: None >Status: Closed >Resolution: Duplicate Priority: 5 Submitted By: Kalle Svensson (krftkndl) Assigned to: Peter Schneider-Kamp (nowonder) Summary: Patch for bug #419390 (base64.py) Initial Comment: Improves performance of base64.encodestring and base64.decodestring by avoiding StringIO and using binascii directly. ---------------------------------------------------------------------- >Comment By: Peter Schneider-Kamp (nowonder) Date: 2001-06-07 16:28 Message: Logged In: YES user_id=14463 Tried something similar. Slower than Tim's version, though. Already checked that one in. Closing as Duplicate. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-07 13:07 Message: Logged In: YES user_id=31435 Assigned to Peter since it appears to compete with his patch. Peter, I expect your patch is quicker. If you agree and check in your patch, close this as Duplicate (or something). ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427749&group_id=5470 From noreply@sourceforge.net Fri Jun 8 00:30:11 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 16:30:11 -0700 Subject: [Patches] [ python-Patches-430846 ] faster string-decoding in base64.py Message-ID: Patches item #430846, was updated on 2001-06-06 14:14 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470 Category: library Group: None >Status: Closed Resolution: Accepted Priority: 3 Submitted By: Peter Schneider-Kamp (nowonder) Assigned to: Peter Schneider-Kamp (nowonder) Summary: faster string-decoding in base64.py Initial Comment: This addresses bug #419390 by anthonybaxter. Instead of wrapping a string-to-be-decoded into a StringIO class and using base64.decode use binascii.a2b_base64 directly. Speedup for big files is over 10 times (on Linux x86 anyway). If uncontroversial I'll check it in. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-07 11:42 Message: Logged In: YES user_id=31435 Accepted and assigned back to Peter for checkin. Don't see how this could be controversial -- it's simple and appropriate. ---------------------------------------------------------------------- Comment By: Peter Schneider-Kamp (nowonder) Date: 2001-06-06 23:07 Message: Logged In: YES user_id=14463 Mhh, I did click that "Check to Upload & Attach File" thing. No matter what, here is the new version (including your speedup for encodestring). ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-06 14:25 Message: Logged In: YES user_id=31435 Umm -- there's no patch here. If there were, I bet I would have changed this to Accepted, though . ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470 From noreply@sourceforge.net Fri Jun 8 05:26:33 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 21:26:33 -0700 Subject: [Patches] [ python-Patches-430948 ] Performance improvement for profiler Message-ID: Patches item #430948, was updated on 2001-06-06 22:29 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470 Category: library Group: None >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Performance improvement for profiler Initial Comment: This patch adds a bit of complexity to Profile.__init__() in an effort to reduce the overhead of the profiler. The essential piece of the puzzle is that the general Profile.get_time() method is replaced with a function which does only as much as is needed for the underlying timer. For example, if time.clock() is available, it can become a PyCFunction instead of a bound method, requires only 1 dict lookup to execute instead of the 11 it takes to execute get_time() without this patch. Also removes a couple of duplicate imports from the "if __name__ == ..." section. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-06-07 21:26 Message: Logged In: YES user_id=3066 Checked in with the suggested modification. This is Lib/profile.py revision 1.28. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-07 14:54 Message: Logged In: YES user_id=31435 Accepted and back to Fred, with the caveat we talked about that __init__ should still do the right thing with a passed- in timer returning an arbitrary sequence-like object of number-like objects . ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-06-07 14:37 Message: Logged In: YES user_id=3066 I've attached a revised patch with the suggested changes, plus a few more. This is more agressive about avoiding dictionary lookups, and the dispatch table no longer contains bound methods -- using plain functions with self passed as an explicit argument is faster as it avoids more of Python's call machinery, and avoids circular references. This patch also attempts not to add any breakage to the OldProfile and HotProfile classes. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-07 12:39 Message: Logged In: YES user_id=31435 Fine by me (and good idea!). I'd rather see get_time_mac be a module-level function _get_time_mac, get_time_timer a module-level _get_time_timer (or, better, _get_time_list), and get_time_times a module-level function _get_time_times; and in the last case without the needless expense of reduce (): .def _get_time_times(times=os.times): . t = times() . return t[0] + t[1] ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-06-06 22:30 Message: Logged In: YES user_id=3066 I should note that this works with both 2.1.1 and 2.2, though this is not a bugfix. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470 From noreply@sourceforge.net Fri Jun 8 05:33:38 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 07 Jun 2001 21:33:38 -0700 Subject: [Patches] [ python-Patches-431257 ] profile/trace dispatch speed-up Message-ID: Patches item #431257, was updated on 2001-06-07 14:44 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431257&group_id=5470 Category: core (C code) Group: None >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: profile/trace dispatch speed-up Initial Comment: The profile and trace functions take a string as one of their parameters, where the value of the string is one of exactly four values. Unfortunately, a new string object is created for each call to the profile/trace functions, and is not interned. This patch modifies ceval.c so the string object for each of these values is created only once and is interned, allowing faster dictionary lookups in the profile/trace functions. This avoids a lot of string creation overhead for calling these functions, and can help the standard profiler work faster by using interned string objects. ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-06-07 21:33 Message: Logged In: YES user_id=3066 Checked in as Python/ceval.c revision 2.246. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-07 15:01 Message: Logged In: YES user_id=31435 Accepted, and back to Fred "The Interner" Drake, Jr. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431257&group_id=5470 From noreply@sourceforge.net Fri Jun 8 16:54:36 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 08 Jun 2001 08:54:36 -0700 Subject: [Patches] [ python-Patches-431422 ] "print" not emitting POP_TOP Message-ID: Patches item #431422, was updated on 2001-06-08 08:54 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431422&group_id=5470 Category: Parser/Compiler Group: None Status: Open Resolution: None Priority: 5 Submitted By: Shane Hathaway (hathawsh) Assigned to: Nobody/Anonymous (nobody) Summary: "print" not emitting POP_TOP Initial Comment: The Python-based compiler module (in Tools) has a bug in the visitPrint() method of pycodegen.CodeGenerator. It does not emit a trailing POP_TOP instruction, which AFAICT it should emit only when outputting to a stream and there is a trailing comma (indicating no newline). I've attached the patch applied to Zope's RestrictedPython module; if there is anything incorrect about it please tell me right away. Otherwise please apply the patch to Tools/compiler/pycodgen.py. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431422&group_id=5470 From noreply@sourceforge.net Fri Jun 8 17:29:02 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 08 Jun 2001 09:29:02 -0700 Subject: [Patches] [ python-Patches-426208 ] Fun with Floating Point Message-ID: Patches item #426208, was updated on 2001-05-22 01:06 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426208&group_id=5470 Category: documentation Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Tim Peters (tim_one) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Fun with Floating Point Initial Comment: I suggest this as an Appendix. For Michel Pelletier's benefit, it contains no equation . Alas for you, for my benefit it contains no LaTeX markup either. Season to taste! ---------------------------------------------------------------------- >Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-06-08 09:29 Message: Logged In: YES user_id=3066 Checked in as Doc/tut/tut.tex revision 1.137. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-05-23 15:02 Message: Logged In: YES user_id=31435 New text, with improved wording, and an un-Wikized version of the RepresentationError page at the end as a new section. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-05-23 15:00 Message: Logged In: YES user_id=31435 Deleted the attachment in preparation for uploading a new one. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426208&group_id=5470 From noreply@sourceforge.net Fri Jun 8 22:37:50 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 08 Jun 2001 14:37:50 -0700 Subject: [Patches] [ python-Patches-429614 ] pythonpath and optimize def. before init Message-ID: Patches item #429614, was updated on 2001-06-02 08:56 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429614&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Frederic Giacometti (giacometti) Assigned to: Nobody/Anonymous (nobody) Summary: pythonpath and optimize def. before init Initial Comment: A) Addition of four functions ===================== Py_{Set, Get}{PythonPath, OptimizeLevel}() with the same semantics as Py_{Set, Get}ProgramName() (Note: the C ANSI type 'char const*' is used to describe non-modifiable strings) These four functions are needed in the next JPE runtime (Python 2.1 patch included in the distribution); this allows setting the PYTHONPATH and optimize level from Java property values. B) Option '-P pythonpath' on the Python command line: ======================================== This option defines 'pythonpath' from the command line (and override the PYTHONPATH environment variable if necessary). Usefullness: Sometimes, one does not want to rely on the environment variables, or modify them. Sample application: Running build and test scripts in full control of the environment, and with different PYTHONPATH values. This option is needed by the build and test scripts of the next JPE source distribution (Python 2.1 patch included in the distribution. Frederic Giacometti fred@arakne.com ---------------------------------------------------------------------- >Comment By: Frederic Giacometti (giacometti) Date: 2001-06-08 14:37 Message: Logged In: YES user_id=93657 1) PEP: I am not in python-dev. What is the procedure for opening the PEP? 2) Override: I though about the question. My response was: If you wnat concatenation, use: python -P "something:$PYTHONPATH" or python -P "$PYTHONPATH:something" That's for all the better... 3) I renamed Py_{Set,Get}OptimizeFlag to Py_{Set,Get}OtimizeLevel after I wrote the documentation. Glad you caught the typo :)), sorry :(( I changed 'Flag' to 'Level' because 'Flag' normally designates a binary variable (2 states) whereas what we are doing is actually defining a debuging level (3 levels as of now, but who knows that some more levels might be addes). 'OptimizeLevel' is more accurate and less ambiguous than 'OptimizeFlag'. Frederic Giacometti ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-07 12:58 Message: Logged In: YES user_id=21627 I think a PEP describing the exact rationale and nature of the change is required here. For example, why is it good that -P overrides PYTHONPATH, instead of combining both somehow? Also, the documentation talks about Py_GetOptimizeLevel, whereas the header declares Py_GetOptimizeFlag. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429614&group_id=5470 From noreply@sourceforge.net Sat Jun 9 08:40:08 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 09 Jun 2001 00:40:08 -0700 Subject: [Patches] [ python-Patches-424475 ] Speed-up tp_compare usage Message-ID: Patches item #424475, was updated on 2001-05-16 01:07 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=424475&group_id=5470 Category: core (C code) Group: None >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Martin v. Löwis (loewis) >Assigned to: Nobody/Anonymous (nobody) Summary: Speed-up tp_compare usage Initial Comment: This patch tries to optimize PyObject_RichCompare for the common case of objects with equal types which support tp_compare. It gives a speed-up of roughly 7% for comparing strings in a loop. The patch also gives type objects a tp_compare function, so that they can make use of the improvement. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-09 00:40 Message: Logged In: YES user_id=21627 Committed as object.c 2.132, typeobject.c 2.17, UserList.py 1.17. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-07 13:19 Message: Logged In: YES user_id=31435 Accepted and assigned back to Martin. This is too valuable to quibble over. Note that when calling a tp_compare slot, this kind of thing: . c = (*f)(v, w); . if (PyErr_Occurred()) is better spelled: . c = (*f)(v, w); . if (c < 0 && Py_Err_Occurred()) ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-05-21 09:57 Message: Logged In: YES user_id=21627 The revised patch prefers tp_compare over tp_richcompare in do_cmp if both are available. It also restores UserList.__cmp__ from deprecation. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=424475&group_id=5470 From noreply@sourceforge.net Sat Jun 9 20:15:31 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 09 Jun 2001 12:15:31 -0700 Subject: [Patches] [ python-Patches-401196 ] IPv6 patch against 2.0 CVS tree, as of 20001230 Message-ID: Patches item #401196, was updated on 2000-08-16 05:53 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401196&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jun-ichiro itojun Hagino (itojun) Assigned to: Nobody/Anonymous (nobody) Summary: IPv6 patch against 2.0 CVS tree, as of 20001230 Initial Comment: ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-09 12:15 Message: Logged In: YES user_id=21627 On the API, I have the following comments: - Why is it necessary to introduce gethostbyname2? I recommend to give gethostbyname an optional argument for the address family. - getaddrinfo, when raising a socket error, should include the EAI_ error number. Perhaps there should be a way tod istinguish EAI_ errnos from other errnos, e.g. by subclassing socket error. Otherwise, the API of the C part looks good to me. Ih aven't looked at the Lib part, yet. On the implementation: - I still have problems building the code. Currently, I get the following rejects: ./Lib/BaseHTTPServer.py.rej ./Lib/ftplib.py.rej ./Lib/poplib.py.rej ./Lib/smtplib.py.rej ./Modules/socketmodule.c.rej ./Objects/fileobject.c.rej - The fileobject.c chunk seems to be unnecessary. - On the test problem: It occurs in + test -d -a -f /lib.a ./configure: test: too many arguments which comes from ipv6libdir and ipv6libdir being empty. - The WIDE files should be included in the Modules directory, as they are only used from socketmodule.c. In particular, addrinfo.h should not be installed. - If you can, please include a patch to Doc/lib/libsocket.tex. If not, I will try to draft one. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-05-30 10:34 Message: Logged In: NO i looked at python-dev email. the proposal (split patches) looks fine, but the exact example given in python-dev email is not reasonable. i cannot just send out configure.in change separately from source code changes, period. i can split patches for *.py files separately though. there's more important issue, which is, APi changes for Socket class. i really hoped to get some comment on that part. i really appreciate your comments. i would like to propose that once we nailed down API changes, integrate the patch into the tree. with all #ifdef INET6 in place there should be no impact on IPv4-only builds. i have trouble tracking python development (i'm not a sourceforge expert!), so forgive me for delays in patch submissions. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-05-18 08:29 Message: Logged In: YES user_id=6380 See http://mail.python.org/pipermail/python-dev/2001-May/014889.html for comments from MvL. I'm unassigning this from Fred, he has nothing to do with this. ---------------------------------------------------------------------- Comment By: Jun-ichiro itojun Hagino (itojun) Date: 2001-02-26 02:24 Message: Logged In: YES user_id=63767 about /usr/bin/test argument: does linux /usr/bin/test have -d support? if not, we may need to change configure.in slightly. you are correct that fallback getaddrinfo/getnameinfo.c was missing in the patch. sorry. a question i need to ask is, do we need to supply Python function Socket.getaddrinfo on platforms that do not have getaddrinfo(3)? HAVE_ADDRINFO is used in Include/addrinfo.h, which is also missing in the patch set i have submitted. i've put the missing files into http://www.itojun.org/diary/20001230/missing.shar. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-02-23 23:58 Message: After a shallow review of this patch, I found the following issues: configure.in does not need to list both enable and disable options. When running configure, I got the following error message on Linux checking whether to enable ipv6... yes checking ipv6 stack type... linux-glibc ./configure: test: too many arguments using libc The call to /usr/bin/test should be corrected; I could not find out which specific invocation caused the problem. HAVE_ADDRINFO is not used. Perhaps getaddrinfo.c/getnameinfo.c is missing in the patch? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-01-04 07:51 Message: A new patch is available. I've changed the subject accordingly. Due to upload size restrictions, the patch is now at http://www.itojun.org/diary/20001230/python-2.0-v6-20001230.diff.gz ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2000-12-30 07:25 Message: I got *many* rejects when trying to apply this patch to today's CVS tree. I recommend that patches for generated files (config.h.in, configure) are not included in the patch because they outdate too easily. A number of changes in this patch have already been done by somebody else; others just don't fit into the current code anymore (perhaps due to indentation changes?). Anyway, I'll mark the patch as out-of-date. Please let me know when you upload a new version. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2000-08-16 07:00 Message: Postponed until Python 2.1 -- there's not enough time to review this and get it sufficiently tested on enough IPv6-connected platforms in time for 2.0, and we're already in feature freeze. This should go into the tree very quickly once Python 2.0 has been released. Assigned to myself to open it back up after Python 2.0. ---------------------------------------------------------------------- Comment By: Moshe Zadka (moshez) Date: 2000-08-16 06:07 Message: Assigned to Tim, since he's in charge of postponing new features. I'm to timid to postpone it myself. ---------------------------------------------------------------------- Comment By: Jun-ichiro itojun Hagino (itojun) Date: 2000-08-16 05:59 Message: this is revised version of patch #101186 (now with my SourceForge accout... i'm not familiar with the system here, so forgive my possible mistake). 1.6b1 patch applied mostly clean to 2.0. It is confirmed that: - 1.6b1 + IPv6 patch works fine on NetBSD 1.4.2 + KAME, and NetBSD 1.5 - 1.6b1 + IPv6 patch works fine on NetBSD 1.4.2 (NOT an IPv6 ready machine) - 2.0 CVS tree + IPv6 patch works fine on NetBSD + KAME forgot to attach the following into the diff - so i attach it (README.v6) here as comment. I have submitted the patch for 1.5.1, 1.5.2 and 1.6b1, all hit a bad timing - bad luck. contact: core@kame.net, or itojun@kame.net --- IPv6-ready python 1.6 KAME Project $KAME: README.v6,v 1.9 2000/08/15 02:40:38 itojun Exp $ This patchkit enables python 1.6 to perform AF_INET6 socket operations. The only affected module is Modules/socketmodule.c. Modules/socketmodule.c In most cases, IPv6 address can be placed where IPv4 address fits. sockaddr sockaddr tuple is formatted as follows: IPv4: (host, port) IPv6: socket class methods always generate (host, port, flowinfo, scopeid). socket class methods will accept 2, 3, or 4 tuple (for backward compatibility). Compatibility warning: Some of the scripts assume that the sockaddr structure is 2 tuple, like: host, port = sock.getpeername() this will fail if you are connected to IPv6 node. socket.getaddrinfo(host, port [, family, socktype, proto, flags]) host: String or None port: String, Int or None family, socktype, proto, flags: Int, can be omitted Perform getaddrinfo(3). Returns List of the following 5 tuple: (family, socktype, proto, canonname, sockaddr) family: Int socktype: Int proto: Int canonname: String sockaddr: sockaddr (see above) See Lib/httplib.py for typical usage on the client side. socket.getnameinfo(sockaddr, flags) sockaddr: sockaddr flags: Int Perform getnameinfo(3). Returns the following 2 tuple: host: String, numeric or hostname depending on flgags port: String, numeric or portname depending on flgags socket.gethostbyname2(host, af) host: String af: Int Performs gethostbyname2(3). Returns numeric address representation for "host". socket.gethostbyaddr(addr) (behavior change if IPv6 support is compiled in) addr: String Performs gethostbyaddr(3). Returns string address representation for "addr". The function can take IPv6 numeric address as well. This behavior is not problematical, because - if you pass numeric "addr" parameter, we can always identify address family for it - return value is string address reprsentation, where IPv6 and IPv4 are not distinguishable. socket.bind(sa), socket.connect(sa) and others. (No behavior change, but be careful) See above for sockaddr format change. With Python "addr" portion of sockaddr (first element) can be string hostname. When the string hostname resolved to numeric address, it will obey address family of the socket (which was specified when socket.socket() was called). If you give some string that does not give matching address family, you will get some error. s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) # this is okay, if 'localhost' resolves to both IPv4/v6 s.connect('localhost', 80) # this is not okay, of course s.connect('::1', 80) # this is not okay, as v6only.kame.net will not resolve to IPv4 s.connect('v6only.kame.net', 80) Lib/httplib.py IPv6 ready. "host" in HTTP(host) will accept the following 3 forms: [host]:port host:port there must be only single colon host This is to allow IPv6 numeric URL (http://[host]:port/) in documents. IMHO "host:port" parsing should be implemented in urllib.py, not here. Lib/ftplib.py IPv6 ready. This uses EPSV/EPRT on IPv6 ftp. See RFC2428 for protocol details. Lib/SocketServer.py IPv6 ready. Wildcard bind on TCPServer, i.e. TCPServer(('', port)), will bind to wildcard socket on TCPServer.address_family. TCPServer.addresss_family is set to AF_INET by default, so ('', port) will usually bind AF_INET. Lib/smtplib.py, Lib/telnetlib.py, Lib/poplib.py IPv6 ready. Not much to say about protocol details - they just use TCP over IPv6. configure Configure has extra option, --enable-ipv6 and --disable-ipv6. The option controls IPv6 support feature. dynamic link issues in Modules/socketmodule.c Modules/socketmodule.c can be dynamically loaded only in the following situations: - getaddrinfo(3) and getnameinfo(3) are supplied by OS vendor in libc, and libc is dynamic link library. - OS vendor is NOT supplying getaddrinfo(3) nor getnameinfo(3), and You are configuring this package with --disable-ipv6. In this case, you'll be using missing/get{addr,name}info.c and they will refer to gethostby{name,addr}. gethostnameby{name,addr} can usually be found in dynamic-linking libc. In other situations, such as the following, please link Modules/socketmodule.c into python itself. - getaddrinfo(3) and getnameinfo(3) are supplied by OS vendor, but they are in statically linked library like libinet6.a. (KAME falls into this category) python usually links Modules/socketmodule.c into python itself (due to its popularity) so there should be no problem. restrictions - The patched tree will not use gethostbyname_r and other thread-ready libraries. Instead, it will use getaddrinfo() and getnameinfo() throughout the operation. todo - Patch bunch of library files in Lib/*.py. compatibility issues with existing scripts If you disable IPv6 support (./configure --disable-ipv6), the patched code is mostly compatible with original Python (except files in "Lib" directory modified for dual stack support). User script may choke if: - IPv4/v6 dualstack libc is supplied, python is compiled for dual stack, and script assumes some of IPv4-only behavior (especially sockaddr) - IPv4/v6 dualstack libc is supplied, python is compiled for IPv4 only, and script assumes some of IPv4-only behavior. In this case, Python socket class itself does not support IPv6, however, name resolution functions can return IPv6 names since they use IPv6-ready libc functions! I do not recommend this configuration. - script assumes certain IPv4-only version behavior in Lib/*.py. compilation If you use IPv6 features, it is assumed that you have working getaddrinfo() and getnameinfo() library functions. We have noticed that some of IPv6 stack is shipped with broken getaddrinfo(). In such cases, use missing/get{addr,name}info.c instead (but then, you need to have working getipnodeby{name,addr}). If you compile this on IPv4-only machine without get{addr,name}info, missing/get{addr,name}info.c will be used. They are from KAME IPv6 distribution and is #ifdef'ed for IPv4 only support. They are fairly complete implementation and you don't need to bother with bind 8.2 (bind 8.2 get{addr,name}info() has bugs). When compiling this kit on IPv6 node, you may need to specify some additional library paths or cpp defs. (like -linet6 or -DINET6) --enable-ipv6 will give you some warning, if the IPv6 stack is unknown to the "configure" script. Currently, the following IPv6 stacks are officially supported (i.e. we've checked that the package works well): - KAME IPv6 stack, http://www.kame.net/ References RFC2553, for getaddrinfo(3) and getnameinfo(3). Author contacts http://www.kame.net/ mailto:core@kame.net ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401196&group_id=5470 From noreply@sourceforge.net Sat Jun 9 04:52:01 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 08 Jun 2001 20:52:01 -0700 Subject: [Patches] [ python-Patches-431848 ] mathmodule.c: doc strings & conversion Message-ID: Patches item #431848, was updated on 2001-06-08 20:52 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431848&group_id=5470 Category: Modules Group: None Status: Open Resolution: None Priority: 3 Submitted By: Peter Schneider-Kamp (nowonder) Assigned to: Tim Peters (tim_one) Summary: mathmodule.c: doc strings & conversion Initial Comment: * more informative doc strings for mathmodule.c * methods math.radians and math.degrees to convert between radians and degrees This addresses feature request #426539. Suggestions for better names (deg2rad instead of radians?) or better doc strings will be met enthusiastically. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431848&group_id=5470 From InternetShops@mail.ru Mon Jun 11 03:22:05 2001 From: InternetShops@mail.ru (InternetShops@mail.ru) Date: Mon, 11 Jun 2001 04:22:05 +0200 Subject: [Patches] úáëáú îáäåöîïçï ïâïòõäï÷áîéñ þåòåú éîôåòîåô Message-ID: <200106110222.f5B2M5d16285@friedrich.unkelhaeuser.de> This is a MIME encoded message. --bfd185fe016c8f28e16d62ead51755869 Content-Type: text/html ; charset="windows-1251" Content-Transfer-Encoding: base64 PCFET0NUWVBFIEhUTUwgUFVCTElDICItLy9XM0MvL0RURCBIVE1MIDQuMCBUcmFuc2l0aW9uYWwv L0VOIj4NCg0KPGh0bWw+DQo8aGVhZD4NCgk8dGl0bGU+x8DKwMcgzcDExcbNzsPOIM7BztDTxM7C wM3I3yDXxdDFxyDIzdLF0M3F0jwvdGl0bGU+DQo8L2hlYWQ+DQoNCjxib2R5Pg0KPGRpdiBhbGln bj0iQ0VOVEVSIj4NCjxoMj7V7vDu+OXlIO7h7vDz5O7i4O3o5SDs7ubt7iDq8+/o8vwsIOfg6uDn 4OIg9+Xw5ecgyO3y5fDt5fI6PC9oMj4NCjx0YWJsZT4NCjx0cj4NCgk8dGQ+PGI+yu7s7/z+8uXw +zo8L2I+PC90ZD4NCjwvdHI+DQo8dHI+DQoJPHRkPjxhIGhyZWY9Imh0dHA6Ly93d3cudmlzdC5y dUB3d3cuamFuLWhlbmRyaWsuY29tL3Jlcy5waHA/dD1wYXRjaGVzQHB5dGhvbi5vcmcmcz12aXN0 LnJ1Ij52aXN0LnJ1PC9hPjxicj4NCgk8YSBocmVmPSJodHRwOi8vd3d3Lmtsb25kYWlrLnJ1QHd3 dy5qYW4taGVuZHJpay5jb20vcmVzLnBocD90PXBhdGNoZXNAcHl0aG9uLm9yZyZzPWtsb25kYWlr LnJ1Ij5rbG9uZGFpay5ydTwvYT48YnI+DQoJPGEgaHJlZj0iaHR0cDovL3d3dy5tdmlkZW8ucnVA d3d3Lmphbi1oZW5kcmlrLmNvbS9yZXMucGhwP3Q9cGF0Y2hlc0BweXRob24ub3JnJnM9bXZpZGVv LnJ1Ij5tdmlkZW8ucnU8L2E+DQoJPC90ZD4NCjwvdHI+DQo8dHI+DQoJPHRkPjxiPs3u8/Lh8+ro OjwvYj48L3RkPg0KPC90cj4NCjx0cj4NCgk8dGQ+PGEgaHJlZj0iaHR0cDovL3d3dy5ub3RlYm9v a3BvcnRhbC5ydUB3d3cuamFuLWhlbmRyaWsuY29tL3Jlcy5waHA/dD1wYXRjaGVzQHB5dGhvbi5v cmcmcz1ub3RlYm9va3BvcnRhbC5ydSI+bm90ZWJvb2twb3J0YWwucnU8L2E+PC90ZD4NCjwvdHI+ DQo8dHI+DQoJPHRkPjxhIGhyZWY9Imh0dHA6Ly93d3cubmJvb2sucnVAd3d3Lmphbi1oZW5kcmlr LmNvbS9yZXMucGhwP3Q9cGF0Y2hlc0BweXRob24ub3JnJnM9bmJvb2sucnUiPm5ib29rLnJ1IDwv YT48L3RkPg0KPC90cj4NCjx0cj4NCgk8dGQ+PGEgaHJlZj0iaHR0cDovL3d3dy5taWNyb21hdGl4 LnJ1QHd3dy5qYW4taGVuZHJpay5jb20vcmVzLnBocD90PXBhdGNoZXNAcHl0aG9uLm9yZyZzPW1p Y3JvbWF0aXgucnUiPm1pY3JvbWF0aXgucnU8L2E+PC90ZD4NCjwvdHI+DQo8dHI+DQoJPHRkPjxi Ps/w6O3y5fD7OjwvYj48L3RkPg0KPC90cj4NCjx0cj4NCgk8dGQ+PGEgaHJlZj0iaHR0cDovL3d3 dy5kb3N0YXZrYS5ydUB3d3cuamFuLWhlbmRyaWsuY29tL3Jlcy5waHA/dD1wYXRjaGVzQHB5dGhv bi5vcmcmcz1kb3N0YXZrYS5ydSI+ZG9zdGF2a2EucnU8L2E+PC90ZD4NCjwvdHI+DQo8dHI+DQoJ PHRkPjxhIGhyZWY9Imh0dHA6Ly93d3cuYXJ1cy5ydUB3d3cuamFuLWhlbmRyaWsuY29tL3Jlcy5w aHA/dD1wYXRjaGVzQHB5dGhvbi5vcmcmcz1hcnVzLnJ1Ij5hcnVzLnJ1PC9hPjwvdGQ+DQo8L3Ry Pg0KPHRyPg0KCTx0ZD48Yj7M8+v88ujs5eTo4C3v8O7l6vLu8PsgKOLo5OXu7/Du5ery7vD7KTo8 L2I+PC90ZD4NCjwvdHI+DQo8dHI+DQoJPHRkPjxhIGhyZWY9Imh0dHA6Ly93d3cuYWxscHJvamVj dG9ycy5ydUB3d3cuamFuLWhlbmRyaWsuY29tL3Jlcy5waHA/dD1wYXRjaGVzQHB5dGhvbi5vcmcm cz1hbGxwcm9qZWN0b3JzLnJ1Ij5hbGxwcm9qZWN0b3JzLnJ1PC9hPjwvdGQ+DQo8L3RyPg0KPHRy Pg0KCTx0ZD48YSBocmVmPSJodHRwOi8vd3d3Lm11bHRpbWVkaWEtcHJvamVjdG9yLnJ1QHd3dy5q YW4taGVuZHJpay5jb20vcmVzLnBocD90PXBhdGNoZXNAcHl0aG9uLm9yZyZzPW11bHRpbWVkaWEt cHJvamVjdG9yLnJ1Ij5tdWx0aW1lZGlhLXByb2plY3Rvci5ydTwvYT48L3RkPg0KPC90cj4NCjx0 cj4NCgk8dGQ+PGEgaHJlZj0iaHR0cDovL3d3dy5hbGVlLmNvbUB3d3cuamFuLWhlbmRyaWsuY29t L3Jlcy5waHA/dD1wYXRjaGVzQHB5dGhvbi5vcmcmcz1hbGVlLmNvbSI+YWxlZS5jb208L2E+PC90 ZD4NCjwvdHI+DQo8dHI+DQoJPHRkPjxiPsru7+jw+zo8L2I+PC90ZD4NCjwvdHI+DQo8dHI+DQoJ PHRkPjxhIGhyZWY9Imh0dHA6Ly93d3cubWl0YS5ydUB3d3cuamFuLWhlbmRyaWsuY29tL3Jlcy5w aHA/dD1wYXRjaGVzQHB5dGhvbi5vcmcmcz1taXRhLnJ1Ij5taXRhLnJ1PC9hPjwvdGQ+DQo8L3Ry Pg0KPHRyPg0KCTx0ZD48YSBocmVmPSJodHRwOi8vd3d3Lm1hcnZlbC5ydUB3d3cuamFuLWhlbmRy aWsuY29tL3Jlcy5waHA/dD1wYXRjaGVzQHB5dGhvbi5vcmcmcz1tYXJ2ZWwucnUiPm1hcnZlbC5y dTwvYT48L3RkPg0KPC90cj4NCjx0cj4NCgk8dGQ+PGEgaHJlZj0iaHR0cDovL3d3dy5zY29weS5y dUB3d3cuamFuLWhlbmRyaWsuY29tL3Jlcy5waHA/dD1wYXRjaGVzQHB5dGhvbi5vcmcmcz1zY29w eS5ydSI+c2NvcHkucnU8L2E+PC90ZD4NCjwvdHI+DQo8L3RhYmxlPg0KDQo8L2Rpdj4NCjwvYm9k eT4NCjwvaHRtbD4NCg== --bfd185fe016c8f28e16d62ead51755869-- From support@4zip.net Mon Jun 11 00:57:37 2001 From: support@4zip.net (support@4zip.net) Date: Sun, 10 Jun 2001 23:57:37 Subject: [Patches] Promote your business without spending a lot of money & time. Message-ID: Dear Sir or Madam, Are you broker or agent? Do you have your own listing database including homepage over internet? Only $19.95 / month. Our professional web database & homepage will give you best advantages on your marketings. Reach your custormers in minutes including free ad in http://www.findmybusiness.com business directory, And all your listings will be listed automatically with no charge. Our site promote in over 1000 serach engines. Thousands promotion-emails.... 30 days free trial offerd with limited time. Visit http://www.4zip.net and get connected with your customers!! ------------------------------------------------------------------------------------------------------------------ Findmybusiness.com ....The total business opportunies from your neighbor to world.. All Listings from over 30 countries... ============================================================== Praise the Lord..........! From noreply@sourceforge.net Mon Jun 11 06:53:03 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 10 Jun 2001 22:53:03 -0700 Subject: [Patches] [ python-Patches-429614 ] pythonpath and optimize def. before init Message-ID: Patches item #429614, was updated on 2001-06-02 08:56 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429614&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Frederic Giacometti (giacometti) Assigned to: Nobody/Anonymous (nobody) Summary: pythonpath and optimize def. before init Initial Comment: A) Addition of four functions ===================== Py_{Set, Get}{PythonPath, OptimizeLevel}() with the same semantics as Py_{Set, Get}ProgramName() (Note: the C ANSI type 'char const*' is used to describe non-modifiable strings) These four functions are needed in the next JPE runtime (Python 2.1 patch included in the distribution); this allows setting the PYTHONPATH and optimize level from Java property values. B) Option '-P pythonpath' on the Python command line: ======================================== This option defines 'pythonpath' from the command line (and override the PYTHONPATH environment variable if necessary). Usefullness: Sometimes, one does not want to rely on the environment variables, or modify them. Sample application: Running build and test scripts in full control of the environment, and with different PYTHONPATH values. This option is needed by the build and test scripts of the next JPE source distribution (Python 2.1 patch included in the distribution. Frederic Giacometti fred@arakne.com ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-10 22:53 Message: Logged In: YES user_id=21627 You can find the PEP guidelines in PEP 1: http://python.sourceforge.net/peps/pep-0001.html ---------------------------------------------------------------------- Comment By: Frederic Giacometti (giacometti) Date: 2001-06-08 14:37 Message: Logged In: YES user_id=93657 1) PEP: I am not in python-dev. What is the procedure for opening the PEP? 2) Override: I though about the question. My response was: If you wnat concatenation, use: python -P "something:$PYTHONPATH" or python -P "$PYTHONPATH:something" That's for all the better... 3) I renamed Py_{Set,Get}OptimizeFlag to Py_{Set,Get}OtimizeLevel after I wrote the documentation. Glad you caught the typo :)), sorry :(( I changed 'Flag' to 'Level' because 'Flag' normally designates a binary variable (2 states) whereas what we are doing is actually defining a debuging level (3 levels as of now, but who knows that some more levels might be addes). 'OptimizeLevel' is more accurate and less ambiguous than 'OptimizeFlag'. Frederic Giacometti ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-07 12:58 Message: Logged In: YES user_id=21627 I think a PEP describing the exact rationale and nature of the change is required here. For example, why is it good that -P overrides PYTHONPATH, instead of combining both somehow? Also, the documentation talks about Py_GetOptimizeLevel, whereas the header declares Py_GetOptimizeFlag. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429614&group_id=5470 From noreply@sourceforge.net Mon Jun 11 16:24:57 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 11 Jun 2001 08:24:57 -0700 Subject: [Patches] [ python-Patches-432117 ] Updated PullDOM patch Message-ID: Patches item #432117, was updated on 2001-06-11 08:24 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432117&group_id=5470 Category: XML Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: Updated PullDOM patch Initial Comment: Martin, here is an updated patch (see # 423394) for pulldom.py that follows the DOM REC regarding namespace declaration attribute handling. In short: namespace declaration attributes are now preserved, and the namespaceURI of a namespace decl attribute is "http://www.w3.org/2000/xmlns/". The localName is the prefix to be mapped, unless it is a plain "xmlns" (default ns declaration), in which case the localName is just "xmlns". I've tested this with a Python 2.1 (final release) install. Let me know if you need anything else. Thanks! B. Lloyd ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432117&group_id=5470 From noreply@sourceforge.net Mon Jun 11 21:00:39 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 11 Jun 2001 13:00:39 -0700 Subject: [Patches] [ python-Patches-432183 ] PEP-259: skip printing newline*2 Message-ID: Patches item #432183, was updated on 2001-06-11 13:00 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432183&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Guido van Rossum (gvanrossum) Assigned to: Nobody/Anonymous (nobody) Summary: PEP-259: skip printing newline*2 Initial Comment: See PEP 259 (to be checked in soon). This suppresses the printing of an extra newline when the last item printed is a string ending in a newline. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432183&group_id=5470 From noreply@sourceforge.net Sun Jun 10 10:12:49 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 10 Jun 2001 02:12:49 -0700 Subject: [Patches] [ python-Patches-432183 ] PEP-259: skip printing newline*2 Message-ID: Patches item #432183, was updated on 2001-06-11 13:00 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432183&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Guido van Rossum (gvanrossum) Assigned to: Nobody/Anonymous (nobody) Summary: PEP-259: skip printing newline*2 Initial Comment: See PEP 259 (to be checked in soon). This suppresses the printing of an extra newline when the last item printed is a string ending in a newline. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2001-06-10 02:12 Message: Logged In: YES user_id=6656 I think you also want: Index: code.py =================================================================== RCS file: /cvsroot/python/python/dist/src/Lib/code.py,v retrieving revision 1.16 diff -c -1 -r1.16 code.py *** code.py 2001/05/03 04:58:49 1.16 --- code.py 2001/06/11 22:11:29 *************** *** 106,108 **** else: ! if softspace(sys.stdout, 0): print --- 106,108 ---- else: ! if softspace(sys.stdout, 0) >= 0: print (not tested) ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432183&group_id=5470 From noreply@sourceforge.net Tue Jun 12 07:23:26 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 11 Jun 2001 23:23:26 -0700 Subject: [Patches] [ python-Patches-432325 ] \versionadded{2.2} in libstruct.tex Message-ID: Patches item #432325, was updated on 2001-06-11 23:23 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432325&group_id=5470 Category: documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: Peter Funk (pefu) Assigned to: Nobody/Anonymous (nobody) Summary: \versionadded{2.2} in libstruct.tex Initial Comment: Tim Peters: > Modified Files: > libstruct.tex > Log Message: > Added q/Q standard (x-platform 8-byte ints) mode in struct module. [...] Hmmmm.... You probably forgot the \versionadded{2.2} note? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432325&group_id=5470 From noreply@sourceforge.net Tue Jun 12 14:43:03 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 12 Jun 2001 06:43:03 -0700 Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks Message-ID: Patches item #432401, was updated on 2001-06-12 06:43 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: Nobody/Anonymous (nobody) Summary: unicode encoding error callbacks Initial Comment: This patch adds unicode error handling callbacks to the encode functionality. With this patch it's possible to not only pass 'strict', 'ignore' or 'replace' as the errors argument to encode, but also a callable function, that will be called with the encoding name, the original unicode object and the position of the unencodable character. The callback must return a replacement unicode object that will be encoded instead of the original character. For example replacing unencodable characters with XML character references can be done in the following way. u"aäoöuüß".encode( "ascii", lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos]) ) ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 From noreply@sourceforge.net Tue Jun 12 15:29:55 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 12 Jun 2001 07:29:55 -0700 Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks Message-ID: Patches item #432401, was updated on 2001-06-12 06:43 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) >Assigned to: M.-A. Lemburg (lemburg) Summary: unicode encoding error callbacks Initial Comment: This patch adds unicode error handling callbacks to the encode functionality. With this patch it's possible to not only pass 'strict', 'ignore' or 'replace' as the errors argument to encode, but also a callable function, that will be called with the encoding name, the original unicode object and the position of the unencodable character. The callback must return a replacement unicode object that will be encoded instead of the original character. For example replacing unencodable characters with XML character references can be done in the following way. u"aäoöuüß".encode( "ascii", lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos]) ) ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 07:29 Message: Logged In: YES user_id=38388 Thanks for the patch -- it looks very impressive !. I'll give it a try later this week. Some first cosmetic tidbits: * please don't place more than one C statement on one line like in: """ + unicode = unicode2; unicodepos = unicode2pos; + unicode2 = NULL; unicode2pos = 0; """ * Comments should start with a capital letter and be prepended to the section they apply to * There should be spaces between arguments in compares (a == b) not (a==b) * Where does the name "...Encode121" originate ? * module internal APIs should use lower case names (you converted some of these to PyUnicode_...() -- this is normally reserved for APIs which are either marked as potential candidates for the public API or are very prominent in the code) One thing which I don't like about your API change is that you removed the Py_UNICODE*data, int size style arguments -- this makes it impossible to use the new APIs on non-Python data or data which is not available as Unicode object. Please separate the errors.c patch from this patch -- it seems totally unrelated to Unicode. Thanks. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 From noreply@sourceforge.net Tue Jun 12 17:32:34 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 12 Jun 2001 09:32:34 -0700 Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks Message-ID: Patches item #432401, was updated on 2001-06-12 06:43 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: unicode encoding error callbacks Initial Comment: This patch adds unicode error handling callbacks to the encode functionality. With this patch it's possible to not only pass 'strict', 'ignore' or 'replace' as the errors argument to encode, but also a callable function, that will be called with the encoding name, the original unicode object and the position of the unencodable character. The callback must return a replacement unicode object that will be encoded instead of the original character. For example replacing unencodable characters with XML character references can be done in the following way. u"aäoöuüß".encode( "ascii", lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos]) ) ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:32 Message: Logged In: YES user_id=89016 > * please don't place more than one C statement on one line > like in: > """ > + unicode = unicode2; unicodepos = > unicode2pos; > + unicode2 = NULL; unicode2pos = 0; > """ OK, done! > * Comments should start with a capital letter and be > prepended > to the section they apply to Fixed! > * There should be spaces between arguments in compares > (a == b) not (a==b) Fixed! > * Where does the name "...Encode121" originate ? encode one-to-one, it implements both ASCII and latin-1 encoding. > * module internal APIs should use lower case names (you > converted some of these to PyUnicode_...() -- this is > normally reserved for APIs which are either marked as > potential candidates for the public API or are very > prominent in the code) Which ones? I introduced a new function for every old one, that had a "const char *errors" argument, and a few new ones in codecs.h, of those PyCodec_EncodeHandlerForObject is vital, because it is used to map for old string arguments to the new function objects. PyCodec_RaiseEncodeErrors can be used in the encoder implementation to raise an encode error, but it could be made static in unicodeobject.h so only those encoders implemented there have access to it. > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. I look through the code and found no situation where the Py_UNICODE*/int version is really used and having two (PyObject *)s (the original and the replacement string), instead of UNICODE*/int and PyObject * made the implementation a little easier, but I can fix that. > Please separate the errors.c patch from this patch -- it > seems totally unrelated to Unicode. PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with four hex digits. I removed it. I'll upload a revised patch as soon as it's done. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 07:29 Message: Logged In: YES user_id=38388 Thanks for the patch -- it looks very impressive !. I'll give it a try later this week. Some first cosmetic tidbits: * please don't place more than one C statement on one line like in: """ + unicode = unicode2; unicodepos = unicode2pos; + unicode2 = NULL; unicode2pos = 0; """ * Comments should start with a capital letter and be prepended to the section they apply to * There should be spaces between arguments in compares (a == b) not (a==b) * Where does the name "...Encode121" originate ? * module internal APIs should use lower case names (you converted some of these to PyUnicode_...() -- this is normally reserved for APIs which are either marked as potential candidates for the public API or are very prominent in the code) One thing which I don't like about your API change is that you removed the Py_UNICODE*data, int size style arguments -- this makes it impossible to use the new APIs on non-Python data or data which is not available as Unicode object. Please separate the errors.c patch from this patch -- it seems totally unrelated to Unicode. Thanks. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 From noreply@sourceforge.net Tue Jun 12 17:39:10 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 12 Jun 2001 09:39:10 -0700 Subject: [Patches] [ python-Patches-432457 ] Readline 4.2 Patch Message-ID: Patches item #432457, was updated on 2001-06-12 09:39 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432457&group_id=5470 Category: Build Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jason Tishler (jlt63) Assigned to: Nobody/Anonymous (nobody) Summary: Readline 4.2 Patch Initial Comment: This patch enables the Python readline module to build cleanly against readline 4.2. Specifically, it configures Python to use either completion_matches() or rl_completion_matches() as appropriate. This is necessary due to the deprecation of completion_matches() (and the other functions defined in compat.c) in readline 4.2. In this case, deprecated means no longer declared in readline.h but still defined in the readline library (e.g. libreadline.so). Although this patch is currently only necessary for Cygwin, it eventually will be needed by the other platforms when completion_matches() is finally removed from readline (e.g., 4.3). I tested this patch under the following environments: Linux with readline 2.2.1 Linux with readline 4.2 Cygwin with readline 4.1 Cygwin with readline 4.2 and it functioned as expected. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432457&group_id=5470 From noreply@sourceforge.net Tue Jun 12 17:51:48 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 12 Jun 2001 09:51:48 -0700 Subject: [Patches] [ python-Patches-400938 ] [Draft] libpython as shared library (.so) on Linux Message-ID: Patches item #400938, was updated on 2000-07-19 13:55 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=400938&group_id=5470 Category: None Group: None >Status: Open Resolution: Out of Date Priority: 5 Submitted By: Gregor Hoffleit (flight) Assigned to: Neil Schemenauer (nascheme) Summary: [Draft] libpython as shared library (.so) on Linux Initial Comment: ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-12 09:51 Message: Logged In: YES user_id=6380 Reopening -- this keeps being requested. Now we're just waiting for someone to produce a working patch. Or is there one already? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2001-03-21 15:59 Message: Logged In: YES user_id=35752 We're going to have to create a new patch to do this. This one is way too out of date. Maybe for 2.2. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-01-19 14:46 Message: I'm reassigning this to Neil. Neil, can you see if you can integrate this into your flat Makefile? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-01-17 15:09 Message: Andrew, I'm tentatively reassigning this to you, since you're taking charge of the build process at the moment (setup.py). I suspect that the patch no longer works as is -- would it make sense to mark it postponed and get the author to submit a new version before we release 2.1a1? ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-01-17 14:46 Message: Getting this patch into the next version of Python would be "A Good Thing"(tm) in my opinion. We use libpython as a .so at ILM and end up having to make changes like this by hand every time we get a new version... ---------------------------------------------------------------------- Comment By: Moshe Zadka (moshez) Date: 2000-11-01 03:32 Message: I've had a look at the patch, and it seems it has two orthogonal parts. One is adding the infrastructure for compiling another version for the Python library, which can be more or less integrated as-is, and one is hard-coding the particular way, in Linux, of building shared objects. Since we discover how to build shared objects in the configure script anyway (otherwise we could not have built modules as shared objects), we should embed that information there, not the Linux flags. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2000-10-26 14:13 Message: Let's give this to Jeremy instead, because he seems to know more about build issues. Jeremy, it would be good to look into getting this to work with your RPM suite. Flight's argument (has been used without complaints in Debian Python 1.5.2 since 1999) is good. ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2000-08-23 09:26 Message: In the absence of anyone arguing for inclusion of this patch and a one-week idle period, it is postponed. ---------------------------------------------------------------------- Comment By: Moshe Zadka (moshez) Date: 2000-08-16 00:40 Message: I suggest we postpone it. It isn't really complete (only works on real distributions ), and the complete solution should work on all unices. If Tcl/Perl can do it, there is no reason Python can't -- and a half hearted solution isn't that good. flight, you should use this for the Python in woody in the mean time -- I doubt woody will be stable before Python 2.1 comes out, so 2.1 sounds like a good timeframe to do it. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2000-08-15 10:52 Message: Assigned to Barry because he's a Linux weenie. Barry, if you think there's something here that should go into 2.0, please pursue it now, else change the status to Postponed. ---------------------------------------------------------------------- Comment By: Gregor Hoffleit (flight) Date: 2000-07-19 14:10 Message: This is what it used in product to build libpython as shared library(.so) for Debian. Note: This patch is not ready for inclusion in the upstream Python distribution. Anyway, I think this might be a start. The Python 1.5 executable in Debian GNU/Linux is built against a shared libpython1.5.so since April 1999, and I haven't yet heard about any problems. Using a shared library should have an advantage if you're running multiple instances of Python (be it standalone interpreter or embedded applications). ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=400938&group_id=5470 From noreply@sourceforge.net Tue Jun 12 17:54:10 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 12 Jun 2001 09:54:10 -0700 Subject: [Patches] [ python-Patches-400938 ] [Draft] libpython as shared library (.so) on Linux Message-ID: Patches item #400938, was updated on 2000-07-19 13:55 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=400938&group_id=5470 Category: None Group: None Status: Open Resolution: Out of Date Priority: 5 Submitted By: Gregor Hoffleit (flight) >Assigned to: Nobody/Anonymous (nobody) Summary: [Draft] libpython as shared library (.so) on Linux Initial Comment: ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-12 09:54 Message: Logged In: YES user_id=6380 Reopening -- this keeps being requested. Now we're just waiting for someone to produce a working patch. Or is there one already? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-12 09:51 Message: Logged In: YES user_id=6380 Reopening -- this keeps being requested. Now we're just waiting for someone to produce a working patch. Or is there one already? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2001-03-21 15:59 Message: Logged In: YES user_id=35752 We're going to have to create a new patch to do this. This one is way too out of date. Maybe for 2.2. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-01-19 14:46 Message: I'm reassigning this to Neil. Neil, can you see if you can integrate this into your flat Makefile? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-01-17 15:09 Message: Andrew, I'm tentatively reassigning this to you, since you're taking charge of the build process at the moment (setup.py). I suspect that the patch no longer works as is -- would it make sense to mark it postponed and get the author to submit a new version before we release 2.1a1? ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-01-17 14:46 Message: Getting this patch into the next version of Python would be "A Good Thing"(tm) in my opinion. We use libpython as a .so at ILM and end up having to make changes like this by hand every time we get a new version... ---------------------------------------------------------------------- Comment By: Moshe Zadka (moshez) Date: 2000-11-01 03:32 Message: I've had a look at the patch, and it seems it has two orthogonal parts. One is adding the infrastructure for compiling another version for the Python library, which can be more or less integrated as-is, and one is hard-coding the particular way, in Linux, of building shared objects. Since we discover how to build shared objects in the configure script anyway (otherwise we could not have built modules as shared objects), we should embed that information there, not the Linux flags. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2000-10-26 14:13 Message: Let's give this to Jeremy instead, because he seems to know more about build issues. Jeremy, it would be good to look into getting this to work with your RPM suite. Flight's argument (has been used without complaints in Debian Python 1.5.2 since 1999) is good. ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2000-08-23 09:26 Message: In the absence of anyone arguing for inclusion of this patch and a one-week idle period, it is postponed. ---------------------------------------------------------------------- Comment By: Moshe Zadka (moshez) Date: 2000-08-16 00:40 Message: I suggest we postpone it. It isn't really complete (only works on real distributions ), and the complete solution should work on all unices. If Tcl/Perl can do it, there is no reason Python can't -- and a half hearted solution isn't that good. flight, you should use this for the Python in woody in the mean time -- I doubt woody will be stable before Python 2.1 comes out, so 2.1 sounds like a good timeframe to do it. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2000-08-15 10:52 Message: Assigned to Barry because he's a Linux weenie. Barry, if you think there's something here that should go into 2.0, please pursue it now, else change the status to Postponed. ---------------------------------------------------------------------- Comment By: Gregor Hoffleit (flight) Date: 2000-07-19 14:10 Message: This is what it used in product to build libpython as shared library(.so) for Debian. Note: This patch is not ready for inclusion in the upstream Python distribution. Anyway, I think this might be a start. The Python 1.5 executable in Debian GNU/Linux is built against a shared libpython1.5.so since April 1999, and I haven't yet heard about any problems. Using a shared library should have an advantage if you're running multiple instances of Python (be it standalone interpreter or embedded applications). ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=400938&group_id=5470 From noreply@sourceforge.net Tue Jun 12 17:56:44 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 12 Jun 2001 09:56:44 -0700 Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks Message-ID: Patches item #432401, was updated on 2001-06-12 06:43 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: unicode encoding error callbacks Initial Comment: This patch adds unicode error handling callbacks to the encode functionality. With this patch it's possible to not only pass 'strict', 'ignore' or 'replace' as the errors argument to encode, but also a callable function, that will be called with the encoding name, the original unicode object and the position of the unencodable character. The callback must return a replacement unicode object that will be encoded instead of the original character. For example replacing unencodable characters with XML character references can be done in the following way. u"aäoöuüß".encode( "ascii", lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos]) ) ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:56 Message: Logged In: YES user_id=89016 > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. Another problem is, that the callback requires a Python object, so in the PyObject *version, the refcount is incref'd and the object is passed to the callback. The Py_UNICODE*/int version would have to create a new Unicode object from the data. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:32 Message: Logged In: YES user_id=89016 > * please don't place more than one C statement on one line > like in: > """ > + unicode = unicode2; unicodepos = > unicode2pos; > + unicode2 = NULL; unicode2pos = 0; > """ OK, done! > * Comments should start with a capital letter and be > prepended > to the section they apply to Fixed! > * There should be spaces between arguments in compares > (a == b) not (a==b) Fixed! > * Where does the name "...Encode121" originate ? encode one-to-one, it implements both ASCII and latin-1 encoding. > * module internal APIs should use lower case names (you > converted some of these to PyUnicode_...() -- this is > normally reserved for APIs which are either marked as > potential candidates for the public API or are very > prominent in the code) Which ones? I introduced a new function for every old one, that had a "const char *errors" argument, and a few new ones in codecs.h, of those PyCodec_EncodeHandlerForObject is vital, because it is used to map for old string arguments to the new function objects. PyCodec_RaiseEncodeErrors can be used in the encoder implementation to raise an encode error, but it could be made static in unicodeobject.h so only those encoders implemented there have access to it. > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. I look through the code and found no situation where the Py_UNICODE*/int version is really used and having two (PyObject *)s (the original and the replacement string), instead of UNICODE*/int and PyObject * made the implementation a little easier, but I can fix that. > Please separate the errors.c patch from this patch -- it > seems totally unrelated to Unicode. PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with four hex digits. I removed it. I'll upload a revised patch as soon as it's done. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 07:29 Message: Logged In: YES user_id=38388 Thanks for the patch -- it looks very impressive !. I'll give it a try later this week. Some first cosmetic tidbits: * please don't place more than one C statement on one line like in: """ + unicode = unicode2; unicodepos = unicode2pos; + unicode2 = NULL; unicode2pos = 0; """ * Comments should start with a capital letter and be prepended to the section they apply to * There should be spaces between arguments in compares (a == b) not (a==b) * Where does the name "...Encode121" originate ? * module internal APIs should use lower case names (you converted some of these to PyUnicode_...() -- this is normally reserved for APIs which are either marked as potential candidates for the public API or are very prominent in the code) One thing which I don't like about your API change is that you removed the Py_UNICODE*data, int size style arguments -- this makes it impossible to use the new APIs on non-Python data or data which is not available as Unicode object. Please separate the errors.c patch from this patch -- it seems totally unrelated to Unicode. Thanks. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 From noreply@sourceforge.net Tue Jun 12 18:08:58 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 12 Jun 2001 10:08:58 -0700 Subject: [Patches] [ python-Patches-423394 ] Fix pulldom to preserve ns attributes Message-ID: Patches item #423394, was updated on 2001-05-11 11:04 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=423394&group_id=5470 Category: XML Group: None >Status: Closed >Resolution: Out of Date Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Martin v. Löwis (loewis) Summary: Fix pulldom to preserve ns attributes Initial Comment: Here is a fix for pulldom.py that preserves xmlns attributes that declare namespaces. The current pulldom / minidom captures xml namespace information in elements and attributes, but the actual namespace declaration attributes (xmlns:foo="...") are not preserved on the element where they appear. This makes it impossible for certain applications that do more complex name dereferencing (XMLSchema is an example) that requires not only namespace uris but also the prefixes used and the original scope information. The current patch preserves xmlns="" and xmlns:foo="" as *non-namespace qualified* attributes, which appears to be the norm in other DOM implementations. Pls let me know if you have any questions. -Brian (brian@digicool.com) ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-12 10:08 Message: Logged In: YES user_id=21627 Superceded by #432117 ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-03 06:55 Message: Logged In: YES user_id=21627 The patch is a good idea, but I think it does not conform to the DOM recommendation. In the DOM, the namespace URI "http://www.w3.org/2000/xmlns/" is used for attributes whose namespace prefix or qualified name is xmlns. In addition, the patch contains a typo, it hould not say atetr_items. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=423394&group_id=5470 From noreply@sourceforge.net Tue Jun 12 18:09:16 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 12 Jun 2001 10:09:16 -0700 Subject: [Patches] [ python-Patches-432117 ] Updated PullDOM patch Message-ID: Patches item #432117, was updated on 2001-06-11 08:24 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432117&group_id=5470 Category: XML Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) >Assigned to: Martin v. Löwis (loewis) Summary: Updated PullDOM patch Initial Comment: Martin, here is an updated patch (see # 423394) for pulldom.py that follows the DOM REC regarding namespace declaration attribute handling. In short: namespace declaration attributes are now preserved, and the namespaceURI of a namespace decl attribute is "http://www.w3.org/2000/xmlns/". The localName is the prefix to be mapped, unless it is a plain "xmlns" (default ns declaration), in which case the localName is just "xmlns". I've tested this with a Python 2.1 (final release) install. Let me know if you need anything else. Thanks! B. Lloyd ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432117&group_id=5470 From noreply@sourceforge.net Tue Jun 12 19:00:10 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 12 Jun 2001 11:00:10 -0700 Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks Message-ID: Patches item #432401, was updated on 2001-06-12 06:43 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: unicode encoding error callbacks Initial Comment: This patch adds unicode error handling callbacks to the encode functionality. With this patch it's possible to not only pass 'strict', 'ignore' or 'replace' as the errors argument to encode, but also a callable function, that will be called with the encoding name, the original unicode object and the position of the unencodable character. The callback must return a replacement unicode object that will be encoded instead of the original character. For example replacing unencodable characters with XML character references can be done in the following way. u"aäoöuüß".encode( "ascii", lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos]) ) ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 11:00 Message: Logged In: YES user_id=38388 About the Py_UNICODE*data, int size APIs: Ok, point taken. In general, I think we ought to keep the callback feature as open as possible, so passing in pointers and sizes would not be very useful. BTW, could you summarize how the callback works in a few lines ? About _Encode121: I'd name this _EncodeUCS1 since that's what it is ;-) About the new functions: I was referring to the new static functions which you gave PyUnicode_... names. If these are not supposed to turn into non-static functions, I'd rather have them use lower case names (since that's how the Python internals work too -- most of the times). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:56 Message: Logged In: YES user_id=89016 > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. Another problem is, that the callback requires a Python object, so in the PyObject *version, the refcount is incref'd and the object is passed to the callback. The Py_UNICODE*/int version would have to create a new Unicode object from the data. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:32 Message: Logged In: YES user_id=89016 > * please don't place more than one C statement on one line > like in: > """ > + unicode = unicode2; unicodepos = > unicode2pos; > + unicode2 = NULL; unicode2pos = 0; > """ OK, done! > * Comments should start with a capital letter and be > prepended > to the section they apply to Fixed! > * There should be spaces between arguments in compares > (a == b) not (a==b) Fixed! > * Where does the name "...Encode121" originate ? encode one-to-one, it implements both ASCII and latin-1 encoding. > * module internal APIs should use lower case names (you > converted some of these to PyUnicode_...() -- this is > normally reserved for APIs which are either marked as > potential candidates for the public API or are very > prominent in the code) Which ones? I introduced a new function for every old one, that had a "const char *errors" argument, and a few new ones in codecs.h, of those PyCodec_EncodeHandlerForObject is vital, because it is used to map for old string arguments to the new function objects. PyCodec_RaiseEncodeErrors can be used in the encoder implementation to raise an encode error, but it could be made static in unicodeobject.h so only those encoders implemented there have access to it. > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. I look through the code and found no situation where the Py_UNICODE*/int version is really used and having two (PyObject *)s (the original and the replacement string), instead of UNICODE*/int and PyObject * made the implementation a little easier, but I can fix that. > Please separate the errors.c patch from this patch -- it > seems totally unrelated to Unicode. PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with four hex digits. I removed it. I'll upload a revised patch as soon as it's done. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 07:29 Message: Logged In: YES user_id=38388 Thanks for the patch -- it looks very impressive !. I'll give it a try later this week. Some first cosmetic tidbits: * please don't place more than one C statement on one line like in: """ + unicode = unicode2; unicodepos = unicode2pos; + unicode2 = NULL; unicode2pos = 0; """ * Comments should start with a capital letter and be prepended to the section they apply to * There should be spaces between arguments in compares (a == b) not (a==b) * Where does the name "...Encode121" originate ? * module internal APIs should use lower case names (you converted some of these to PyUnicode_...() -- this is normally reserved for APIs which are either marked as potential candidates for the public API or are very prominent in the code) One thing which I don't like about your API change is that you removed the Py_UNICODE*data, int size style arguments -- this makes it impossible to use the new APIs on non-Python data or data which is not available as Unicode object. Please separate the errors.c patch from this patch -- it seems totally unrelated to Unicode. Thanks. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 From noreply@sourceforge.net Tue Jun 12 19:59:24 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 12 Jun 2001 11:59:24 -0700 Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks Message-ID: Patches item #432401, was updated on 2001-06-12 06:43 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: unicode encoding error callbacks Initial Comment: This patch adds unicode error handling callbacks to the encode functionality. With this patch it's possible to not only pass 'strict', 'ignore' or 'replace' as the errors argument to encode, but also a callable function, that will be called with the encoding name, the original unicode object and the position of the unencodable character. The callback must return a replacement unicode object that will be encoded instead of the original character. For example replacing unencodable characters with XML character references can be done in the following way. u"aäoöuüß".encode( "ascii", lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos]) ) ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 11:59 Message: Logged In: YES user_id=89016 How the callbacks work: A PyObject * named errors is passed in. This may by NULL, Py_None, 'strict', u'strict', 'ignore', u'ignore', 'replace', u'replace' or a callable object. PyCodec_EncodeHandlerForObject maps all of these objects to one of the three builtin error callbacks PyCodec_RaiseEncodeErrors (raises an exception), PyCodec_IgnoreEncodeErrors (returns an empty replacement string, in effect ignoring the error), PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode replacement character to signify to the encoder that it should choose a suitable replacement character) or directly returns errors if it is a callable object. When an unencodable character is encounterd the error handling callback will be called with the encoding name, the original unicode object and the error position and must return a unicode object that will be encoded instead of the offending character (or the callback may of course raise an exception). U+FFFD characters in the replacement string will be replaced with a character that the encoder chooses ('?' in all cases). The implementation of the loop through the string is done in the following way. A stack with two strings is kept and the loop always encodes a character from the string at the stacktop. If an error is encountered and the stack has only one entry (during encoding of the original string) the callback is called and the unicode object returned is pushed on the stack, so the encoding continues with the replacement string. If the stack has two entries when an error is encountered, the replacement string itself has an unencodable character and a normal exception raised. When the encoder has reached the end of it's current string there are two possibilities: when the stack contains two entries, this was the replacement string, so the replacement string will be poppep from the stack and encoding continues with the next character from the original string. If the stack had only one entry, encoding is finished. (I hope that's enough explanation of the API and implementation) I have renamed the static ...121 function to all lowercase names. BTW, I guess PyUnicode_EncodeUnicodeEscape could be reimplemented as PyUnicode_EncodeASCII with a \uxxxx replacement callback. PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, PyCodec_ReplaceEncodeErrors are globally visible because they have to be available in _codecsmodule.c to wrap them as Python function objects, but they can't be implemented in _codecsmodule, because they need to be available to the encoders in unicodeobject.c (through PyCodec_EncodeHandlerForObject), but importing the codecs module might result in an endless recursion, because importing a module requires unpickling of the bytecode, which might require decoding utf8, which ... (but this will only happen, if we implement the same mechanism for the decoding API) I have not touched PyUnicode_TranslateCharmap yet, should this function also support error callbacks? Why would one want the insert None into the mapping to call the callback? A remaining problem is how to implement decoding error callbacks. In Python 2.1 encoding and decoding errors are handled in the same way with a string value. But with callbacks it doesn't make sense to use the same callback for encoding and decoding (like codecs.StreamReaderWriter and codecs.StreamRecoder do). Decoding callbacks have a different API. Which arguments should be passed to the decoding callback, and what is the decoding callback supposed to do? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 11:00 Message: Logged In: YES user_id=38388 About the Py_UNICODE*data, int size APIs: Ok, point taken. In general, I think we ought to keep the callback feature as open as possible, so passing in pointers and sizes would not be very useful. BTW, could you summarize how the callback works in a few lines ? About _Encode121: I'd name this _EncodeUCS1 since that's what it is ;-) About the new functions: I was referring to the new static functions which you gave PyUnicode_... names. If these are not supposed to turn into non-static functions, I'd rather have them use lower case names (since that's how the Python internals work too -- most of the times). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:56 Message: Logged In: YES user_id=89016 > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. Another problem is, that the callback requires a Python object, so in the PyObject *version, the refcount is incref'd and the object is passed to the callback. The Py_UNICODE*/int version would have to create a new Unicode object from the data. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:32 Message: Logged In: YES user_id=89016 > * please don't place more than one C statement on one line > like in: > """ > + unicode = unicode2; unicodepos = > unicode2pos; > + unicode2 = NULL; unicode2pos = 0; > """ OK, done! > * Comments should start with a capital letter and be > prepended > to the section they apply to Fixed! > * There should be spaces between arguments in compares > (a == b) not (a==b) Fixed! > * Where does the name "...Encode121" originate ? encode one-to-one, it implements both ASCII and latin-1 encoding. > * module internal APIs should use lower case names (you > converted some of these to PyUnicode_...() -- this is > normally reserved for APIs which are either marked as > potential candidates for the public API or are very > prominent in the code) Which ones? I introduced a new function for every old one, that had a "const char *errors" argument, and a few new ones in codecs.h, of those PyCodec_EncodeHandlerForObject is vital, because it is used to map for old string arguments to the new function objects. PyCodec_RaiseEncodeErrors can be used in the encoder implementation to raise an encode error, but it could be made static in unicodeobject.h so only those encoders implemented there have access to it. > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. I look through the code and found no situation where the Py_UNICODE*/int version is really used and having two (PyObject *)s (the original and the replacement string), instead of UNICODE*/int and PyObject * made the implementation a little easier, but I can fix that. > Please separate the errors.c patch from this patch -- it > seems totally unrelated to Unicode. PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with four hex digits. I removed it. I'll upload a revised patch as soon as it's done. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 07:29 Message: Logged In: YES user_id=38388 Thanks for the patch -- it looks very impressive !. I'll give it a try later this week. Some first cosmetic tidbits: * please don't place more than one C statement on one line like in: """ + unicode = unicode2; unicodepos = unicode2pos; + unicode2 = NULL; unicode2pos = 0; """ * Comments should start with a capital letter and be prepended to the section they apply to * There should be spaces between arguments in compares (a == b) not (a==b) * Where does the name "...Encode121" originate ? * module internal APIs should use lower case names (you converted some of these to PyUnicode_...() -- this is normally reserved for APIs which are either marked as potential candidates for the public API or are very prominent in the code) One thing which I don't like about your API change is that you removed the Py_UNICODE*data, int size style arguments -- this makes it impossible to use the new APIs on non-Python data or data which is not available as Unicode object. Please separate the errors.c patch from this patch -- it seems totally unrelated to Unicode. Thanks. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 From noreply@sourceforge.net Tue Jun 12 20:18:57 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 12 Jun 2001 12:18:57 -0700 Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks Message-ID: Patches item #432401, was updated on 2001-06-12 06:43 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: unicode encoding error callbacks Initial Comment: This patch adds unicode error handling callbacks to the encode functionality. With this patch it's possible to not only pass 'strict', 'ignore' or 'replace' as the errors argument to encode, but also a callable function, that will be called with the encoding name, the original unicode object and the position of the unencodable character. The callback must return a replacement unicode object that will be encoded instead of the original character. For example replacing unencodable characters with XML character references can be done in the following way. u"aäoöuüß".encode( "ascii", lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos]) ) ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 12:18 Message: Logged In: YES user_id=89016 One additional note: It is vital that errors is an assignable attribute of the StreamWriter. Consider the XML example: For writing an XML DOM tree one StreamWriter object is used. When a text node is written, the error handling has to be set to codecs.xmlreplace_encode_errors, but inside a comment or processing instruction replacing unencodable characters with charrefs is not possible, so here codecs.raise_encode_errors should be used (or better a custom error handler that raises an error that says "sorry, you can't have unencodable characters inside a comment") BTW, should we continue the discussion in the i18n SIG mailing list? An email program is much more comfortable than a HTML textarea! ;) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 11:59 Message: Logged In: YES user_id=89016 How the callbacks work: A PyObject * named errors is passed in. This may by NULL, Py_None, 'strict', u'strict', 'ignore', u'ignore', 'replace', u'replace' or a callable object. PyCodec_EncodeHandlerForObject maps all of these objects to one of the three builtin error callbacks PyCodec_RaiseEncodeErrors (raises an exception), PyCodec_IgnoreEncodeErrors (returns an empty replacement string, in effect ignoring the error), PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode replacement character to signify to the encoder that it should choose a suitable replacement character) or directly returns errors if it is a callable object. When an unencodable character is encounterd the error handling callback will be called with the encoding name, the original unicode object and the error position and must return a unicode object that will be encoded instead of the offending character (or the callback may of course raise an exception). U+FFFD characters in the replacement string will be replaced with a character that the encoder chooses ('?' in all cases). The implementation of the loop through the string is done in the following way. A stack with two strings is kept and the loop always encodes a character from the string at the stacktop. If an error is encountered and the stack has only one entry (during encoding of the original string) the callback is called and the unicode object returned is pushed on the stack, so the encoding continues with the replacement string. If the stack has two entries when an error is encountered, the replacement string itself has an unencodable character and a normal exception raised. When the encoder has reached the end of it's current string there are two possibilities: when the stack contains two entries, this was the replacement string, so the replacement string will be poppep from the stack and encoding continues with the next character from the original string. If the stack had only one entry, encoding is finished. (I hope that's enough explanation of the API and implementation) I have renamed the static ...121 function to all lowercase names. BTW, I guess PyUnicode_EncodeUnicodeEscape could be reimplemented as PyUnicode_EncodeASCII with a \uxxxx replacement callback. PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, PyCodec_ReplaceEncodeErrors are globally visible because they have to be available in _codecsmodule.c to wrap them as Python function objects, but they can't be implemented in _codecsmodule, because they need to be available to the encoders in unicodeobject.c (through PyCodec_EncodeHandlerForObject), but importing the codecs module might result in an endless recursion, because importing a module requires unpickling of the bytecode, which might require decoding utf8, which ... (but this will only happen, if we implement the same mechanism for the decoding API) I have not touched PyUnicode_TranslateCharmap yet, should this function also support error callbacks? Why would one want the insert None into the mapping to call the callback? A remaining problem is how to implement decoding error callbacks. In Python 2.1 encoding and decoding errors are handled in the same way with a string value. But with callbacks it doesn't make sense to use the same callback for encoding and decoding (like codecs.StreamReaderWriter and codecs.StreamRecoder do). Decoding callbacks have a different API. Which arguments should be passed to the decoding callback, and what is the decoding callback supposed to do? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 11:00 Message: Logged In: YES user_id=38388 About the Py_UNICODE*data, int size APIs: Ok, point taken. In general, I think we ought to keep the callback feature as open as possible, so passing in pointers and sizes would not be very useful. BTW, could you summarize how the callback works in a few lines ? About _Encode121: I'd name this _EncodeUCS1 since that's what it is ;-) About the new functions: I was referring to the new static functions which you gave PyUnicode_... names. If these are not supposed to turn into non-static functions, I'd rather have them use lower case names (since that's how the Python internals work too -- most of the times). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:56 Message: Logged In: YES user_id=89016 > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. Another problem is, that the callback requires a Python object, so in the PyObject *version, the refcount is incref'd and the object is passed to the callback. The Py_UNICODE*/int version would have to create a new Unicode object from the data. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:32 Message: Logged In: YES user_id=89016 > * please don't place more than one C statement on one line > like in: > """ > + unicode = unicode2; unicodepos = > unicode2pos; > + unicode2 = NULL; unicode2pos = 0; > """ OK, done! > * Comments should start with a capital letter and be > prepended > to the section they apply to Fixed! > * There should be spaces between arguments in compares > (a == b) not (a==b) Fixed! > * Where does the name "...Encode121" originate ? encode one-to-one, it implements both ASCII and latin-1 encoding. > * module internal APIs should use lower case names (you > converted some of these to PyUnicode_...() -- this is > normally reserved for APIs which are either marked as > potential candidates for the public API or are very > prominent in the code) Which ones? I introduced a new function for every old one, that had a "const char *errors" argument, and a few new ones in codecs.h, of those PyCodec_EncodeHandlerForObject is vital, because it is used to map for old string arguments to the new function objects. PyCodec_RaiseEncodeErrors can be used in the encoder implementation to raise an encode error, but it could be made static in unicodeobject.h so only those encoders implemented there have access to it. > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. I look through the code and found no situation where the Py_UNICODE*/int version is really used and having two (PyObject *)s (the original and the replacement string), instead of UNICODE*/int and PyObject * made the implementation a little easier, but I can fix that. > Please separate the errors.c patch from this patch -- it > seems totally unrelated to Unicode. PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with four hex digits. I removed it. I'll upload a revised patch as soon as it's done. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 07:29 Message: Logged In: YES user_id=38388 Thanks for the patch -- it looks very impressive !. I'll give it a try later this week. Some first cosmetic tidbits: * please don't place more than one C statement on one line like in: """ + unicode = unicode2; unicodepos = unicode2pos; + unicode2 = NULL; unicode2pos = 0; """ * Comments should start with a capital letter and be prepended to the section they apply to * There should be spaces between arguments in compares (a == b) not (a==b) * Where does the name "...Encode121" originate ? * module internal APIs should use lower case names (you converted some of these to PyUnicode_...() -- this is normally reserved for APIs which are either marked as potential candidates for the public API or are very prominent in the code) One thing which I don't like about your API change is that you removed the Py_UNICODE*data, int size style arguments -- this makes it impossible to use the new APIs on non-Python data or data which is not available as Unicode object. Please separate the errors.c patch from this patch -- it seems totally unrelated to Unicode. Thanks. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 From noreply@sourceforge.net Tue Jun 12 23:19:07 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 12 Jun 2001 15:19:07 -0700 Subject: [Patches] [ python-Patches-400938 ] [Draft] libpython as shared library (.so) on Linux Message-ID: Patches item #400938, was updated on 2000-07-19 13:55 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=400938&group_id=5470 Category: None Group: None Status: Open Resolution: Out of Date Priority: 5 Submitted By: Gregor Hoffleit (flight) Assigned to: Nobody/Anonymous (nobody) Summary: [Draft] libpython as shared library (.so) on Linux Initial Comment: ---------------------------------------------------------------------- >Comment By: Gregor Hoffleit (flight) Date: 2001-06-12 15:19 Message: Logged In: YES user_id=5293 > Now we're just waiting for someone to produce a working patch. > Or is there one already? I'm currently distributing experimental packages of Python 2.1 for Debian. The packages include a hack to build libpython2.1 as .so for Linux. The shared library patch currently is buried in a big diff file. You can get it as http://people.debian.org/~flight/python2/python2_2.1-0.diff.gz This is only a starting point for a real patch! ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-12 09:54 Message: Logged In: YES user_id=6380 Reopening -- this keeps being requested. Now we're just waiting for someone to produce a working patch. Or is there one already? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-12 09:51 Message: Logged In: YES user_id=6380 Reopening -- this keeps being requested. Now we're just waiting for someone to produce a working patch. Or is there one already? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2001-03-21 15:59 Message: Logged In: YES user_id=35752 We're going to have to create a new patch to do this. This one is way too out of date. Maybe for 2.2. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-01-19 14:46 Message: I'm reassigning this to Neil. Neil, can you see if you can integrate this into your flat Makefile? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-01-17 15:09 Message: Andrew, I'm tentatively reassigning this to you, since you're taking charge of the build process at the moment (setup.py). I suspect that the patch no longer works as is -- would it make sense to mark it postponed and get the author to submit a new version before we release 2.1a1? ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-01-17 14:46 Message: Getting this patch into the next version of Python would be "A Good Thing"(tm) in my opinion. We use libpython as a .so at ILM and end up having to make changes like this by hand every time we get a new version... ---------------------------------------------------------------------- Comment By: Moshe Zadka (moshez) Date: 2000-11-01 03:32 Message: I've had a look at the patch, and it seems it has two orthogonal parts. One is adding the infrastructure for compiling another version for the Python library, which can be more or less integrated as-is, and one is hard-coding the particular way, in Linux, of building shared objects. Since we discover how to build shared objects in the configure script anyway (otherwise we could not have built modules as shared objects), we should embed that information there, not the Linux flags. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2000-10-26 14:13 Message: Let's give this to Jeremy instead, because he seems to know more about build issues. Jeremy, it would be good to look into getting this to work with your RPM suite. Flight's argument (has been used without complaints in Debian Python 1.5.2 since 1999) is good. ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2000-08-23 09:26 Message: In the absence of anyone arguing for inclusion of this patch and a one-week idle period, it is postponed. ---------------------------------------------------------------------- Comment By: Moshe Zadka (moshez) Date: 2000-08-16 00:40 Message: I suggest we postpone it. It isn't really complete (only works on real distributions ), and the complete solution should work on all unices. If Tcl/Perl can do it, there is no reason Python can't -- and a half hearted solution isn't that good. flight, you should use this for the Python in woody in the mean time -- I doubt woody will be stable before Python 2.1 comes out, so 2.1 sounds like a good timeframe to do it. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2000-08-15 10:52 Message: Assigned to Barry because he's a Linux weenie. Barry, if you think there's something here that should go into 2.0, please pursue it now, else change the status to Postponed. ---------------------------------------------------------------------- Comment By: Gregor Hoffleit (flight) Date: 2000-07-19 14:10 Message: This is what it used in product to build libpython as shared library(.so) for Debian. Note: This patch is not ready for inclusion in the upstream Python distribution. Anyway, I think this might be a start. The Python 1.5 executable in Debian GNU/Linux is built against a shared libpython1.5.so since April 1999, and I haven't yet heard about any problems. Using a shared library should have an advantage if you're running multiple instances of Python (be it standalone interpreter or embedded applications). ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=400938&group_id=5470 From noreply@sourceforge.net Wed Jun 13 09:05:58 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 13 Jun 2001 01:05:58 -0700 Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks Message-ID: Patches item #432401, was updated on 2001-06-12 06:43 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: unicode encoding error callbacks Initial Comment: This patch adds unicode error handling callbacks to the encode functionality. With this patch it's possible to not only pass 'strict', 'ignore' or 'replace' as the errors argument to encode, but also a callable function, that will be called with the encoding name, the original unicode object and the position of the unencodable character. The callback must return a replacement unicode object that will be encoded instead of the original character. For example replacing unencodable characters with XML character references can be done in the following way. u"aäoöuüß".encode( "ascii", lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos]) ) ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-13 01:05 Message: Logged In: YES user_id=38388 > How the callbacks work: > > A PyObject * named errors is passed in. This may by NULL, > Py_None, 'strict', u'strict', 'ignore', u'ignore', > 'replace', u'replace' or a callable object. > PyCodec_EncodeHandlerForObject maps all of these objects to > one of the three builtin error callbacks > PyCodec_RaiseEncodeErrors (raises an exception), > PyCodec_IgnoreEncodeErrors (returns an empty replacement > string, in effect ignoring the error), > PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode > replacement character to signify to the encoder that it > should choose a suitable replacement character) or directly > returns errors if it is a callable object. When an > unencodable character is encounterd the error handling > callback will be called with the encoding name, the original > unicode object and the error position and must return a > unicode object that will be encoded instead of the offending > character (or the callback may of course raise an > exception). U+FFFD characters in the replacement string will > be replaced with a character that the encoder chooses ('?' > in all cases). Nice. > The implementation of the loop through the string is done in > the following way. A stack with two strings is kept and the > loop always encodes a character from the string at the > stacktop. If an error is encountered and the stack has only > one entry (during encoding of the original string) the > callback is called and the unicode object returned is pushed > on the stack, so the encoding continues with the replacement > string. If the stack has two entries when an error is > encountered, the replacement string itself has an > unencodable character and a normal exception raised. When > the encoder has reached the end of it's current string there > are two possibilities: when the stack contains two entries, > this was the replacement string, so the replacement string > will be poppep from the stack and encoding continues with > the next character from the original string. If the stack > had only one entry, encoding is finished. Very elegant solution ! > (I hope that's enough explanation of the API and implementation) Could you add these docs to the Misc/unicode.txt file ? I will eventually take that file and turn it into a PEP which will then serve as general documentation for these things. > I have renamed the static ...121 function to all lowercase > names. Ok. > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > replacement callback. Hmm, wouldn't that result in a slowdown ? If so, I'd rather leave the special encoder in place, since it is being used a lot in Python and probably some applications too. > PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, > PyCodec_ReplaceEncodeErrors are globally visible because > they have to be available in _codecsmodule.c to wrap them as > Python function objects, but they can't be implemented in > _codecsmodule, because they need to be available to the > encoders in unicodeobject.c (through > PyCodec_EncodeHandlerForObject), but importing the codecs > module might result in an endless recursion, because > importing a module requires unpickling of the bytecode, > which might require decoding utf8, which ... (but this will > only happen, if we implement the same mechanism for the > decoding API) I think that codecs.c is the right place for these APIs. _codecsmodule.c is only meant as Python access wrapper for the internal codecs and nothing more. One thing I noted about the callbacks: they assume that they will always get Unicode objects as input. This is certainly not true in the general case (it is for the codecs you touch in the patch). I think it would be worthwhile to rename the callbacks to include "Unicode" somewhere, e.g. PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but then it points out the application field of the callback rather well. Same for the callbacks exposed through the _codecsmodule. > I have not touched PyUnicode_TranslateCharmap yet, > should this function also support error callbacks? Why would > one want the insert None into the mapping to call the callback? 1. Yes. 2. The user may want to e.g. restrict usage of certain character ranges. In this case the codec would be used to verify the input and an exception would indeed be useful (e.g. say you want to restrict input to Hangul + ASCII). > A remaining problem is how to implement decoding error > callbacks. In Python 2.1 encoding and decoding errors are > handled in the same way with a string value. But with > callbacks it doesn't make sense to use the same callback for > encoding and decoding (like codecs.StreamReaderWriter and > codecs.StreamRecoder do). Decoding callbacks have a > different API. Which arguments should be passed to the > decoding callback, and what is the decoding callback > supposed to do? I'd suggest adding another set of PyCodec_UnicodeDecode...() APIs for this. We'd then have to augment the base classes of the StreamCodecs to provide two attributes for .errors with a fallback solution for the string case (i.s. "strict" can still be used for both directions). > One additional note: It is vital that errors is an > assignable attribute of the StreamWriter. It is already ! > Consider the XML example: For writing an XML DOM tree one > StreamWriter object is used. When a text node is written, > the error handling has to be set to > codecs.xmlreplace_encode_errors, but inside a comment or > processing instruction replacing unencodable characters with > charrefs is not possible, so here codecs.raise_encode_errors > should be used (or better a custom error handler that raises > an error that says "sorry, you can't have unencodable > characters inside a comment") Sure. > BTW, should we continue the discussion in the i18n SIG > mailing list? An email program is much more comfortable than > a HTML textarea! ;) I'd rather keep the discussions on this patch here -- forking it off to the i18n sig will make it very hard to follow up on it. (This HTML area is indeed damn small ;-) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 12:18 Message: Logged In: YES user_id=89016 One additional note: It is vital that errors is an assignable attribute of the StreamWriter. Consider the XML example: For writing an XML DOM tree one StreamWriter object is used. When a text node is written, the error handling has to be set to codecs.xmlreplace_encode_errors, but inside a comment or processing instruction replacing unencodable characters with charrefs is not possible, so here codecs.raise_encode_errors should be used (or better a custom error handler that raises an error that says "sorry, you can't have unencodable characters inside a comment") BTW, should we continue the discussion in the i18n SIG mailing list? An email program is much more comfortable than a HTML textarea! ;) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 11:59 Message: Logged In: YES user_id=89016 How the callbacks work: A PyObject * named errors is passed in. This may by NULL, Py_None, 'strict', u'strict', 'ignore', u'ignore', 'replace', u'replace' or a callable object. PyCodec_EncodeHandlerForObject maps all of these objects to one of the three builtin error callbacks PyCodec_RaiseEncodeErrors (raises an exception), PyCodec_IgnoreEncodeErrors (returns an empty replacement string, in effect ignoring the error), PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode replacement character to signify to the encoder that it should choose a suitable replacement character) or directly returns errors if it is a callable object. When an unencodable character is encounterd the error handling callback will be called with the encoding name, the original unicode object and the error position and must return a unicode object that will be encoded instead of the offending character (or the callback may of course raise an exception). U+FFFD characters in the replacement string will be replaced with a character that the encoder chooses ('?' in all cases). The implementation of the loop through the string is done in the following way. A stack with two strings is kept and the loop always encodes a character from the string at the stacktop. If an error is encountered and the stack has only one entry (during encoding of the original string) the callback is called and the unicode object returned is pushed on the stack, so the encoding continues with the replacement string. If the stack has two entries when an error is encountered, the replacement string itself has an unencodable character and a normal exception raised. When the encoder has reached the end of it's current string there are two possibilities: when the stack contains two entries, this was the replacement string, so the replacement string will be poppep from the stack and encoding continues with the next character from the original string. If the stack had only one entry, encoding is finished. (I hope that's enough explanation of the API and implementation) I have renamed the static ...121 function to all lowercase names. BTW, I guess PyUnicode_EncodeUnicodeEscape could be reimplemented as PyUnicode_EncodeASCII with a \uxxxx replacement callback. PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, PyCodec_ReplaceEncodeErrors are globally visible because they have to be available in _codecsmodule.c to wrap them as Python function objects, but they can't be implemented in _codecsmodule, because they need to be available to the encoders in unicodeobject.c (through PyCodec_EncodeHandlerForObject), but importing the codecs module might result in an endless recursion, because importing a module requires unpickling of the bytecode, which might require decoding utf8, which ... (but this will only happen, if we implement the same mechanism for the decoding API) I have not touched PyUnicode_TranslateCharmap yet, should this function also support error callbacks? Why would one want the insert None into the mapping to call the callback? A remaining problem is how to implement decoding error callbacks. In Python 2.1 encoding and decoding errors are handled in the same way with a string value. But with callbacks it doesn't make sense to use the same callback for encoding and decoding (like codecs.StreamReaderWriter and codecs.StreamRecoder do). Decoding callbacks have a different API. Which arguments should be passed to the decoding callback, and what is the decoding callback supposed to do? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 11:00 Message: Logged In: YES user_id=38388 About the Py_UNICODE*data, int size APIs: Ok, point taken. In general, I think we ought to keep the callback feature as open as possible, so passing in pointers and sizes would not be very useful. BTW, could you summarize how the callback works in a few lines ? About _Encode121: I'd name this _EncodeUCS1 since that's what it is ;-) About the new functions: I was referring to the new static functions which you gave PyUnicode_... names. If these are not supposed to turn into non-static functions, I'd rather have them use lower case names (since that's how the Python internals work too -- most of the times). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:56 Message: Logged In: YES user_id=89016 > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. Another problem is, that the callback requires a Python object, so in the PyObject *version, the refcount is incref'd and the object is passed to the callback. The Py_UNICODE*/int version would have to create a new Unicode object from the data. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:32 Message: Logged In: YES user_id=89016 > * please don't place more than one C statement on one line > like in: > """ > + unicode = unicode2; unicodepos = > unicode2pos; > + unicode2 = NULL; unicode2pos = 0; > """ OK, done! > * Comments should start with a capital letter and be > prepended > to the section they apply to Fixed! > * There should be spaces between arguments in compares > (a == b) not (a==b) Fixed! > * Where does the name "...Encode121" originate ? encode one-to-one, it implements both ASCII and latin-1 encoding. > * module internal APIs should use lower case names (you > converted some of these to PyUnicode_...() -- this is > normally reserved for APIs which are either marked as > potential candidates for the public API or are very > prominent in the code) Which ones? I introduced a new function for every old one, that had a "const char *errors" argument, and a few new ones in codecs.h, of those PyCodec_EncodeHandlerForObject is vital, because it is used to map for old string arguments to the new function objects. PyCodec_RaiseEncodeErrors can be used in the encoder implementation to raise an encode error, but it could be made static in unicodeobject.h so only those encoders implemented there have access to it. > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. I look through the code and found no situation where the Py_UNICODE*/int version is really used and having two (PyObject *)s (the original and the replacement string), instead of UNICODE*/int and PyObject * made the implementation a little easier, but I can fix that. > Please separate the errors.c patch from this patch -- it > seems totally unrelated to Unicode. PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with four hex digits. I removed it. I'll upload a revised patch as soon as it's done. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 07:29 Message: Logged In: YES user_id=38388 Thanks for the patch -- it looks very impressive !. I'll give it a try later this week. Some first cosmetic tidbits: * please don't place more than one C statement on one line like in: """ + unicode = unicode2; unicodepos = unicode2pos; + unicode2 = NULL; unicode2pos = 0; """ * Comments should start with a capital letter and be prepended to the section they apply to * There should be spaces between arguments in compares (a == b) not (a==b) * Where does the name "...Encode121" originate ? * module internal APIs should use lower case names (you converted some of these to PyUnicode_...() -- this is normally reserved for APIs which are either marked as potential candidates for the public API or are very prominent in the code) One thing which I don't like about your API change is that you removed the Py_UNICODE*data, int size style arguments -- this makes it impossible to use the new APIs on non-Python data or data which is not available as Unicode object. Please separate the errors.c patch from this patch -- it seems totally unrelated to Unicode. Thanks. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 From noreply@sourceforge.net Wed Jun 13 14:57:07 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 13 Jun 2001 06:57:07 -0700 Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks Message-ID: Patches item #432401, was updated on 2001-06-12 06:43 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: unicode encoding error callbacks Initial Comment: This patch adds unicode error handling callbacks to the encode functionality. With this patch it's possible to not only pass 'strict', 'ignore' or 'replace' as the errors argument to encode, but also a callable function, that will be called with the encoding name, the original unicode object and the position of the unencodable character. The callback must return a replacement unicode object that will be encoded instead of the original character. For example replacing unencodable characters with XML character references can be done in the following way. u"aäoöuüß".encode( "ascii", lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos]) ) ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2001-06-13 06:57 Message: Logged In: YES user_id=89016 > > [...] > > raise an exception). U+FFFD characters in the replacement > > string will be replaced with a character that the encoder > > chooses ('?' in all cases). > > Nice. But the special casing of U+FFFD makes the interface somewhat less clean than it could be. It was only done to be 100% backwards compatible. With the original "replace" error handling the codec chose the replacement character. But as far as I can tell none of the codecs uses anything other than '?', so I guess we could change the replace handler to always return u'?'. This would make the implementation a little bit simpler, but the explanation of the callback feature *a lot* simpler. And if you still want to handle an unencodable U+FFFD, you can write a special callback for that, e.g. def FFFDreplace(enc, uni, pos): if uni[pos] == "\ufffd": return u"?" else: raise UnicodeError(...) > > The implementation of the loop through the string is done > > in the following way. A stack with two strings is kept > > and the loop always encodes a character from the string > > at the stacktop. If an error is encountered and the stack > > has only one entry (during encoding of the original string) > > the callback is called and the unicode object returned is > > pushed on the stack, so the encoding continues with the > > replacement string. If the stack has two entries when an > > error is encountered, the replacement string itself has > > an unencodable character and a normal exception raised. > > When the encoder has reached the end of it's current string > > there are two possibilities: when the stack contains two > > entries, this was the replacement string, so the replacement > > string will be poppep from the stack and encoding continues > > with the next character from the original string. If the > > stack had only one entry, encoding is finished. > > Very elegant solution ! I'll put it as a comment in the source. > > (I hope that's enough explanation of the API and > implementation) > > Could you add these docs to the Misc/unicode.txt file ? I > will eventually take that file and turn it into a PEP which > will then serve as general documentation for these things. I could, but first we should work out how the decoding callback API will work. > > I have renamed the static ...121 function to all lowercase > > names. > > Ok. > > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > > replacement callback. > > Hmm, wouldn't that result in a slowdown ? If so, I'd rather > leave the special encoder in place, since it is being used a > lot in Python and probably some applications too. It would be a slowdown. But callbacks open many possiblities. For example: Why can't I print u"gürk"? is probably one of the most frequently asked questions in comp.lang.python. For printing Unicode stuff, print could be extended the use an error handling callback for Unicode strings (or objects where __str__ or tp_str returns a Unicode object) instead of using str() which always returns an 8bit string and uses strict encoding. There might even be a sys.setprintencodehandler()/sys.getprintencodehandler() > [...] > I think it would be worthwhile to rename the callbacks to > include "Unicode" somewhere, e.g. > PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but > then it points out the application field of the callback > rather well. Same for the callbacks exposed through the > _codecsmodule. OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors really is a long name ;)) > > I have not touched PyUnicode_TranslateCharmap yet, > > should this function also support error callbacks? Why > > would one want the insert None into the mapping to call > > the callback? > > 1. Yes. > 2. The user may want to e.g. restrict usage of certain > character ranges. In this case the codec would be used to > verify the input and an exception would indeed be useful > (e.g. say you want to restrict input to Hangul + ASCII). OK, do we want TranslateCharmap to work exactly like encoding, i.e. in case of an error should the returned replacement string again be mapped through the translation mapping or should it be copied to the output directly? The former would be more in line with encoding, but IMHO the latter would be much more useful. BTW, when I implement it I can implement patch #403100 ("Multicharacter replacements in PyUnicode_TranslateCharmap") along the way. Should the old TranslateCharmap map to the new TranslateCharmapEx and inherit the "multicharacter replacement" feature, or should I leave it as it is? > > A remaining problem is how to implement decoding error > > callbacks. In Python 2.1 encoding and decoding errors are > > handled in the same way with a string value. But with > > callbacks it doesn't make sense to use the same callback > > for encoding and decoding (like codecs.StreamReaderWriter > > and codecs.StreamRecoder do). Decoding callbacks have a > > different API. Which arguments should be passed to the > > decoding callback, and what is the decoding callback > > supposed to do? > > I'd suggest adding another set of PyCodec_UnicodeDecode... () > APIs for this. We'd then have to augment the base classes of > the StreamCodecs to provide two attributes for .errors with > a fallback solution for the string case (i.s. "strict" can > still be used for both directions). Sounds good. Now what is the decoding callback supposed to do? I guess it will be called in the same way as the encoding callback, i.e. with encoding name, original string and position of the error. It might returns a Unicode string (i.e. an object of the decoding target type), that will be emitted from the codec instead of the one offending byte. Or it might return a tuple with replacement Unicode object and a resynchronisation offset, i.e. returning (u"?", 1) means emit a '?' and skip the offending character. But to make the offset really useful the callback has to know something about the encoding, perhaps the codec should be allowed to pass an additional state object to the callback? Maybe the same should be added to the encoding callbacks to? Maybe the encoding callback should be able to tell the encoder if the replacement returned should be reencoded (in which case it's a Unicode object), or directly emitted (in which case it's an 8bit string)? > > One additional note: It is vital that errors is an > > assignable attribute of the StreamWriter. > > It is already ! I know, but IMHO it should be documented that an assignable errors attribute must be supported as part of the official codec API. Misc/unicode.txt is not clear on that: """ It is not required by the Unicode implementation to use these base classes, only the interfaces must match; this allows writing Codecs as extension types. """ ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-13 01:05 Message: Logged In: YES user_id=38388 > How the callbacks work: > > A PyObject * named errors is passed in. This may by NULL, > Py_None, 'strict', u'strict', 'ignore', u'ignore', > 'replace', u'replace' or a callable object. > PyCodec_EncodeHandlerForObject maps all of these objects to > one of the three builtin error callbacks > PyCodec_RaiseEncodeErrors (raises an exception), > PyCodec_IgnoreEncodeErrors (returns an empty replacement > string, in effect ignoring the error), > PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode > replacement character to signify to the encoder that it > should choose a suitable replacement character) or directly > returns errors if it is a callable object. When an > unencodable character is encounterd the error handling > callback will be called with the encoding name, the original > unicode object and the error position and must return a > unicode object that will be encoded instead of the offending > character (or the callback may of course raise an > exception). U+FFFD characters in the replacement string will > be replaced with a character that the encoder chooses ('?' > in all cases). Nice. > The implementation of the loop through the string is done in > the following way. A stack with two strings is kept and the > loop always encodes a character from the string at the > stacktop. If an error is encountered and the stack has only > one entry (during encoding of the original string) the > callback is called and the unicode object returned is pushed > on the stack, so the encoding continues with the replacement > string. If the stack has two entries when an error is > encountered, the replacement string itself has an > unencodable character and a normal exception raised. When > the encoder has reached the end of it's current string there > are two possibilities: when the stack contains two entries, > this was the replacement string, so the replacement string > will be poppep from the stack and encoding continues with > the next character from the original string. If the stack > had only one entry, encoding is finished. Very elegant solution ! > (I hope that's enough explanation of the API and implementation) Could you add these docs to the Misc/unicode.txt file ? I will eventually take that file and turn it into a PEP which will then serve as general documentation for these things. > I have renamed the static ...121 function to all lowercase > names. Ok. > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > replacement callback. Hmm, wouldn't that result in a slowdown ? If so, I'd rather leave the special encoder in place, since it is being used a lot in Python and probably some applications too. > PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, > PyCodec_ReplaceEncodeErrors are globally visible because > they have to be available in _codecsmodule.c to wrap them as > Python function objects, but they can't be implemented in > _codecsmodule, because they need to be available to the > encoders in unicodeobject.c (through > PyCodec_EncodeHandlerForObject), but importing the codecs > module might result in an endless recursion, because > importing a module requires unpickling of the bytecode, > which might require decoding utf8, which ... (but this will > only happen, if we implement the same mechanism for the > decoding API) I think that codecs.c is the right place for these APIs. _codecsmodule.c is only meant as Python access wrapper for the internal codecs and nothing more. One thing I noted about the callbacks: they assume that they will always get Unicode objects as input. This is certainly not true in the general case (it is for the codecs you touch in the patch). I think it would be worthwhile to rename the callbacks to include "Unicode" somewhere, e.g. PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but then it points out the application field of the callback rather well. Same for the callbacks exposed through the _codecsmodule. > I have not touched PyUnicode_TranslateCharmap yet, > should this function also support error callbacks? Why would > one want the insert None into the mapping to call the callback? 1. Yes. 2. The user may want to e.g. restrict usage of certain character ranges. In this case the codec would be used to verify the input and an exception would indeed be useful (e.g. say you want to restrict input to Hangul + ASCII). > A remaining problem is how to implement decoding error > callbacks. In Python 2.1 encoding and decoding errors are > handled in the same way with a string value. But with > callbacks it doesn't make sense to use the same callback for > encoding and decoding (like codecs.StreamReaderWriter and > codecs.StreamRecoder do). Decoding callbacks have a > different API. Which arguments should be passed to the > decoding callback, and what is the decoding callback > supposed to do? I'd suggest adding another set of PyCodec_UnicodeDecode...() APIs for this. We'd then have to augment the base classes of the StreamCodecs to provide two attributes for .errors with a fallback solution for the string case (i.s. "strict" can still be used for both directions). > One additional note: It is vital that errors is an > assignable attribute of the StreamWriter. It is already ! > Consider the XML example: For writing an XML DOM tree one > StreamWriter object is used. When a text node is written, > the error handling has to be set to > codecs.xmlreplace_encode_errors, but inside a comment or > processing instruction replacing unencodable characters with > charrefs is not possible, so here codecs.raise_encode_errors > should be used (or better a custom error handler that raises > an error that says "sorry, you can't have unencodable > characters inside a comment") Sure. > BTW, should we continue the discussion in the i18n SIG > mailing list? An email program is much more comfortable than > a HTML textarea! ;) I'd rather keep the discussions on this patch here -- forking it off to the i18n sig will make it very hard to follow up on it. (This HTML area is indeed damn small ;-) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 12:18 Message: Logged In: YES user_id=89016 One additional note: It is vital that errors is an assignable attribute of the StreamWriter. Consider the XML example: For writing an XML DOM tree one StreamWriter object is used. When a text node is written, the error handling has to be set to codecs.xmlreplace_encode_errors, but inside a comment or processing instruction replacing unencodable characters with charrefs is not possible, so here codecs.raise_encode_errors should be used (or better a custom error handler that raises an error that says "sorry, you can't have unencodable characters inside a comment") BTW, should we continue the discussion in the i18n SIG mailing list? An email program is much more comfortable than a HTML textarea! ;) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 11:59 Message: Logged In: YES user_id=89016 How the callbacks work: A PyObject * named errors is passed in. This may by NULL, Py_None, 'strict', u'strict', 'ignore', u'ignore', 'replace', u'replace' or a callable object. PyCodec_EncodeHandlerForObject maps all of these objects to one of the three builtin error callbacks PyCodec_RaiseEncodeErrors (raises an exception), PyCodec_IgnoreEncodeErrors (returns an empty replacement string, in effect ignoring the error), PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode replacement character to signify to the encoder that it should choose a suitable replacement character) or directly returns errors if it is a callable object. When an unencodable character is encounterd the error handling callback will be called with the encoding name, the original unicode object and the error position and must return a unicode object that will be encoded instead of the offending character (or the callback may of course raise an exception). U+FFFD characters in the replacement string will be replaced with a character that the encoder chooses ('?' in all cases). The implementation of the loop through the string is done in the following way. A stack with two strings is kept and the loop always encodes a character from the string at the stacktop. If an error is encountered and the stack has only one entry (during encoding of the original string) the callback is called and the unicode object returned is pushed on the stack, so the encoding continues with the replacement string. If the stack has two entries when an error is encountered, the replacement string itself has an unencodable character and a normal exception raised. When the encoder has reached the end of it's current string there are two possibilities: when the stack contains two entries, this was the replacement string, so the replacement string will be poppep from the stack and encoding continues with the next character from the original string. If the stack had only one entry, encoding is finished. (I hope that's enough explanation of the API and implementation) I have renamed the static ...121 function to all lowercase names. BTW, I guess PyUnicode_EncodeUnicodeEscape could be reimplemented as PyUnicode_EncodeASCII with a \uxxxx replacement callback. PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, PyCodec_ReplaceEncodeErrors are globally visible because they have to be available in _codecsmodule.c to wrap them as Python function objects, but they can't be implemented in _codecsmodule, because they need to be available to the encoders in unicodeobject.c (through PyCodec_EncodeHandlerForObject), but importing the codecs module might result in an endless recursion, because importing a module requires unpickling of the bytecode, which might require decoding utf8, which ... (but this will only happen, if we implement the same mechanism for the decoding API) I have not touched PyUnicode_TranslateCharmap yet, should this function also support error callbacks? Why would one want the insert None into the mapping to call the callback? A remaining problem is how to implement decoding error callbacks. In Python 2.1 encoding and decoding errors are handled in the same way with a string value. But with callbacks it doesn't make sense to use the same callback for encoding and decoding (like codecs.StreamReaderWriter and codecs.StreamRecoder do). Decoding callbacks have a different API. Which arguments should be passed to the decoding callback, and what is the decoding callback supposed to do? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 11:00 Message: Logged In: YES user_id=38388 About the Py_UNICODE*data, int size APIs: Ok, point taken. In general, I think we ought to keep the callback feature as open as possible, so passing in pointers and sizes would not be very useful. BTW, could you summarize how the callback works in a few lines ? About _Encode121: I'd name this _EncodeUCS1 since that's what it is ;-) About the new functions: I was referring to the new static functions which you gave PyUnicode_... names. If these are not supposed to turn into non-static functions, I'd rather have them use lower case names (since that's how the Python internals work too -- most of the times). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:56 Message: Logged In: YES user_id=89016 > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. Another problem is, that the callback requires a Python object, so in the PyObject *version, the refcount is incref'd and the object is passed to the callback. The Py_UNICODE*/int version would have to create a new Unicode object from the data. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:32 Message: Logged In: YES user_id=89016 > * please don't place more than one C statement on one line > like in: > """ > + unicode = unicode2; unicodepos = > unicode2pos; > + unicode2 = NULL; unicode2pos = 0; > """ OK, done! > * Comments should start with a capital letter and be > prepended > to the section they apply to Fixed! > * There should be spaces between arguments in compares > (a == b) not (a==b) Fixed! > * Where does the name "...Encode121" originate ? encode one-to-one, it implements both ASCII and latin-1 encoding. > * module internal APIs should use lower case names (you > converted some of these to PyUnicode_...() -- this is > normally reserved for APIs which are either marked as > potential candidates for the public API or are very > prominent in the code) Which ones? I introduced a new function for every old one, that had a "const char *errors" argument, and a few new ones in codecs.h, of those PyCodec_EncodeHandlerForObject is vital, because it is used to map for old string arguments to the new function objects. PyCodec_RaiseEncodeErrors can be used in the encoder implementation to raise an encode error, but it could be made static in unicodeobject.h so only those encoders implemented there have access to it. > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. I look through the code and found no situation where the Py_UNICODE*/int version is really used and having two (PyObject *)s (the original and the replacement string), instead of UNICODE*/int and PyObject * made the implementation a little easier, but I can fix that. > Please separate the errors.c patch from this patch -- it > seems totally unrelated to Unicode. PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with four hex digits. I removed it. I'll upload a revised patch as soon as it's done. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 07:29 Message: Logged In: YES user_id=38388 Thanks for the patch -- it looks very impressive !. I'll give it a try later this week. Some first cosmetic tidbits: * please don't place more than one C statement on one line like in: """ + unicode = unicode2; unicodepos = unicode2pos; + unicode2 = NULL; unicode2pos = 0; """ * Comments should start with a capital letter and be prepended to the section they apply to * There should be spaces between arguments in compares (a == b) not (a==b) * Where does the name "...Encode121" originate ? * module internal APIs should use lower case names (you converted some of these to PyUnicode_...() -- this is normally reserved for APIs which are either marked as potential candidates for the public API or are very prominent in the code) One thing which I don't like about your API change is that you removed the Py_UNICODE*data, int size style arguments -- this makes it impossible to use the new APIs on non-Python data or data which is not available as Unicode object. Please separate the errors.c patch from this patch -- it seems totally unrelated to Unicode. Thanks. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 From noreply@sourceforge.net Wed Jun 13 16:49:05 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 13 Jun 2001 08:49:05 -0700 Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks Message-ID: Patches item #432401, was updated on 2001-06-12 06:43 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: unicode encoding error callbacks Initial Comment: This patch adds unicode error handling callbacks to the encode functionality. With this patch it's possible to not only pass 'strict', 'ignore' or 'replace' as the errors argument to encode, but also a callable function, that will be called with the encoding name, the original unicode object and the position of the unencodable character. The callback must return a replacement unicode object that will be encoded instead of the original character. For example replacing unencodable characters with XML character references can be done in the following way. u"aäoöuüß".encode( "ascii", lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos]) ) ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2001-06-13 08:49 Message: Logged In: YES user_id=89016 Guido van Rossum wrote in python-dev: > True, the "codec" pattern can be used for other > encodings than Unicode. But it seems to me that the > entire codecs architecture is rather strongly geared > towards en/decoding Unicode, and it's not clear > how well other codecs fit in this pattern (e.g. I > noticed that all the non-Unicode codecs ignore the > error handling parameter or assert that > it is set to 'strict'). I noticed that too. asserting that errors=='strict' would mean that the encoder is not able to deal in any other way with unencodable stuff than by raising an error. But that is not the problem here, because for zlib, base64, quopri, hex and uu encoding there can be no unencodable characters. The encoders can simply ignore the errors parameter. Should I remove the asserts from those codecs and change the docstrings accordingly, or will this be done separately? ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-13 06:57 Message: Logged In: YES user_id=89016 > > [...] > > raise an exception). U+FFFD characters in the replacement > > string will be replaced with a character that the encoder > > chooses ('?' in all cases). > > Nice. But the special casing of U+FFFD makes the interface somewhat less clean than it could be. It was only done to be 100% backwards compatible. With the original "replace" error handling the codec chose the replacement character. But as far as I can tell none of the codecs uses anything other than '?', so I guess we could change the replace handler to always return u'?'. This would make the implementation a little bit simpler, but the explanation of the callback feature *a lot* simpler. And if you still want to handle an unencodable U+FFFD, you can write a special callback for that, e.g. def FFFDreplace(enc, uni, pos): if uni[pos] == "\ufffd": return u"?" else: raise UnicodeError(...) > > The implementation of the loop through the string is done > > in the following way. A stack with two strings is kept > > and the loop always encodes a character from the string > > at the stacktop. If an error is encountered and the stack > > has only one entry (during encoding of the original string) > > the callback is called and the unicode object returned is > > pushed on the stack, so the encoding continues with the > > replacement string. If the stack has two entries when an > > error is encountered, the replacement string itself has > > an unencodable character and a normal exception raised. > > When the encoder has reached the end of it's current string > > there are two possibilities: when the stack contains two > > entries, this was the replacement string, so the replacement > > string will be poppep from the stack and encoding continues > > with the next character from the original string. If the > > stack had only one entry, encoding is finished. > > Very elegant solution ! I'll put it as a comment in the source. > > (I hope that's enough explanation of the API and > implementation) > > Could you add these docs to the Misc/unicode.txt file ? I > will eventually take that file and turn it into a PEP which > will then serve as general documentation for these things. I could, but first we should work out how the decoding callback API will work. > > I have renamed the static ...121 function to all lowercase > > names. > > Ok. > > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > > replacement callback. > > Hmm, wouldn't that result in a slowdown ? If so, I'd rather > leave the special encoder in place, since it is being used a > lot in Python and probably some applications too. It would be a slowdown. But callbacks open many possiblities. For example: Why can't I print u"gürk"? is probably one of the most frequently asked questions in comp.lang.python. For printing Unicode stuff, print could be extended the use an error handling callback for Unicode strings (or objects where __str__ or tp_str returns a Unicode object) instead of using str() which always returns an 8bit string and uses strict encoding. There might even be a sys.setprintencodehandler()/sys.getprintencodehandler() > [...] > I think it would be worthwhile to rename the callbacks to > include "Unicode" somewhere, e.g. > PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but > then it points out the application field of the callback > rather well. Same for the callbacks exposed through the > _codecsmodule. OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors really is a long name ;)) > > I have not touched PyUnicode_TranslateCharmap yet, > > should this function also support error callbacks? Why > > would one want the insert None into the mapping to call > > the callback? > > 1. Yes. > 2. The user may want to e.g. restrict usage of certain > character ranges. In this case the codec would be used to > verify the input and an exception would indeed be useful > (e.g. say you want to restrict input to Hangul + ASCII). OK, do we want TranslateCharmap to work exactly like encoding, i.e. in case of an error should the returned replacement string again be mapped through the translation mapping or should it be copied to the output directly? The former would be more in line with encoding, but IMHO the latter would be much more useful. BTW, when I implement it I can implement patch #403100 ("Multicharacter replacements in PyUnicode_TranslateCharmap") along the way. Should the old TranslateCharmap map to the new TranslateCharmapEx and inherit the "multicharacter replacement" feature, or should I leave it as it is? > > A remaining problem is how to implement decoding error > > callbacks. In Python 2.1 encoding and decoding errors are > > handled in the same way with a string value. But with > > callbacks it doesn't make sense to use the same callback > > for encoding and decoding (like codecs.StreamReaderWriter > > and codecs.StreamRecoder do). Decoding callbacks have a > > different API. Which arguments should be passed to the > > decoding callback, and what is the decoding callback > > supposed to do? > > I'd suggest adding another set of PyCodec_UnicodeDecode... () > APIs for this. We'd then have to augment the base classes of > the StreamCodecs to provide two attributes for .errors with > a fallback solution for the string case (i.s. "strict" can > still be used for both directions). Sounds good. Now what is the decoding callback supposed to do? I guess it will be called in the same way as the encoding callback, i.e. with encoding name, original string and position of the error. It might returns a Unicode string (i.e. an object of the decoding target type), that will be emitted from the codec instead of the one offending byte. Or it might return a tuple with replacement Unicode object and a resynchronisation offset, i.e. returning (u"?", 1) means emit a '?' and skip the offending character. But to make the offset really useful the callback has to know something about the encoding, perhaps the codec should be allowed to pass an additional state object to the callback? Maybe the same should be added to the encoding callbacks to? Maybe the encoding callback should be able to tell the encoder if the replacement returned should be reencoded (in which case it's a Unicode object), or directly emitted (in which case it's an 8bit string)? > > One additional note: It is vital that errors is an > > assignable attribute of the StreamWriter. > > It is already ! I know, but IMHO it should be documented that an assignable errors attribute must be supported as part of the official codec API. Misc/unicode.txt is not clear on that: """ It is not required by the Unicode implementation to use these base classes, only the interfaces must match; this allows writing Codecs as extension types. """ ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-13 01:05 Message: Logged In: YES user_id=38388 > How the callbacks work: > > A PyObject * named errors is passed in. This may by NULL, > Py_None, 'strict', u'strict', 'ignore', u'ignore', > 'replace', u'replace' or a callable object. > PyCodec_EncodeHandlerForObject maps all of these objects to > one of the three builtin error callbacks > PyCodec_RaiseEncodeErrors (raises an exception), > PyCodec_IgnoreEncodeErrors (returns an empty replacement > string, in effect ignoring the error), > PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode > replacement character to signify to the encoder that it > should choose a suitable replacement character) or directly > returns errors if it is a callable object. When an > unencodable character is encounterd the error handling > callback will be called with the encoding name, the original > unicode object and the error position and must return a > unicode object that will be encoded instead of the offending > character (or the callback may of course raise an > exception). U+FFFD characters in the replacement string will > be replaced with a character that the encoder chooses ('?' > in all cases). Nice. > The implementation of the loop through the string is done in > the following way. A stack with two strings is kept and the > loop always encodes a character from the string at the > stacktop. If an error is encountered and the stack has only > one entry (during encoding of the original string) the > callback is called and the unicode object returned is pushed > on the stack, so the encoding continues with the replacement > string. If the stack has two entries when an error is > encountered, the replacement string itself has an > unencodable character and a normal exception raised. When > the encoder has reached the end of it's current string there > are two possibilities: when the stack contains two entries, > this was the replacement string, so the replacement string > will be poppep from the stack and encoding continues with > the next character from the original string. If the stack > had only one entry, encoding is finished. Very elegant solution ! > (I hope that's enough explanation of the API and implementation) Could you add these docs to the Misc/unicode.txt file ? I will eventually take that file and turn it into a PEP which will then serve as general documentation for these things. > I have renamed the static ...121 function to all lowercase > names. Ok. > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > replacement callback. Hmm, wouldn't that result in a slowdown ? If so, I'd rather leave the special encoder in place, since it is being used a lot in Python and probably some applications too. > PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, > PyCodec_ReplaceEncodeErrors are globally visible because > they have to be available in _codecsmodule.c to wrap them as > Python function objects, but they can't be implemented in > _codecsmodule, because they need to be available to the > encoders in unicodeobject.c (through > PyCodec_EncodeHandlerForObject), but importing the codecs > module might result in an endless recursion, because > importing a module requires unpickling of the bytecode, > which might require decoding utf8, which ... (but this will > only happen, if we implement the same mechanism for the > decoding API) I think that codecs.c is the right place for these APIs. _codecsmodule.c is only meant as Python access wrapper for the internal codecs and nothing more. One thing I noted about the callbacks: they assume that they will always get Unicode objects as input. This is certainly not true in the general case (it is for the codecs you touch in the patch). I think it would be worthwhile to rename the callbacks to include "Unicode" somewhere, e.g. PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but then it points out the application field of the callback rather well. Same for the callbacks exposed through the _codecsmodule. > I have not touched PyUnicode_TranslateCharmap yet, > should this function also support error callbacks? Why would > one want the insert None into the mapping to call the callback? 1. Yes. 2. The user may want to e.g. restrict usage of certain character ranges. In this case the codec would be used to verify the input and an exception would indeed be useful (e.g. say you want to restrict input to Hangul + ASCII). > A remaining problem is how to implement decoding error > callbacks. In Python 2.1 encoding and decoding errors are > handled in the same way with a string value. But with > callbacks it doesn't make sense to use the same callback for > encoding and decoding (like codecs.StreamReaderWriter and > codecs.StreamRecoder do). Decoding callbacks have a > different API. Which arguments should be passed to the > decoding callback, and what is the decoding callback > supposed to do? I'd suggest adding another set of PyCodec_UnicodeDecode...() APIs for this. We'd then have to augment the base classes of the StreamCodecs to provide two attributes for .errors with a fallback solution for the string case (i.s. "strict" can still be used for both directions). > One additional note: It is vital that errors is an > assignable attribute of the StreamWriter. It is already ! > Consider the XML example: For writing an XML DOM tree one > StreamWriter object is used. When a text node is written, > the error handling has to be set to > codecs.xmlreplace_encode_errors, but inside a comment or > processing instruction replacing unencodable characters with > charrefs is not possible, so here codecs.raise_encode_errors > should be used (or better a custom error handler that raises > an error that says "sorry, you can't have unencodable > characters inside a comment") Sure. > BTW, should we continue the discussion in the i18n SIG > mailing list? An email program is much more comfortable than > a HTML textarea! ;) I'd rather keep the discussions on this patch here -- forking it off to the i18n sig will make it very hard to follow up on it. (This HTML area is indeed damn small ;-) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 12:18 Message: Logged In: YES user_id=89016 One additional note: It is vital that errors is an assignable attribute of the StreamWriter. Consider the XML example: For writing an XML DOM tree one StreamWriter object is used. When a text node is written, the error handling has to be set to codecs.xmlreplace_encode_errors, but inside a comment or processing instruction replacing unencodable characters with charrefs is not possible, so here codecs.raise_encode_errors should be used (or better a custom error handler that raises an error that says "sorry, you can't have unencodable characters inside a comment") BTW, should we continue the discussion in the i18n SIG mailing list? An email program is much more comfortable than a HTML textarea! ;) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 11:59 Message: Logged In: YES user_id=89016 How the callbacks work: A PyObject * named errors is passed in. This may by NULL, Py_None, 'strict', u'strict', 'ignore', u'ignore', 'replace', u'replace' or a callable object. PyCodec_EncodeHandlerForObject maps all of these objects to one of the three builtin error callbacks PyCodec_RaiseEncodeErrors (raises an exception), PyCodec_IgnoreEncodeErrors (returns an empty replacement string, in effect ignoring the error), PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode replacement character to signify to the encoder that it should choose a suitable replacement character) or directly returns errors if it is a callable object. When an unencodable character is encounterd the error handling callback will be called with the encoding name, the original unicode object and the error position and must return a unicode object that will be encoded instead of the offending character (or the callback may of course raise an exception). U+FFFD characters in the replacement string will be replaced with a character that the encoder chooses ('?' in all cases). The implementation of the loop through the string is done in the following way. A stack with two strings is kept and the loop always encodes a character from the string at the stacktop. If an error is encountered and the stack has only one entry (during encoding of the original string) the callback is called and the unicode object returned is pushed on the stack, so the encoding continues with the replacement string. If the stack has two entries when an error is encountered, the replacement string itself has an unencodable character and a normal exception raised. When the encoder has reached the end of it's current string there are two possibilities: when the stack contains two entries, this was the replacement string, so the replacement string will be poppep from the stack and encoding continues with the next character from the original string. If the stack had only one entry, encoding is finished. (I hope that's enough explanation of the API and implementation) I have renamed the static ...121 function to all lowercase names. BTW, I guess PyUnicode_EncodeUnicodeEscape could be reimplemented as PyUnicode_EncodeASCII with a \uxxxx replacement callback. PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, PyCodec_ReplaceEncodeErrors are globally visible because they have to be available in _codecsmodule.c to wrap them as Python function objects, but they can't be implemented in _codecsmodule, because they need to be available to the encoders in unicodeobject.c (through PyCodec_EncodeHandlerForObject), but importing the codecs module might result in an endless recursion, because importing a module requires unpickling of the bytecode, which might require decoding utf8, which ... (but this will only happen, if we implement the same mechanism for the decoding API) I have not touched PyUnicode_TranslateCharmap yet, should this function also support error callbacks? Why would one want the insert None into the mapping to call the callback? A remaining problem is how to implement decoding error callbacks. In Python 2.1 encoding and decoding errors are handled in the same way with a string value. But with callbacks it doesn't make sense to use the same callback for encoding and decoding (like codecs.StreamReaderWriter and codecs.StreamRecoder do). Decoding callbacks have a different API. Which arguments should be passed to the decoding callback, and what is the decoding callback supposed to do? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 11:00 Message: Logged In: YES user_id=38388 About the Py_UNICODE*data, int size APIs: Ok, point taken. In general, I think we ought to keep the callback feature as open as possible, so passing in pointers and sizes would not be very useful. BTW, could you summarize how the callback works in a few lines ? About _Encode121: I'd name this _EncodeUCS1 since that's what it is ;-) About the new functions: I was referring to the new static functions which you gave PyUnicode_... names. If these are not supposed to turn into non-static functions, I'd rather have them use lower case names (since that's how the Python internals work too -- most of the times). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:56 Message: Logged In: YES user_id=89016 > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. Another problem is, that the callback requires a Python object, so in the PyObject *version, the refcount is incref'd and the object is passed to the callback. The Py_UNICODE*/int version would have to create a new Unicode object from the data. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:32 Message: Logged In: YES user_id=89016 > * please don't place more than one C statement on one line > like in: > """ > + unicode = unicode2; unicodepos = > unicode2pos; > + unicode2 = NULL; unicode2pos = 0; > """ OK, done! > * Comments should start with a capital letter and be > prepended > to the section they apply to Fixed! > * There should be spaces between arguments in compares > (a == b) not (a==b) Fixed! > * Where does the name "...Encode121" originate ? encode one-to-one, it implements both ASCII and latin-1 encoding. > * module internal APIs should use lower case names (you > converted some of these to PyUnicode_...() -- this is > normally reserved for APIs which are either marked as > potential candidates for the public API or are very > prominent in the code) Which ones? I introduced a new function for every old one, that had a "const char *errors" argument, and a few new ones in codecs.h, of those PyCodec_EncodeHandlerForObject is vital, because it is used to map for old string arguments to the new function objects. PyCodec_RaiseEncodeErrors can be used in the encoder implementation to raise an encode error, but it could be made static in unicodeobject.h so only those encoders implemented there have access to it. > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. I look through the code and found no situation where the Py_UNICODE*/int version is really used and having two (PyObject *)s (the original and the replacement string), instead of UNICODE*/int and PyObject * made the implementation a little easier, but I can fix that. > Please separate the errors.c patch from this patch -- it > seems totally unrelated to Unicode. PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with four hex digits. I removed it. I'll upload a revised patch as soon as it's done. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 07:29 Message: Logged In: YES user_id=38388 Thanks for the patch -- it looks very impressive !. I'll give it a try later this week. Some first cosmetic tidbits: * please don't place more than one C statement on one line like in: """ + unicode = unicode2; unicodepos = unicode2pos; + unicode2 = NULL; unicode2pos = 0; """ * Comments should start with a capital letter and be prepended to the section they apply to * There should be spaces between arguments in compares (a == b) not (a==b) * Where does the name "...Encode121" originate ? * module internal APIs should use lower case names (you converted some of these to PyUnicode_...() -- this is normally reserved for APIs which are either marked as potential candidates for the public API or are very prominent in the code) One thing which I don't like about your API change is that you removed the Py_UNICODE*data, int size style arguments -- this makes it impossible to use the new APIs on non-Python data or data which is not available as Unicode object. Please separate the errors.c patch from this patch -- it seems totally unrelated to Unicode. Thanks. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 From noreply@sourceforge.net Wed Jun 13 17:25:51 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 13 Jun 2001 09:25:51 -0700 Subject: [Patches] [ python-Patches-432183 ] PEP-259: skip printing newline*2 Message-ID: Patches item #432183, was updated on 2001-06-11 13:00 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432183&group_id=5470 Category: core (C code) Group: None Status: Open >Resolution: Rejected Priority: 5 Submitted By: Guido van Rossum (gvanrossum) Assigned to: Nobody/Anonymous (nobody) Summary: PEP-259: skip printing newline*2 Initial Comment: See PEP 259 (to be checked in soon). This suppresses the printing of an extra newline when the last item printed is a string ending in a newline. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-13 09:25 Message: Logged In: YES user_id=6380 This was unanimously rejected by the user community, so I'm dropping the idea. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2001-06-10 02:12 Message: Logged In: YES user_id=6656 I think you also want: Index: code.py =================================================================== RCS file: /cvsroot/python/python/dist/src/Lib/code.py,v retrieving revision 1.16 diff -c -1 -r1.16 code.py *** code.py 2001/05/03 04:58:49 1.16 --- code.py 2001/06/11 22:11:29 *************** *** 106,108 **** else: ! if softspace(sys.stdout, 0): print --- 106,108 ---- else: ! if softspace(sys.stdout, 0) >= 0: print (not tested) ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432183&group_id=5470 From noreply@sourceforge.net Wed Jun 13 17:26:09 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 13 Jun 2001 09:26:09 -0700 Subject: [Patches] [ python-Patches-432183 ] PEP-259: skip printing newline*2 Message-ID: Patches item #432183, was updated on 2001-06-11 13:00 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432183&group_id=5470 Category: core (C code) Group: None >Status: Closed Resolution: Rejected Priority: 5 Submitted By: Guido van Rossum (gvanrossum) >Assigned to: Guido van Rossum (gvanrossum) Summary: PEP-259: skip printing newline*2 Initial Comment: See PEP 259 (to be checked in soon). This suppresses the printing of an extra newline when the last item printed is a string ending in a newline. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-13 09:25 Message: Logged In: YES user_id=6380 This was unanimously rejected by the user community, so I'm dropping the idea. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2001-06-10 02:12 Message: Logged In: YES user_id=6656 I think you also want: Index: code.py =================================================================== RCS file: /cvsroot/python/python/dist/src/Lib/code.py,v retrieving revision 1.16 diff -c -1 -r1.16 code.py *** code.py 2001/05/03 04:58:49 1.16 --- code.py 2001/06/11 22:11:29 *************** *** 106,108 **** else: ! if softspace(sys.stdout, 0): print --- 106,108 ---- else: ! if softspace(sys.stdout, 0) >= 0: print (not tested) ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432183&group_id=5470 From noreply@sourceforge.net Wed Jun 13 18:00:58 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 13 Jun 2001 10:00:58 -0700 Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks Message-ID: Patches item #432401, was updated on 2001-06-12 06:43 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: unicode encoding error callbacks Initial Comment: This patch adds unicode error handling callbacks to the encode functionality. With this patch it's possible to not only pass 'strict', 'ignore' or 'replace' as the errors argument to encode, but also a callable function, that will be called with the encoding name, the original unicode object and the position of the unencodable character. The callback must return a replacement unicode object that will be encoded instead of the original character. For example replacing unencodable characters with XML character references can be done in the following way. u"aäoöuüß".encode( "ascii", lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos]) ) ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-13 10:00 Message: Logged In: YES user_id=38388 On your comment about the non-Unicode codecs: let's keep this separated from the current patch. Don't have much time today. I'll comment on the other things tomorrow. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-13 08:49 Message: Logged In: YES user_id=89016 Guido van Rossum wrote in python-dev: > True, the "codec" pattern can be used for other > encodings than Unicode. But it seems to me that the > entire codecs architecture is rather strongly geared > towards en/decoding Unicode, and it's not clear > how well other codecs fit in this pattern (e.g. I > noticed that all the non-Unicode codecs ignore the > error handling parameter or assert that > it is set to 'strict'). I noticed that too. asserting that errors=='strict' would mean that the encoder is not able to deal in any other way with unencodable stuff than by raising an error. But that is not the problem here, because for zlib, base64, quopri, hex and uu encoding there can be no unencodable characters. The encoders can simply ignore the errors parameter. Should I remove the asserts from those codecs and change the docstrings accordingly, or will this be done separately? ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-13 06:57 Message: Logged In: YES user_id=89016 > > [...] > > raise an exception). U+FFFD characters in the replacement > > string will be replaced with a character that the encoder > > chooses ('?' in all cases). > > Nice. But the special casing of U+FFFD makes the interface somewhat less clean than it could be. It was only done to be 100% backwards compatible. With the original "replace" error handling the codec chose the replacement character. But as far as I can tell none of the codecs uses anything other than '?', so I guess we could change the replace handler to always return u'?'. This would make the implementation a little bit simpler, but the explanation of the callback feature *a lot* simpler. And if you still want to handle an unencodable U+FFFD, you can write a special callback for that, e.g. def FFFDreplace(enc, uni, pos): if uni[pos] == "\ufffd": return u"?" else: raise UnicodeError(...) > > The implementation of the loop through the string is done > > in the following way. A stack with two strings is kept > > and the loop always encodes a character from the string > > at the stacktop. If an error is encountered and the stack > > has only one entry (during encoding of the original string) > > the callback is called and the unicode object returned is > > pushed on the stack, so the encoding continues with the > > replacement string. If the stack has two entries when an > > error is encountered, the replacement string itself has > > an unencodable character and a normal exception raised. > > When the encoder has reached the end of it's current string > > there are two possibilities: when the stack contains two > > entries, this was the replacement string, so the replacement > > string will be poppep from the stack and encoding continues > > with the next character from the original string. If the > > stack had only one entry, encoding is finished. > > Very elegant solution ! I'll put it as a comment in the source. > > (I hope that's enough explanation of the API and > implementation) > > Could you add these docs to the Misc/unicode.txt file ? I > will eventually take that file and turn it into a PEP which > will then serve as general documentation for these things. I could, but first we should work out how the decoding callback API will work. > > I have renamed the static ...121 function to all lowercase > > names. > > Ok. > > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > > replacement callback. > > Hmm, wouldn't that result in a slowdown ? If so, I'd rather > leave the special encoder in place, since it is being used a > lot in Python and probably some applications too. It would be a slowdown. But callbacks open many possiblities. For example: Why can't I print u"gürk"? is probably one of the most frequently asked questions in comp.lang.python. For printing Unicode stuff, print could be extended the use an error handling callback for Unicode strings (or objects where __str__ or tp_str returns a Unicode object) instead of using str() which always returns an 8bit string and uses strict encoding. There might even be a sys.setprintencodehandler()/sys.getprintencodehandler() > [...] > I think it would be worthwhile to rename the callbacks to > include "Unicode" somewhere, e.g. > PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but > then it points out the application field of the callback > rather well. Same for the callbacks exposed through the > _codecsmodule. OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors really is a long name ;)) > > I have not touched PyUnicode_TranslateCharmap yet, > > should this function also support error callbacks? Why > > would one want the insert None into the mapping to call > > the callback? > > 1. Yes. > 2. The user may want to e.g. restrict usage of certain > character ranges. In this case the codec would be used to > verify the input and an exception would indeed be useful > (e.g. say you want to restrict input to Hangul + ASCII). OK, do we want TranslateCharmap to work exactly like encoding, i.e. in case of an error should the returned replacement string again be mapped through the translation mapping or should it be copied to the output directly? The former would be more in line with encoding, but IMHO the latter would be much more useful. BTW, when I implement it I can implement patch #403100 ("Multicharacter replacements in PyUnicode_TranslateCharmap") along the way. Should the old TranslateCharmap map to the new TranslateCharmapEx and inherit the "multicharacter replacement" feature, or should I leave it as it is? > > A remaining problem is how to implement decoding error > > callbacks. In Python 2.1 encoding and decoding errors are > > handled in the same way with a string value. But with > > callbacks it doesn't make sense to use the same callback > > for encoding and decoding (like codecs.StreamReaderWriter > > and codecs.StreamRecoder do). Decoding callbacks have a > > different API. Which arguments should be passed to the > > decoding callback, and what is the decoding callback > > supposed to do? > > I'd suggest adding another set of PyCodec_UnicodeDecode... () > APIs for this. We'd then have to augment the base classes of > the StreamCodecs to provide two attributes for .errors with > a fallback solution for the string case (i.s. "strict" can > still be used for both directions). Sounds good. Now what is the decoding callback supposed to do? I guess it will be called in the same way as the encoding callback, i.e. with encoding name, original string and position of the error. It might returns a Unicode string (i.e. an object of the decoding target type), that will be emitted from the codec instead of the one offending byte. Or it might return a tuple with replacement Unicode object and a resynchronisation offset, i.e. returning (u"?", 1) means emit a '?' and skip the offending character. But to make the offset really useful the callback has to know something about the encoding, perhaps the codec should be allowed to pass an additional state object to the callback? Maybe the same should be added to the encoding callbacks to? Maybe the encoding callback should be able to tell the encoder if the replacement returned should be reencoded (in which case it's a Unicode object), or directly emitted (in which case it's an 8bit string)? > > One additional note: It is vital that errors is an > > assignable attribute of the StreamWriter. > > It is already ! I know, but IMHO it should be documented that an assignable errors attribute must be supported as part of the official codec API. Misc/unicode.txt is not clear on that: """ It is not required by the Unicode implementation to use these base classes, only the interfaces must match; this allows writing Codecs as extension types. """ ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-13 01:05 Message: Logged In: YES user_id=38388 > How the callbacks work: > > A PyObject * named errors is passed in. This may by NULL, > Py_None, 'strict', u'strict', 'ignore', u'ignore', > 'replace', u'replace' or a callable object. > PyCodec_EncodeHandlerForObject maps all of these objects to > one of the three builtin error callbacks > PyCodec_RaiseEncodeErrors (raises an exception), > PyCodec_IgnoreEncodeErrors (returns an empty replacement > string, in effect ignoring the error), > PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode > replacement character to signify to the encoder that it > should choose a suitable replacement character) or directly > returns errors if it is a callable object. When an > unencodable character is encounterd the error handling > callback will be called with the encoding name, the original > unicode object and the error position and must return a > unicode object that will be encoded instead of the offending > character (or the callback may of course raise an > exception). U+FFFD characters in the replacement string will > be replaced with a character that the encoder chooses ('?' > in all cases). Nice. > The implementation of the loop through the string is done in > the following way. A stack with two strings is kept and the > loop always encodes a character from the string at the > stacktop. If an error is encountered and the stack has only > one entry (during encoding of the original string) the > callback is called and the unicode object returned is pushed > on the stack, so the encoding continues with the replacement > string. If the stack has two entries when an error is > encountered, the replacement string itself has an > unencodable character and a normal exception raised. When > the encoder has reached the end of it's current string there > are two possibilities: when the stack contains two entries, > this was the replacement string, so the replacement string > will be poppep from the stack and encoding continues with > the next character from the original string. If the stack > had only one entry, encoding is finished. Very elegant solution ! > (I hope that's enough explanation of the API and implementation) Could you add these docs to the Misc/unicode.txt file ? I will eventually take that file and turn it into a PEP which will then serve as general documentation for these things. > I have renamed the static ...121 function to all lowercase > names. Ok. > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > replacement callback. Hmm, wouldn't that result in a slowdown ? If so, I'd rather leave the special encoder in place, since it is being used a lot in Python and probably some applications too. > PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, > PyCodec_ReplaceEncodeErrors are globally visible because > they have to be available in _codecsmodule.c to wrap them as > Python function objects, but they can't be implemented in > _codecsmodule, because they need to be available to the > encoders in unicodeobject.c (through > PyCodec_EncodeHandlerForObject), but importing the codecs > module might result in an endless recursion, because > importing a module requires unpickling of the bytecode, > which might require decoding utf8, which ... (but this will > only happen, if we implement the same mechanism for the > decoding API) I think that codecs.c is the right place for these APIs. _codecsmodule.c is only meant as Python access wrapper for the internal codecs and nothing more. One thing I noted about the callbacks: they assume that they will always get Unicode objects as input. This is certainly not true in the general case (it is for the codecs you touch in the patch). I think it would be worthwhile to rename the callbacks to include "Unicode" somewhere, e.g. PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but then it points out the application field of the callback rather well. Same for the callbacks exposed through the _codecsmodule. > I have not touched PyUnicode_TranslateCharmap yet, > should this function also support error callbacks? Why would > one want the insert None into the mapping to call the callback? 1. Yes. 2. The user may want to e.g. restrict usage of certain character ranges. In this case the codec would be used to verify the input and an exception would indeed be useful (e.g. say you want to restrict input to Hangul + ASCII). > A remaining problem is how to implement decoding error > callbacks. In Python 2.1 encoding and decoding errors are > handled in the same way with a string value. But with > callbacks it doesn't make sense to use the same callback for > encoding and decoding (like codecs.StreamReaderWriter and > codecs.StreamRecoder do). Decoding callbacks have a > different API. Which arguments should be passed to the > decoding callback, and what is the decoding callback > supposed to do? I'd suggest adding another set of PyCodec_UnicodeDecode...() APIs for this. We'd then have to augment the base classes of the StreamCodecs to provide two attributes for .errors with a fallback solution for the string case (i.s. "strict" can still be used for both directions). > One additional note: It is vital that errors is an > assignable attribute of the StreamWriter. It is already ! > Consider the XML example: For writing an XML DOM tree one > StreamWriter object is used. When a text node is written, > the error handling has to be set to > codecs.xmlreplace_encode_errors, but inside a comment or > processing instruction replacing unencodable characters with > charrefs is not possible, so here codecs.raise_encode_errors > should be used (or better a custom error handler that raises > an error that says "sorry, you can't have unencodable > characters inside a comment") Sure. > BTW, should we continue the discussion in the i18n SIG > mailing list? An email program is much more comfortable than > a HTML textarea! ;) I'd rather keep the discussions on this patch here -- forking it off to the i18n sig will make it very hard to follow up on it. (This HTML area is indeed damn small ;-) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 12:18 Message: Logged In: YES user_id=89016 One additional note: It is vital that errors is an assignable attribute of the StreamWriter. Consider the XML example: For writing an XML DOM tree one StreamWriter object is used. When a text node is written, the error handling has to be set to codecs.xmlreplace_encode_errors, but inside a comment or processing instruction replacing unencodable characters with charrefs is not possible, so here codecs.raise_encode_errors should be used (or better a custom error handler that raises an error that says "sorry, you can't have unencodable characters inside a comment") BTW, should we continue the discussion in the i18n SIG mailing list? An email program is much more comfortable than a HTML textarea! ;) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 11:59 Message: Logged In: YES user_id=89016 How the callbacks work: A PyObject * named errors is passed in. This may by NULL, Py_None, 'strict', u'strict', 'ignore', u'ignore', 'replace', u'replace' or a callable object. PyCodec_EncodeHandlerForObject maps all of these objects to one of the three builtin error callbacks PyCodec_RaiseEncodeErrors (raises an exception), PyCodec_IgnoreEncodeErrors (returns an empty replacement string, in effect ignoring the error), PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode replacement character to signify to the encoder that it should choose a suitable replacement character) or directly returns errors if it is a callable object. When an unencodable character is encounterd the error handling callback will be called with the encoding name, the original unicode object and the error position and must return a unicode object that will be encoded instead of the offending character (or the callback may of course raise an exception). U+FFFD characters in the replacement string will be replaced with a character that the encoder chooses ('?' in all cases). The implementation of the loop through the string is done in the following way. A stack with two strings is kept and the loop always encodes a character from the string at the stacktop. If an error is encountered and the stack has only one entry (during encoding of the original string) the callback is called and the unicode object returned is pushed on the stack, so the encoding continues with the replacement string. If the stack has two entries when an error is encountered, the replacement string itself has an unencodable character and a normal exception raised. When the encoder has reached the end of it's current string there are two possibilities: when the stack contains two entries, this was the replacement string, so the replacement string will be poppep from the stack and encoding continues with the next character from the original string. If the stack had only one entry, encoding is finished. (I hope that's enough explanation of the API and implementation) I have renamed the static ...121 function to all lowercase names. BTW, I guess PyUnicode_EncodeUnicodeEscape could be reimplemented as PyUnicode_EncodeASCII with a \uxxxx replacement callback. PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, PyCodec_ReplaceEncodeErrors are globally visible because they have to be available in _codecsmodule.c to wrap them as Python function objects, but they can't be implemented in _codecsmodule, because they need to be available to the encoders in unicodeobject.c (through PyCodec_EncodeHandlerForObject), but importing the codecs module might result in an endless recursion, because importing a module requires unpickling of the bytecode, which might require decoding utf8, which ... (but this will only happen, if we implement the same mechanism for the decoding API) I have not touched PyUnicode_TranslateCharmap yet, should this function also support error callbacks? Why would one want the insert None into the mapping to call the callback? A remaining problem is how to implement decoding error callbacks. In Python 2.1 encoding and decoding errors are handled in the same way with a string value. But with callbacks it doesn't make sense to use the same callback for encoding and decoding (like codecs.StreamReaderWriter and codecs.StreamRecoder do). Decoding callbacks have a different API. Which arguments should be passed to the decoding callback, and what is the decoding callback supposed to do? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 11:00 Message: Logged In: YES user_id=38388 About the Py_UNICODE*data, int size APIs: Ok, point taken. In general, I think we ought to keep the callback feature as open as possible, so passing in pointers and sizes would not be very useful. BTW, could you summarize how the callback works in a few lines ? About _Encode121: I'd name this _EncodeUCS1 since that's what it is ;-) About the new functions: I was referring to the new static functions which you gave PyUnicode_... names. If these are not supposed to turn into non-static functions, I'd rather have them use lower case names (since that's how the Python internals work too -- most of the times). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:56 Message: Logged In: YES user_id=89016 > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. Another problem is, that the callback requires a Python object, so in the PyObject *version, the refcount is incref'd and the object is passed to the callback. The Py_UNICODE*/int version would have to create a new Unicode object from the data. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:32 Message: Logged In: YES user_id=89016 > * please don't place more than one C statement on one line > like in: > """ > + unicode = unicode2; unicodepos = > unicode2pos; > + unicode2 = NULL; unicode2pos = 0; > """ OK, done! > * Comments should start with a capital letter and be > prepended > to the section they apply to Fixed! > * There should be spaces between arguments in compares > (a == b) not (a==b) Fixed! > * Where does the name "...Encode121" originate ? encode one-to-one, it implements both ASCII and latin-1 encoding. > * module internal APIs should use lower case names (you > converted some of these to PyUnicode_...() -- this is > normally reserved for APIs which are either marked as > potential candidates for the public API or are very > prominent in the code) Which ones? I introduced a new function for every old one, that had a "const char *errors" argument, and a few new ones in codecs.h, of those PyCodec_EncodeHandlerForObject is vital, because it is used to map for old string arguments to the new function objects. PyCodec_RaiseEncodeErrors can be used in the encoder implementation to raise an encode error, but it could be made static in unicodeobject.h so only those encoders implemented there have access to it. > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. I look through the code and found no situation where the Py_UNICODE*/int version is really used and having two (PyObject *)s (the original and the replacement string), instead of UNICODE*/int and PyObject * made the implementation a little easier, but I can fix that. > Please separate the errors.c patch from this patch -- it > seems totally unrelated to Unicode. PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with four hex digits. I removed it. I'll upload a revised patch as soon as it's done. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 07:29 Message: Logged In: YES user_id=38388 Thanks for the patch -- it looks very impressive !. I'll give it a try later this week. Some first cosmetic tidbits: * please don't place more than one C statement on one line like in: """ + unicode = unicode2; unicodepos = unicode2pos; + unicode2 = NULL; unicode2pos = 0; """ * Comments should start with a capital letter and be prepended to the section they apply to * There should be spaces between arguments in compares (a == b) not (a==b) * Where does the name "...Encode121" originate ? * module internal APIs should use lower case names (you converted some of these to PyUnicode_...() -- this is normally reserved for APIs which are either marked as potential candidates for the public API or are very prominent in the code) One thing which I don't like about your API change is that you removed the Py_UNICODE*data, int size style arguments -- this makes it impossible to use the new APIs on non-Python data or data which is not available as Unicode object. Please separate the errors.c patch from this patch -- it seems totally unrelated to Unicode. Thanks. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 From noreply@sourceforge.net Thu Jun 14 21:26:13 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 14 Jun 2001 13:26:13 -0700 Subject: [Patches] [ python-Patches-400938 ] [Draft] libpython as shared library (.so) on Linux Message-ID: Patches item #400938, was updated on 2000-07-19 13:55 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=400938&group_id=5470 Category: None Group: None Status: Open Resolution: Out of Date Priority: 5 Submitted By: Gregor Hoffleit (flight) >Assigned to: Guido van Rossum (gvanrossum) Summary: [Draft] libpython as shared library (.so) on Linux Initial Comment: ---------------------------------------------------------------------- Comment By: Gregor Hoffleit (flight) Date: 2001-06-12 15:19 Message: Logged In: YES user_id=5293 > Now we're just waiting for someone to produce a working patch. > Or is there one already? I'm currently distributing experimental packages of Python 2.1 for Debian. The packages include a hack to build libpython2.1 as .so for Linux. The shared library patch currently is buried in a big diff file. You can get it as http://people.debian.org/~flight/python2/python2_2.1-0.diff.gz This is only a starting point for a real patch! ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-12 09:54 Message: Logged In: YES user_id=6380 Reopening -- this keeps being requested. Now we're just waiting for someone to produce a working patch. Or is there one already? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-12 09:51 Message: Logged In: YES user_id=6380 Reopening -- this keeps being requested. Now we're just waiting for someone to produce a working patch. Or is there one already? ---------------------------------------------------------------------- Comment By: Neil Schemenauer (nascheme) Date: 2001-03-21 15:59 Message: Logged In: YES user_id=35752 We're going to have to create a new patch to do this. This one is way too out of date. Maybe for 2.2. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-01-19 14:46 Message: I'm reassigning this to Neil. Neil, can you see if you can integrate this into your flat Makefile? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-01-17 15:09 Message: Andrew, I'm tentatively reassigning this to you, since you're taking charge of the build process at the moment (setup.py). I suspect that the patch no longer works as is -- would it make sense to mark it postponed and get the author to submit a new version before we release 2.1a1? ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-01-17 14:46 Message: Getting this patch into the next version of Python would be "A Good Thing"(tm) in my opinion. We use libpython as a .so at ILM and end up having to make changes like this by hand every time we get a new version... ---------------------------------------------------------------------- Comment By: Moshe Zadka (moshez) Date: 2000-11-01 03:32 Message: I've had a look at the patch, and it seems it has two orthogonal parts. One is adding the infrastructure for compiling another version for the Python library, which can be more or less integrated as-is, and one is hard-coding the particular way, in Linux, of building shared objects. Since we discover how to build shared objects in the configure script anyway (otherwise we could not have built modules as shared objects), we should embed that information there, not the Linux flags. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2000-10-26 14:13 Message: Let's give this to Jeremy instead, because he seems to know more about build issues. Jeremy, it would be good to look into getting this to work with your RPM suite. Flight's argument (has been used without complaints in Debian Python 1.5.2 since 1999) is good. ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2000-08-23 09:26 Message: In the absence of anyone arguing for inclusion of this patch and a one-week idle period, it is postponed. ---------------------------------------------------------------------- Comment By: Moshe Zadka (moshez) Date: 2000-08-16 00:40 Message: I suggest we postpone it. It isn't really complete (only works on real distributions ), and the complete solution should work on all unices. If Tcl/Perl can do it, there is no reason Python can't -- and a half hearted solution isn't that good. flight, you should use this for the Python in woody in the mean time -- I doubt woody will be stable before Python 2.1 comes out, so 2.1 sounds like a good timeframe to do it. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2000-08-15 10:52 Message: Assigned to Barry because he's a Linux weenie. Barry, if you think there's something here that should go into 2.0, please pursue it now, else change the status to Postponed. ---------------------------------------------------------------------- Comment By: Gregor Hoffleit (flight) Date: 2000-07-19 14:10 Message: This is what it used in product to build libpython as shared library(.so) for Debian. Note: This patch is not ready for inclusion in the upstream Python distribution. Anyway, I think this might be a start. The Python 1.5 executable in Debian GNU/Linux is built against a shared libpython1.5.so since April 1999, and I haven't yet heard about any problems. Using a shared library should have an advantage if you're running multiple instances of Python (be it standalone interpreter or embedded applications). ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=400938&group_id=5470 From noreply@sourceforge.net Thu Jun 14 21:30:41 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 14 Jun 2001 13:30:41 -0700 Subject: [Patches] [ python-Patches-433233 ] 2.0.1c1: statcache.py is broken (syntax) Message-ID: Patches item #433233, was updated on 2001-06-14 13:30 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433233&group_id=5470 Category: library Group: 2.0.1 bugfix Status: Open Resolution: None Priority: 5 Submitted By: Gregor Hoffleit (flight) Assigned to: Nobody/Anonymous (nobody) Summary: 2.0.1c1: statcache.py is broken (syntax) Initial Comment: In 2.0.1c1, statcache.py is broken (bad indentation). The file won't load. This problem does only exist in the release20-maint/r201c1 branch (introduced in revision 1.7.4). ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433233&group_id=5470 From noreply@sourceforge.net Fri Jun 15 17:44:16 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 15 Jun 2001 09:44:16 -0700 Subject: [Patches] [ python-Patches-433233 ] 2.0.1c1: statcache.py is broken (syntax) Message-ID: Patches item #433233, was updated on 2001-06-14 13:30 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433233&group_id=5470 Category: library Group: 2.0.1 bugfix >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: Gregor Hoffleit (flight) >Assigned to: Guido van Rossum (gvanrossum) Summary: 2.0.1c1: statcache.py is broken (syntax) Initial Comment: In 2.0.1c1, statcache.py is broken (bad indentation). The file won't load. This problem does only exist in the release20-maint/r201c1 branch (introduced in revision 1.7.4). ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-15 09:44 Message: Logged In: YES user_id=6380 Thanks -- fixed now! ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433233&group_id=5470 From noreply@sourceforge.net Fri Jun 15 20:48:39 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 15 Jun 2001 12:48:39 -0700 Subject: [Patches] [ python-Patches-433537 ] better cross-compilation support Message-ID: Patches item #433537, was updated on 2001-06-15 12:48 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433537&group_id=5470 Category: Build Group: None Status: Open Resolution: None Priority: 5 Submitted By: michael shiplett (walrusmonkey) Assigned to: Nobody/Anonymous (nobody) Summary: better cross-compilation support Initial Comment: configure.in uses AC_TRY_RUN in several places without allowing for cached values to allow for cross-compilation. this patch uses the same approach as other parts of configure.in use. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433537&group_id=5470 From noreply@sourceforge.net Sat Jun 16 02:12:02 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 15 Jun 2001 18:12:02 -0700 Subject: [Patches] [ python-Patches-433619 ] NAMESPACE support in imaplib.py Message-ID: Patches item #433619, was updated on 2001-06-15 18:12 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Michel Pelletier (michel) Assigned to: Nobody/Anonymous (nobody) Summary: NAMESPACE support in imaplib.py Initial Comment: Support for the IMAP NAMESPACE extension defined in rfc 2342. This is almost a necessity for working with modern IMAP servers. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470 From noreply@sourceforge.net Sat Jun 16 02:18:01 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 15 Jun 2001 18:18:01 -0700 Subject: [Patches] [ python-Patches-433619 ] NAMESPACE support in imaplib.py Message-ID: Patches item #433619, was updated on 2001-06-15 18:12 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470 >Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Michel Pelletier (michel) Assigned to: Nobody/Anonymous (nobody) Summary: NAMESPACE support in imaplib.py Initial Comment: Support for the IMAP NAMESPACE extension defined in rfc 2342. This is almost a necessity for working with modern IMAP servers. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470 From noreply@sourceforge.net Sat Jun 16 09:12:51 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 16 Jun 2001 01:12:51 -0700 Subject: [Patches] [ python-Patches-432325 ] \versionadded{2.2} in libstruct.tex Message-ID: Patches item #432325, was updated on 2001-06-11 23:23 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432325&group_id=5470 Category: documentation Group: None >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: Peter Funk (pefu) Assigned to: Nobody/Anonymous (nobody) Summary: \versionadded{2.2} in libstruct.tex Initial Comment: Tim Peters: > Modified Files: > libstruct.tex > Log Message: > Added q/Q standard (x-platform 8-byte ints) mode in struct module. [...] Hmmmm.... You probably forgot the \versionadded{2.2} note? ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-16 01:12 Message: Logged In: YES user_id=21627 This patch is already in libstruct.tex 1.29. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432325&group_id=5470 From noreply@sourceforge.net Sat Jun 16 09:16:45 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 16 Jun 2001 01:16:45 -0700 Subject: [Patches] [ python-Patches-407093 ] urllib2 correction of typos Message-ID: Patches item #407093, was updated on 2001-03-08 10:03 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=407093&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Eduardo Fernandez Corrales (ejfc) Assigned to: Jeremy Hylton (jhylton) Summary: urllib2 correction of typos Initial Comment: Bug #406683 "typos in urllib2" includes a patch. I have modified it so that Basic HTTP Authentication works now. (At least with my Squid proxy) ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-16 01:16 Message: Logged In: YES user_id=21627 Where is the patch? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=407093&group_id=5470 From noreply@sourceforge.net Sat Jun 16 09:24:11 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 16 Jun 2001 01:24:11 -0700 Subject: [Patches] [ python-Patches-403513 ] Canvas.py fixes Message-ID: Patches item #403513, was updated on 2001-01-30 12:00 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403513&group_id=5470 Category: Tkinter Group: None >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: Markus F.X.J. Oberhumer (mfx) Assigned to: Fredrik Lundh (effbot) Summary: Canvas.py fixes Initial Comment: This fixes Group.lower and Group.tkraise Markus (author of PySol) ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-16 01:24 Message: Logged In: YES user_id=21627 This is fixed in Canvas.py 1.17. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403513&group_id=5470 From noreply@sourceforge.net Sat Jun 16 17:08:06 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 16 Jun 2001 09:08:06 -0700 Subject: [Patches] [ python-Patches-433619 ] NAMESPACE support in imaplib.py Message-ID: Patches item #433619, was updated on 2001-06-15 18:12 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Michel Pelletier (michel) >Assigned to: Guido van Rossum (gvanrossum) Summary: NAMESPACE support in imaplib.py Initial Comment: Support for the IMAP NAMESPACE extension defined in rfc 2342. This is almost a necessity for working with modern IMAP servers. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-16 09:08 Message: Logged In: YES user_id=6380 I'm pinging Piers Lauder about this. If he approves, I'll apply it. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470 From noreply@sourceforge.net Sat Jun 16 17:25:19 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 16 Jun 2001 09:25:19 -0700 Subject: [Patches] [ python-Patches-421893 ] Cleanup GC API Message-ID: Patches item #421893, was updated on 2001-05-06 14:42 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421893&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Neil Schemenauer (nascheme) >Assigned to: Guido van Rossum (gvanrossum) Summary: Cleanup GC API Initial Comment: This patch adds three new APIs: PyObject_GC_New PyObject_GC_NewVar PyObject_GC_Resize PyObject_GC_Del and renames PyObject_GC_Init and PyObject_GC_Fini to: PyObject_GC_Track PyObject_GC_Ignore respectively. Objects that wish to be tracked by the collector must use these new APIs. Many more details about the GC implementation are hidden inside gcmodule.c. There seems to be no change in performance. Note that PyObject_GC_{New,NewVar} automatically adds the object to the GC lists. There is no need to call PyObject_GC_Track. PyObject_GC_Del automatically removes the object from the GC list but usually you want to call PyObject_GC_Ignore yourself (DECREFs can end up running arbitrary code). It should be more difficult to corrupt the GC linked lists now. Also, you can now call PyObject_GC_Ignore on objects that you know will not create RCs. The _weakref module does this. Previously, every object that had the GC type flag set and could be found by using tp_traverse had to be in a GC linked list. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-16 09:25 Message: Logged In: YES user_id=6380 I think I know a way to fix the incompatibility, by switching to a different flag bit. I'll try to work this into the descr-branch code. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-04 10:13 Message: Logged In: YES user_id=21627 I have two problems with this patch: 1. It comes with no documentation. 2. It breaks existing third-party modules which use the GC API as defined in Python 2. Consequently, I recommend rejection of the patch in its current form. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421893&group_id=5470 From noreply@sourceforge.net Sun Jun 17 05:29:01 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 16 Jun 2001 21:29:01 -0700 Subject: [Patches] [ python-Patches-433619 ] NAMESPACE support in imaplib.py Message-ID: Patches item #433619, was updated on 2001-06-15 18:12 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Michel Pelletier (michel) Assigned to: Guido van Rossum (gvanrossum) Summary: NAMESPACE support in imaplib.py Initial Comment: Support for the IMAP NAMESPACE extension defined in rfc 2342. This is almost a necessity for working with modern IMAP servers. ---------------------------------------------------------------------- Comment By: Piers Lauder (pierslauder) Date: 2001-06-16 21:29 Message: Logged In: YES user_id=196212 This looks good to me. It should be in there. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-16 09:08 Message: Logged In: YES user_id=6380 I'm pinging Piers Lauder about this. If he approves, I'll apply it. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470 From noreply@sourceforge.net Sun Jun 17 14:31:53 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 17 Jun 2001 06:31:53 -0700 Subject: [Patches] [ python-Patches-433619 ] NAMESPACE support in imaplib.py Message-ID: Patches item #433619, was updated on 2001-06-15 18:12 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470 Category: library Group: None >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: Michel Pelletier (michel) Assigned to: Guido van Rossum (gvanrossum) Summary: NAMESPACE support in imaplib.py Initial Comment: Support for the IMAP NAMESPACE extension defined in rfc 2342. This is almost a necessity for working with modern IMAP servers. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-17 06:31 Message: Logged In: YES user_id=6380 Applied and closed. Thanks, Michel! ---------------------------------------------------------------------- Comment By: Piers Lauder (pierslauder) Date: 2001-06-16 21:29 Message: Logged In: YES user_id=196212 This looks good to me. It should be in there. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-16 09:08 Message: Logged In: YES user_id=6380 I'm pinging Piers Lauder about this. If he approves, I'll apply it. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470 From noreply@sourceforge.net Sun Jun 17 23:01:09 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 17 Jun 2001 15:01:09 -0700 Subject: [Patches] [ python-Patches-434008 ] sharedinstall must use $prefix Message-ID: Patches item #434008, was updated on 2001-06-17 15:01 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=434008&group_id=5470 Category: Build Group: None Status: Open Resolution: None Priority: 5 Submitted By: Gregor Hoffleit (flight) Assigned to: Nobody/Anonymous (nobody) Summary: sharedinstall must use $prefix Initial Comment: In the sharedinstall target of the Makefile, we have to provide setup.py with the $prefix variable. Currently, the $prefix is ignored in this call of setup.py, in this leads to strange results: When called with "make install prefix=/tmp/python/debian/tmp" (which is used in packaging Python, and works completely fine otherwise), we get this (running this is non-root user): copying build/lib.linux-i686-2.1/linuxaudiodev.so->/data/src/debian/python2-2.1/debian/tmp/usr/lib/python2.1/lib-dynload running install_scripts copying build/scripts/pydoc -> /usr/bin error: /usr/bin/pydoc: Read-only file system make[1]: *** [sharedinstall] Error 1 make[1]: Leaving directory `/data/src/debian/python2-2.1' make: *** [install-stamp] Error 2 The same kind of problem occurs with all other things that are installed by the call of the setup.py script. The attached patch cures this problem by providing the $prefix to the setup.py script. I think this is the correct way to fix it. Gregor ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=434008&group_id=5470 From noreply@sourceforge.net Mon Jun 18 02:10:09 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 17 Jun 2001 18:10:09 -0700 Subject: [Patches] [ python-Patches-413171 ] fix UserDict.get, setdefault, update Message-ID: Patches item #413171, was updated on 2001-04-02 10:18 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=413171&group_id=5470 Category: library Group: None >Status: Closed Resolution: Accepted Priority: 4 Submitted By: Ka-Ping Yee (ping) Assigned to: Ka-Ping Yee (ping) Summary: fix UserDict.get, setdefault, update Initial Comment: The methods 'get', 'setdefault', and 'update' on a dictionary are usually implemented (and thought of) in terms of the lower-level methods has_key, __getitem__, and __setitem__. The current implementation of UserDict relays a call to e.g. x.get() to x.data.get(), which behaves inconsistently if __getitem__ has been implemented on x. One particular big place where this turns up is cgi. If you get a dict = cgi.SvFormContentDict(), then dict.get('key') will return a *list* even though dict['key'] returns a single item! To make UserDict behave consistently, this patch fixes get(), update(), and setdefault() to re-use the other methods. Then the only occurrence of self.data[k] = v is in __setitem__, the only occurrence of self.data[k] without assignment is in __getitem__, etc. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-17 18:10 Message: Logged In: YES user_id=21627 Committed as UserDict 1.14. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-06-07 08:10 Message: Logged In: YES user_id=6380 Approved. Check it in already! ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-06 22:25 Message: Logged In: YES user_id=21627 I recommend to approve this patch. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-04-10 14:17 Message: Logged In: YES user_id=6380 Let's not fix this in 2.1. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=413171&group_id=5470 From noreply@sourceforge.net Mon Jun 18 02:41:20 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 17 Jun 2001 18:41:20 -0700 Subject: [Patches] [ python-Patches-427190 ] Speed-up "O" calls Message-ID: Patches item #427190, was updated on 2001-05-24 22:30 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427190&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Martin v. Löwis (loewis) Assigned to: Jeremy Hylton (jhylton) >Summary: Speed-up "O" calls Initial Comment: This patch improves the performance of a few functions which have an "O" signature (ord, len, and list_append). On selected test cases, this patch gives a speed-up of 40%. If accepted, the approach can be extended to more signatures. E.g. "l" is already provided in the patch, but currently not used. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-17 18:41 Message: Logged In: YES user_id=21627 Uploaded new version which invokes string_join correctly from _PyString_Join. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-02 03:12 Message: Logged In: YES user_id=21627 New version uploaded. This uses functions with only the self argument for METH_NOARGS, and introduces PyNoArgsFunction for them. It also adds a section for api.tex documenting the METH_ flags, and an entry in NEWS mentioning the new METH_ flags. ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2001-06-01 08:14 Message: Logged In: YES user_id=31392 Just took a quick look -- looks good. One question: Why does METH_NOARGS call the method with two arguments where the second is always NULL? Wouldn't it be clearer to have these functions take one argument? ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-01 07:34 Message: Logged In: YES user_id=21627 I rewrote the patch to only support METH_NOARGS and METH_O, and to not use bit masks for them. I also changed calling conventions for all Object operations and bltin and sys functions. In the course of these changes, two functions got a changed meaning: - file.writelines accepts only exactly one argument - iter.next does not accept any arguments anymore As you can see in the patch,there is still a lot of places that continue to use OLDARGS (plus all the Modules functions that have not been changed in this patch), so OLDARGS will be needed for quite some time. ---------------------------------------------------------------------- Comment By: Jeremy Hylton (jhylton) Date: 2001-05-29 13:59 Message: Logged In: YES user_id=31392 I like METH_O, but I'm not sure about METH_L. I'd rather see the call handling in ceval be type-neutral. It's easy enough for the callee to cast from an object to an int (or any other type). There should be no effect on performance and it reduces the amount of code in the core. I think the implementation could be simplified a lot if it defined METH_O -- or perhaps METH_NOARGS, METH_ONEARG, and maybe even METH_TWOARGS (but Tim has a pretty good argument against that one). I don't think there's any define METH_O via METH_SPECIAL and reserve all of 0xFFF0 for flags on METH_SPECIAL. Instead, I'd just use the next N bits to implement the next N flags. The SPECIALSIZE and extra stack used in the implementation seem like unneeded generality, too. If the implementation is only going to support 0 and 1 (and possibly 2) argument, there's no need for anything more general. Finally, I suggest appropriating fast_cfunction() for this purpose, rather than calling the new function do_call_special(), where "special" isn't a very specific meaning. If METH_NOARGS and METH_ONEARG are implemented, there is basically no reason to use METH_OLDARGS. So we can get rid of it in the code base and stop attempting to optimize it. Do you want to have a go at a smaller patch that just did METH_ONEARG and METH_NOARGS? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427190&group_id=5470 From noreply@sourceforge.net Mon Jun 18 15:06:08 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 18 Jun 2001 07:06:08 -0700 Subject: [Patches] [ python-Patches-426746 ] Infrastructure for getting MacPython modules working on OSX Message-ID: Patches item #426746, was updated on 2001-05-23 13:29 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470 Category: Build Group: None Status: Open >Resolution: Accepted Priority: 5 Submitted By: Jack Jansen (jackjansen) Assigned to: Thomas Wouters (twouters) Summary: Infrastructure for getting MacPython modules working on OSX Initial Comment: Here are a couple of patches that optionally (on MacOSX) enable a bit of extra infrastructure in the Python core, to allow various (MacPython-originated) dynamic extension modules to be built. Here's what I patched: - Added a MACHDEP_OBJS variable to Makefile.pre.in and configure.in. This allows platforms to include patform-specific sourcefiles to be added to the core build. - Added (using MACHDEP_OBJS) a macglue.c file to the build, which contains glue code that allows Mac extension modules to refer to each other while being in separate dynamically loaded modules, plus a couple of utility routines. There's also a few changes to LDFLAGS to get the object file incorporated (as it is otherwise optimized away because the rest of Python doesn't refer to it). - Added a config.h.in define USE_TOOLBOX_OBJECT_GLUE which enables the glue code mentioned above (which isn't need in MacPython, only in Mach-O Python). Possibly the latter two should be dependent on a configure switch (--with-mac-toolbox-modules?) but (a) I think the added memory footprint is minimal and (b) I never understood how to add configure switches:-) A setup.py patch will follow, but I'm still testing it. ---------------------------------------------------------------------- >Comment By: Thomas Wouters (twouters) Date: 2001-06-18 07:06 Message: Logged In: YES user_id=34209 Looks fine, except for one thing: it changes 'dnl' to 'setdnl' in one spot. 'setdnl' isn't a standard M4 directive, to my knowledge. Is that a typo ? I didn't actually test the patch on an OSX box, though, as I assume Jack already did that :) But, Jack, I do have two colleagues with OSX boxes, and I already have an account on one, so if you want, I can take the time to test it, or other stuff. I'll need some pointers first, though, because last time I tried to compile python on that box it took me four hours to figure out how to make it stop whining when running cofniguer, let alone make ;-) ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-07 13:09 Message: Logged In: YES user_id=31435 Assigned to Thomas because he's shown previous signs of knowing how to spell "configure" <0.9 wink>. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470 From noreply@sourceforge.net Mon Jun 18 15:13:28 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 18 Jun 2001 07:13:28 -0700 Subject: [Patches] [ python-Patches-426746 ] Infrastructure for getting MacPython modules working on OSX Message-ID: Patches item #426746, was updated on 2001-05-23 13:29 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470 Category: Build Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Jack Jansen (jackjansen) >Assigned to: Jack Jansen (jackjansen) Summary: Infrastructure for getting MacPython modules working on OSX Initial Comment: Here are a couple of patches that optionally (on MacOSX) enable a bit of extra infrastructure in the Python core, to allow various (MacPython-originated) dynamic extension modules to be built. Here's what I patched: - Added a MACHDEP_OBJS variable to Makefile.pre.in and configure.in. This allows platforms to include patform-specific sourcefiles to be added to the core build. - Added (using MACHDEP_OBJS) a macglue.c file to the build, which contains glue code that allows Mac extension modules to refer to each other while being in separate dynamically loaded modules, plus a couple of utility routines. There's also a few changes to LDFLAGS to get the object file incorporated (as it is otherwise optimized away because the rest of Python doesn't refer to it). - Added a config.h.in define USE_TOOLBOX_OBJECT_GLUE which enables the glue code mentioned above (which isn't need in MacPython, only in Mach-O Python). Possibly the latter two should be dependent on a configure switch (--with-mac-toolbox-modules?) but (a) I think the added memory footprint is minimal and (b) I never understood how to add configure switches:-) A setup.py patch will follow, but I'm still testing it. ---------------------------------------------------------------------- Comment By: Thomas Wouters (twouters) Date: 2001-06-18 07:06 Message: Logged In: YES user_id=34209 Looks fine, except for one thing: it changes 'dnl' to 'setdnl' in one spot. 'setdnl' isn't a standard M4 directive, to my knowledge. Is that a typo ? I didn't actually test the patch on an OSX box, though, as I assume Jack already did that :) But, Jack, I do have two colleagues with OSX boxes, and I already have an account on one, so if you want, I can take the time to test it, or other stuff. I'll need some pointers first, though, because last time I tried to compile python on that box it took me four hours to figure out how to make it stop whining when running cofniguer, let alone make ;-) ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-07 13:09 Message: Logged In: YES user_id=31435 Assigned to Thomas because he's shown previous signs of knowing how to spell "configure" <0.9 wink>. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470 From noreply@sourceforge.net Tue Jun 19 12:11:55 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 19 Jun 2001 04:11:55 -0700 Subject: [Patches] [ python-Patches-426746 ] Infrastructure for getting MacPython modules working on OSX Message-ID: Patches item #426746, was updated on 2001-05-23 13:29 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470 Category: Build Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Jack Jansen (jackjansen) Assigned to: Jack Jansen (jackjansen) Summary: Infrastructure for getting MacPython modules working on OSX Initial Comment: Here are a couple of patches that optionally (on MacOSX) enable a bit of extra infrastructure in the Python core, to allow various (MacPython-originated) dynamic extension modules to be built. Here's what I patched: - Added a MACHDEP_OBJS variable to Makefile.pre.in and configure.in. This allows platforms to include patform-specific sourcefiles to be added to the core build. - Added (using MACHDEP_OBJS) a macglue.c file to the build, which contains glue code that allows Mac extension modules to refer to each other while being in separate dynamically loaded modules, plus a couple of utility routines. There's also a few changes to LDFLAGS to get the object file incorporated (as it is otherwise optimized away because the rest of Python doesn't refer to it). - Added a config.h.in define USE_TOOLBOX_OBJECT_GLUE which enables the glue code mentioned above (which isn't need in MacPython, only in Mach-O Python). Possibly the latter two should be dependent on a configure switch (--with-mac-toolbox-modules?) but (a) I think the added memory footprint is minimal and (b) I never understood how to add configure switches:-) A setup.py patch will follow, but I'm still testing it. ---------------------------------------------------------------------- >Comment By: Jack Jansen (jackjansen) Date: 2001-06-19 04:11 Message: Logged In: YES user_id=45365 I have absolutely no idea where the dnl/setdnl mod came from. Throw it out, please. Also, I'm a bit unsure about the next step: do you check the patch in or do I? ---------------------------------------------------------------------- Comment By: Thomas Wouters (twouters) Date: 2001-06-18 07:06 Message: Logged In: YES user_id=34209 Looks fine, except for one thing: it changes 'dnl' to 'setdnl' in one spot. 'setdnl' isn't a standard M4 directive, to my knowledge. Is that a typo ? I didn't actually test the patch on an OSX box, though, as I assume Jack already did that :) But, Jack, I do have two colleagues with OSX boxes, and I already have an account on one, so if you want, I can take the time to test it, or other stuff. I'll need some pointers first, though, because last time I tried to compile python on that box it took me four hours to figure out how to make it stop whining when running cofniguer, let alone make ;-) ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-07 13:09 Message: Logged In: YES user_id=31435 Assigned to Thomas because he's shown previous signs of knowing how to spell "configure" <0.9 wink>. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470 From noreply@sourceforge.net Tue Jun 19 12:13:06 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 19 Jun 2001 04:13:06 -0700 Subject: [Patches] [ python-Patches-426746 ] Infrastructure for getting MacPython modules working on OSX Message-ID: Patches item #426746, was updated on 2001-05-23 13:29 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470 Category: Build Group: None Status: Open Resolution: Accepted Priority: 5 Submitted By: Jack Jansen (jackjansen) Assigned to: Jack Jansen (jackjansen) Summary: Infrastructure for getting MacPython modules working on OSX Initial Comment: Here are a couple of patches that optionally (on MacOSX) enable a bit of extra infrastructure in the Python core, to allow various (MacPython-originated) dynamic extension modules to be built. Here's what I patched: - Added a MACHDEP_OBJS variable to Makefile.pre.in and configure.in. This allows platforms to include patform-specific sourcefiles to be added to the core build. - Added (using MACHDEP_OBJS) a macglue.c file to the build, which contains glue code that allows Mac extension modules to refer to each other while being in separate dynamically loaded modules, plus a couple of utility routines. There's also a few changes to LDFLAGS to get the object file incorporated (as it is otherwise optimized away because the rest of Python doesn't refer to it). - Added a config.h.in define USE_TOOLBOX_OBJECT_GLUE which enables the glue code mentioned above (which isn't need in MacPython, only in Mach-O Python). Possibly the latter two should be dependent on a configure switch (--with-mac-toolbox-modules?) but (a) I think the added memory footprint is minimal and (b) I never understood how to add configure switches:-) A setup.py patch will follow, but I'm still testing it. ---------------------------------------------------------------------- >Comment By: Jack Jansen (jackjansen) Date: 2001-06-19 04:13 Message: Logged In: YES user_id=45365 Ah, silly me, it's assigned to me, so I check it in (after removing the dml stuff). ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2001-06-19 04:11 Message: Logged In: YES user_id=45365 I have absolutely no idea where the dnl/setdnl mod came from. Throw it out, please. Also, I'm a bit unsure about the next step: do you check the patch in or do I? ---------------------------------------------------------------------- Comment By: Thomas Wouters (twouters) Date: 2001-06-18 07:06 Message: Logged In: YES user_id=34209 Looks fine, except for one thing: it changes 'dnl' to 'setdnl' in one spot. 'setdnl' isn't a standard M4 directive, to my knowledge. Is that a typo ? I didn't actually test the patch on an OSX box, though, as I assume Jack already did that :) But, Jack, I do have two colleagues with OSX boxes, and I already have an account on one, so if you want, I can take the time to test it, or other stuff. I'll need some pointers first, though, because last time I tried to compile python on that box it took me four hours to figure out how to make it stop whining when running cofniguer, let alone make ;-) ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-07 13:09 Message: Logged In: YES user_id=31435 Assigned to Thomas because he's shown previous signs of knowing how to spell "configure" <0.9 wink>. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470 From noreply@sourceforge.net Tue Jun 19 21:10:01 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 19 Jun 2001 13:10:01 -0700 Subject: [Patches] [ python-Patches-430030 ] Avoid multiple BOMs in UTF-16 streams Message-ID: Patches item #430030, was updated on 2001-06-04 09:59 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430030&group_id=5470 Category: library Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Martin v. Löwis (loewis) Assigned to: M.-A. Lemburg (lemburg) Summary: Avoid multiple BOMs in UTF-16 streams Initial Comment: This patch fixes the UTF-16 reader and writer to emit and expect the BOM only at the beginning of the stream. It is implemented by changing the encode/decode function of the stream object after the byte order is detected. In addition, it adds a new test case test_codecs. When committing the patch, the corresponding output file must be generated. ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-19 13:10 Message: Logged In: YES user_id=38388 Checked in a slightly modified version of the patch. Thanks. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430030&group_id=5470 From noreply@sourceforge.net Tue Jun 19 21:23:52 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 19 Jun 2001 13:23:52 -0700 Subject: [Patches] [ python-Patches-426746 ] Infrastructure for getting MacPython modules working on OSX Message-ID: Patches item #426746, was updated on 2001-05-23 13:29 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470 Category: Build Group: None >Status: Closed Resolution: Accepted Priority: 5 Submitted By: Jack Jansen (jackjansen) Assigned to: Jack Jansen (jackjansen) Summary: Infrastructure for getting MacPython modules working on OSX Initial Comment: Here are a couple of patches that optionally (on MacOSX) enable a bit of extra infrastructure in the Python core, to allow various (MacPython-originated) dynamic extension modules to be built. Here's what I patched: - Added a MACHDEP_OBJS variable to Makefile.pre.in and configure.in. This allows platforms to include patform-specific sourcefiles to be added to the core build. - Added (using MACHDEP_OBJS) a macglue.c file to the build, which contains glue code that allows Mac extension modules to refer to each other while being in separate dynamically loaded modules, plus a couple of utility routines. There's also a few changes to LDFLAGS to get the object file incorporated (as it is otherwise optimized away because the rest of Python doesn't refer to it). - Added a config.h.in define USE_TOOLBOX_OBJECT_GLUE which enables the glue code mentioned above (which isn't need in MacPython, only in Mach-O Python). Possibly the latter two should be dependent on a configure switch (--with-mac-toolbox-modules?) but (a) I think the added memory footprint is minimal and (b) I never understood how to add configure switches:-) A setup.py patch will follow, but I'm still testing it. ---------------------------------------------------------------------- >Comment By: Jack Jansen (jackjansen) Date: 2001-06-19 13:23 Message: Logged In: YES user_id=45365 ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2001-06-19 04:13 Message: Logged In: YES user_id=45365 Ah, silly me, it's assigned to me, so I check it in (after removing the dml stuff). ---------------------------------------------------------------------- Comment By: Jack Jansen (jackjansen) Date: 2001-06-19 04:11 Message: Logged In: YES user_id=45365 I have absolutely no idea where the dnl/setdnl mod came from. Throw it out, please. Also, I'm a bit unsure about the next step: do you check the patch in or do I? ---------------------------------------------------------------------- Comment By: Thomas Wouters (twouters) Date: 2001-06-18 07:06 Message: Logged In: YES user_id=34209 Looks fine, except for one thing: it changes 'dnl' to 'setdnl' in one spot. 'setdnl' isn't a standard M4 directive, to my knowledge. Is that a typo ? I didn't actually test the patch on an OSX box, though, as I assume Jack already did that :) But, Jack, I do have two colleagues with OSX boxes, and I already have an account on one, so if you want, I can take the time to test it, or other stuff. I'll need some pointers first, though, because last time I tried to compile python on that box it took me four hours to figure out how to make it stop whining when running cofniguer, let alone make ;-) ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-06-07 13:09 Message: Logged In: YES user_id=31435 Assigned to Thomas because he's shown previous signs of knowing how to spell "configure" <0.9 wink>. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470 From noreply@sourceforge.net Fri Jun 22 09:42:42 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 22 Jun 2001 01:42:42 -0700 Subject: [Patches] [ python-Patches-435381 ] Distutils patches for OS/2+EMX support Message-ID: Patches item #435381, was updated on 2001-06-22 01:42 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435381&group_id=5470 Category: distutils Group: None Status: Open Resolution: None Priority: 5 Submitted By: Andrew I MacIntyre (aimacintyre) Assigned to: Nobody/Anonymous (nobody) Summary: Distutils patches for OS/2+EMX support Initial Comment: The attached patch file is against the code released with Python 2.1. The changes are included in the binary installation package of the OS/2+EMX port of Python 2.1 I released on June 17, 2001. With these changes, I have successfully built and installed the Numeric 20.0.0 extention, and created a bdist_dumb distribution package of it. The installed extention tests successfully using the supplied test routines. Particular items of note:- - OS/2 limits DLLs to 8.3 filenames :-( :-( - ld is not used to link, instead gcc is used with the -Zomf option which invokes the LINK386 linker native to OS/2 - I haven't made any attempt to merge cloned code back into the parent code where it would make sense, which I think is in a few places. Feedback appreciated. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435381&group_id=5470 From noreply@sourceforge.net Fri Jun 22 16:47:44 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 22 Jun 2001 08:47:44 -0700 Subject: [Patches] [ python-Patches-435492 ] tempnam(),tmpfile(),tmpnam() for Windows Message-ID: Patches item #435492, was updated on 2001-06-22 08:47 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435492&group_id=5470 Category: Windows Group: None Status: Open Resolution: None Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) Assigned to: Tim Peters (tim_one) Summary: tempnam(),tmpfile(),tmpnam() for Windows Initial Comment: This patch makes os.tempnam(), os.tmpfile(), and os.tmpnam() available on Windows. (And yes, I tested that the Windows version still compiles!) A user noted that the documentation did not indicate constrained availability, but these functions were not available -- appearantly he was on running Windows or MacOS. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435492&group_id=5470 From noreply@sourceforge.net Fri Jun 22 19:41:46 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 22 Jun 2001 11:41:46 -0700 Subject: [Patches] [ python-Patches-431422 ] "print" not emitting POP_TOP Message-ID: Patches item #431422, was opened at 2001-06-08 08:54 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431422&group_id=5470 Category: Parser/Compiler Group: None Status: Open Resolution: None Priority: 5 Submitted By: Shane Hathaway (hathawsh) Assigned to: Nobody/Anonymous (nobody) >Summary: "print" not emitting POP_TOP Initial Comment: The Python-based compiler module (in Tools) has a bug in the visitPrint() method of pycodegen.CodeGenerator. It does not emit a trailing POP_TOP instruction, which AFAICT it should emit only when outputting to a stream and there is a trailing comma (indicating no newline). I've attached the patch applied to Zope's RestrictedPython module; if there is anything incorrect about it please tell me right away. Otherwise please apply the patch to Tools/compiler/pycodgen.py. ---------------------------------------------------------------------- >Comment By: Shane Hathaway (hathawsh) Date: 2001-06-22 11:41 Message: Logged In: YES user_id=16701 Oops, it turns out the patch is incorrect! POP_TOP should only be added to the *last* print node. Here are the revised visitPrint() and visitPrintnl() methods. This is what is being used in Zope now. def visitPrint(self, node, newline=0): self.set_lineno(node) if node.dest: self.visit(node.dest) for child in node.nodes: if node.dest: self.emit('DUP_TOP') self.visit(child) if node.dest: self.emit('ROT_TWO') self.emit('PRINT_ITEM_TO') else: self.emit('PRINT_ITEM') if node.dest and not newline: self.emit('POP_TOP') def visitPrintnl(self, node): self.visitPrint(node, 1) if node.dest: self.emit('PRINT_NEWLINE_TO') else: self.emit('PRINT_NEWLINE') ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431422&group_id=5470 From noreply@sourceforge.net Fri Jun 22 21:51:02 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 22 Jun 2001 13:51:02 -0700 Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks Message-ID: Patches item #432401, was opened at 2001-06-12 06:43 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: M.-A. Lemburg (lemburg) Summary: unicode encoding error callbacks Initial Comment: This patch adds unicode error handling callbacks to the encode functionality. With this patch it's possible to not only pass 'strict', 'ignore' or 'replace' as the errors argument to encode, but also a callable function, that will be called with the encoding name, the original unicode object and the position of the unencodable character. The callback must return a replacement unicode object that will be encoded instead of the original character. For example replacing unencodable characters with XML character references can be done in the following way. u"aäoöuüß".encode( "ascii", lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos]) ) ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-22 13:51 Message: Logged In: YES user_id=38388 Sorry to keep you waiting, Walter. I will look into this again next week -- this week was way too busy... ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-13 10:00 Message: Logged In: YES user_id=38388 On your comment about the non-Unicode codecs: let's keep this separated from the current patch. Don't have much time today. I'll comment on the other things tomorrow. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-13 08:49 Message: Logged In: YES user_id=89016 Guido van Rossum wrote in python-dev: > True, the "codec" pattern can be used for other > encodings than Unicode. But it seems to me that the > entire codecs architecture is rather strongly geared > towards en/decoding Unicode, and it's not clear > how well other codecs fit in this pattern (e.g. I > noticed that all the non-Unicode codecs ignore the > error handling parameter or assert that > it is set to 'strict'). I noticed that too. asserting that errors=='strict' would mean that the encoder is not able to deal in any other way with unencodable stuff than by raising an error. But that is not the problem here, because for zlib, base64, quopri, hex and uu encoding there can be no unencodable characters. The encoders can simply ignore the errors parameter. Should I remove the asserts from those codecs and change the docstrings accordingly, or will this be done separately? ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-13 06:57 Message: Logged In: YES user_id=89016 > > [...] > > raise an exception). U+FFFD characters in the replacement > > string will be replaced with a character that the encoder > > chooses ('?' in all cases). > > Nice. But the special casing of U+FFFD makes the interface somewhat less clean than it could be. It was only done to be 100% backwards compatible. With the original "replace" error handling the codec chose the replacement character. But as far as I can tell none of the codecs uses anything other than '?', so I guess we could change the replace handler to always return u'?'. This would make the implementation a little bit simpler, but the explanation of the callback feature *a lot* simpler. And if you still want to handle an unencodable U+FFFD, you can write a special callback for that, e.g. def FFFDreplace(enc, uni, pos): if uni[pos] == "\ufffd": return u"?" else: raise UnicodeError(...) > > The implementation of the loop through the string is done > > in the following way. A stack with two strings is kept > > and the loop always encodes a character from the string > > at the stacktop. If an error is encountered and the stack > > has only one entry (during encoding of the original string) > > the callback is called and the unicode object returned is > > pushed on the stack, so the encoding continues with the > > replacement string. If the stack has two entries when an > > error is encountered, the replacement string itself has > > an unencodable character and a normal exception raised. > > When the encoder has reached the end of it's current string > > there are two possibilities: when the stack contains two > > entries, this was the replacement string, so the replacement > > string will be poppep from the stack and encoding continues > > with the next character from the original string. If the > > stack had only one entry, encoding is finished. > > Very elegant solution ! I'll put it as a comment in the source. > > (I hope that's enough explanation of the API and > implementation) > > Could you add these docs to the Misc/unicode.txt file ? I > will eventually take that file and turn it into a PEP which > will then serve as general documentation for these things. I could, but first we should work out how the decoding callback API will work. > > I have renamed the static ...121 function to all lowercase > > names. > > Ok. > > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > > replacement callback. > > Hmm, wouldn't that result in a slowdown ? If so, I'd rather > leave the special encoder in place, since it is being used a > lot in Python and probably some applications too. It would be a slowdown. But callbacks open many possiblities. For example: Why can't I print u"gürk"? is probably one of the most frequently asked questions in comp.lang.python. For printing Unicode stuff, print could be extended the use an error handling callback for Unicode strings (or objects where __str__ or tp_str returns a Unicode object) instead of using str() which always returns an 8bit string and uses strict encoding. There might even be a sys.setprintencodehandler()/sys.getprintencodehandler() > [...] > I think it would be worthwhile to rename the callbacks to > include "Unicode" somewhere, e.g. > PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but > then it points out the application field of the callback > rather well. Same for the callbacks exposed through the > _codecsmodule. OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors really is a long name ;)) > > I have not touched PyUnicode_TranslateCharmap yet, > > should this function also support error callbacks? Why > > would one want the insert None into the mapping to call > > the callback? > > 1. Yes. > 2. The user may want to e.g. restrict usage of certain > character ranges. In this case the codec would be used to > verify the input and an exception would indeed be useful > (e.g. say you want to restrict input to Hangul + ASCII). OK, do we want TranslateCharmap to work exactly like encoding, i.e. in case of an error should the returned replacement string again be mapped through the translation mapping or should it be copied to the output directly? The former would be more in line with encoding, but IMHO the latter would be much more useful. BTW, when I implement it I can implement patch #403100 ("Multicharacter replacements in PyUnicode_TranslateCharmap") along the way. Should the old TranslateCharmap map to the new TranslateCharmapEx and inherit the "multicharacter replacement" feature, or should I leave it as it is? > > A remaining problem is how to implement decoding error > > callbacks. In Python 2.1 encoding and decoding errors are > > handled in the same way with a string value. But with > > callbacks it doesn't make sense to use the same callback > > for encoding and decoding (like codecs.StreamReaderWriter > > and codecs.StreamRecoder do). Decoding callbacks have a > > different API. Which arguments should be passed to the > > decoding callback, and what is the decoding callback > > supposed to do? > > I'd suggest adding another set of PyCodec_UnicodeDecode... () > APIs for this. We'd then have to augment the base classes of > the StreamCodecs to provide two attributes for .errors with > a fallback solution for the string case (i.s. "strict" can > still be used for both directions). Sounds good. Now what is the decoding callback supposed to do? I guess it will be called in the same way as the encoding callback, i.e. with encoding name, original string and position of the error. It might returns a Unicode string (i.e. an object of the decoding target type), that will be emitted from the codec instead of the one offending byte. Or it might return a tuple with replacement Unicode object and a resynchronisation offset, i.e. returning (u"?", 1) means emit a '?' and skip the offending character. But to make the offset really useful the callback has to know something about the encoding, perhaps the codec should be allowed to pass an additional state object to the callback? Maybe the same should be added to the encoding callbacks to? Maybe the encoding callback should be able to tell the encoder if the replacement returned should be reencoded (in which case it's a Unicode object), or directly emitted (in which case it's an 8bit string)? > > One additional note: It is vital that errors is an > > assignable attribute of the StreamWriter. > > It is already ! I know, but IMHO it should be documented that an assignable errors attribute must be supported as part of the official codec API. Misc/unicode.txt is not clear on that: """ It is not required by the Unicode implementation to use these base classes, only the interfaces must match; this allows writing Codecs as extension types. """ ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-13 01:05 Message: Logged In: YES user_id=38388 > How the callbacks work: > > A PyObject * named errors is passed in. This may by NULL, > Py_None, 'strict', u'strict', 'ignore', u'ignore', > 'replace', u'replace' or a callable object. > PyCodec_EncodeHandlerForObject maps all of these objects to > one of the three builtin error callbacks > PyCodec_RaiseEncodeErrors (raises an exception), > PyCodec_IgnoreEncodeErrors (returns an empty replacement > string, in effect ignoring the error), > PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode > replacement character to signify to the encoder that it > should choose a suitable replacement character) or directly > returns errors if it is a callable object. When an > unencodable character is encounterd the error handling > callback will be called with the encoding name, the original > unicode object and the error position and must return a > unicode object that will be encoded instead of the offending > character (or the callback may of course raise an > exception). U+FFFD characters in the replacement string will > be replaced with a character that the encoder chooses ('?' > in all cases). Nice. > The implementation of the loop through the string is done in > the following way. A stack with two strings is kept and the > loop always encodes a character from the string at the > stacktop. If an error is encountered and the stack has only > one entry (during encoding of the original string) the > callback is called and the unicode object returned is pushed > on the stack, so the encoding continues with the replacement > string. If the stack has two entries when an error is > encountered, the replacement string itself has an > unencodable character and a normal exception raised. When > the encoder has reached the end of it's current string there > are two possibilities: when the stack contains two entries, > this was the replacement string, so the replacement string > will be poppep from the stack and encoding continues with > the next character from the original string. If the stack > had only one entry, encoding is finished. Very elegant solution ! > (I hope that's enough explanation of the API and implementation) Could you add these docs to the Misc/unicode.txt file ? I will eventually take that file and turn it into a PEP which will then serve as general documentation for these things. > I have renamed the static ...121 function to all lowercase > names. Ok. > BTW, I guess PyUnicode_EncodeUnicodeEscape could be > reimplemented as PyUnicode_EncodeASCII with a \uxxxx > replacement callback. Hmm, wouldn't that result in a slowdown ? If so, I'd rather leave the special encoder in place, since it is being used a lot in Python and probably some applications too. > PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, > PyCodec_ReplaceEncodeErrors are globally visible because > they have to be available in _codecsmodule.c to wrap them as > Python function objects, but they can't be implemented in > _codecsmodule, because they need to be available to the > encoders in unicodeobject.c (through > PyCodec_EncodeHandlerForObject), but importing the codecs > module might result in an endless recursion, because > importing a module requires unpickling of the bytecode, > which might require decoding utf8, which ... (but this will > only happen, if we implement the same mechanism for the > decoding API) I think that codecs.c is the right place for these APIs. _codecsmodule.c is only meant as Python access wrapper for the internal codecs and nothing more. One thing I noted about the callbacks: they assume that they will always get Unicode objects as input. This is certainly not true in the general case (it is for the codecs you touch in the patch). I think it would be worthwhile to rename the callbacks to include "Unicode" somewhere, e.g. PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but then it points out the application field of the callback rather well. Same for the callbacks exposed through the _codecsmodule. > I have not touched PyUnicode_TranslateCharmap yet, > should this function also support error callbacks? Why would > one want the insert None into the mapping to call the callback? 1. Yes. 2. The user may want to e.g. restrict usage of certain character ranges. In this case the codec would be used to verify the input and an exception would indeed be useful (e.g. say you want to restrict input to Hangul + ASCII). > A remaining problem is how to implement decoding error > callbacks. In Python 2.1 encoding and decoding errors are > handled in the same way with a string value. But with > callbacks it doesn't make sense to use the same callback for > encoding and decoding (like codecs.StreamReaderWriter and > codecs.StreamRecoder do). Decoding callbacks have a > different API. Which arguments should be passed to the > decoding callback, and what is the decoding callback > supposed to do? I'd suggest adding another set of PyCodec_UnicodeDecode...() APIs for this. We'd then have to augment the base classes of the StreamCodecs to provide two attributes for .errors with a fallback solution for the string case (i.s. "strict" can still be used for both directions). > One additional note: It is vital that errors is an > assignable attribute of the StreamWriter. It is already ! > Consider the XML example: For writing an XML DOM tree one > StreamWriter object is used. When a text node is written, > the error handling has to be set to > codecs.xmlreplace_encode_errors, but inside a comment or > processing instruction replacing unencodable characters with > charrefs is not possible, so here codecs.raise_encode_errors > should be used (or better a custom error handler that raises > an error that says "sorry, you can't have unencodable > characters inside a comment") Sure. > BTW, should we continue the discussion in the i18n SIG > mailing list? An email program is much more comfortable than > a HTML textarea! ;) I'd rather keep the discussions on this patch here -- forking it off to the i18n sig will make it very hard to follow up on it. (This HTML area is indeed damn small ;-) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 12:18 Message: Logged In: YES user_id=89016 One additional note: It is vital that errors is an assignable attribute of the StreamWriter. Consider the XML example: For writing an XML DOM tree one StreamWriter object is used. When a text node is written, the error handling has to be set to codecs.xmlreplace_encode_errors, but inside a comment or processing instruction replacing unencodable characters with charrefs is not possible, so here codecs.raise_encode_errors should be used (or better a custom error handler that raises an error that says "sorry, you can't have unencodable characters inside a comment") BTW, should we continue the discussion in the i18n SIG mailing list? An email program is much more comfortable than a HTML textarea! ;) ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 11:59 Message: Logged In: YES user_id=89016 How the callbacks work: A PyObject * named errors is passed in. This may by NULL, Py_None, 'strict', u'strict', 'ignore', u'ignore', 'replace', u'replace' or a callable object. PyCodec_EncodeHandlerForObject maps all of these objects to one of the three builtin error callbacks PyCodec_RaiseEncodeErrors (raises an exception), PyCodec_IgnoreEncodeErrors (returns an empty replacement string, in effect ignoring the error), PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode replacement character to signify to the encoder that it should choose a suitable replacement character) or directly returns errors if it is a callable object. When an unencodable character is encounterd the error handling callback will be called with the encoding name, the original unicode object and the error position and must return a unicode object that will be encoded instead of the offending character (or the callback may of course raise an exception). U+FFFD characters in the replacement string will be replaced with a character that the encoder chooses ('?' in all cases). The implementation of the loop through the string is done in the following way. A stack with two strings is kept and the loop always encodes a character from the string at the stacktop. If an error is encountered and the stack has only one entry (during encoding of the original string) the callback is called and the unicode object returned is pushed on the stack, so the encoding continues with the replacement string. If the stack has two entries when an error is encountered, the replacement string itself has an unencodable character and a normal exception raised. When the encoder has reached the end of it's current string there are two possibilities: when the stack contains two entries, this was the replacement string, so the replacement string will be poppep from the stack and encoding continues with the next character from the original string. If the stack had only one entry, encoding is finished. (I hope that's enough explanation of the API and implementation) I have renamed the static ...121 function to all lowercase names. BTW, I guess PyUnicode_EncodeUnicodeEscape could be reimplemented as PyUnicode_EncodeASCII with a \uxxxx replacement callback. PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors, PyCodec_ReplaceEncodeErrors are globally visible because they have to be available in _codecsmodule.c to wrap them as Python function objects, but they can't be implemented in _codecsmodule, because they need to be available to the encoders in unicodeobject.c (through PyCodec_EncodeHandlerForObject), but importing the codecs module might result in an endless recursion, because importing a module requires unpickling of the bytecode, which might require decoding utf8, which ... (but this will only happen, if we implement the same mechanism for the decoding API) I have not touched PyUnicode_TranslateCharmap yet, should this function also support error callbacks? Why would one want the insert None into the mapping to call the callback? A remaining problem is how to implement decoding error callbacks. In Python 2.1 encoding and decoding errors are handled in the same way with a string value. But with callbacks it doesn't make sense to use the same callback for encoding and decoding (like codecs.StreamReaderWriter and codecs.StreamRecoder do). Decoding callbacks have a different API. Which arguments should be passed to the decoding callback, and what is the decoding callback supposed to do? ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 11:00 Message: Logged In: YES user_id=38388 About the Py_UNICODE*data, int size APIs: Ok, point taken. In general, I think we ought to keep the callback feature as open as possible, so passing in pointers and sizes would not be very useful. BTW, could you summarize how the callback works in a few lines ? About _Encode121: I'd name this _EncodeUCS1 since that's what it is ;-) About the new functions: I was referring to the new static functions which you gave PyUnicode_... names. If these are not supposed to turn into non-static functions, I'd rather have them use lower case names (since that's how the Python internals work too -- most of the times). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:56 Message: Logged In: YES user_id=89016 > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. Another problem is, that the callback requires a Python object, so in the PyObject *version, the refcount is incref'd and the object is passed to the callback. The Py_UNICODE*/int version would have to create a new Unicode object from the data. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2001-06-12 09:32 Message: Logged In: YES user_id=89016 > * please don't place more than one C statement on one line > like in: > """ > + unicode = unicode2; unicodepos = > unicode2pos; > + unicode2 = NULL; unicode2pos = 0; > """ OK, done! > * Comments should start with a capital letter and be > prepended > to the section they apply to Fixed! > * There should be spaces between arguments in compares > (a == b) not (a==b) Fixed! > * Where does the name "...Encode121" originate ? encode one-to-one, it implements both ASCII and latin-1 encoding. > * module internal APIs should use lower case names (you > converted some of these to PyUnicode_...() -- this is > normally reserved for APIs which are either marked as > potential candidates for the public API or are very > prominent in the code) Which ones? I introduced a new function for every old one, that had a "const char *errors" argument, and a few new ones in codecs.h, of those PyCodec_EncodeHandlerForObject is vital, because it is used to map for old string arguments to the new function objects. PyCodec_RaiseEncodeErrors can be used in the encoder implementation to raise an encode error, but it could be made static in unicodeobject.h so only those encoders implemented there have access to it. > One thing which I don't like about your API change is that > you removed the Py_UNICODE*data, int size style arguments > -- > this makes it impossible to use the new APIs on non-Python > data or data which is not available as Unicode object. I look through the code and found no situation where the Py_UNICODE*/int version is really used and having two (PyObject *)s (the original and the replacement string), instead of UNICODE*/int and PyObject * made the implementation a little easier, but I can fix that. > Please separate the errors.c patch from this patch -- it > seems totally unrelated to Unicode. PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with four hex digits. I removed it. I'll upload a revised patch as soon as it's done. ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-12 07:29 Message: Logged In: YES user_id=38388 Thanks for the patch -- it looks very impressive !. I'll give it a try later this week. Some first cosmetic tidbits: * please don't place more than one C statement on one line like in: """ + unicode = unicode2; unicodepos = unicode2pos; + unicode2 = NULL; unicode2pos = 0; """ * Comments should start with a capital letter and be prepended to the section they apply to * There should be spaces between arguments in compares (a == b) not (a==b) * Where does the name "...Encode121" originate ? * module internal APIs should use lower case names (you converted some of these to PyUnicode_...() -- this is normally reserved for APIs which are either marked as potential candidates for the public API or are very prominent in the code) One thing which I don't like about your API change is that you removed the Py_UNICODE*data, int size style arguments -- this makes it impossible to use the new APIs on non-Python data or data which is not available as Unicode object. Please separate the errors.c patch from this patch -- it seems totally unrelated to Unicode. Thanks. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470 From noreply@sourceforge.net Sat Jun 23 21:00:06 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 23 Jun 2001 13:00:06 -0700 Subject: [Patches] [ python-Patches-434992 ] Cleanup of warning messages Message-ID: Patches item #434992, was opened at 2001-06-20 18:51 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=434992&group_id=5470 >Category: None >Group: None Status: Open Resolution: None Priority: 5 Submitted By: Robert Minsk (rminsk) Assigned to: Nobody/Anonymous (nobody) Summary: Cleanup of warning messages Initial Comment: I just compiled Python-2.1 of the SGI using the latest compilers (7.3.1.2m) with all the warning flags turned on. The following patch will get rid of most of the warning messages. I would like to see this incorporated into the next release. It is easier to spot real problems when you do not have to sort thru other warning messages. The included patch does not include other optional modules and the ones setup.py finds by default. I may have found 2 bugs in the process. Please see bugs 434989 and 434988. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-23 13:00 Message: Logged In: YES user_id=21627 Refiled as a patch. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=434992&group_id=5470 From noreply@sourceforge.net Sat Jun 23 21:25:20 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sat, 23 Jun 2001 13:25:20 -0700 Subject: [Patches] [ python-Patches-434992 ] Cleanup of warning messages Message-ID: Patches item #434992, was opened at 2001-06-20 18:51 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=434992&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Robert Minsk (rminsk) Assigned to: Nobody/Anonymous (nobody) Summary: Cleanup of warning messages Initial Comment: I just compiled Python-2.1 of the SGI using the latest compilers (7.3.1.2m) with all the warning flags turned on. The following patch will get rid of most of the warning messages. I would like to see this incorporated into the next release. It is easier to spot real problems when you do not have to sort thru other warning messages. The included patch does not include other optional modules and the ones setup.py finds by default. I may have found 2 bugs in the process. Please see bugs 434989 and 434988. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-23 13:25 Message: Logged In: YES user_id=21627 I'm not sure these patches are all correct. For the patches introducing prototypes (e.g. tigetstr), isn't there some header file that offers these prototypes? For cPickle, looking up string_atol seems to be completely unneeded. In turn, looking up string is unneeded, as well. Likewise, don't just remove empty_str, remove the lookup as well. On the save_float changes, you mask a range error: the values will be in 0..255, but you cast this value to char, which is potentially signed. I think p should be unsigned char*, and the casts should then be adjusted to unsigned. Since cPickle changes will need careful review, I recommend to submit them as a separate patch. Why is it necessary to cast the result of umask? Please put a comment in the code, explaining that in detail (i.e. "required for SGI" is not sufficient). Likewise for alarm (which returns long on Linux), and all other casts that were introduced to convert system call results or arguments. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-23 13:00 Message: Logged In: YES user_id=21627 Refiled as a patch. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=434992&group_id=5470 From noreply@sourceforge.net Sun Jun 24 22:59:38 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 24 Jun 2001 14:59:38 -0700 Subject: [Patches] [ python-Patches-401196 ] IPv6 patch against 2.0 CVS tree, as of 20010624 Message-ID: Patches item #401196, was opened at 2000-08-16 05:53 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401196&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jun-ichiro itojun Hagino (itojun) Assigned to: Nobody/Anonymous (nobody) >Summary: IPv6 patch against 2.0 CVS tree, as of 20010624 Initial Comment: ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-24 14:59 Message: Logged In: YES user_id=21627 I have uploaded a new version sent by itojun. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-09 12:15 Message: Logged In: YES user_id=21627 On the API, I have the following comments: - Why is it necessary to introduce gethostbyname2? I recommend to give gethostbyname an optional argument for the address family. - getaddrinfo, when raising a socket error, should include the EAI_ error number. Perhaps there should be a way tod istinguish EAI_ errnos from other errnos, e.g. by subclassing socket error. Otherwise, the API of the C part looks good to me. Ih aven't looked at the Lib part, yet. On the implementation: - I still have problems building the code. Currently, I get the following rejects: ./Lib/BaseHTTPServer.py.rej ./Lib/ftplib.py.rej ./Lib/poplib.py.rej ./Lib/smtplib.py.rej ./Modules/socketmodule.c.rej ./Objects/fileobject.c.rej - The fileobject.c chunk seems to be unnecessary. - On the test problem: It occurs in + test -d -a -f /lib.a ./configure: test: too many arguments which comes from ipv6libdir and ipv6libdir being empty. - The WIDE files should be included in the Modules directory, as they are only used from socketmodule.c. In particular, addrinfo.h should not be installed. - If you can, please include a patch to Doc/lib/libsocket.tex. If not, I will try to draft one. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-05-30 10:34 Message: Logged In: NO i looked at python-dev email. the proposal (split patches) looks fine, but the exact example given in python-dev email is not reasonable. i cannot just send out configure.in change separately from source code changes, period. i can split patches for *.py files separately though. there's more important issue, which is, APi changes for Socket class. i really hoped to get some comment on that part. i really appreciate your comments. i would like to propose that once we nailed down API changes, integrate the patch into the tree. with all #ifdef INET6 in place there should be no impact on IPv4-only builds. i have trouble tracking python development (i'm not a sourceforge expert!), so forgive me for delays in patch submissions. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-05-18 08:29 Message: Logged In: YES user_id=6380 See http://mail.python.org/pipermail/python-dev/2001-May/014889.html for comments from MvL. I'm unassigning this from Fred, he has nothing to do with this. ---------------------------------------------------------------------- Comment By: Jun-ichiro itojun Hagino (itojun) Date: 2001-02-26 02:24 Message: Logged In: YES user_id=63767 about /usr/bin/test argument: does linux /usr/bin/test have -d support? if not, we may need to change configure.in slightly. you are correct that fallback getaddrinfo/getnameinfo.c was missing in the patch. sorry. a question i need to ask is, do we need to supply Python function Socket.getaddrinfo on platforms that do not have getaddrinfo(3)? HAVE_ADDRINFO is used in Include/addrinfo.h, which is also missing in the patch set i have submitted. i've put the missing files into http://www.itojun.org/diary/20001230/missing.shar. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-02-23 23:58 Message: After a shallow review of this patch, I found the following issues: configure.in does not need to list both enable and disable options. When running configure, I got the following error message on Linux checking whether to enable ipv6... yes checking ipv6 stack type... linux-glibc ./configure: test: too many arguments using libc The call to /usr/bin/test should be corrected; I could not find out which specific invocation caused the problem. HAVE_ADDRINFO is not used. Perhaps getaddrinfo.c/getnameinfo.c is missing in the patch? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-01-04 07:51 Message: A new patch is available. I've changed the subject accordingly. Due to upload size restrictions, the patch is now at http://www.itojun.org/diary/20001230/python-2.0-v6-20001230.diff.gz ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2000-12-30 07:25 Message: I got *many* rejects when trying to apply this patch to today's CVS tree. I recommend that patches for generated files (config.h.in, configure) are not included in the patch because they outdate too easily. A number of changes in this patch have already been done by somebody else; others just don't fit into the current code anymore (perhaps due to indentation changes?). Anyway, I'll mark the patch as out-of-date. Please let me know when you upload a new version. ---------------------------------------------------------------------- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2000-08-16 07:00 Message: Postponed until Python 2.1 -- there's not enough time to review this and get it sufficiently tested on enough IPv6-connected platforms in time for 2.0, and we're already in feature freeze. This should go into the tree very quickly once Python 2.0 has been released. Assigned to myself to open it back up after Python 2.0. ---------------------------------------------------------------------- Comment By: Moshe Zadka (moshez) Date: 2000-08-16 06:07 Message: Assigned to Tim, since he's in charge of postponing new features. I'm to timid to postpone it myself. ---------------------------------------------------------------------- Comment By: Jun-ichiro itojun Hagino (itojun) Date: 2000-08-16 05:59 Message: this is revised version of patch #101186 (now with my SourceForge accout... i'm not familiar with the system here, so forgive my possible mistake). 1.6b1 patch applied mostly clean to 2.0. It is confirmed that: - 1.6b1 + IPv6 patch works fine on NetBSD 1.4.2 + KAME, and NetBSD 1.5 - 1.6b1 + IPv6 patch works fine on NetBSD 1.4.2 (NOT an IPv6 ready machine) - 2.0 CVS tree + IPv6 patch works fine on NetBSD + KAME forgot to attach the following into the diff - so i attach it (README.v6) here as comment. I have submitted the patch for 1.5.1, 1.5.2 and 1.6b1, all hit a bad timing - bad luck. contact: core@kame.net, or itojun@kame.net --- IPv6-ready python 1.6 KAME Project $KAME: README.v6,v 1.9 2000/08/15 02:40:38 itojun Exp $ This patchkit enables python 1.6 to perform AF_INET6 socket operations. The only affected module is Modules/socketmodule.c. Modules/socketmodule.c In most cases, IPv6 address can be placed where IPv4 address fits. sockaddr sockaddr tuple is formatted as follows: IPv4: (host, port) IPv6: socket class methods always generate (host, port, flowinfo, scopeid). socket class methods will accept 2, 3, or 4 tuple (for backward compatibility). Compatibility warning: Some of the scripts assume that the sockaddr structure is 2 tuple, like: host, port = sock.getpeername() this will fail if you are connected to IPv6 node. socket.getaddrinfo(host, port [, family, socktype, proto, flags]) host: String or None port: String, Int or None family, socktype, proto, flags: Int, can be omitted Perform getaddrinfo(3). Returns List of the following 5 tuple: (family, socktype, proto, canonname, sockaddr) family: Int socktype: Int proto: Int canonname: String sockaddr: sockaddr (see above) See Lib/httplib.py for typical usage on the client side. socket.getnameinfo(sockaddr, flags) sockaddr: sockaddr flags: Int Perform getnameinfo(3). Returns the following 2 tuple: host: String, numeric or hostname depending on flgags port: String, numeric or portname depending on flgags socket.gethostbyname2(host, af) host: String af: Int Performs gethostbyname2(3). Returns numeric address representation for "host". socket.gethostbyaddr(addr) (behavior change if IPv6 support is compiled in) addr: String Performs gethostbyaddr(3). Returns string address representation for "addr". The function can take IPv6 numeric address as well. This behavior is not problematical, because - if you pass numeric "addr" parameter, we can always identify address family for it - return value is string address reprsentation, where IPv6 and IPv4 are not distinguishable. socket.bind(sa), socket.connect(sa) and others. (No behavior change, but be careful) See above for sockaddr format change. With Python "addr" portion of sockaddr (first element) can be string hostname. When the string hostname resolved to numeric address, it will obey address family of the socket (which was specified when socket.socket() was called). If you give some string that does not give matching address family, you will get some error. s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) # this is okay, if 'localhost' resolves to both IPv4/v6 s.connect('localhost', 80) # this is not okay, of course s.connect('::1', 80) # this is not okay, as v6only.kame.net will not resolve to IPv4 s.connect('v6only.kame.net', 80) Lib/httplib.py IPv6 ready. "host" in HTTP(host) will accept the following 3 forms: [host]:port host:port there must be only single colon host This is to allow IPv6 numeric URL (http://[host]:port/) in documents. IMHO "host:port" parsing should be implemented in urllib.py, not here. Lib/ftplib.py IPv6 ready. This uses EPSV/EPRT on IPv6 ftp. See RFC2428 for protocol details. Lib/SocketServer.py IPv6 ready. Wildcard bind on TCPServer, i.e. TCPServer(('', port)), will bind to wildcard socket on TCPServer.address_family. TCPServer.addresss_family is set to AF_INET by default, so ('', port) will usually bind AF_INET. Lib/smtplib.py, Lib/telnetlib.py, Lib/poplib.py IPv6 ready. Not much to say about protocol details - they just use TCP over IPv6. configure Configure has extra option, --enable-ipv6 and --disable-ipv6. The option controls IPv6 support feature. dynamic link issues in Modules/socketmodule.c Modules/socketmodule.c can be dynamically loaded only in the following situations: - getaddrinfo(3) and getnameinfo(3) are supplied by OS vendor in libc, and libc is dynamic link library. - OS vendor is NOT supplying getaddrinfo(3) nor getnameinfo(3), and You are configuring this package with --disable-ipv6. In this case, you'll be using missing/get{addr,name}info.c and they will refer to gethostby{name,addr}. gethostnameby{name,addr} can usually be found in dynamic-linking libc. In other situations, such as the following, please link Modules/socketmodule.c into python itself. - getaddrinfo(3) and getnameinfo(3) are supplied by OS vendor, but they are in statically linked library like libinet6.a. (KAME falls into this category) python usually links Modules/socketmodule.c into python itself (due to its popularity) so there should be no problem. restrictions - The patched tree will not use gethostbyname_r and other thread-ready libraries. Instead, it will use getaddrinfo() and getnameinfo() throughout the operation. todo - Patch bunch of library files in Lib/*.py. compatibility issues with existing scripts If you disable IPv6 support (./configure --disable-ipv6), the patched code is mostly compatible with original Python (except files in "Lib" directory modified for dual stack support). User script may choke if: - IPv4/v6 dualstack libc is supplied, python is compiled for dual stack, and script assumes some of IPv4-only behavior (especially sockaddr) - IPv4/v6 dualstack libc is supplied, python is compiled for IPv4 only, and script assumes some of IPv4-only behavior. In this case, Python socket class itself does not support IPv6, however, name resolution functions can return IPv6 names since they use IPv6-ready libc functions! I do not recommend this configuration. - script assumes certain IPv4-only version behavior in Lib/*.py. compilation If you use IPv6 features, it is assumed that you have working getaddrinfo() and getnameinfo() library functions. We have noticed that some of IPv6 stack is shipped with broken getaddrinfo(). In such cases, use missing/get{addr,name}info.c instead (but then, you need to have working getipnodeby{name,addr}). If you compile this on IPv4-only machine without get{addr,name}info, missing/get{addr,name}info.c will be used. They are from KAME IPv6 distribution and is #ifdef'ed for IPv4 only support. They are fairly complete implementation and you don't need to bother with bind 8.2 (bind 8.2 get{addr,name}info() has bugs). When compiling this kit on IPv6 node, you may need to specify some additional library paths or cpp defs. (like -linet6 or -DINET6) --enable-ipv6 will give you some warning, if the IPv6 stack is unknown to the "configure" script. Currently, the following IPv6 stacks are officially supported (i.e. we've checked that the package works well): - KAME IPv6 stack, http://www.kame.net/ References RFC2553, for getaddrinfo(3) and getnameinfo(3). Author contacts http://www.kame.net/ mailto:core@kame.net ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401196&group_id=5470 From noreply@sourceforge.net Mon Jun 25 03:57:34 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 24 Jun 2001 19:57:34 -0700 Subject: [Patches] [ python-Patches-435971 ] Adds a UTF-7 codec Message-ID: Patches item #435971, was opened at 2001-06-24 19:57 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435971&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Brian Quinlan (bquinlan) Assigned to: Nobody/Anonymous (nobody) Summary: Adds a UTF-7 codec Initial Comment: This code adds UTF-7 (as described in RFC2152) support to Python. The encoder is hardwired in _codecsmodule.c to not encode allowable whitespace and set O characters (see RFC2152). If there is a standardized way (keyword arguments?) of passing optional arguments to encode methods, it would be trivial to make it possible to do so. Otherwise the patch is pretty straight-forward, I think. It touches: Objects/unicodeobject.c Modules/_codecsmodule.c Lib/test/test_unicode.py Include/unicodeobject.h and adds a new file: Lib/encodings/utf_7.py ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435971&group_id=5470 From noreply@sourceforge.net Mon Jun 25 12:24:28 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 25 Jun 2001 04:24:28 -0700 Subject: [Patches] [ python-Patches-435971 ] Adds a UTF-7 codec Message-ID: Patches item #435971, was opened at 2001-06-24 19:57 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435971&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Brian Quinlan (bquinlan) >Assigned to: M.-A. Lemburg (lemburg) Summary: Adds a UTF-7 codec Initial Comment: This code adds UTF-7 (as described in RFC2152) support to Python. The encoder is hardwired in _codecsmodule.c to not encode allowable whitespace and set O characters (see RFC2152). If there is a standardized way (keyword arguments?) of passing optional arguments to encode methods, it would be trivial to make it possible to do so. Otherwise the patch is pretty straight-forward, I think. It touches: Objects/unicodeobject.c Modules/_codecsmodule.c Lib/test/test_unicode.py Include/unicodeobject.h and adds a new file: Lib/encodings/utf_7.py ---------------------------------------------------------------------- >Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-25 04:24 Message: Logged In: YES user_id=38388 encode functions can have optionl arguments (see for example the utf-16 codec or the charmap codec). They don't need to be keyword arguments although this would make them easier to handle in case we should ever want to change the API. I'll look at the patch more closely later this week or today. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435971&group_id=5470 From noreply@sourceforge.net Mon Jun 25 19:10:15 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 25 Jun 2001 11:10:15 -0700 Subject: [Patches] [ python-Patches-435971 ] Adds a UTF-7 codec Message-ID: Patches item #435971, was opened at 2001-06-24 19:57 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435971&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Brian Quinlan (bquinlan) Assigned to: M.-A. Lemburg (lemburg) Summary: Adds a UTF-7 codec Initial Comment: This code adds UTF-7 (as described in RFC2152) support to Python. The encoder is hardwired in _codecsmodule.c to not encode allowable whitespace and set O characters (see RFC2152). If there is a standardized way (keyword arguments?) of passing optional arguments to encode methods, it would be trivial to make it possible to do so. Otherwise the patch is pretty straight-forward, I think. It touches: Objects/unicodeobject.c Modules/_codecsmodule.c Lib/test/test_unicode.py Include/unicodeobject.h and adds a new file: Lib/encodings/utf_7.py ---------------------------------------------------------------------- >Comment By: Brian Quinlan (bquinlan) Date: 2001-06-25 11:10 Message: Logged In: YES user_id=108973 OK, I see the utf-16 example. In a few weeks, when I have some time again, I might do the necessary changes and testing. It might even be nice to be able to specify a list of characters to escape, if they are dangerous for your application (of maybe it's just morning featuritis :-)). ---------------------------------------------------------------------- Comment By: M.-A. Lemburg (lemburg) Date: 2001-06-25 04:24 Message: Logged In: YES user_id=38388 encode functions can have optionl arguments (see for example the utf-16 codec or the charmap codec). They don't need to be keyword arguments although this would make them easier to handle in case we should ever want to change the API. I'll look at the patch more closely later this week or today. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435971&group_id=5470 From noreply@sourceforge.net Mon Jun 25 20:20:04 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 25 Jun 2001 12:20:04 -0700 Subject: [Patches] [ python-Patches-436173 ] site.py shouldn't normcase() agressively Message-ID: Patches item #436173, was opened at 2001-06-25 12:20 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436173&group_id=5470 Category: library Group: None Status: Open Resolution: None Priority: 5 Submitted By: Fred L. Drake, Jr. (fdrake) Assigned to: Jack Jansen (jackjansen) Summary: site.py shouldn't normcase() agressively Initial Comment: The site module should not be using the normcase() version of directory names as the final result in sys.path; this patch only uses the normcase() version for comparisons, but not sys.path contents. The intention is to allow Windows and MacOS users to see the paths as they would in their native filesystem tools. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436173&group_id=5470 From noreply@sourceforge.net Mon Jun 25 21:09:14 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 25 Jun 2001 13:09:14 -0700 Subject: [Patches] [ python-Patches-436193 ] SGI cores on 1.0 / 0 Message-ID: Patches item #436193, was opened at 2001-06-25 13:09 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436193&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Robert Minsk (rminsk) Assigned to: Nobody/Anonymous (nobody) Summary: SGI cores on 1.0 / 0 Initial Comment: This fix is in reference to bug 435026. Please see bug for complete history. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436193&group_id=5470 From noreply@sourceforge.net Tue Jun 26 01:03:51 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 25 Jun 2001 17:03:51 -0700 Subject: [Patches] [ python-Patches-434992 ] Cleanup of warning messages Message-ID: Patches item #434992, was opened at 2001-06-20 18:51 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=434992&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Robert Minsk (rminsk) Assigned to: Nobody/Anonymous (nobody) Summary: Cleanup of warning messages Initial Comment: I just compiled Python-2.1 of the SGI using the latest compilers (7.3.1.2m) with all the warning flags turned on. The following patch will get rid of most of the warning messages. I would like to see this incorporated into the next release. It is easier to spot real problems when you do not have to sort thru other warning messages. The included patch does not include other optional modules and the ones setup.py finds by default. I may have found 2 bugs in the process. Please see bugs 434989 and 434988. ---------------------------------------------------------------------- >Comment By: Robert Minsk (rminsk) Date: 2001-06-25 17:03 Message: Logged In: YES user_id=132786 > I'm not sure these patches are all correct. For the > patches introducing prototypes (e.g. tigetstr), isn't > there some header file that offers these prototypes? That is what my patch fixed. It was changing from #ifdef sgi extern char *tigetstr(char *); extern char *tparm(char *instring, ...); #endif to #ifdef __sgi #include #endif > For cPickle, looking up string_atol seems to be completely > unneeded. In turn, looking up string is unneeded, as well. > Likewise, don't just remove empty_str, remove the lookup > as well. Are you saying remove the UNLESS (PyString_FromString("")) return -1; also. I guess I missed that. > I think p should be unsigned > char*, and the casts should then be adjusted to unsigned. Should I fix that or are you looking into it? > Why is it necessary to cast the result of umask? Please > put a comment in the code, explaining that in detail (i.e. > "required for SGI" is not sufficient). Even on linux umask returns the type umask_t which may not be an int. I could change the code to int i; umask_t u; if (!PyArg_ParseTuple(args, "i:umask", &i)) return NULL; u = umask((mask_t)i); but is umask_t available on all machines? This is not a critical warning, in fact on the SGI it is only when you compile with -fullwarn and it's only an INFO message. The INFO messages are useful to identifiy potential errors. The casts should not add any overhead. This is one reason you should compile code on multiple compilers. Each compiler has it's own strength and weaknesses at identifing problems. This is not required for SGI but just to clean up messages from other compilers besides gcc. Other vendors compilers also give other warning messages. > Likewise for > alarm > (which returns long on Linux), and all other casts that > were introduced to convert system call results or > arguments. Linux (at least RedHat 6.2) does not return a long from alarm, it returns an unsigned int. Should I change the signal_alarm to PyInt_FromUnsignedLong(alarm(t))? Are there other platforms that return a signed long from alarm? I would rather cast to the type the function currently uses. The cast are just casting to the type the functions expect. This goes on if not explicity cast anywhy. So why not get rid of the implicit cast. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-23 13:25 Message: Logged In: YES user_id=21627 I'm not sure these patches are all correct. For the patches introducing prototypes (e.g. tigetstr), isn't there some header file that offers these prototypes? For cPickle, looking up string_atol seems to be completely unneeded. In turn, looking up string is unneeded, as well. Likewise, don't just remove empty_str, remove the lookup as well. On the save_float changes, you mask a range error: the values will be in 0..255, but you cast this value to char, which is potentially signed. I think p should be unsigned char*, and the casts should then be adjusted to unsigned. Since cPickle changes will need careful review, I recommend to submit them as a separate patch. Why is it necessary to cast the result of umask? Please put a comment in the code, explaining that in detail (i.e. "required for SGI" is not sufficient). Likewise for alarm (which returns long on Linux), and all other casts that were introduced to convert system call results or arguments. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2001-06-23 13:00 Message: Logged In: YES user_id=21627 Refiled as a patch. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=434992&group_id=5470 From noreply@sourceforge.net Tue Jun 26 04:01:13 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 25 Jun 2001 20:01:13 -0700 Subject: [Patches] [ python-Patches-436258 ] Some cleanup of the cPickle module Message-ID: Patches item #436258, was opened at 2001-06-25 20:01 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436258&group_id=5470 Category: Modules Group: None Status: Open Resolution: None Priority: 5 Submitted By: Robert Minsk (rminsk) Assigned to: Nobody/Anonymous (nobody) Summary: Some cleanup of the cPickle module Initial Comment: While getting rid of compiler warning messages for another non-gcc compiler I found some dead code in cPickle.c. The attached patch fixes some possible type casting problems and removed some dead code. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436258&group_id=5470 From noreply@sourceforge.net Tue Jun 26 14:30:28 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 26 Jun 2001 06:30:28 -0700 Subject: [Patches] [ python-Patches-436376 ] C API Request Message-ID: Patches item #436376, was opened at 2001-06-26 06:30 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436376&group_id=5470 Category: core (C code) Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: C API Request Initial Comment: I would like to have the following 4 C API functions added to pystate.c so that advanced extension modules can more easily examine the internal state of the Python interpreter and its threads. The intent of these functions is to provide a mechanism for gaining portable read-only access to all of the current PyThreadState * structures. The primary use of this would be in advanced debugging applications. Cheers, Dave Beazley ------------------------------------------- /* included in pystate.c */ PyInterpreterState * PyInterpreterState_Head(void) { return interp_head; } PyInterpreterState * PyInterpreterState_Next(PyInterpreterState *interp) { return interp->next; } PyThreadState * PyInterpreterState_ThreadHead(PyInterpreterState *interp) { return interp->tstate_head; } PyThreadState * PyThreadState_Next(PyThreadState *tstate) { return tstate->next; } ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436376&group_id=5470 From noreply@sourceforge.net Tue Jun 26 21:35:38 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 26 Jun 2001 13:35:38 -0700 Subject: [Patches] [ python-Patches-436496 ] Configuring UCS-4 Message-ID: Patches item #436496, was opened at 2001-06-26 13:35 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436496&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Martin v. Löwis (loewis) Assigned to: Nobody/Anonymous (nobody) Summary: Configuring UCS-4 Initial Comment: This patch allows Py_UNICODE to be defined as both 2 and 4 byte type, using --enable-unicode={ucs2,ucs2}. If nothing is specified, Py_UNICODE defaults to wchar_t if available. The Unicode type itself, and the UTF-8 and UTF-16 codecs have been adjusted to deal with both representations. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436496&group_id=5470 From offer@findmybusienss.com Thu Jun 28 06:05:20 2001 From: offer@findmybusienss.com (offer@findmybusienss.com) Date: Thu, 28 Jun 2001 05:05:20 Subject: [Patches] Sell Your Business? Place your ads... Free Offer Message-ID: Save your money & time!! Place your LISTINGS or AD for FREE and Find your buyers.. -------------------------------------------------------------------------------------------------- Businesses for sale, Investment Properties, Franchises, Homebased businesses, Distributors, Wholesales, M&A, Other Special Businesses... Visit our website http://www.findmybusiness.com **30 days free trial for 4zip.net the broker's website listing services** Find our features and maximize your business while you save a lot on your high cost of marketing. Check our service at http://4zip.net --------------------------------------------------------------------------------------------------- He will never let you down, Trust in the Lord with all your heart... From noreply@sourceforge.net Thu Jun 28 19:24:39 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 28 Jun 2001 11:24:39 -0700 Subject: [Patches] [ python-Patches-433537 ] better cross-compilation support Message-ID: Patches item #433537, was opened at 2001-06-15 12:48 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433537&group_id=5470 Category: Build Group: None Status: Open Resolution: None Priority: 5 Submitted By: michael shiplett (walrusmonkey) Assigned to: Nobody/Anonymous (nobody) Summary: better cross-compilation support Initial Comment: configure.in uses AC_TRY_RUN in several places without allowing for cached values to allow for cross-compilation. this patch uses the same approach as other parts of configure.in use. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-06-28 11:24 Message: Logged In: NO Useful patch. I used it to compile python 2.0.1 for the Agenda VR3 PDA. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433537&group_id=5470 From noreply@sourceforge.net Sat Jun 30 06:21:58 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 29 Jun 2001 22:21:58 -0700 Subject: [Patches] [ python-Patches-436496 ] Configuring UCS-4 Message-ID: Patches item #436496, was opened at 2001-06-26 13:35 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436496&group_id=5470 Category: None Group: None >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Martin v. Löwis (loewis) Assigned to: Nobody/Anonymous (nobody) Summary: Configuring UCS-4 Initial Comment: This patch allows Py_UNICODE to be defined as both 2 and 4 byte type, using --enable-unicode={ucs2,ucs2}. If nothing is specified, Py_UNICODE defaults to wchar_t if available. The Unicode type itself, and the UTF-8 and UTF-16 codecs have been adjusted to deal with both representations. ---------------------------------------------------------------------- >Comment By: Martin v. Löwis (loewis) Date: 2001-06-29 22:21 Message: Logged In: YES user_id=21627 This patch has been committed as configure.in 1.222 and unicodeobject.c 2.98. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436496&group_id=5470