From noreply@sourceforge.net  Fri Jun  1 15:34:28 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 01 Jun 2001 07:34:28 -0700
Subject: [Patches] [ python-Patches-427190 ] Speed-up "O" calls
Message-ID: <E155q0S-0001dC-00@usw-sf-web2.sourceforge.net>

Patches item #427190, was updated on 2001-05-24 22:30
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427190&group_id=5470

>Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Martin v. Löwis (loewis)
>Assigned to: Jeremy Hylton (jhylton)
>Summary: Speed-up "O" calls

Initial Comment:
This patch improves the performance of a few functions
which have an "O" signature (ord, len, and
list_append). On selected test cases, this patch gives
a speed-up of 40%. If accepted, the approach can be
extended to more signatures. E.g. "l" is already
provided in the patch, but currently not used.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-01 07:34

Message:
Logged In: YES 
user_id=21627

I rewrote the patch to only support METH_NOARGS and METH_O,
and to not use bit masks for them.

I also changed calling conventions for all Object operations
and bltin and sys functions. In the course of these changes,
two functions got a changed meaning:
- file.writelines accepts only exactly one argument
- iter.next does not accept any arguments anymore

As you can see in the patch,there is still a lot of places
that continue to use OLDARGS (plus all the Modules functions
that have not been changed in this patch), so OLDARGS will
be needed for quite some time.

----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2001-05-29 13:59

Message:
Logged In: YES 
user_id=31392

I like METH_O, but I'm not sure about METH_L.  I'd rather
see the call handling in ceval be type-neutral.  It's easy
enough for the callee to cast from an object to an int (or
any other type).  There should be no effect on performance
and it reduces the amount of code in the core.

I think the implementation could be simplified a lot if it
defined METH_O -- or perhaps METH_NOARGS,  METH_ONEARG, and
maybe even METH_TWOARGS (but Tim has a pretty good argument
against that one).  I don't think there's any define METH_O
via METH_SPECIAL and reserve all of 0xFFF0 for flags on
METH_SPECIAL.  Instead, I'd just use the next N bits to
implement the next N flags.

The SPECIALSIZE and extra stack used in the implementation
seem like unneeded generality, too.  If the implementation
is only going to support 0 and 1 (and possibly 2) argument,
there's no need for anything more general.

Finally, I suggest appropriating fast_cfunction() for this
purpose, rather than calling the new function
do_call_special(), where "special" isn't a very specific
meaning.  If METH_NOARGS and METH_ONEARG are implemented,
there is basically no reason to use METH_OLDARGS.  So we can
get rid of it in the code base and stop attempting to
optimize it.

Do you want to have a go at a smaller patch that just did
METH_ONEARG and METH_NOARGS?


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427190&group_id=5470


From noreply@sourceforge.net  Fri Jun  1 16:14:27 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 01 Jun 2001 08:14:27 -0700
Subject: [Patches] [ python-Patches-427190 ] Speed-up "O" calls
Message-ID: <E155qd9-000281-00@usw-sf-web2.sourceforge.net>

Patches item #427190, was updated on 2001-05-24 22:30
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427190&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Martin v. Löwis (loewis)
Assigned to: Jeremy Hylton (jhylton)
>Summary: Speed-up "O" calls

Initial Comment:
This patch improves the performance of a few functions
which have an "O" signature (ord, len, and
list_append). On selected test cases, this patch gives
a speed-up of 40%. If accepted, the approach can be
extended to more signatures. E.g. "l" is already
provided in the patch, but currently not used.

----------------------------------------------------------------------

>Comment By: Jeremy Hylton (jhylton)
Date: 2001-06-01 08:14

Message:
Logged In: YES 
user_id=31392

Just took a quick look -- looks good.  

One question: Why does METH_NOARGS call the method with two
arguments where the second is always NULL?  Wouldn't it be
clearer to have these functions take one argument?


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-01 07:34

Message:
Logged In: YES 
user_id=21627

I rewrote the patch to only support METH_NOARGS and METH_O,
and to not use bit masks for them.

I also changed calling conventions for all Object operations
and bltin and sys functions. In the course of these changes,
two functions got a changed meaning:
- file.writelines accepts only exactly one argument
- iter.next does not accept any arguments anymore

As you can see in the patch,there is still a lot of places
that continue to use OLDARGS (plus all the Modules functions
that have not been changed in this patch), so OLDARGS will
be needed for quite some time.

----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2001-05-29 13:59

Message:
Logged In: YES 
user_id=31392

I like METH_O, but I'm not sure about METH_L.  I'd rather
see the call handling in ceval be type-neutral.  It's easy
enough for the callee to cast from an object to an int (or
any other type).  There should be no effect on performance
and it reduces the amount of code in the core.

I think the implementation could be simplified a lot if it
defined METH_O -- or perhaps METH_NOARGS,  METH_ONEARG, and
maybe even METH_TWOARGS (but Tim has a pretty good argument
against that one).  I don't think there's any define METH_O
via METH_SPECIAL and reserve all of 0xFFF0 for flags on
METH_SPECIAL.  Instead, I'd just use the next N bits to
implement the next N flags.

The SPECIALSIZE and extra stack used in the implementation
seem like unneeded generality, too.  If the implementation
is only going to support 0 and 1 (and possibly 2) argument,
there's no need for anything more general.

Finally, I suggest appropriating fast_cfunction() for this
purpose, rather than calling the new function
do_call_special(), where "special" isn't a very specific
meaning.  If METH_NOARGS and METH_ONEARG are implemented,
there is basically no reason to use METH_OLDARGS.  So we can
get rid of it in the code base and stop attempting to
optimize it.

Do you want to have a go at a smaller patch that just did
METH_ONEARG and METH_NOARGS?


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427190&group_id=5470


From noreply@sourceforge.net  Fri Jun  1 16:19:54 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 01 Jun 2001 08:19:54 -0700
Subject: [Patches] [ python-Patches-420565 ] makes setup.py search sys.prefix
Message-ID: <E155qiQ-0002BX-00@usw-sf-web2.sourceforge.net>

Patches item #420565, was updated on 2001-05-01 14:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=420565&group_id=5470

Category: Build
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: A.M. Kuchling (akuchling)
Summary: makes setup.py search sys.prefix 

Initial Comment:
It's useful to have setup.py search the lib and include
directories in sys.prefix before it checks /usr/local.
That way, if you are building Python into a custom
location and want it to use the the libraries installed
there rather than the system defaults, you can give the
--prefix option to configure and setup.py will search
that path first.

----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-06-01 08:19

Message:
Logged In: NO 

I totally agree. I'm building for hard hat linux on a 
debian host, and the implicit search in /usr/lib is 
totally the wrong thing to do in this case.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=420565&group_id=5470


From noreply@sourceforge.net  Fri Jun  1 21:07:21 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 01 Jun 2001 13:07:21 -0700
Subject: [Patches] [ python-Patches-429442 ] Cygwin sys.platform/get_platform() patch
Message-ID: <E155vCb-0003NT-00@usw-sf-web3.sourceforge.net>

Patches item #429442, was updated on 2001-06-01 13:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429442&group_id=5470

Category: distutils
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jason Tishler (jlt63)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cygwin sys.platform/get_platform() patch

Initial Comment:
This patch corrects sys.platform and distutils.util.get_platform()
problems caused by the cruft contained in Cygwin's uname -s.

Please see the following for the gory details:

http://www.cygwin.com/ml/cygwin-apps/2001-05/msg00106.html

Note that the above also solicited input from the community in an
attempt to prevent any potential heartache.  Since no one responded
it would appear that either the changes are acceptable or that no one
really cares... :,)

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429442&group_id=5470


From noreply@sourceforge.net  Sat Jun  2 01:34:43 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 01 Jun 2001 17:34:43 -0700
Subject: [Patches] [ python-Patches-414991 ] Separate CFLAGS and CPPFLAGS
Message-ID: <E155zNL-0006hc-00@usw-sf-web3.sourceforge.net>

Patches item #414991, was updated on 2001-04-09 13:33
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=414991&group_id=5470

Category: Build
Group: None
Status: Open
Resolution: Postponed
Priority: 5
Submitted By: Wilfredo Sanchez (wsanchez)
Assigned to: Neil Schemenauer (nascheme)
Summary: Separate CFLAGS and CPPFLAGS

Initial Comment:
CFLAGS should not contain preprocessor directives, which 
is the role of CPPFLAGS.  By combining the two, it is 
not possible to override CFLAGS (eg. make CFLAGS="-arch 
i386 -arch ppc -O3 -pipe") without breaking the build.

This patch is against Python 2.1b2a.


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-06-01 17:34

Message:
Logged In: NO 

Neil,  Can we get this checked in for 2.2?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2001-04-10 14:44

Message:
Logged In: YES 
user_id=35752

I agree with the change but I'm not comfortable checking
it in for 2.1 (even though the patch is quite simple).  It
will have to wait.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-04-10 14:26

Message:
Logged In: YES 
user_id=6380

Newl, can you review this and maybe check this in?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=414991&group_id=5470


From noreply@sourceforge.net  Sat Jun  2 07:20:03 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 01 Jun 2001 23:20:03 -0700
Subject: [Patches] [ python-Patches-414991 ] Separate CFLAGS and CPPFLAGS
Message-ID: <E1564lX-00028g-00@usw-sf-web3.sourceforge.net>

Patches item #414991, was updated on 2001-04-09 13:33
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=414991&group_id=5470

Category: Build
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Wilfredo Sanchez (wsanchez)
Assigned to: Neil Schemenauer (nascheme)
Summary: Separate CFLAGS and CPPFLAGS

Initial Comment:
CFLAGS should not contain preprocessor directives, which 
is the role of CPPFLAGS.  By combining the two, it is 
not possible to override CFLAGS (eg. make CFLAGS="-arch 
i386 -arch ppc -O3 -pipe") without breaking the build.

This patch is against Python 2.1b2a.


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-06-01 17:34

Message:
Logged In: NO 

Neil,  Can we get this checked in for 2.2?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2001-04-10 14:44

Message:
Logged In: YES 
user_id=35752

I agree with the change but I'm not comfortable checking
it in for 2.1 (even though the patch is quite simple).  It
will have to wait.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-04-10 14:26

Message:
Logged In: YES 
user_id=6380

Newl, can you review this and maybe check this in?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=414991&group_id=5470


From noreply@sourceforge.net  Sat Jun  2 10:27:12 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 02 Jun 2001 02:27:12 -0700
Subject: [Patches] [ python-Patches-429542 ] Bugfix for libsmtp example
Message-ID: <E1567ge-0001hV-00@usw-sf-web1.sourceforge.net>

Patches item #429542, was updated on 2001-06-02 02:27
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429542&group_id=5470

Category: documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Sean Reifschneider (jafo)
Assigned to: Nobody/Anonymous (nobody)
Summary: Bugfix for libsmtp example

Initial Comment:
libsmtp includes an example which does:

   while 1:
      line = raw_input()
      if not line: break

which fails raising an EOFError exception.  This patch
changes the code to:

   while 1:
      try:
         line = raw_input()
      except EOFError:
         break

Sean

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429542&group_id=5470


From noreply@sourceforge.net  Sat Jun  2 11:12:45 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 02 Jun 2001 03:12:45 -0700
Subject: [Patches] [ python-Patches-427190 ] Speed-up "O" calls
Message-ID: <E1568Oj-0004to-00@usw-sf-web2.sourceforge.net>

Patches item #427190, was updated on 2001-05-24 22:30
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427190&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Martin v. Löwis (loewis)
Assigned to: Jeremy Hylton (jhylton)
>Summary: Speed-up "O" calls

Initial Comment:
This patch improves the performance of a few functions
which have an "O" signature (ord, len, and
list_append). On selected test cases, this patch gives
a speed-up of 40%. If accepted, the approach can be
extended to more signatures. E.g. "l" is already
provided in the patch, but currently not used.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-02 03:12

Message:
Logged In: YES 
user_id=21627

New version uploaded. This uses functions with only the 
self argument for METH_NOARGS, and introduces 
PyNoArgsFunction for them.

It also adds a section for api.tex documenting the METH_ 
flags, and an entry in NEWS mentioning the new METH_ flags.


----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2001-06-01 08:14

Message:
Logged In: YES 
user_id=31392

Just took a quick look -- looks good.  

One question: Why does METH_NOARGS call the method with two
arguments where the second is always NULL?  Wouldn't it be
clearer to have these functions take one argument?


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-01 07:34

Message:
Logged In: YES 
user_id=21627

I rewrote the patch to only support METH_NOARGS and METH_O,
and to not use bit masks for them.

I also changed calling conventions for all Object operations
and bltin and sys functions. In the course of these changes,
two functions got a changed meaning:
- file.writelines accepts only exactly one argument
- iter.next does not accept any arguments anymore

As you can see in the patch,there is still a lot of places
that continue to use OLDARGS (plus all the Modules functions
that have not been changed in this patch), so OLDARGS will
be needed for quite some time.

----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2001-05-29 13:59

Message:
Logged In: YES 
user_id=31392

I like METH_O, but I'm not sure about METH_L.  I'd rather
see the call handling in ceval be type-neutral.  It's easy
enough for the callee to cast from an object to an int (or
any other type).  There should be no effect on performance
and it reduces the amount of code in the core.

I think the implementation could be simplified a lot if it
defined METH_O -- or perhaps METH_NOARGS,  METH_ONEARG, and
maybe even METH_TWOARGS (but Tim has a pretty good argument
against that one).  I don't think there's any define METH_O
via METH_SPECIAL and reserve all of 0xFFF0 for flags on
METH_SPECIAL.  Instead, I'd just use the next N bits to
implement the next N flags.

The SPECIALSIZE and extra stack used in the implementation
seem like unneeded generality, too.  If the implementation
is only going to support 0 and 1 (and possibly 2) argument,
there's no need for anything more general.

Finally, I suggest appropriating fast_cfunction() for this
purpose, rather than calling the new function
do_call_special(), where "special" isn't a very specific
meaning.  If METH_NOARGS and METH_ONEARG are implemented,
there is basically no reason to use METH_OLDARGS.  So we can
get rid of it in the code base and stop attempting to
optimize it.

Do you want to have a go at a smaller patch that just did
METH_ONEARG and METH_NOARGS?


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427190&group_id=5470


From noreply@sourceforge.net  Sat Jun  2 11:15:47 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 02 Jun 2001 03:15:47 -0700
Subject: [Patches] [ python-Patches-424335 ] richcompare for strings
Message-ID: <E1568Rf-0004ve-00@usw-sf-web2.sourceforge.net>

Patches item #424335, was updated on 2001-05-15 12:54
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=424335&group_id=5470

Category: core (C code)
Group: None
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Martin v. Löwis (loewis)
Assigned to: Martin v. Löwis (loewis)
Summary: richcompare for strings

Initial Comment:
This patch implements the tp_richcompare slot for
string objects. It shows a 8% speed-up on selected test
cases.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-02 03:15

Message:
Logged In: YES 
user_id=21627

Committed as  2.117 of stringobject.c, 2.95 of 
dictobject.c, and 2.27 of stringobject.h.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-05-23 21:48

Message:
Logged In: YES 
user_id=31435

Oops!  Looks like I forgot to assign this back to Martin.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-05-22 16:04

Message:
Logged In: YES 
user_id=31435

Marked accepted.  Looks good!

Suggest

return a->ob_size == b->ob_size &&
       *a->ob_sval == *b->ob_sval &&
       memcmp(a->ob_sval, b->ob_sval, a->ob_size) == 0;

for the tail of the _PyString_Eq body as compilers should 
have an easier time of turning that into the best code for 
the platform (especially the weaker compilers do better 
optimizing expressions than across branches).  Plus it 
improves clarity, at least for me.

Unsure why the

case Py_EQ: c = c == 0; break; /* not needed here */

case is there:  if it's truly unreacable (and I agree it 
isn't), better to assert-fail if it gets there.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-05-22 14:31

Message:
Logged In: YES 
user_id=31435

Assigned to me.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-05-21 08:33

Message:
Logged In: YES 
user_id=21627

The new revision of the patch entirely removes tp_compare
for string, following discussions on python-dev. The only
direct user of string_compare has been changed to use the
new function _PyString_Eq instead.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=424335&group_id=5470


From noreply@sourceforge.net  Sat Jun  2 16:40:23 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 02 Jun 2001 08:40:23 -0700
Subject: [Patches] [ python-Patches-429611 ] doc build on win32 with MiKTeX et al.
Message-ID: <E156DVn-0006Pq-00@usw-sf-web1.sourceforge.net>

Patches item #429611, was updated on 2001-06-02 08:40
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429611&group_id=5470

Category: documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Frederic Giacometti (giacometti)
Assigned to: Nobody/Anonymous (nobody)
Summary: doc build on win32 with MiKTeX et al.

Initial Comment:

With this patch, everything build fine on win32 but for the following problems:

  - html/api/labels.pl not generated -> html/api/*.html uncorrect
  - lib/modindex.html not generated -> html/modindex.html uncorrect

Problems worked out:

- fancyhdr.sty is not in the Miktex distribution ...

- Makefile content made compatible with the Windows command line (now runs fine with VC++'s 
nmake, or cygnus's make --win32)

- misc. problems regarding the path formats

- miktex 2.0's pdflatex would block on a mismatching macro level in python.sty -> fixed


Hints on installing latex2html:

- I had to work out some fixes in the config.pl script (2,000 lines of perl...)
- make sure the paths to the ghostscript and miktex installations have no spaces!!!!!! latex2html 
will silently screw up its configuration process
- looking at perl scripts gave me a serious trauma


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429611&group_id=5470


From noreply@sourceforge.net  Sat Jun  2 16:56:53 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 02 Jun 2001 08:56:53 -0700
Subject: [Patches] [ python-Patches-429614 ] pythonpath and optimize def. before init
Message-ID: <E156Dll-0008MM-00@usw-sf-web2.sourceforge.net>

Patches item #429614, was updated on 2001-06-02 08:56
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429614&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Frederic Giacometti (giacometti)
Assigned to: Nobody/Anonymous (nobody)
Summary: pythonpath and optimize def. before init

Initial Comment:

A) Addition of four functions
=====================

Py_{Set, Get}{PythonPath, OptimizeLevel}()
with the same semantics as Py_{Set, Get}ProgramName()

(Note: the C ANSI type 'char const*' is used to describe non-modifiable strings)

These four functions are needed in the next JPE runtime (Python 2.1 patch included in the 
distribution); this allows setting the PYTHONPATH and optimize level from Java property values.


B) Option '-P pythonpath' on the Python command line:
========================================

This option defines 'pythonpath' from the command line (and override the PYTHONPATH 
environment variable if necessary).

Usefullness: Sometimes, one does not want to rely on the environment variables, or modify them.

Sample application: Running build and test scripts in full control of the environment, and with 
different PYTHONPATH values.

This option is needed by the build and test scripts of the next JPE source distribution (Python 2.1 
patch included in the distribution.

Frederic Giacometti
fred@arakne.com


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429614&group_id=5470


From noreply@sourceforge.net  Sun Jun  3 14:55:14 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 03 Jun 2001 06:55:14 -0700
Subject: [Patches] [ python-Patches-423394 ] Fix pulldom to preserve ns attributes
Message-ID: <E156YLa-0000BI-00@usw-sf-web3.sourceforge.net>

Patches item #423394, was updated on 2001-05-11 11:04
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=423394&group_id=5470

Category: XML
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Martin v. Löwis (loewis)
Summary: Fix pulldom to preserve ns attributes

Initial Comment:
Here is a fix for pulldom.py that preserves 
xmlns attributes that declare namespaces. 

The current pulldom / minidom captures xml namespace 
information in elements and attributes, but the 
actual namespace declaration attributes 
(xmlns:foo="...") are not preserved on the element 
where they appear. 

This makes it impossible for 
certain applications that do more complex name 
dereferencing (XMLSchema is an example) that 
requires not only namespace uris but 
also the prefixes used and the original scope 
information.

The current patch preserves xmlns="" and 
xmlns:foo="" as *non-namespace qualified* 
attributes, which appears to be the norm in other 
DOM implementations.

Pls let me know if you have any questions.

-Brian (brian@digicool.com)

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-03 06:55

Message:
Logged In: YES 
user_id=21627

The patch is a good idea, but I think it does not conform 
to the DOM recommendation. In the DOM, the namespace URI
"http://www.w3.org/2000/xmlns/" is used for attributes 
whose namespace prefix or qualified name is xmlns.

In addition, the patch contains a typo, it hould not say 
atetr_items.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=423394&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 03:31:52 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 03 Jun 2001 19:31:52 -0700
Subject: [Patches] [ python-Patches-423221 ] Add a few Windows encoding aliases
Message-ID: <E156k9o-0005Jn-00@usw-sf-web2.sourceforge.net>

Patches item #423221, was updated on 2001-05-10 21:14
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=423221&group_id=5470

Category: library
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
>Assigned to: Mark Hammond (mhammond)
Summary: Add a few Windows encoding aliases

Initial Comment:
This patch adds aliases for some of the common Windows 
encodings. Windows-1252 is particularly useful because 
it is the default encoding used by Visual Studio .NET 
projects. 

Microsoft's complete encoding list can be found at:
http://msdn.microsoft.com/workshop/author/dhtml/referen
ce/charsets/charset4.asp


----------------------------------------------------------------------

>Comment By: Mark Hammond (mhammond)
Date: 2001-06-03 19:31

Message:
Logged In: YES 
user_id=14198

Checked in:
/cvsroot/python/python/dist/src/Lib/encodings/aliases.py,v  
<--  aliases.py
new revision: 1.7; previous revision: 1.6


----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2001-05-30 14:04

Message:
Logged In: YES 
user_id=108973

Yeah, there are tonnes. But after my initial mistake, I 
don't want to add any without carefully checking that the 
encodings that I am aliasing are exactly identical.

I'll probably add some more later but Windows-1252 is 
probably the most immediately useful.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-05-30 12:59

Message:
Logged In: YES 
user_id=38388

The link you gave doesn't work for me, but the aliases look
reasonable... aren't there more (Windows does have far more
code pages than the few you added) ?

In any case, go ahead and check them in :-)

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2001-05-29 23:30

Message:
Logged In: YES 
user_id=14198

This looks good to me!  Assiging to MAL simply for 
comment.  Marc - if you have no objections and this sounds 
reasonable, assign back to me and I will check it in.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2001-05-29 23:23

Message:
Logged In: YES 
user_id=108973

The first patch (alias.patch) is incorrect and I'm not sure 
it the replacement (aliases.patch) is visible to anyone but 
me, so here is aliases.patch inline:

*** d:\Dev\python\dist\src\Lib\encodings\aliases.py	Wed 
Jun  7 02:12:30 2000
--- d:\Dev\python-dev\dist\src\Lib\encodings\aliases.py	Tue 
May 29 19:16:58 2001
***************
*** 59,64 ****
--- 59,69 ----
      'macroman': 'mac_roman',
      'macturkish': 'mac_turkish',
  
+     # Windows
+     'windows_1252': 'cp1252',
+     'windows_1254': 'cp1254',
+     'windows_1255': 'cp1255',
+     
      # MBCS
      'dbcs': 'mbcs',

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=423221&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 04:53:16 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 03 Jun 2001 20:53:16 -0700
Subject: [Patches] [ python-Patches-429957 ] Add some more EBCDIC  encodings
Message-ID: <E156lQa-0002KG-00@usw-sf-web1.sourceforge.net>

Patches item #429957, was updated on 2001-06-03 20:53
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429957&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
Assigned to: Nobody/Anonymous (nobody)
Summary: Add some more EBCDIC  encodings

Initial Comment:
Add support for cp1140, which is identical to cp037, 
with the addition of the euro character.

Also added a few EDBDIC aliases.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429957&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 05:24:46 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 03 Jun 2001 21:24:46 -0700
Subject: [Patches] [ python-Patches-429024 ] Deal with some unary ops at compile time
Message-ID: <E156lv4-0006Sj-00@usw-sf-web2.sourceforge.net>

Patches item #429024, was updated on 2001-05-31 07:27
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429024&group_id=5470

Category: Parser/Compiler
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Fred L. Drake, Jr. (fdrake)
>Assigned to: Tim Peters (tim_one)
Summary: Deal with some unary ops at compile time

Initial Comment:
This patch makes unary + and - operations with numeric
literals compile to a constant reference instead of a
constant reference and UNARY_POSITIVE or UNARY_NEGATIVE
opcode.  This could be extended to support UNARY_INVERT
as well, but that would be a little more complicated.

Folding unary + only affects one case in the regression
test, but folding the - affects 817 places (on a Linux
system with pretty much everything enabled).  I don't
know that this makes much difference at runtime, but
certainly reduces the number of opcodes evaluated.


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-06-03 21:24

Message:
Logged In: YES 
user_id=3066

Re-assigned to Tim since Jeremy's on a new assignment.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429024&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 08:02:33 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 04 Jun 2001 00:02:33 -0700
Subject: [Patches] [ python-Patches-401229 ] Optional memory profiler
Message-ID: <E156oNl-0004tN-00@usw-sf-web1.sourceforge.net>

Patches item #401229, was updated on 2000-08-18 23:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401229&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Vladimir Marangozov (marangoz)
Assigned to: Jeremy Hylton (jhylton)
Summary: Optional memory profiler

Initial Comment:
 

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-04 00:02

Message:
Logged In: YES 
user_id=21627

The patch, in its current form, fails to apply (4 hunks 
fail). Also, the URL of the discussion of the patch 
changed to

http://mail.python.org/pipermail/python-dev/2000-August/008527.html

I recommend to reject this patch, since I cannot see what 
use the information it produces has to a Python developer.
If there is a desire to have the feature in Python, I'd 
volunteer to provide an updated patch.


----------------------------------------------------------------------

Comment By: Vladimir Marangozov (marangoz)
Date: 2000-08-19 00:18

Message:
An optional memory profiler, which goes in tandem with the optional
object memory allocator (SourceForge patch #101104). The profiler was
introduced briefly on python-dev:
http://www.python.org/pipermail/python-dev/2000-August/015239.html

Applying both patches gives for me (screen dump):

~> patch -p1 < ../obmalloc-patch
patching file `Include/objimpl.h'
patching file `Objects/object.c'
patching file `Objects/obmalloc.c'
patching file `acconfig.h'
patching file `configure.in'
~> patch -p1 < ../memprof-patch
patching file `Include/pydebug.h'
patching file `Modules/Setup.config.in'
patching file `Modules/main.c'
patching file `Modules/memprof.c'
patching file `Python/pythonrun.c'
patching file `acconfig.h'
patching file `configure.in'

- Don't forget that you need to autoheader; autoconf;

This patch:

1) introduces a new --with-memprof configure option. Off by default.
2) introduced a Py_ProfileFlag and a "-p" Python option which starts
    the profiler in Py_Initialize() before any initializations, and stops it
    in Py_Finalize() after all finalizations.
3) contains a new Modules/memprof.c module. The inclusion of this file
   in the core is similar to the thread and GC modules (Setup.config.in)

The patch *can* be applied without the object allocator and it *does*
compile on request. However, it issues a warning that it won't profile
anything, because it can't be called (the profiler can't install its hooks).
Besides, it will refuse to start(). The point is that both the profiler and
the allocator are really optional.

Needs docs & tests :( The interface can be improved (just like everything
else) but the core functionality is there. It *is* useful for getting snapshots
of the minimum allocated (object) memory, at least. Some worthy points to
condifer, IMO, are listed in the TODO of memprof.c.

I am submitting this for testing, reviewing, comments and more ideas.
Overall, I think it is a BIG plus regarding Python's typical introspection.

Comments welcome. As usual, flames to /dev/null <wink>.

Status set straight to Postponed. Assigned to marangoz who's in charge of
opening it in due time, together with #101104.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401229&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 08:37:31 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 04 Jun 2001 00:37:31 -0700
Subject: [Patches] [ python-Patches-401335 ] Adds login to auth-type servers (smtplib.py)
Message-ID: <E156ovb-00058E-00@usw-sf-web3.sourceforge.net>

Patches item #401335, was updated on 2000-08-29 01:05
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401335&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Ulf Engstrøm (alexisjuh)
Assigned to: Jeremy Hylton (jhylton)
Summary: Adds login to auth-type servers (smtplib.py)

Initial Comment:
 

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-04 00:37

Message:
Logged In: YES 
user_id=21627

I've transformed this into a patch into a diff-style 
patch. I've left out the self.authenticated attribute, 
since it appears never to be set, and since its purpose is 
unclear.

Applying the patch seems harmless, since it just adds 
another method to the SMTP class. I still recommend 
rejecting it, since it has no apparent relationship to RFC 
2554. In that RFC, the AUTH line of the EHLO response will 
contain a list of SASL authentication mechanisms, as 
defined in RFC 2222, and listed in 
http://www.iana.org/assignments/sasl-mechanisms

So a valid AUTH request would be "AUTH CRAM-MD5", as the 
example in RFC 2554 shows. "AUTH login", as implemented in 
this patch, does not conform to the RFC. Therefore, I 
recommend to reject this patch.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-01-04 07:44

Message:
Jeremy, can you look at this again?

----------------------------------------------------------------------

Comment By: Moshe Zadka (moshez)
Date: 2000-08-29 02:22

Message:
Postponed -- we're in feature freeze.
Assigned to Jeremy in case he disagrees.
Note also that it's preferable to submit patches in diff format, not as human-readable summaries. Try "diff -c".

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401335&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 08:47:12 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 04 Jun 2001 00:47:12 -0700
Subject: [Patches] [ python-Patches-401606 ] threads and __del__
Message-ID: <E156p4y-0005Wj-00@usw-sf-web1.sourceforge.net>

Patches item #401606, was updated on 2000-09-22 07:47
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401606&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Toby Dickenson (htrd)
Assigned to: Tim Peters (tim_one)
Summary: threads and __del__

Initial Comment:
 

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-04 00:47

Message:
Logged In: YES 
user_id=21627

I recommend to approve this patch.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2000-09-22 08:06

Message:
Works fine for me.
Assigned to Tim for review since he's the race condition czar.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401606&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 08:56:18 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 04 Jun 2001 00:56:18 -0700
Subject: [Patches] [ python-Patches-401713 ] Free extension DLLs' handles during the Py_Finalize()
Message-ID: <E156pDm-0005f9-00@usw-sf-web1.sourceforge.net>

Patches item #401713, was updated on 2000-09-29 12:02
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401713&group_id=5470

Category: Windows
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Yakov Markovitch (markovitch)
Assigned to: Tim Peters (tim_one)
Summary: Free extension DLLs' handles during the Py_Finalize()

Initial Comment:
 

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-04 00:56

Message:
Logged In: YES 
user_id=21627

I recommend to reject this patch. If such a feature is 
implemented, it should be implemented uniformly across 
platforms - i.e. on Unix, appropriate dlclose calls should 
be issued.

Furthermore, I don't see the problem with the DLLs being 
loaded. AFAIK, each DLL will be loaded only once, so even 
if the interpreter is stopped and started again, you get 
only one copy of the DLLs state per process, right? So 
what is the problem?

Finally, it seems reasonable that people embedding the 
interpreter might need to customize its code. It is 
possible that the finalization procedure of user A won't 
work for user B, e.g. because they require state to 
survive different activations and deactivations.


----------------------------------------------------------------------

Comment By: Yakov Markovitch (markovitch)
Date: 2000-10-06 02:54

Message:
Yes, I agree with Mark, but there is the other side of the problem. Let's suppose that we have an application that uses the interpreter through dynamic loading (I mean through the LoadLibrary). It isn't likely to be directly, but the application can load/unload some other DLL which, in turn, uses an embedded interpreter. Now after freeing this DLL the application has ALL extensions which was used by this DLL loaded! (Though it hasn't the interpreter embedded at all!)

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2000-10-05 18:19

Message:
I agree we should close handles that we can't use as extension modules.

I am quite skeptical of the unloading of modules, tho.  Python simply doesn't provide enough cleanup semantics to guarantee we are finished with the module at Py_Finalize() time.  Indeed, extension modules are one main reason why Python often can not handle multiple Py_Initialize()/Py_Finalize() calls in the same process.

I think that Python needs to grow module termination semantics.  Something like, at Py_Finalize time:

Try and find function "term_{module}"
If function exists:
  call function
  free handle
else:
  pass

Thus - only modules that have gone to the trouble of providing a finalize function can be trusted to be unloaded.

On one hand, the addition of the map means we _are_ in a better position for better finalization semantics on Windows.  On the larger hand, module finalization semantics must be cross-platform anyway.

So - while I acknowledge the problem, I don't believe this alone is a reasonable solution. 

Marking as postponed, and assigning back to Tim, so he can rule on the next step....  This came up a number of years ago, and Guido agreed "better" semantics were needed.  Sounds like PEP material.  I guess I _do_ care enough about this issue to own a PEP on it, as long as no-one needs the PEP finalized this year ;-)

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2000-10-05 18:02

Message:
Mark, you got anything to say about this?  Can't say I've ever noticed a problem here.  Note that "the patch" is actually a .zip archive, and it takes a little effort to sort out what's what.

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2000-09-30 17:01

Message:
Assigned to one of our Windows guys for review.

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2000-09-30 17:01

Message:
Assigned to one of our Windows guys for review.

----------------------------------------------------------------------

Comment By: Yakov Markovitch (markovitch)
Date: 2000-09-29 12:09

Message:
This patch is intended to fix the following problem:
Python on Windows never frees DLLs loaded as extension. Whenever it's not a big problem when the interpreter is being used in a standart way, it becomes THE problem (or even a disaster) when the interpreter DLL is dynamically 
initialized/finalized from one process many times during single run.
Moreover, even in case of single initialization there is a trap - DLLs loaded by mistake are unloaded only then a process finishes (e.g. suppose there is a foo.dll in the current directory and foo.dll is NOT a Python extension;
"import foo" ends up with error, but foo.dll will be anging in process' address space!)

This patch
    1) frees a DLL handle in case of it has no proper initialization funcion
    2) registers in an internal array all handles of successfully loaded dynamic extensions
    2) frees all registered handles during Py_Finalize()
    
Yakov Markovitch,
markovitch@iso.ru    

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401713&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 08:59:07 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 04 Jun 2001 00:59:07 -0700
Subject: [Patches] [ python-Patches-402780 ] SET_LINENO for augassign
Message-ID: <E156pGV-0005hU-00@usw-sf-web1.sourceforge.net>

Patches item #402780, was updated on 2000-12-11 08:03
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=402780&group_id=5470

Category: demos and tools
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Jeremy Hylton (jhylton)
Summary: SET_LINENO for augassign

Initial Comment:
 

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-04 00:59

Message:
Logged In: YES 
user_id=21627

I recommend to approve this patch.


----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2000-12-11 08:05

Message:
Line numbers are currently not set for augmented assignment statements for code compiled by Tools/compiler.  Here is a one line fix.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=402780&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 09:00:19 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 04 Jun 2001 01:00:19 -0700
Subject: [Patches] [ python-Patches-402891 ] Alternative readline module
Message-ID: <E156pHf-0005iT-00@usw-sf-web1.sourceforge.net>

Patches item #402891, was updated on 2000-12-17 14:22
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=402891&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Neil Schemenauer (nascheme)
Summary: Alternative readline module

Initial Comment:
 

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-04 01:00

Message:
Logged In: YES 
user_id=21627

I see this patch is still not committed. Any reason why 
not?


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-01-18 16:04

Message:
Neil, I'm assigning this back to you and reopening it (from Accepted).  

It seems patches with status Accepted frequently get lost -- probably because "My Patches" doesn't show them.

In any case, I think you should just check this in ASAP and close the patch!


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-01-16 06:59

Message:
Ah, of course.  I saw that, even played with it a bit.

Looks cool, but I don't know about using it to replace readline.

But you might want to change the name given that pyrl is already taken. ;-)


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2001-01-16 06:34

Message:
pyrl is my line reader written in Python that I've been intermittently blathering about on python-dev:

http://www-jcsu.jesus.cam.ac.uk/~mwh21/hacks/pyrl-0.2.0.tar.gz

it's still very experimental, though.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-01-16 05:41

Message:
What's pyrl in this context?  A Google search turns up a bunch of references to a Perl preprocessor that takes Pythonic syntax and translates it into Perl. :-)

[ESR replied Neil via email: "I'm on it.  Gotta ship my PC9 paper first, though."]


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2001-01-16 02:23

Message:
You could defer the decision between readline and edline until runtime, as in: (will sf mangle this? we'll see)

Index: Modules/main.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Modules/main.c,v
retrieving revision 1.47
diff -c -r1.47 main.c
*** Modules/main.c      2000/12/15 22:00:54     1.47
--- Modules/main.c      2001/01/16 10:19:45
***************
*** 267,274 ****
            isatty(fileno(stdin))) {
                PyObject *v;
                v = PyImport_ImportModule("readline");
!               if (v == NULL)
                        PyErr_Clear();
                else
                        Py_DECREF(v);
        }
--- 267,280 ----
            isatty(fileno(stdin))) {
                PyObject *v;
                v = PyImport_ImportModule("readline");
!               if (v == NULL) {
                        PyErr_Clear();
+                       v = PyImport_ImportModule("edline");
+                       if (v == NULL)
+                               PyErr_Clear();
+                       else
+                               Py_DECREF(v);
+               }
                else
                        Py_DECREF(v);
        }

(and pyrl's not going to be ready for 2.1, by a country mile...)


----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2001-01-15 21:30

Message:
Hmm, I still like pyrl better.  What to do about GNU
readline now that its in Setup.conf?  You can't enable
them both and I don't feel comfortable enough with
autoconf to fix things.  ESR, if you could add the magic
to test for termios that would be cool.  configure should
use readline if its there and fall back to edline if 
termios is available.

Feel free to bounce it back to me if you don't have time.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-01-03 06:10

Message:
Neil, this has now status Accepted.  Go ahead and check it in!


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2000-12-18 14:17

Message:
Neil, I propose that you check this in.

The edline.c file would need a little work to compile without warnings, and you should add #HAVE_STRDUP to edline.h (Python makes sure strdup() is always present).

The comment for Setup.dist "Neil Schemenauer's edline library" sounds a little strange given that most of the code is by others. Maybe "Neil Schemenauer's edline wrapper module"?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2000-12-17 14:28

Message:
I like Michael Hudson's idea of writing a readline
replacement in Python using modules like _curses and termios
better but I had this patch 90% complete before I recieved
his email.  I stripped the editline library down and updated
it for modern Unix systems.  I have no idea if it compiles
on anything other than Linux however.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=402891&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 17:59:20 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 04 Jun 2001 09:59:20 -0700
Subject: [Patches] [ python-Patches-430030 ] Avoid multiple BOMs in UTF-16 streams
Message-ID: <E156xhI-0003Yr-00@usw-sf-web3.sourceforge.net>

Patches item #430030, was updated on 2001-06-04 09:59
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430030&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Martin v. Löwis (loewis)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Avoid multiple BOMs in UTF-16 streams

Initial Comment:
This patch fixes the UTF-16 reader and writer to emit 
and expect the BOM only at the beginning of the 
stream. It is implemented by changing the 
encode/decode function of the stream object after the 
byte order is detected.

In addition, it adds a new test case test_codecs. 
When committing the patch, the corresponding output 
file must be generated.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430030&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 18:13:25 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 04 Jun 2001 10:13:25 -0700
Subject: [Patches] [ python-Patches-421893 ] Cleanup GC API
Message-ID: <E156xuv-0003nr-00@usw-sf-web3.sourceforge.net>

Patches item #421893, was updated on 2001-05-06 14:42
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421893&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Cleanup GC API

Initial Comment:
This patch adds three new APIs:

	PyObject_GC_New
	PyObject_GC_NewVar
	PyObject_GC_Resize
	PyObject_GC_Del

and renames PyObject_GC_Init and PyObject_GC_Fini to:

	PyObject_GC_Track
	PyObject_GC_Ignore

respectively.  Objects that wish to be tracked by the
collector must use these new APIs.  Many more details
about the GC implementation are hidden inside
gcmodule.c.  There seems to be no change in
performance.

Note that PyObject_GC_{New,NewVar} automatically adds
the object to the GC lists.  There is no need to
call PyObject_GC_Track.  PyObject_GC_Del automatically
removes the object from the GC list but usually you
want to call PyObject_GC_Ignore yourself (DECREFs can
end up running arbitrary code).

It should be more difficult to corrupt the GC linked
lists now.  Also, you can now call PyObject_GC_Ignore
on objects that you know will not create RCs. The
_weakref module does this.  Previously, every object
that had the GC type flag set and could be found by
using tp_traverse had to be in a GC linked list.


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-04 10:13

Message:
Logged In: YES 
user_id=21627

I have two problems with this patch:
1. It comes with no documentation.
2. It breaks existing third-party modules which use the 
   GC API as defined in Python 2.
Consequently, I recommend rejection of the patch in its 
current form.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421893&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 18:28:59 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 04 Jun 2001 10:28:59 -0700
Subject: [Patches] [ python-Patches-421709 ] Access { thread id : frame } dict
Message-ID: <E156y9z-0004u1-00@usw-sf-web1.sourceforge.net>

Patches item #421709, was updated on 2001-05-05 13:30
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421709&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John D. Heintz (jheintz)
Assigned to: Barry Warsaw (bwarsaw)
Summary: Access { thread id : frame } dict

Initial Comment:
This patch adds a new function sys._getframes() that 
returns a dictionary mapping from thread id to 
current frame object.

This is very useful when diagnosing deadlock issues 
in Python code.

The new C code function is purely additive except for 
modifying the PyThreadState struct (adding a long 
thread_ident) and modifying PyThreadState_New() 
function to set this new long.


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-04 10:28

Message:
Logged In: YES 
user_id=21627

I think the patch could use some more documentation, e.g. 
as a patch to Doc/lib/libsys.tex. E.g. what are the tuples 
that are put into the dictionaries?

Also, isn't there a problem with the tuple size? The patch 
allocates tuples of size 0, but then puts things into 
index 0. Is there any kind of test case for this code?

Finally, I don't think the docstring should say that the 
function is for internal and specialized purposes only 
(what specialized purposes, anyway), if you think its 
primary use is in diagnosing deadlocks. It should only 
document what the function does, not what you intend it to 
use for.

For these reasons, I also think its name should not start 
with an underscore.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421709&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 18:52:19 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 04 Jun 2001 10:52:19 -0700
Subject: [Patches] [ python-Patches-421709 ] Access { thread id : frame } dict
Message-ID: <E156yWZ-0005Ma-00@usw-sf-web1.sourceforge.net>

Patches item #421709, was updated on 2001-05-05 13:30
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421709&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John D. Heintz (jheintz)
Assigned to: Barry Warsaw (bwarsaw)
Summary: Access { thread id : frame } dict

Initial Comment:
This patch adds a new function sys._getframes() that 
returns a dictionary mapping from thread id to 
current frame object.

This is very useful when diagnosing deadlock issues 
in Python code.

The new C code function is purely additive except for 
modifying the PyThreadState struct (adding a long 
thread_ident) and modifying PyThreadState_New() 
function to set this new long.


----------------------------------------------------------------------

>Comment By: John D. Heintz (jheintz)
Date: 2001-06-04 10:52

Message:
Logged In: YES 
user_id=20438

Martin:  I agree with you on the documentation issue and 
will look into the tuple size issue you raised.

The docstring is modeled on the sys._getframe() function 
so I figured it would be sufficient to follow the leader.

(I think that both sys._getframe() and sys._getframes() 
should be part of the public api for the sys module by the 
way.)


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-04 10:28

Message:
Logged In: YES 
user_id=21627

I think the patch could use some more documentation, e.g. 
as a patch to Doc/lib/libsys.tex. E.g. what are the tuples 
that are put into the dictionaries?

Also, isn't there a problem with the tuple size? The patch 
allocates tuples of size 0, but then puts things into 
index 0. Is there any kind of test case for this code?

Finally, I don't think the docstring should say that the 
function is for internal and specialized purposes only 
(what specialized purposes, anyway), if you think its 
primary use is in diagnosing deadlocks. It should only 
document what the function does, not what you intend it to 
use for.

For these reasons, I also think its name should not start 
with an underscore.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421709&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 19:32:29 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 04 Jun 2001 11:32:29 -0700
Subject: [Patches] [ python-Patches-407764 ] allow whitespace lines for doctest tests
Message-ID: <E156z9R-0005RE-00@usw-sf-web3.sourceforge.net>

Patches item #407764, was updated on 2001-03-11 13:37
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=407764&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Trent Mick (tmick)
>Assigned to: Trent Mick (tmick)
Summary: allow whitespace lines for doctest tests

Initial Comment:
Currently doctest.py does not allow individual tests 
to have all-whitespace output lines. This patch 
proposes a fix for this. With this patch a leading '.' 
on a doctest output line, if and only if the tests are 
indented, will signal that following whitespace *is* 
the expected output. 

For example, currently this cannot be doctest'ed

"""
    >>> print "\nhello\n"

    hello

    >>>
"""

But with this patch *this* can be:

# file test_doctest.py
"""
    >>> print "\nhello\n"
.
    hello
.
    >>>
"""
def _test():
    import doctest, test_doctest
    return doctest.testmod(test_doctest)
if __name__ == "__main__":
    _test()


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2001-06-04 11:32

Message:
Logged In: YES 
user_id=31435

Should have assigned this back to Trent months ago.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-03-16 21:04

Message:
Logged In: YES 
user_id=31435

Trent, yuck.  doctests are primarily documentation, and 
there's nothing about "." that suggests-- let alone 
screams --"ah, this line is really a blank line, not the 
period that it sure looks like".  Too confusing.

I'd be happy with this if it *screamed* "blank line", 
though!  For example, accept

<really a blank line>

as meaning it's really a blank line.  In that case, though, 
note that:

1. The restriction about blank lines is documented in both 
doctest's docstrings and in the Library Manual, so this 
would also need doc changes in both places.

and

2. doctest is self-testing, i.e. the standard test for 
doctest simply runs doctest on doctest.  So in the very 
same place you document your blank line convention in the 
doctest docstring, you should also include an executable  
doctest example in the docstring.  Then the standard 
test_doctest.py will verify that it works exactly as 
advertised forever more.

----------------------------------------------------------------------

Comment By: Trent Mick (tmick)
Date: 2001-03-11 13:40

Message:
Logged In: YES 
user_id=34892

Grrr, the code I put in the comment is supposed to be 
indented of course. I will attach the test_doctest.py to 
clarify.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=407764&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 21:04:53 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 04 Jun 2001 13:04:53 -0700
Subject: [Patches] [ python-Patches-403514 ] small speedup in Tkinter.Misc._bind
Message-ID: <E1570ar-0006ui-00@usw-sf-web3.sourceforge.net>

Patches item #403514, was updated on 2001-01-30 12:21
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403514&group_id=5470

Category: Tkinter
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Markus F.X.J. Oberhumer (mfx)
Assigned to: Fredrik Lundh (effbot)
Summary: small speedup in Tkinter.Misc._bind

Initial Comment:
This patch precomputes _subst_format_str to avoid a call to _string.join() on each invocation of _bind. It gives a small but noticable speed improvement when creating a lot of bindings, such as in the upcoming PySol Mahjongg games.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-04 13:04

Message:
Logged In: YES 
user_id=21627

I recommend to approve this patch.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403514&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 21:45:15 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 04 Jun 2001 13:45:15 -0700
Subject: [Patches] [ python-Patches-403743 ] [windows] Correction to bug #131273
Message-ID: <E1571Dv-0000Fu-00@usw-sf-web1.sourceforge.net>

Patches item #403743, was updated on 2001-02-12 01:03
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403743&group_id=5470

Category: Windows
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Christophe Gouiran (cgouiran)
Assigned to: Mark Hammond (mhammond)
Summary: [windows] Correction to bug #131273

Initial Comment:
I found a bug in the posixmodule.c file, not killing children processes when exiting python.
Now in the posixmodule i wrote a win32_atexit() function that does the trick.
Then it's registered it in the INITFUNC function with the atexit() function.

Now at exit, any children process are automatically killed.

The patch must be applyed in the module directory, not at the python root one.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-04 13:45

Message:
Logged In: YES 
user_id=21627

It appears that the patch is a nearly empty file 
containing only garbage. Christophe, you probably should 
try uploading it again.


----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-02-12 08:07

Message:
Assigned to Mark Hammond since the original bug is already assigned to him.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403743&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 21:55:33 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 04 Jun 2001 13:55:33 -0700
Subject: [Patches] [ python-Patches-403753 ] zlib decompress; uncontrollable memory usage
Message-ID: <E1571Nt-0000Tt-00@usw-sf-web1.sourceforge.net>

Patches item #403753, was updated on 2001-02-12 08:10
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403753&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Toby Dickenson (htrd)
Assigned to: Jeremy Hylton (jhylton)
Summary: zlib decompress; uncontrollable memory usage

Initial Comment:
zlib's decompress method will allocate as much memory as is
needed to hold the decompressed output. The length of the output
buffer may be very much larger than the length of the input buffer,
and the python code calling the decompress method has no other way
to control how much memory is allocated.

In experimentation, I seen decompress generate output that is
1000 times larger than its input

These characteristics may make the decompress method unsuitable for
handling data obtained from untrusted sources (for example,
in a http proxy which implements gzip encoding) since it may be
vulnerable to a denial of service attack. A malicious user could
construct a moderately sized input which forces 'decompress' to
try to allocate too much memory.

This patch adds a new method, decompress_incremental, which allows
the caller to specify the maximum size of the output. This method
returns the excess input, in addition to the decompressed output.

It is possible to solve this problem without a patch:
If input is fed to the decompressor a few tens of bytes
at a time, memory usage will surge by (at most)
a few tens of kilobytes. Such a process is a kludge, and much
less efficient that the approach used in this patch.

(Ive not been able to test the documentation patch; I hope its ok)

(This patch also includes the change from Patch #103748)


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-04 13:55

Message:
Logged In: YES 
user_id=21627

The patch looks good to me; I recommend to approve it.


----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2001-04-10 10:52

Message:
Logged In: YES 
user_id=11375

I've let this patch gather dust long enough.  Unassigning so
that someone else can review it.

----------------------------------------------------------------------

Comment By: Gregory P. Smith (greg)
Date: 2001-04-07 00:20

Message:
Logged In: YES 
user_id=413

as a side note.  I believe I implemented a python workaround
for this problem by just decompressing data in small chunks
(4k at a time) using a decompressor object.

see the mojonation project on sourceforge if you're
curious.  (specifically, in the mojonation evil module, look
at common/mojoutil.py for function named
safe_zlib_decompress).

Regardless, I like thie idea of this patch.  It would be
good to have that in the main API and documentation for
simplicity. (and because there are too many programmers out
there who don't realize potential denial of service issues
on their own...)


----------------------------------------------------------------------

Comment By: Toby Dickenson (htrd)
Date: 2001-02-22 04:50

Message:
New patch implementing a new optional parameter to .decompress, and a new attribute .unconsumed_tail


----------------------------------------------------------------------

Comment By: Toby Dickenson (htrd)
Date: 2001-02-22 03:42

Message:
Waaah - that last comment should be 'cant' not 'can'

----------------------------------------------------------------------

Comment By: Toby Dickenson (htrd)
Date: 2001-02-22 03:40

Message:
We can reuse .unused_data without introducing an ambiguity. I will prepare a patch that uses a new attribute .unconsumed_tail


----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2001-02-21 11:32

Message:
Doesn't .unused_data serve much the same purpose, though?
So that even with a maximum size, .decompress() always returns a string, and .unused_data would contain the unprocessed data.

----------------------------------------------------------------------

Comment By: Toby Dickenson (htrd)
Date: 2001-02-21 06:00

Message:
I did consider that....

An extra change that you didnt mention is the need for a different return value. Currently .decompress() always returns a string. The new method in my patch returns a tuple containing the same string, and an integer specifying how many bytes were consumed from the input.

Overloading return values based on an optional parameter seems a little hairy to me, but I would be happy to change the patch if that is your preferred option.

I also considered (and rejected) the possibility of adding an optional max-size argument to .decompress() as you suggest, but raising an exception if this limit is exceeded. This avoids the need for an extra return value, but looses out on flexibility.


----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2001-02-20 18:48

Message:
Rather than introducing a new method, why not just add an optional maxlen argument to .decompress().  I think the changes would be:

* add 'int maxlen=-1;'
* add "...|i" ... ,&maxlen to the argument parsing
* if maxlen != -1, length = maxlen else length = DEFAULTALLOC;
* Add '&& maxlen==-1' to the while loop.  (Use the current CVS; I just checked in a patch rearranging the zlib module a bit.)

Do you want to make those changes and resubmit the patch?


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403753&group_id=5470


From noreply@sourceforge.net  Mon Jun  4 21:59:12 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 04 Jun 2001 13:59:12 -0700
Subject: [Patches] [ python-Patches-403977 ] Rename config.h to pyac_config.h, per SF bug #131774
Message-ID: <E1571RQ-0000XQ-00@usw-sf-web1.sourceforge.net>

Patches item #403977, was updated on 2001-02-23 13:28
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403977&group_id=5470

Category: Build
Group: None
Status: Open
Resolution: Postponed
Priority: 1
Submitted By: Thomas Wouters (twouters)
Assigned to: Thomas Wouters (twouters)
Summary: Rename config.h to pyac_config.h, per SF bug #131774

Initial Comment:
This patch fixes the UNIX and Windows builds to use 'pyac_config.h' instead of 'config.h', to avoid the problems summarized in SF bug #131774. It doesn't address the placing issue, however, because I believe it's intended to be like this.

Most changes were done using a fairly intelligent shell+sed oneliner, but they should be correct. The Windows build *seems* correct, though I can't be sure. Someone will have to check ;) It is probably a good idea to remove 'config.h' before testing, to be sure I got all references.

The UNIX build requires that autoconf is installed, and requires a 'autoheader ; autoconf' is done before running 'configure'. Removing config.h(.in) is also a good idea.

I excluded the OS2 build files, and will be uploading those as a seperate patch to avoid making this one unreadable Though only two files are involved, they both list all dependencies for *all* files in its entirety, so the patch is quite large. If those files are auto-generated, someone please tell me so :-)

I also didn't fix distutils, though it looks like it does need fixing. And I didn't do anything wrt. backwards compatibility. We should probably provide a config.h that just does

#warning Warning: Use of Python-specific config.h is deprecated. Use pyac_config.h instead.
#include <pyac_config.h>

The name is just my suggestion, changing it into something less acronymic would be no problem at all. I think 'pythonconfig.h' gives the wrong message though: the file isn't used to configure Python itself, after all ;)


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-04 13:59

Message:
Logged In: YES 
user_id=21627

I think we should come to a conclusion for these patches, 
and applying one of them. I still like pyconfig.h better 
than pyac_config.h, but apart from that, *something* 
should get installed.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2001-04-11 05:43

Message:
Logged In: YES 
user_id=34209

I'm not sure about the supersedence here. See my comment in
#411138.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-04-10 15:15

Message:
Logged In: YES 
user_id=6380

Is this superseded by patch #411138?


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-03-18 16:09

Message:
Logged In: YES 
user_id=6380

Let's do this after 2.1 is released.  Status set to postponed and priority lowered.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-03-01 17:29

Message:
Logged In: YES 
user_id=31435

Na, I don't mind the pyac name.  I had forgotten (or 
perhaps never knew) that this thing is a generated file (on 
Windows it's done by hand).  It's an internal 
implementation detail anyway, so it doesn't matter if the 
name "makes sense" to Windows geeks; at least pyac_config 
will make some sense to Linux dweebs.

----------------------------------------------------------------------

Comment By: Trent Mick (tmick)
Date: 2001-03-01 17:08

Message:
Logged In: YES 
user_id=34892

Tim said:
> BTW, I have no idea what "pyac" is supposed to bring
> to mind.  Is that some Unixism?

In answer to that. How about just calling it "pyconfig.h". 
The reference to autoconf is not very accurate for Windows. 

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2001-02-28 01:07

Message:
Logged In: YES 
user_id=34209

I forgot to mention that I think this should be postponed
until 2.2 or 2.1.1 anyway. It's not that big a change, but
it's big enough to have weird and unsuspected sideffects.

The bug is now numbered #231774, by the way. The problem is
that 'config.h' is an oft-used name, and if you include it
but have another directory with another project's config.h
earlier in your include path, you get the wrong one. Similar
if you intend to use the other one, but get this one.
Leaving a fake config.h would only cause this patch to fix
half of those problems, but only the first problem was
reported in the bugreport :)

The 'pyac_config' name comes from 'python', 'autoconf',
'config', and is IMHO sufficiently vague that it implies it
is autogenerated :-)


----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2001-02-27 23:14

Message:
Logged In: YES 
user_id=31392

No time

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2001-02-27 13:53

Message:
Logged In: YES 
user_id=35752

SF seems to have changed the bug ids!  I can't find bug 
#131774.  Unless there is a very good reason for the
change I'm against it for 2.1.

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2001-02-27 13:05

Message:
Logged In: YES 
user_id=11375

Regarding Distutils: I think the only actual *code* 
that would change is in distutils/sysconfig.py, 
in the get_config_h_filename() method.  For backward
compat., this method would probably have to check the Python
version and use pyac_config.h if the version is 2.1 or
greater.

There are also lots of references to config.h in comments;
we can change those or not, as desired.  (I probably *would*
change most of them.)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-02-27 12:48

Message:
Logged In: YES 
user_id=31435

Pushed onto Jeremy.  Jeremy, we want to do this much 
fiddling so late in the cycle?

Thomas, don't worry about Windows.  I only need a warning 
about that, and I've aware of this now (thanks!).  Check in 
the new MS project files or don't, it's easy for me to 
fix 'em up regardless (indeed, it's not worth extra time to 
check it in advance).

Note that "#warning" is not std C.  I'm afraid you'll have 
to make it an #error.  OTOH, if you leave a file 
*named* "config.h" in the distribution, it doesn't really 
address the bug report, right?

BTW, I have no idea what "pyac" is supposed to bring to 
mind.  Is that some Unixism?


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2001-02-23 13:32

Message:
Apologies for the large blurb in the 'details' section. I keep forgetting SF strips *all* whitespace from that block :(

Assigning to Tim "The Windows Bot" Peters to test (and fix) the Windows build changes. Let me know if your patch still doesn't work and you want me to send you patched files instead, Tim.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403977&group_id=5470


From noreply@sourceforge.net  Tue Jun  5 03:40:45 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 04 Jun 2001 19:40:45 -0700
Subject: [Patches] [ python-Patches-430181 ] Make httplib work with picky servers
Message-ID: <E1576lx-000762-00@usw-sf-web1.sourceforge.net>

Patches item #430181, was updated on 2001-06-04 19:40
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430181&group_id=5470

Category: library
Group: 2.0.1 bugfix
Status: Open
Resolution: None
Priority: 5
Submitted By: Leonard Samuelson (lenski)
Assigned to: Nobody/Anonymous (nobody)
Summary: Make httplib work with picky servers

Initial Comment:
Python2.0: httplib.py: httplib: HTTPconnection

Header processing: (putheader, putrequest, and
endheaders) methods transmit each HTTP header
line using a separate socket send invocation.

Before this change, My Linksys Etherfast Cable/DSL
router (Linksys BEFSR41, firmware v 1.22,
March 31 2000) rejected the request becuase the
entire HTTP header block is not contained in a
single TCP packet.

Clearly, the router is engaging in a noncompliant
optimization!  This patch is not required to allow
httplib to work with real servers, making it
completely optional.

The patch I am submitting with this note causes
httplib to work with the router.  It is intended
mostly as a model; a developer with greater
familiarity with the library might have a better
approach.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430181&group_id=5470


From noreply@sourceforge.net  Tue Jun  5 09:00:33 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 05 Jun 2001 01:00:33 -0700
Subject: [Patches] [ python-Patches-421709 ] Access { thread id : frame } dict
Message-ID: <E157BlR-0000M7-00@usw-sf-web3.sourceforge.net>

Patches item #421709, was updated on 2001-05-05 13:30
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421709&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John D. Heintz (jheintz)
Assigned to: Barry Warsaw (bwarsaw)
Summary: Access { thread id : frame } dict

Initial Comment:
This patch adds a new function sys._getframes() that 
returns a dictionary mapping from thread id to 
current frame object.

This is very useful when diagnosing deadlock issues 
in Python code.

The new C code function is purely additive except for 
modifying the PyThreadState struct (adding a long 
thread_ident) and modifying PyThreadState_New() 
function to set this new long.


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-05 01:00

Message:
Logged In: YES 
user_id=21627

There is a difference between these two functions. _getframe
is not an
official API; inspect.currentframe is the official API. It
seems that
your function is meant to be used via sys, so it would be
public
there.

In any case, I also think that  the sys._getframe doc string
should
not talk about intended uses - if anything, it should
mention what
function to call instead.


----------------------------------------------------------------------

Comment By: John D. Heintz (jheintz)
Date: 2001-06-04 10:52

Message:
Logged In: YES 
user_id=20438

Martin:  I agree with you on the documentation issue and 
will look into the tuple size issue you raised.

The docstring is modeled on the sys._getframe() function 
so I figured it would be sufficient to follow the leader.

(I think that both sys._getframe() and sys._getframes() 
should be part of the public api for the sys module by the 
way.)


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-04 10:28

Message:
Logged In: YES 
user_id=21627

I think the patch could use some more documentation, e.g. 
as a patch to Doc/lib/libsys.tex. E.g. what are the tuples 
that are put into the dictionaries?

Also, isn't there a problem with the tuple size? The patch 
allocates tuples of size 0, but then puts things into 
index 0. Is there any kind of test case for this code?

Finally, I don't think the docstring should say that the 
function is for internal and specialized purposes only 
(what specialized purposes, anyway), if you think its 
primary use is in diagnosing deadlocks. It should only 
document what the function does, not what you intend it to 
use for.

For these reasons, I also think its name should not start 
with an underscore.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421709&group_id=5470


From noreply@sourceforge.net  Wed Jun  6 07:27:07 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 05 Jun 2001 23:27:07 -0700
Subject: [Patches] [ python-Patches-409973 ] glob.glob speedups
Message-ID: <E157WmZ-000220-00@usw-sf-web3.sourceforge.net>

Patches item #409973, was updated on 2001-03-20 01:57
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=409973&group_id=5470

Category: library
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Rob W.W. Hooft (hooft)
Assigned to: Nobody/Anonymous (nobody)
Summary: glob.glob speedups

Initial Comment:
A lot of the time spent by glob.glob on large
directories is spent doing os.path.normcase(). Half of
this can be saved by normcasing the pattern only once,
and on unix the whole normcase call can be left out.

This patch attempts to optimize globbing even a bit
more by delegating the fnmatching of a list of file
names to a new function fnmatch.filter, which allows us
to move a few more lookups outside of the file name
loop.

Furthermore, an optimization is added to glob.glob
calls that do not contain any directory specifications,
saving a round of os.path.join calls.

Speedups of the pattern '*.py?' in the python lib
directory range
from a factor of 2 with directory specification to a
factor of 5 without directory specifications.

Unfortunately there is no test_glob regression test,
but I did my best to verify that nothing changed in my
calls.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-05 23:27

Message:
Logged In: YES 
user_id=21627

Committed as glob.py 1.10 and fnmatch.py 1.12.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=409973&group_id=5470


From noreply@sourceforge.net  Wed Jun  6 07:39:59 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 05 Jun 2001 23:39:59 -0700
Subject: [Patches] [ python-Patches-412229 ] runtime RTLD_NOW control via sys
Message-ID: <E157Wz1-0002EZ-00@usw-sf-web3.sourceforge.net>

Patches item #412229, was updated on 2001-03-29 08:55
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=412229&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Bram Stolk (bram)
Assigned to: Nobody/Anonymous (nobody)
Summary: runtime RTLD_NOW control via sys

Initial Comment:
This patch enables runtime control over the RTLD_NOW
flag, which can be used to do lazy symbol resolving
when loading a shared lib.

It's an extention to the sys module:
sys.setlazysymresolve(0|1)

The patch is against the latest CVS code, and
was generated by 'cvs diff'.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-05 23:39

Message:
Logged In: YES 
user_id=21627

The patch needs further work: The code currently compiles 
on systems which don't define RTLD_NOW (although I'm not 
sure what these systems are); your code doesn't.

Also, the code allows to set the flags, but has no 
interface to query them.

Finally, users often complain that Python should use 
RTLD_GLOBAL, so that they can share symbols across 
extension modules. Therefore, I propose that you allow 
setting arbitrary dlopen flags; users would have to write

sys.setdlopenflags(0)

to turn off RTLD_NOW, and use

sys.setdlopenflags(dl.RTLD_NOW|dl.RTLD_GLOBAL)

to add RTLD_GLOBAL.

When you revise this patch, please submit unified (-u) or 
context (-c) diffs.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-04-10 14:48

Message:
Logged In: YES 
user_id=6380

Sorry, no new features in 2.1.

I'll look at this after 2.1 is released though.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=412229&group_id=5470


From noreply@sourceforge.net  Wed Jun  6 07:45:02 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 05 Jun 2001 23:45:02 -0700
Subject: [Patches] [ python-Patches-414492 ] adds a gc.get_generation function
Message-ID: <E157X3u-0002IP-00@usw-sf-web3.sourceforge.net>

Patches item #414492, was updated on 2001-04-07 00:33
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=414492&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Gregory P. Smith (greg)
Assigned to: Neil Schemenauer (nascheme)
Summary: adds a gc.get_generation function

Initial Comment:
gc.get_generation(num) added by this patch allows you
to get a
list of all objects in a given garbage collector
generation.

I wrote this while trying to debug a memory leak so
that I could peek at what types of objects were
remaining allocated but never freed.

Looking through the patches I see another similarish
patch that allow for searching the collection lists for
references to a particular thing or set of things. 
interesting.

Is it useful?  Yes and no.  I still haven't found the
memory leak.  But I know what objects are consuming it
so I can narrow my search through to code to find how
they are remaining referenced.

as a side note, there's not much point in the
generation number parameter to this method, 2 is the
only generation really worth examining.

This or something like it would be nice to see in a
future python gc module as a debugging aid.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-05 23:45

Message:
Logged In: YES 
user_id=21627

I still would like to see my gc.getreferents patch 
applied, which offers a similar debugging aid.

However, since this offers a somewhat orthogonal 
functionality, and is a quite short patch, I recommend to 
approve it.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=414492&group_id=5470


From noreply@sourceforge.net  Wed Jun  6 16:33:54 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 06 Jun 2001 08:33:54 -0700
Subject: [Patches] [ python-Patches-430706 ] Persistent connections in BaseHTTPServer
Message-ID: <E157fJi-0004Pd-00@usw-sf-web1.sourceforge.net>

Patches item #430706, was updated on 2001-06-06 08:33
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430706&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Chris Lawrence (lordsutch)
Assigned to: Nobody/Anonymous (nobody)
Summary: Persistent connections in BaseHTTPServer

Initial Comment:
This patch provides HTTP/1.1 persistent 
connection support in BaseHTTPServer.py.  It is 
not enabled by default (for backwards 
compatibility) because Content-Length headers 
must be supplied for persistent connections to 
work correctly.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430706&group_id=5470


From noreply@sourceforge.net  Wed Jun  6 18:21:19 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 06 Jun 2001 10:21:19 -0700
Subject: [Patches] [ python-Patches-430754 ] Makes ftpmirror.py .netrc aware
Message-ID: <E157gzf-00038z-00@usw-sf-web3.sourceforge.net>

Patches item #430754, was updated on 2001-06-06 10:21
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430754&group_id=5470

Category: demos and tools
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Mike Romberg (romberg)
Assigned to: Nobody/Anonymous (nobody)
Summary: Makes ftpmirror.py .netrc aware

Initial Comment:
  The following patch  modifies the ftpmirror.py script
found in Tools/scripts to use the netrc module.  This
allows
the ftpmirror script to act more like a standard ftp
client
and take the login, password, and account from a users
$HOME/.netrc file if it exists.

  This patch is against the ftpmirror.py found in
python 2.1


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430754&group_id=5470


From noreply@sourceforge.net  Wed Jun  6 22:14:08 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 06 Jun 2001 14:14:08 -0700
Subject: [Patches] [ python-Patches-430846 ] faster string-decoding in base64.py
Message-ID: <E157kcy-0003jH-00@usw-sf-web1.sourceforge.net>

Patches item #430846, was updated on 2001-06-06 14:14
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 3
Submitted By: Peter Schneider-Kamp (nowonder)
Assigned to: Nobody/Anonymous (nobody)
Summary: faster string-decoding in base64.py

Initial Comment:
This addresses bug #419390 by anthonybaxter.

Instead of wrapping a string-to-be-decoded into a
StringIO class and using base64.decode use
binascii.a2b_base64 directly.

Speedup for big files is over 10 times (on Linux x86
anyway).

If uncontroversial I'll check it in.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470


From noreply@sourceforge.net  Wed Jun  6 22:25:56 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 06 Jun 2001 14:25:56 -0700
Subject: [Patches] [ python-Patches-430846 ] faster string-decoding in base64.py
Message-ID: <E157koO-0003xZ-00@usw-sf-web1.sourceforge.net>

Patches item #430846, was updated on 2001-06-06 14:14
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 3
Submitted By: Peter Schneider-Kamp (nowonder)
Assigned to: Nobody/Anonymous (nobody)
Summary: faster string-decoding in base64.py

Initial Comment:
This addresses bug #419390 by anthonybaxter.

Instead of wrapping a string-to-be-decoded into a
StringIO class and using base64.decode use
binascii.a2b_base64 directly.

Speedup for big files is over 10 times (on Linux x86
anyway).

If uncontroversial I'll check it in.

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2001-06-06 14:25

Message:
Logged In: YES 
user_id=31435

Umm -- there's no patch here.  If there were, I bet I would 
have changed this to Accepted, though <wink>.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 06:25:48 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 06 Jun 2001 22:25:48 -0700
Subject: [Patches] [ python-Patches-413171 ] fix UserDict.get, setdefault, update
Message-ID: <E157sIm-00063B-00@usw-sf-web3.sourceforge.net>

Patches item #413171, was updated on 2001-04-02 10:18
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=413171&group_id=5470

Category: library
Group: None
Status: Open
Resolution: Postponed
Priority: 4
Submitted By: Ka-Ping Yee (ping)
Assigned to: Ka-Ping Yee (ping)
Summary: fix UserDict.get, setdefault, update

Initial Comment:
The methods 'get', 'setdefault', and 'update'
on a dictionary are usually implemented (and
thought of) in terms of the lower-level methods
has_key, __getitem__, and __setitem__.  The
current implementation of UserDict relays a
call to e.g. x.get() to x.data.get(), which
behaves inconsistently if __getitem__ has been
implemented on x.

One particular big place where this turns up is cgi.
If you get a dict = cgi.SvFormContentDict(), then
dict.get('key') will return a *list* even though
dict['key'] returns a single item!

To make UserDict behave consistently, this patch
fixes get(), update(), and setdefault() to re-use
the other methods.  Then the only occurrence of
self.data[k] = v is in __setitem__, the only
occurrence of self.data[k] without assignment is
in __getitem__, etc.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-06 22:25

Message:
Logged In: YES 
user_id=21627

I recommend to approve this patch.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-04-10 14:17

Message:
Logged In: YES 
user_id=6380

Let's not fix this in 2.1.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=413171&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 06:28:36 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 06 Jun 2001 22:28:36 -0700
Subject: [Patches] [ python-Patches-414775 ] Add --skip-build option to bdist command
Message-ID: <E157sLU-000657-00@usw-sf-web3.sourceforge.net>

Patches item #414775, was updated on 2001-04-08 18:20
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=414775&group_id=5470

Category: distutils
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Robert Kern (kern)
Assigned to: A.M. Kuchling (akuchling)
Summary: Add --skip-build option to bdist command

Initial Comment:
Whenever one uses a non-default compiler to build an
extension, the bdist command will try to rebuild the
package with the default compiler and fail. 

The install command has a --skip-build option to
manually skip the re-building part of the install. I
adapted that code to add a similar --skip-build option
to the bdist, bdist_dumb, and bdist_wininst commands.
I'm not familiar enough with the bdist_rpm command's
code to see where it would work in there.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-06 22:28

Message:
Logged In: YES 
user_id=21627

I recommend to approve this patch.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=414775&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 06:29:01 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 06 Jun 2001 22:29:01 -0700
Subject: [Patches] [ python-Patches-430948 ] Performance improvement for profiler
Message-ID: <E157sLt-0002it-00@usw-sf-web2.sourceforge.net>

Patches item #430948, was updated on 2001-06-06 22:29
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Fred L. Drake, Jr. (fdrake)
Assigned to: Tim Peters (tim_one)
Summary: Performance improvement for profiler

Initial Comment:
This patch adds a bit of complexity to
Profile.__init__() in an effort to reduce the overhead
of the profiler.  The essential piece of the puzzle is
that the general Profile.get_time() method is replaced
with a function which does only as much as is needed
for the underlying timer.  For example, if time.clock()
is available, it can become a PyCFunction instead of a
bound method, requires only 1 dict lookup to execute
instead of the 11 it takes to execute get_time()
without this patch.

Also removes a couple of duplicate imports from the "if
__name__ == ..." section.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 06:30:01 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 06 Jun 2001 22:30:01 -0700
Subject: [Patches] [ python-Patches-430948 ] Performance improvement for profiler
Message-ID: <E157sMr-0002jw-00@usw-sf-web2.sourceforge.net>

Patches item #430948, was updated on 2001-06-06 22:29
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Fred L. Drake, Jr. (fdrake)
Assigned to: Tim Peters (tim_one)
Summary: Performance improvement for profiler

Initial Comment:
This patch adds a bit of complexity to
Profile.__init__() in an effort to reduce the overhead
of the profiler.  The essential piece of the puzzle is
that the general Profile.get_time() method is replaced
with a function which does only as much as is needed
for the underlying timer.  For example, if time.clock()
is available, it can become a PyCFunction instead of a
bound method, requires only 1 dict lookup to execute
instead of the 11 it takes to execute get_time()
without this patch.

Also removes a couple of duplicate imports from the "if
__name__ == ..." section.


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-06-06 22:30

Message:
Logged In: YES 
user_id=3066

I should note that this works with both 2.1.1 and 2.2,
though this is not a bugfix.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 06:33:24 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 06 Jun 2001 22:33:24 -0700
Subject: [Patches] [ python-Patches-415226 ] new base class for binary packaging
Message-ID: <E157sQ8-00068F-00@usw-sf-web3.sourceforge.net>

Patches item #415226, was updated on 2001-04-10 12:51
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415226&group_id=5470

Category: distutils
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Mark Alexander (mwa)
Assigned to: A.M. Kuchling (akuchling)
Summary: new base class for binary packaging

Initial Comment:
bdist_packager.py provides an abstract
base class for bdist commands. It provides easy access
to all 
the PEP 241 metadata fields, plus "revision" for the
package
revision and installation scripts for preinstall,
postinstall
preremove, and postremove. That covers the base
characteristics
of all the package managers that I'm familiar with. If
anyone
can think of any others, let me know, otherwise
additional
extensions would be implemented in the specific
packager's
commands. I would, however, discourage _requiring_ any
additional fields. It would be nice if by simply
supplying
the PEP241 metadata under the [bdist_packager] section 
all subclassed packagers worked with no further effort.
It also has rudimentary relocation support by including
a --no-autorelocate option. 

The bdist_packager is also where I see creating
seperate
binary packages for sub-packages supported. My need for 
that is much less than my desire for it right now, so I
didn't give it much thought as I wrote it. I'd be
delighted
to hear any comments and suggestions on how to approach
sub-packaging, though.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-06 22:33

Message:
Logged In: YES 
user_id=21627

Shouldn't the patch also modify the existing bdist 
commands to use this as a base class?


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415226&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 06:39:09 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 06 Jun 2001 22:39:09 -0700
Subject: [Patches] [ python-Patches-415227 ] Solaris pkgtool bdist command
Message-ID: <E157sVh-0006Gk-00@usw-sf-web3.sourceforge.net>

Patches item #415227, was updated on 2001-04-10 12:54
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415227&group_id=5470

Category: distutils
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Mark Alexander (mwa)
Assigned to: A.M. Kuchling (akuchling)
Summary: Solaris pkgtool bdist command

Initial Comment:
The bdist_pktool command is based on bdist_packager and
provides support for the Solaris
pkgadd and pkgrm commands. In most cases, no additional
options beyond the PEP 241 options are required. An 
exception is if the package name is >9 characters, a
--pkg-abrev option is required because that's all
pkgtool
will handle. It makes listing the packages on the
system
a pain, but the actual package files produced do match
name-version-revision-pyvers.pkg format. By default,
bdist_pkgtool provides request, postinstall, preremove,
and postremove scripts that will properly relocate
modules to the site-packages directory and recompile
all .py modules on the target machine. An author
can provide a custom request script and either have
it auto-relocate by merging the scripts, or inhibit
auto-relocation with --no-autorelocate.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-06 22:39

Message:
Logged In: YES 
user_id=21627

Should there also be some Makefile machinery to create a 
Solaris package for python itself? There is a 1.6a2 
package on sunfreeware; it would surely help if building 
Solaris packages was supported by the Python core itself.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415227&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 06:45:43 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 06 Jun 2001 22:45:43 -0700
Subject: [Patches] [ python-Patches-415629 ] setup.py: readline req. ncurses (SuSE)
Message-ID: <E157sc3-0006LA-00@usw-sf-web3.sourceforge.net>

Patches item #415629, was updated on 2001-04-12 02:29
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415629&group_id=5470

Category: Build
Group: None
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: setup.py: readline req. ncurses (SuSE)

Initial Comment:
Python 2.1b2 on SuSE Linux 7.0:

The readline extension module must be linked with
libncurses, else 'import readline' fails because of
unresolved symbols.
(libtermcap is only installed for libc5 compatibility
in SuSE 7.0)


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-06 22:45

Message:
Logged In: YES 
user_id=21627

This patch is not necessary. If readline.so is a shared 
library that relies on libncurses, it should itself be 
linked with libncurses; this is indeed the case on SuSE 
7.2:

martin@mira:~ > ldd /usr/lib/libreadline.so               
        libncurses.so.5 => /lib/libncurses.so.5
        libc.so.6 => /lib/libc.so.6
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2

Now, if the libreadline.so on SuSE 7.0 does not link 
itself with libncurses, that's a bug in the readline 
package.

OTOH, linking libncurses might be the *wrong* thing, since 
on some systems, libcurses might be needed even if 
libncurses is present (e.g. some Solaris installations).

If some system requires a special build procedure, the 
administrator must build the module using Modules/Setup, 
so that setup.py will not attempt to build it.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415629&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 06:53:16 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 06 Jun 2001 22:53:16 -0700
Subject: [Patches] [ python-Patches-416220 ] pstats.py interactive read function fix
Message-ID: <E157sjM-0006QY-00@usw-sf-web3.sourceforge.net>

Patches item #416220, was updated on 2001-04-14 19:25
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=416220&group_id=5470

Category: library
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Matthew Mueller (donut)
Assigned to: Eric S. Raymond (esr)
Summary: pstats.py interactive read function fix

Initial Comment:
In pstats.py new interactive mode, read with no
arguments dies because of a misplaced paren.  Simple
one liner fix.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-06 22:53

Message:
Logged In: YES 
user_id=21627

Committed as 1.18 of pstats.py.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=416220&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 07:07:56 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 06 Jun 2001 23:07:56 -0700
Subject: [Patches] [ python-Patches-430846 ] faster string-decoding in base64.py
Message-ID: <E157sxY-0004WB-00@usw-sf-web1.sourceforge.net>

Patches item #430846, was updated on 2001-06-06 14:14
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 3
Submitted By: Peter Schneider-Kamp (nowonder)
Assigned to: Nobody/Anonymous (nobody)
Summary: faster string-decoding in base64.py

Initial Comment:
This addresses bug #419390 by anthonybaxter.

Instead of wrapping a string-to-be-decoded into a
StringIO class and using base64.decode use
binascii.a2b_base64 directly.

Speedup for big files is over 10 times (on Linux x86
anyway).

If uncontroversial I'll check it in.

----------------------------------------------------------------------

>Comment By: Peter Schneider-Kamp (nowonder)
Date: 2001-06-06 23:07

Message:
Logged In: YES 
user_id=14463

Mhh, I did click that "Check to Upload & Attach File" thing.

No matter what, here is the new version (including your
speedup for encodestring).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-06 14:25

Message:
Logged In: YES 
user_id=31435

Umm -- there's no patch here.  If there were, I bet I would 
have changed this to Accepted, though <wink>.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 07:08:47 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 06 Jun 2001 23:08:47 -0700
Subject: [Patches] [ python-Patches-430754 ] Makes ftpmirror.py .netrc aware
Message-ID: <E157syN-0006fT-00@usw-sf-web3.sourceforge.net>

Patches item #430754, was updated on 2001-06-06 10:21
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430754&group_id=5470

Category: demos and tools
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Mike Romberg (romberg)
>Assigned to: Martin v. Löwis (loewis)
Summary: Makes ftpmirror.py .netrc aware

Initial Comment:
  The following patch  modifies the ftpmirror.py script
found in Tools/scripts to use the netrc module.  This
allows
the ftpmirror script to act more like a standard ftp
client
and take the login, password, and account from a users
$HOME/.netrc file if it exists.

  This patch is against the ftpmirror.py found in
python 2.1


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-06 23:08

Message:
Logged In: YES 
user_id=21627

I recommend a number of improvements to the patch:

- When unpacking the tuple, it is more intuitive to put 
the variables in the order in which they are documented 
for the function, ie.

            if auth:
                login, account, passwd = auth

- If the user does not have a .netrc, IOError will be 
raised and should be expected

- If a user is specified in the command line, it should 
probably take precedence over the .netrc setting

- The debug message (Loggin in as) should probably display 
the user which is used for login.

Please indicate whether you can produce a revised patch.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430754&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 07:10:11 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 06 Jun 2001 23:10:11 -0700
Subject: [Patches] [ python-Patches-430846 ] faster string-decoding in base64.py
Message-ID: <E157szj-0004cH-00@usw-sf-web1.sourceforge.net>

Patches item #430846, was updated on 2001-06-06 14:14
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 3
Submitted By: Peter Schneider-Kamp (nowonder)
>Assigned to: Tim Peters (tim_one)
Summary: faster string-decoding in base64.py

Initial Comment:
This addresses bug #419390 by anthonybaxter.

Instead of wrapping a string-to-be-decoded into a
StringIO class and using base64.decode use
binascii.a2b_base64 directly.

Speedup for big files is over 10 times (on Linux x86
anyway).

If uncontroversial I'll check it in.

----------------------------------------------------------------------

Comment By: Peter Schneider-Kamp (nowonder)
Date: 2001-06-06 23:07

Message:
Logged In: YES 
user_id=14463

Mhh, I did click that "Check to Upload & Attach File" thing.

No matter what, here is the new version (including your
speedup for encodestring).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-06 14:25

Message:
Logged In: YES 
user_id=31435

Umm -- there's no patch here.  If there were, I bet I would 
have changed this to Accepted, though <wink>.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 11:09:13 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 03:09:13 -0700
Subject: [Patches] [ python-Patches-403100 ] Multicharacter replacements in PyUnicode_TranslateCharmap
Message-ID: <E157wj3-0000Pf-00@usw-sf-web1.sourceforge.net>

Patches item #403100, was updated on 2001-01-04 09:50
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403100&group_id=5470

Category: core (C code)
Group: None
Status: Closed
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Multicharacter replacements in PyUnicode_TranslateCharmap

Initial Comment:
This patch modifies Objects/unicodeobject.c/PyUnicode_TranslateCharmap,
so that the error

   PyErr_SetString(PyExc_NotImplementedError,
        "1-n mappings are currently not implemented");

no longer occurs. I.e.

   u"ab".translate({ord(u"a"): u"bbb", ord(u"b"): u"aaa"})

now works. It does this by exponentially
reallocating the string, when there is no more
available space.


----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-07 03:09

Message:
Logged In: YES 
user_id=89016

The patch that was checked in changes 
PyUnicode_DecodeCharmap and PyUnicode_EncodeCharmap, but 
not PyUnicode_TranslateCharmap, where this functionality is 
also useful. . (e.g. for 
u"<foo>".translate({ord("<"): u"&lt;", ord(">"): u"&gt;"})
)

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-01-06 07:03

Message:
Checked in a different patch providing the same functionality.
Please see the CVS checking message for details.


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-01-05 10:45

Message:
I'll checkin a patch for this tomorrow which implements what I had 
in mind. The patch doesn't change the performance of the charmap 
codec.

Thanks,
-- Marc-Andre

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-01-05 09:07

Message:
The problem, that you can't know beforehand how long
the result string will be, i.e. if there really will be any 1-n
replacements happening.

It would be possible to do a loop through the replacement
strings and see if there are any that are longer than one character,
but even if there are, you don't know if they will really be used.

So you have three choices:
(1) You either guess how much space you need and reallocate
when the space is not enough or 
(2) you do a dry run of the algorithm once and count how much 
space you need and do the algorithm a second time and this 
time use the strings.
(3) you can keep the strings in a list and join the list into
one string in the end.

For the case of 1-1 mapping the following will happen:

(1) The first allocation has exactly the right amount of space, 
there won't be any reallocations, but a size check for every
character will be don (which should be only a few assembler instructions).
The mapping will have to be accessed for every character
in the source string once.

(2) There will only be one allocation, but for every character in
the source string, the mapping has to be accessed twice, which
are calls to Python function, exception handling etc.

(3) You have to make as many memory allocations are are parts
of the final string that you create, including error handling etc.

I think (1) is clearly the fastest method.


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-01-04 10:33

Message:
I like the idea, but the implementation needs some reworking:
the common case is 1-1 mapping so this should be as fast
as possible; extra size checks slow things down too much.

You can take a different approach, though:
leave things as they are and only add a special case for the 1-n
which does resizing depending on how many extra chars are inserted.
Then as final step, if resizing occurred, call _PyUnicode_Resize()
to cut down the allocate buffer to its true size.

-- Marc-Andre

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403100&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 11:20:18 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 03:20:18 -0700
Subject: [Patches] [ python-Patches-430986 ] Buglet in PyUnicode_FromUnicode
Message-ID: <E157wtm-0000aB-00@usw-sf-web1.sourceforge.net>

Patches item #430986, was updated on 2001-06-07 03:20
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430986&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Nobody/Anonymous (nobody)
Summary: Buglet in PyUnicode_FromUnicode

Initial Comment:
PyUnicode_FromUnicode contains the following
code, which is clearly wrong:

   unicode = _PyUnicode_New(1);
   unicode->str[0] = *u;
   if (!unicode)
      return NULL;


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430986&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 11:52:39 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 03:52:39 -0700
Subject: [Patches] [ python-Patches-412229 ] runtime RTLD_NOW control via sys
Message-ID: <E157xP5-0001Cu-00@usw-sf-web1.sourceforge.net>

Patches item #412229, was updated on 2001-03-29 08:55
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=412229&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Bram Stolk (bram)
>Assigned to: Martin v. Löwis (loewis)
Summary: runtime RTLD_NOW control via sys

Initial Comment:
This patch enables runtime control over the RTLD_NOW
flag, which can be used to do lazy symbol resolving
when loading a shared lib.

It's an extention to the sys module:
sys.setlazysymresolve(0|1)

The patch is against the latest CVS code, and
was generated by 'cvs diff'.

----------------------------------------------------------------------

>Comment By: Bram Stolk (bram)
Date: 2001-06-07 03:52

Message:
Logged In: YES 
user_id=14028

Ok,  I've revised the patch as you suggested.
Currently, you can get and set the flags just as you
specified.
Also, it should also build on platforms without RTLD_NOW,
and even on platforms without LDOPEN altogether.

However, I see one problem with this:
After Python 1.5.2, the dl module seems to be removed from
the
default installation. 

This means that dl.RTLD_NOW and dl.RTLD_LAZY are not
readilly available on a standard Python install.
This is akward.

The patch was generated with the command:
cvs diff -c against the cvs tree of  Thu Jun  7 12:44:18 MDT
2001

   Bram


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-05 23:39

Message:
Logged In: YES 
user_id=21627

The patch needs further work: The code currently compiles 
on systems which don't define RTLD_NOW (although I'm not 
sure what these systems are); your code doesn't.

Also, the code allows to set the flags, but has no 
interface to query them.

Finally, users often complain that Python should use 
RTLD_GLOBAL, so that they can share symbols across 
extension modules. Therefore, I propose that you allow 
setting arbitrary dlopen flags; users would have to write

sys.setdlopenflags(0)

to turn off RTLD_NOW, and use

sys.setdlopenflags(dl.RTLD_NOW|dl.RTLD_GLOBAL)

to add RTLD_GLOBAL.

When you revise this patch, please submit unified (-u) or 
context (-c) diffs.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-04-10 14:48

Message:
Logged In: YES 
user_id=6380

Sorry, no new features in 2.1.

I'll look at this after 2.1 is released though.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=412229&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 13:26:45 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 05:26:45 -0700
Subject: [Patches] [ python-Patches-430986 ] Buglet in PyUnicode_FromUnicode
Message-ID: <E157ys9-0003Vf-00@usw-sf-web3.sourceforge.net>

Patches item #430986, was updated on 2001-06-07 03:20
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430986&group_id=5470

Category: core (C code)
Group: None
>Status: Closed
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
>Assigned to: M.-A. Lemburg (lemburg)
Summary: Buglet in PyUnicode_FromUnicode

Initial Comment:
PyUnicode_FromUnicode contains the following
code, which is clearly wrong:

   unicode = _PyUnicode_New(1);
   unicode->str[0] = *u;
   if (!unicode)
      return NULL;


----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-07 05:26

Message:
Logged In: YES 
user_id=38388

Thanks. I checked in a fix in CVS.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430986&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 13:30:38 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 05:30:38 -0700
Subject: [Patches] [ python-Patches-403100 ] Multicharacter replacements in PyUnicode_TranslateCharmap
Message-ID: <E157yvu-0000EH-00@usw-sf-web2.sourceforge.net>

Patches item #403100, was updated on 2001-01-04 09:50
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403100&group_id=5470

Category: core (C code)
Group: None
>Status: Open
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Multicharacter replacements in PyUnicode_TranslateCharmap

Initial Comment:
This patch modifies Objects/unicodeobject.c/PyUnicode_TranslateCharmap,
so that the error

   PyErr_SetString(PyExc_NotImplementedError,
        "1-n mappings are currently not implemented");

no longer occurs. I.e.

   u"ab".translate({ord(u"a"): u"bbb", ord(u"b"): u"aaa"})

now works. It does this by exponentially
reallocating the string, when there is no more
available space.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-07 03:09

Message:
Logged In: YES 
user_id=89016

The patch that was checked in changes 
PyUnicode_DecodeCharmap and PyUnicode_EncodeCharmap, but 
not PyUnicode_TranslateCharmap, where this functionality is 
also useful. . (e.g. for 
u"<foo>".translate({ord("<"): u"&lt;", ord(">"): u"&gt;"})
)

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-01-06 07:03

Message:
Checked in a different patch providing the same functionality.
Please see the CVS checking message for details.


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-01-05 10:45

Message:
I'll checkin a patch for this tomorrow which implements what I had 
in mind. The patch doesn't change the performance of the charmap 
codec.

Thanks,
-- Marc-Andre

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-01-05 09:07

Message:
The problem, that you can't know beforehand how long
the result string will be, i.e. if there really will be any 1-n
replacements happening.

It would be possible to do a loop through the replacement
strings and see if there are any that are longer than one character,
but even if there are, you don't know if they will really be used.

So you have three choices:
(1) You either guess how much space you need and reallocate
when the space is not enough or 
(2) you do a dry run of the algorithm once and count how much 
space you need and do the algorithm a second time and this 
time use the strings.
(3) you can keep the strings in a list and join the list into
one string in the end.

For the case of 1-1 mapping the following will happen:

(1) The first allocation has exactly the right amount of space, 
there won't be any reallocations, but a size check for every
character will be don (which should be only a few assembler instructions).
The mapping will have to be accessed for every character
in the source string once.

(2) There will only be one allocation, but for every character in
the source string, the mapping has to be accessed twice, which
are calls to Python function, exception handling etc.

(3) You have to make as many memory allocations are are parts
of the final string that you create, including error handling etc.

I think (1) is clearly the fastest method.


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-01-04 10:33

Message:
I like the idea, but the implementation needs some reworking:
the common case is 1-1 mapping so this should be as fast
as possible; extra size checks slow things down too much.

You can take a different approach, though:
leave things as they are and only add a special case for the 1-n
which does resizing depending on how many extra chars are inserted.
Then as final step, if resizing occurred, call _PyUnicode_Resize()
to cut down the allocate buffer to its true size.

-- Marc-Andre

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403100&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 13:32:10 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 05:32:10 -0700
Subject: [Patches] [ python-Patches-403100 ] Multicharacter replacements in PyUnicode_TranslateCharmap
Message-ID: <E157yxO-0000FL-00@usw-sf-web2.sourceforge.net>

Patches item #403100, was updated on 2001-01-04 09:50
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403100&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Multicharacter replacements in PyUnicode_TranslateCharmap

Initial Comment:
This patch modifies Objects/unicodeobject.c/PyUnicode_TranslateCharmap,
so that the error

   PyErr_SetString(PyExc_NotImplementedError,
        "1-n mappings are currently not implemented");

no longer occurs. I.e.

   u"ab".translate({ord(u"a"): u"bbb", ord(u"b"): u"aaa"})

now works. It does this by exponentially
reallocating the string, when there is no more
available space.


----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-07 05:32

Message:
Logged In: YES 
user_id=38388

Reopened. This should really be marked as feature request
but for some reason SF won't let me change the Data Type.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-07 03:09

Message:
Logged In: YES 
user_id=89016

The patch that was checked in changes 
PyUnicode_DecodeCharmap and PyUnicode_EncodeCharmap, but 
not PyUnicode_TranslateCharmap, where this functionality is 
also useful. . (e.g. for 
u"<foo>".translate({ord("<"): u"&lt;", ord(">"): u"&gt;"})
)

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-01-06 07:03

Message:
Checked in a different patch providing the same functionality.
Please see the CVS checking message for details.


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-01-05 10:45

Message:
I'll checkin a patch for this tomorrow which implements what I had 
in mind. The patch doesn't change the performance of the charmap 
codec.

Thanks,
-- Marc-Andre

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-01-05 09:07

Message:
The problem, that you can't know beforehand how long
the result string will be, i.e. if there really will be any 1-n
replacements happening.

It would be possible to do a loop through the replacement
strings and see if there are any that are longer than one character,
but even if there are, you don't know if they will really be used.

So you have three choices:
(1) You either guess how much space you need and reallocate
when the space is not enough or 
(2) you do a dry run of the algorithm once and count how much 
space you need and do the algorithm a second time and this 
time use the strings.
(3) you can keep the strings in a list and join the list into
one string in the end.

For the case of 1-1 mapping the following will happen:

(1) The first allocation has exactly the right amount of space, 
there won't be any reallocations, but a size check for every
character will be don (which should be only a few assembler instructions).
The mapping will have to be accessed for every character
in the source string once.

(2) There will only be one allocation, but for every character in
the source string, the mapping has to be accessed twice, which
are calls to Python function, exception handling etc.

(3) You have to make as many memory allocations are are parts
of the final string that you create, including error handling etc.

I think (1) is clearly the fastest method.


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-01-04 10:33

Message:
I like the idea, but the implementation needs some reworking:
the common case is 1-1 mapping so this should be as fast
as possible; extra size checks slow things down too much.

You can take a different approach, though:
leave things as they are and only add a special case for the 1-n
which does resizing depending on how many extra chars are inserted.
Then as final step, if resizing occurred, call _PyUnicode_Resize()
to cut down the allocate buffer to its true size.

-- Marc-Andre

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403100&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 15:34:44 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 07:34:44 -0700
Subject: [Patches] [ python-Patches-412229 ] runtime RTLD_NOW control via sys
Message-ID: <E1580s0-00029R-00@usw-sf-web2.sourceforge.net>

Patches item #412229, was updated on 2001-03-29 08:55
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=412229&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Bram Stolk (bram)
Assigned to: Martin v. Löwis (loewis)
Summary: runtime RTLD_NOW control via sys

Initial Comment:
This patch enables runtime control over the RTLD_NOW
flag, which can be used to do lazy symbol resolving
when loading a shared lib.

It's an extention to the sys module:
sys.setlazysymresolve(0|1)

The patch is against the latest CVS code, and
was generated by 'cvs diff'.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-07 07:34

Message:
Logged In: YES 
user_id=21627

The patch looks good to me now, so I recommend accepting it
- except for the part that activates dlmodule by default.

As for getting at the RTLD flags, I see three options:
1. setup.py could be changed to build dl wherever possible.
2. Administrators should activate dlmodule if they trust it.
3. Application authors somehow need to find out the values
of RTLD_ on their system, e.g. by per-system hard-coded
values, or by running h2py on dlfcn.h; that could be part of
the Python distribution for systems known to support
dlfcn.h.
3. the RTLD_ flags are exported from some other module as
well;
    imp comes to mind.

Actually, putting setdlopenflags into imp instead of sys
might be worth a consideration.

----------------------------------------------------------------------

Comment By: Bram Stolk (bram)
Date: 2001-06-07 03:52

Message:
Logged In: YES 
user_id=14028

Ok,  I've revised the patch as you suggested.
Currently, you can get and set the flags just as you
specified.
Also, it should also build on platforms without RTLD_NOW,
and even on platforms without LDOPEN altogether.

However, I see one problem with this:
After Python 1.5.2, the dl module seems to be removed from
the
default installation. 

This means that dl.RTLD_NOW and dl.RTLD_LAZY are not
readilly available on a standard Python install.
This is akward.

The patch was generated with the command:
cvs diff -c against the cvs tree of  Thu Jun  7 12:44:18 MDT
2001

   Bram


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-05 23:39

Message:
Logged In: YES 
user_id=21627

The patch needs further work: The code currently compiles 
on systems which don't define RTLD_NOW (although I'm not 
sure what these systems are); your code doesn't.

Also, the code allows to set the flags, but has no 
interface to query them.

Finally, users often complain that Python should use 
RTLD_GLOBAL, so that they can share symbols across 
extension modules. Therefore, I propose that you allow 
setting arbitrary dlopen flags; users would have to write

sys.setdlopenflags(0)

to turn off RTLD_NOW, and use

sys.setdlopenflags(dl.RTLD_NOW|dl.RTLD_GLOBAL)

to add RTLD_GLOBAL.

When you revise this patch, please submit unified (-u) or 
context (-c) diffs.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-04-10 14:48

Message:
Logged In: YES 
user_id=6380

Sorry, no new features in 2.1.

I'll look at this after 2.1 is released though.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=412229&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 16:10:39 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 08:10:39 -0700
Subject: [Patches] [ python-Patches-413171 ] fix UserDict.get, setdefault, update
Message-ID: <E1581Ql-00066A-00@usw-sf-web1.sourceforge.net>

Patches item #413171, was updated on 2001-04-02 10:18
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=413171&group_id=5470

Category: library
Group: None
Status: Open
>Resolution: Accepted
Priority: 4
Submitted By: Ka-Ping Yee (ping)
Assigned to: Ka-Ping Yee (ping)
Summary: fix UserDict.get, setdefault, update

Initial Comment:
The methods 'get', 'setdefault', and 'update'
on a dictionary are usually implemented (and
thought of) in terms of the lower-level methods
has_key, __getitem__, and __setitem__.  The
current implementation of UserDict relays a
call to e.g. x.get() to x.data.get(), which
behaves inconsistently if __getitem__ has been
implemented on x.

One particular big place where this turns up is cgi.
If you get a dict = cgi.SvFormContentDict(), then
dict.get('key') will return a *list* even though
dict['key'] returns a single item!

To make UserDict behave consistently, this patch
fixes get(), update(), and setdefault() to re-use
the other methods.  Then the only occurrence of
self.data[k] = v is in __setitem__, the only
occurrence of self.data[k] without assignment is
in __getitem__, etc.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-07 08:10

Message:
Logged In: YES 
user_id=6380

Approved.  Check it in already!

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-06 22:25

Message:
Logged In: YES 
user_id=21627

I recommend to approve this patch.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-04-10 14:17

Message:
Logged In: YES 
user_id=6380

Let's not fix this in 2.1.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=413171&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 17:37:40 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 09:37:40 -0700
Subject: [Patches] [ python-Patches-430754 ] Makes ftpmirror.py .netrc aware
Message-ID: <E1582my-0007q9-00@usw-sf-web1.sourceforge.net>

Patches item #430754, was updated on 2001-06-06 10:21
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430754&group_id=5470

Category: demos and tools
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Mike Romberg (romberg)
Assigned to: Martin v. Löwis (loewis)
Summary: Makes ftpmirror.py .netrc aware

Initial Comment:
  The following patch  modifies the ftpmirror.py script
found in Tools/scripts to use the netrc module.  This
allows
the ftpmirror script to act more like a standard ftp
client
and take the login, password, and account from a users
$HOME/.netrc file if it exists.

  This patch is against the ftpmirror.py found in
python 2.1


----------------------------------------------------------------------

>Comment By: Mike Romberg (romberg)
Date: 2001-06-07 09:37

Message:
Logged In: YES 
user_id=61373

Good ideas.  Here is a revised patch.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-06 23:08

Message:
Logged In: YES 
user_id=21627

I recommend a number of improvements to the patch:

- When unpacking the tuple, it is more intuitive to put 
the variables in the order in which they are documented 
for the function, ie.

            if auth:
                login, account, passwd = auth

- If the user does not have a .netrc, IOError will be 
raised and should be expected

- If a user is specified in the command line, it should 
probably take precedence over the .netrc setting

- The debug message (Loggin in as) should probably display 
the user which is used for login.

Please indicate whether you can produce a revised patch.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430754&group_id=5470


From abbn@v-share.com  Thu Jun  7 17:48:33 2001
From: abbn@v-share.com (Vassilis Vassiliou)
Date: Thu, 7 Jun 2001 19:48:33 +0300
Subject: [Patches] Shareware Software Registration Services
Message-ID: <200106071648.f57GmXB13791@v-share.com>

Dear Software Vendor,

Our company Visage Services Inc. offers valuable shareware software registration services to many developers for the past 4 years. Being ourselves shareware software developers we created in 1998 a state of the art service administration system which proved very reliable and prosperous, due to its highly adaptable flexibility. Taking into consideration our very attractive fee schedule this could be a major opportunity to enhance your profits at a minimum cost. Please visit our site at http://www.v-share.com for a detailed description of these services and our fee schedule. Should you need assistance, feel free to contact me anytime.

Thank you for your time reading my mail.

Sincerely,

  Vassilis Vassiliou 
  Sales Manager
  VISAGE SERVICES INC. 
  abbn@v-share.com


From noreply@sourceforge.net  Thu Jun  7 18:17:41 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 10:17:41 -0700
Subject: [Patches] [ python-Patches-430754 ] Makes ftpmirror.py .netrc aware
Message-ID: <E1583Ph-0004hL-00@usw-sf-web2.sourceforge.net>

Patches item #430754, was updated on 2001-06-06 10:21
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430754&group_id=5470

Category: demos and tools
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Mike Romberg (romberg)
Assigned to: Martin v. Löwis (loewis)
Summary: Makes ftpmirror.py .netrc aware

Initial Comment:
  The following patch  modifies the ftpmirror.py script
found in Tools/scripts to use the netrc module.  This
allows
the ftpmirror script to act more like a standard ftp
client
and take the login, password, and account from a users
$HOME/.netrc file if it exists.

  This patch is against the ftpmirror.py found in
python 2.1


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-07 10:17

Message:
Logged In: YES 
user_id=21627

Committed as ftpmirror.py 1.14.

----------------------------------------------------------------------

Comment By: Mike Romberg (romberg)
Date: 2001-06-07 09:37

Message:
Logged In: YES 
user_id=61373

Good ideas.  Here is a revised patch.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-06 23:08

Message:
Logged In: YES 
user_id=21627

I recommend a number of improvements to the patch:

- When unpacking the tuple, it is more intuitive to put 
the variables in the order in which they are documented 
for the function, ie.

            if auth:
                login, account, passwd = auth

- If the user does not have a .netrc, IOError will be 
raised and should be expected

- If a user is specified in the command line, it should 
probably take precedence over the .netrc setting

- The debug message (Loggin in as) should probably display 
the user which is used for login.

Please indicate whether you can produce a revised patch.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430754&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 19:42:44 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 11:42:44 -0700
Subject: [Patches] [ python-Patches-430846 ] faster string-decoding in base64.py
Message-ID: <E1584k0-00065T-00@usw-sf-web2.sourceforge.net>

Patches item #430846, was updated on 2001-06-06 14:14
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470

Category: library
Group: None
Status: Open
>Resolution: Accepted
Priority: 3
Submitted By: Peter Schneider-Kamp (nowonder)
>Assigned to: Peter Schneider-Kamp (nowonder)
Summary: faster string-decoding in base64.py

Initial Comment:
This addresses bug #419390 by anthonybaxter.

Instead of wrapping a string-to-be-decoded into a
StringIO class and using base64.decode use
binascii.a2b_base64 directly.

Speedup for big files is over 10 times (on Linux x86
anyway).

If uncontroversial I'll check it in.

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2001-06-07 11:42

Message:
Logged In: YES 
user_id=31435

Accepted and assigned back to Peter for checkin.  Don't see 
how this could be controversial -- it's simple and 
appropriate.

----------------------------------------------------------------------

Comment By: Peter Schneider-Kamp (nowonder)
Date: 2001-06-06 23:07

Message:
Logged In: YES 
user_id=14463

Mhh, I did click that "Check to Upload & Attach File" thing.

No matter what, here is the new version (including your
speedup for encodestring).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-06 14:25

Message:
Logged In: YES 
user_id=31435

Umm -- there's no patch here.  If there were, I bet I would 
have changed this to Accepted, though <wink>.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 20:39:31 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 12:39:31 -0700
Subject: [Patches] [ python-Patches-430948 ] Performance improvement for profiler
Message-ID: <E1585cx-0001wl-00@usw-sf-web3.sourceforge.net>

Patches item #430948, was updated on 2001-06-06 22:29
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Fred L. Drake, Jr. (fdrake)
>Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Performance improvement for profiler

Initial Comment:
This patch adds a bit of complexity to
Profile.__init__() in an effort to reduce the overhead
of the profiler.  The essential piece of the puzzle is
that the general Profile.get_time() method is replaced
with a function which does only as much as is needed
for the underlying timer.  For example, if time.clock()
is available, it can become a PyCFunction instead of a
bound method, requires only 1 dict lookup to execute
instead of the 11 it takes to execute get_time()
without this patch.

Also removes a couple of duplicate imports from the "if
__name__ == ..." section.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2001-06-07 12:39

Message:
Logged In: YES 
user_id=31435

Fine by me (and good idea!).  I'd rather see get_time_mac 
be a module-level function _get_time_mac, get_time_timer a 
module-level _get_time_timer (or, better, _get_time_list), 
and get_time_times a module-level function _get_time_times; 
and in the last case without the needless expense of reduce
():

.def _get_time_times(times=os.times):
.    t = times()
.    return t[0] + t[1]

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-06-06 22:30

Message:
Logged In: YES 
user_id=3066

I should note that this works with both 2.1.1 and 2.2,
though this is not a bugfix.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 20:40:10 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 12:40:10 -0700
Subject: [Patches] [ python-Patches-429957 ] Add some more EBCDIC  encodings
Message-ID: <E1585da-0001xU-00@usw-sf-web3.sourceforge.net>

Patches item #429957, was updated on 2001-06-03 20:53
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429957&group_id=5470

Category: library
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
Assigned to: Nobody/Anonymous (nobody)
Summary: Add some more EBCDIC  encodings

Initial Comment:
Add support for cp1140, which is identical to cp037, 
with the addition of the euro character.

Also added a few EDBDIC aliases.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-07 12:40

Message:
Logged In: YES 
user_id=21627

Committed as cp1140.py 1.1 and aliases.py 1.8.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429957&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 20:47:00 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 12:47:00 -0700
Subject: [Patches] [ python-Patches-430181 ] Make httplib work with picky servers
Message-ID: <E1585kC-00022W-00@usw-sf-web3.sourceforge.net>

Patches item #430181, was updated on 2001-06-04 19:40
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430181&group_id=5470

Category: library
Group: 2.0.1 bugfix
Status: Open
Resolution: None
Priority: 5
Submitted By: Leonard Samuelson (lenski)
Assigned to: Nobody/Anonymous (nobody)
Summary: Make httplib work with picky servers

Initial Comment:
Python2.0: httplib.py: httplib: HTTPconnection

Header processing: (putheader, putrequest, and
endheaders) methods transmit each HTTP header
line using a separate socket send invocation.

Before this change, My Linksys Etherfast Cable/DSL
router (Linksys BEFSR41, firmware v 1.22,
March 31 2000) rejected the request becuase the
entire HTTP header block is not contained in a
single TCP packet.

Clearly, the router is engaging in a noncompliant
optimization!  This patch is not required to allow
httplib to work with real servers, making it
completely optional.

The patch I am submitting with this note causes
httplib to work with the router.  It is intended
mostly as a model; a developer with greater
familiarity with the library might have a better
approach.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-07 12:47

Message:
Logged In: YES 
user_id=21627

I recommend to reject this patch. Not only is the router 
broken, but it appears that the operating system is broken 
also; I think it legally could, and probably should, 
combine small write requests to a TCP socket that occur 
shortly after each other into a single IP packet.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430181&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 20:53:57 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 12:53:57 -0700
Subject: [Patches] [ python-Patches-429542 ] Bugfix for libsmtp example
Message-ID: <E1585qv-00028c-00@usw-sf-web3.sourceforge.net>

Patches item #429542, was updated on 2001-06-02 02:27
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429542&group_id=5470

Category: documentation
Group: None
>Status: Closed
>Resolution: Works For Me
Priority: 5
Submitted By: Sean Reifschneider (jafo)
Assigned to: Nobody/Anonymous (nobody)
Summary: Bugfix for libsmtp example

Initial Comment:
libsmtp includes an example which does:

   while 1:
      line = raw_input()
      if not line: break

which fails raising an EOFError exception.  This patch
changes the code to:

   while 1:
      try:
         line = raw_input()
      except EOFError:
         break

Sean

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-07 12:53

Message:
Logged In: YES 
user_id=21627

This is already fixed in Doc/lib/libsmtplib.tex revisions 
1.17 and 1.16.6.1, as a response to bug report #424776.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429542&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 20:58:34 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 12:58:34 -0700
Subject: [Patches] [ python-Patches-429614 ] pythonpath and optimize def. before init
Message-ID: <E1585vO-0002D9-00@usw-sf-web3.sourceforge.net>

Patches item #429614, was updated on 2001-06-02 08:56
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429614&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Frederic Giacometti (giacometti)
Assigned to: Nobody/Anonymous (nobody)
Summary: pythonpath and optimize def. before init

Initial Comment:

A) Addition of four functions
=====================

Py_{Set, Get}{PythonPath, OptimizeLevel}()
with the same semantics as Py_{Set, Get}ProgramName()

(Note: the C ANSI type 'char const*' is used to describe non-modifiable strings)

These four functions are needed in the next JPE runtime (Python 2.1 patch included in the 
distribution); this allows setting the PYTHONPATH and optimize level from Java property values.


B) Option '-P pythonpath' on the Python command line:
========================================

This option defines 'pythonpath' from the command line (and override the PYTHONPATH 
environment variable if necessary).

Usefullness: Sometimes, one does not want to rely on the environment variables, or modify them.

Sample application: Running build and test scripts in full control of the environment, and with 
different PYTHONPATH values.

This option is needed by the build and test scripts of the next JPE source distribution (Python 2.1 
patch included in the distribution.

Frederic Giacometti
fred@arakne.com


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-07 12:58

Message:
Logged In: YES 
user_id=21627

I think a PEP describing the exact rationale and nature of 
the change is required here. For example, why is it good 
that -P overrides PYTHONPATH, instead of combining both 
somehow?

Also, the documentation talks about Py_GetOptimizeLevel, 
whereas the header declares Py_GetOptimizeFlag.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429614&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 21:00:54 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 13:00:54 -0700
Subject: [Patches] [ python-Patches-429442 ] Cygwin sys.platform/get_platform() patch
Message-ID: <E1585xe-0007Ju-00@usw-sf-web2.sourceforge.net>

Patches item #429442, was updated on 2001-06-01 13:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429442&group_id=5470

Category: distutils
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jason Tishler (jlt63)
>Assigned to: Greg Ward (gward)
Summary: Cygwin sys.platform/get_platform() patch

Initial Comment:
This patch corrects sys.platform and distutils.util.get_platform()
problems caused by the cruft contained in Cygwin's uname -s.

Please see the following for the gory details:

http://www.cygwin.com/ml/cygwin-apps/2001-05/msg00106.html

Note that the above also solicited input from the community in an
attempt to prevent any potential heartache.  Since no one responded
it would appear that either the changes are acceptable or that no one
really cares... :,)

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2001-06-07 13:00

Message:
Logged In: YES 
user_id=31435

Assigned to GregW.  Greg, note that since Cygwin is really 
a Unix derivative, your primary concern is probably just 
that this doesn't break other Unixoid systems.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429442&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 21:05:50 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 13:05:50 -0700
Subject: [Patches] [ python-Patches-429171 ] sgmllib - leading spaces in declaration
Message-ID: <E15862Q-0002J9-00@usw-sf-web3.sourceforge.net>

Patches item #429171, was updated on 2001-05-31 15:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429171&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Doug Fort (dougfort)
Assigned to: Nobody/Anonymous (nobody)
Summary: sgmllib - leading spaces in declaration

Initial Comment:
Some sites sloppily leave a space in their doctype
declaration:  i.e. <! doctype...>. The Python 2.1 sgml
parser raises an exception for this.  This patch
modifies sgmllib.py to allow leading whitespace in the
declaration.  It also adds a little information to the
exception message.


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-07 13:05

Message:
Logged In: YES 
user_id=21627

I don't have an SGML spec, so I can only check the XML 
spec. In XML, such a DOCTYPE declaration is ill-formed; I 
expect the same to be true for SGML. Therefore, I 
recommend to reject this patch.

If you have a need to process such ill-formed documents, I 
recommend to derive from SGMLParser and replace 
parse_declaration appropriately. E.g. you could advance i 
until after the space, then call the base method.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429171&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 21:07:00 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 13:07:00 -0700
Subject: [Patches] [ python-Patches-427749 ] Patch for bug #419390 (base64.py)
Message-ID: <E15863Y-0007P6-00@usw-sf-web2.sourceforge.net>

Patches item #427749, was updated on 2001-05-27 11:35
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427749&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Kalle Svensson (krftkndl)
>Assigned to: Peter Schneider-Kamp (nowonder)
Summary: Patch for bug #419390 (base64.py)

Initial Comment:
Improves performance of base64.encodestring and
base64.decodestring by avoiding StringIO and using
binascii directly.

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2001-06-07 13:07

Message:
Logged In: YES 
user_id=31435

Assigned to Peter since it appears to compete with his 
patch.  Peter, I expect your patch is quicker.  If you 
agree and check in your patch, close this as Duplicate (or 
something).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427749&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 21:09:50 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 13:09:50 -0700
Subject: [Patches] [ python-Patches-426746 ] Infrastructure for getting MacPython modules working on OSX
Message-ID: <E15866I-0007U9-00@usw-sf-web2.sourceforge.net>

Patches item #426746, was updated on 2001-05-23 13:29
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470

Category: Build
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jack Jansen (jackjansen)
>Assigned to: Thomas Wouters (twouters)
Summary: Infrastructure for getting MacPython modules working on OSX

Initial Comment:
Here are a couple of patches that optionally (on MacOSX) enable a bit of extra infrastructure in the Python core, to allow various (MacPython-originated) dynamic extension modules to be built. Here's what I patched:

- Added a MACHDEP_OBJS variable to Makefile.pre.in and configure.in. This allows platforms to include patform-specific sourcefiles to be added to the core build.

- Added (using MACHDEP_OBJS) a macglue.c file to the build, which contains glue code that allows Mac extension modules to refer to each other while being in separate dynamically loaded modules, plus a couple of utility routines. There's also a few changes to LDFLAGS to get the object file incorporated (as it is otherwise optimized away because the rest of Python doesn't refer to it).

- Added a config.h.in define USE_TOOLBOX_OBJECT_GLUE which enables the glue code mentioned above (which isn't need in MacPython, only in Mach-O Python).

Possibly the latter two should be dependent on a configure switch (--with-mac-toolbox-modules?) but (a) I think the added memory footprint is minimal and (b) I never understood how to add configure switches:-)

A setup.py patch will follow, but I'm still testing it.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2001-06-07 13:09

Message:
Logged In: YES 
user_id=31435

Assigned to Thomas because he's shown previous signs of 
knowing how to spell "configure" <0.9 wink>.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 21:19:04 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 13:19:04 -0700
Subject: [Patches] [ python-Patches-424475 ] Speed-up tp_compare usage
Message-ID: <E1586FE-0007bn-00@usw-sf-web2.sourceforge.net>

Patches item #424475, was updated on 2001-05-16 01:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=424475&group_id=5470

Category: core (C code)
Group: None
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Martin v. Löwis (loewis)
>Assigned to: Martin v. Löwis (loewis)
Summary: Speed-up tp_compare usage

Initial Comment:
This patch tries to optimize PyObject_RichCompare for
the common case of objects with equal types which
support tp_compare. It gives a speed-up of roughly 7%
for comparing strings in a loop.

The patch also gives type objects a tp_compare
function, so that they can make use of the improvement.

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2001-06-07 13:19

Message:
Logged In: YES 
user_id=31435

Accepted and assigned back to Martin.  This is too valuable 
to quibble over.  Note that when calling a tp_compare slot, 
this kind of thing:

.	c = (*f)(v, w);
.	if (PyErr_Occurred())

is better spelled:

.	c = (*f)(v, w);
.	if (c < 0 && Py_Err_Occurred())


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-05-21 09:57

Message:
Logged In: YES 
user_id=21627

The revised patch prefers tp_compare over tp_richcompare in
do_cmp if both are available. It also restores
UserList.__cmp__ from deprecation.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=424475&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 21:41:58 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 13:41:58 -0700
Subject: [Patches] [ python-Patches-429171 ] sgmllib - leading spaces in declaration
Message-ID: <E1586bO-00080Q-00@usw-sf-web2.sourceforge.net>

Patches item #429171, was updated on 2001-05-31 15:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429171&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Doug Fort (dougfort)
Assigned to: Nobody/Anonymous (nobody)
Summary: sgmllib - leading spaces in declaration

Initial Comment:
Some sites sloppily leave a space in their doctype
declaration:  i.e. <! doctype...>. The Python 2.1 sgml
parser raises an exception for this.  This patch
modifies sgmllib.py to allow leading whitespace in the
declaration.  It also adds a little information to the
exception message.


----------------------------------------------------------------------

>Comment By: Doug Fort (dougfort)
Date: 2001-06-07 13:41

Message:
Logged In: YES 
user_id=6399

I have already overloaded parse_declaration. I will withdraw
the patch. However, I would like to make one final comment.

<rant>
A rigid interpretation of the RFCs is correct in servers,
but clients should be as flexible as possible, to handle
real servers.  Our system (http://www.stressmy.com) uses
heavily overloaded versions of sgmllib, httplib, and other
Python library modules because while they may adhere here to
some notion of academic purity, they just don't work very
well against real websites.
</rant>

Whew, I feel better now.  

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-07 13:05

Message:
Logged In: YES 
user_id=21627

I don't have an SGML spec, so I can only check the XML 
spec. In XML, such a DOCTYPE declaration is ill-formed; I 
expect the same to be true for SGML. Therefore, I 
recommend to reject this patch.

If you have a need to process such ill-formed documents, I 
recommend to derive from SGMLParser and replace 
parse_declaration appropriately. E.g. you could advance i 
until after the space, then call the base method.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429171&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 21:43:00 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 13:43:00 -0700
Subject: [Patches] [ python-Patches-429171 ] sgmllib - leading spaces in declaration
Message-ID: <E1586cO-00081S-00@usw-sf-web2.sourceforge.net>

Patches item #429171, was updated on 2001-05-31 15:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429171&group_id=5470

Category: library
Group: None
>Status: Deleted
Resolution: None
Priority: 5
Submitted By: Doug Fort (dougfort)
Assigned to: Nobody/Anonymous (nobody)
Summary: sgmllib - leading spaces in declaration

Initial Comment:
Some sites sloppily leave a space in their doctype
declaration:  i.e. <! doctype...>. The Python 2.1 sgml
parser raises an exception for this.  This patch
modifies sgmllib.py to allow leading whitespace in the
declaration.  It also adds a little information to the
exception message.


----------------------------------------------------------------------

Comment By: Doug Fort (dougfort)
Date: 2001-06-07 13:41

Message:
Logged In: YES 
user_id=6399

I have already overloaded parse_declaration. I will withdraw
the patch. However, I would like to make one final comment.

<rant>
A rigid interpretation of the RFCs is correct in servers,
but clients should be as flexible as possible, to handle
real servers.  Our system (http://www.stressmy.com) uses
heavily overloaded versions of sgmllib, httplib, and other
Python library modules because while they may adhere here to
some notion of academic purity, they just don't work very
well against real websites.
</rant>

Whew, I feel better now.  

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-07 13:05

Message:
Logged In: YES 
user_id=21627

I don't have an SGML spec, so I can only check the XML 
spec. In XML, such a DOCTYPE declaration is ill-formed; I 
expect the same to be true for SGML. Therefore, I 
recommend to reject this patch.

If you have a need to process such ill-formed documents, I 
recommend to derive from SGMLParser and replace 
parse_declaration appropriately. E.g. you could advance i 
until after the space, then call the base method.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429171&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 22:37:52 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 14:37:52 -0700
Subject: [Patches] [ python-Patches-430948 ] Performance improvement for profiler
Message-ID: <E1587TU-0003mo-00@usw-sf-web3.sourceforge.net>

Patches item #430948, was updated on 2001-06-06 22:29
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Fred L. Drake, Jr. (fdrake)
>Assigned to: Tim Peters (tim_one)
Summary: Performance improvement for profiler

Initial Comment:
This patch adds a bit of complexity to
Profile.__init__() in an effort to reduce the overhead
of the profiler.  The essential piece of the puzzle is
that the general Profile.get_time() method is replaced
with a function which does only as much as is needed
for the underlying timer.  For example, if time.clock()
is available, it can become a PyCFunction instead of a
bound method, requires only 1 dict lookup to execute
instead of the 11 it takes to execute get_time()
without this patch.

Also removes a couple of duplicate imports from the "if
__name__ == ..." section.


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-06-07 14:37

Message:
Logged In: YES 
user_id=3066

I've attached a revised patch with the suggested changes,
plus a few more.  This is more agressive about avoiding
dictionary lookups, and the dispatch table no longer
contains bound methods -- using plain functions with self
passed as an explicit argument is faster as it avoids more
of Python's call machinery, and avoids circular references.

This patch also attempts not to add any breakage to the
OldProfile and HotProfile classes.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-07 12:39

Message:
Logged In: YES 
user_id=31435

Fine by me (and good idea!).  I'd rather see get_time_mac 
be a module-level function _get_time_mac, get_time_timer a 
module-level _get_time_timer (or, better, _get_time_list), 
and get_time_times a module-level function _get_time_times; 
and in the last case without the needless expense of reduce
():

.def _get_time_times(times=os.times):
.    t = times()
.    return t[0] + t[1]

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-06-06 22:30

Message:
Logged In: YES 
user_id=3066

I should note that this works with both 2.1.1 and 2.2,
though this is not a bugfix.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 22:44:35 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 14:44:35 -0700
Subject: [Patches] [ python-Patches-431257 ] profile/trace dispatch speed-up
Message-ID: <E1587Zz-0003ul-00@usw-sf-web3.sourceforge.net>

Patches item #431257, was updated on 2001-06-07 14:44
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431257&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Fred L. Drake, Jr. (fdrake)
Assigned to: Tim Peters (tim_one)
Summary: profile/trace dispatch speed-up

Initial Comment:
The profile and trace functions take a string as one of
their parameters, where the value of the string is one
of exactly four values.  Unfortunately, a new string
object is created for each call to the profile/trace
functions, and is not interned.

This patch modifies ceval.c so the string object for
each of these values is created only once and is
interned, allowing faster dictionary lookups in the
profile/trace functions.  This avoids a lot of string
creation overhead for calling these functions, and can
help the standard profiler work faster by using
interned string objects.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431257&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 22:54:13 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 14:54:13 -0700
Subject: [Patches] [ python-Patches-430948 ] Performance improvement for profiler
Message-ID: <E1587jJ-00042g-00@usw-sf-web3.sourceforge.net>

Patches item #430948, was updated on 2001-06-06 22:29
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470

Category: library
Group: None
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Fred L. Drake, Jr. (fdrake)
>Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Performance improvement for profiler

Initial Comment:
This patch adds a bit of complexity to
Profile.__init__() in an effort to reduce the overhead
of the profiler.  The essential piece of the puzzle is
that the general Profile.get_time() method is replaced
with a function which does only as much as is needed
for the underlying timer.  For example, if time.clock()
is available, it can become a PyCFunction instead of a
bound method, requires only 1 dict lookup to execute
instead of the 11 it takes to execute get_time()
without this patch.

Also removes a couple of duplicate imports from the "if
__name__ == ..." section.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2001-06-07 14:54

Message:
Logged In: YES 
user_id=31435

Accepted and back to Fred, with the caveat we talked about 
that __init__ should still do the right thing with a passed-
in timer returning an arbitrary sequence-like object of 
number-like objects <wink -- i.e., the "reduce" business>.


----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-06-07 14:37

Message:
Logged In: YES 
user_id=3066

I've attached a revised patch with the suggested changes,
plus a few more.  This is more agressive about avoiding
dictionary lookups, and the dispatch table no longer
contains bound methods -- using plain functions with self
passed as an explicit argument is faster as it avoids more
of Python's call machinery, and avoids circular references.

This patch also attempts not to add any breakage to the
OldProfile and HotProfile classes.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-07 12:39

Message:
Logged In: YES 
user_id=31435

Fine by me (and good idea!).  I'd rather see get_time_mac 
be a module-level function _get_time_mac, get_time_timer a 
module-level _get_time_timer (or, better, _get_time_list), 
and get_time_times a module-level function _get_time_times; 
and in the last case without the needless expense of reduce
():

.def _get_time_times(times=os.times):
.    t = times()
.    return t[0] + t[1]

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-06-06 22:30

Message:
Logged In: YES 
user_id=3066

I should note that this works with both 2.1.1 and 2.2,
though this is not a bugfix.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470


From noreply@sourceforge.net  Thu Jun  7 23:01:57 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 15:01:57 -0700
Subject: [Patches] [ python-Patches-431257 ] profile/trace dispatch speed-up
Message-ID: <E1587qn-00048h-00@usw-sf-web3.sourceforge.net>

Patches item #431257, was updated on 2001-06-07 14:44
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431257&group_id=5470

Category: core (C code)
Group: None
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Fred L. Drake, Jr. (fdrake)
>Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: profile/trace dispatch speed-up

Initial Comment:
The profile and trace functions take a string as one of
their parameters, where the value of the string is one
of exactly four values.  Unfortunately, a new string
object is created for each call to the profile/trace
functions, and is not interned.

This patch modifies ceval.c so the string object for
each of these values is created only once and is
interned, allowing faster dictionary lookups in the
profile/trace functions.  This avoids a lot of string
creation overhead for calling these functions, and can
help the standard profiler work faster by using
interned string objects.

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2001-06-07 15:01

Message:
Logged In: YES 
user_id=31435

Accepted, and back to Fred "The Interner" Drake, Jr.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431257&group_id=5470


From noreply@sourceforge.net  Fri Jun  8 00:28:55 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 16:28:55 -0700
Subject: [Patches] [ python-Patches-427749 ] Patch for bug #419390 (base64.py)
Message-ID: <E1589Cx-0002AA-00@usw-sf-web2.sourceforge.net>

Patches item #427749, was updated on 2001-05-27 11:35
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427749&group_id=5470

Category: library
Group: None
>Status: Closed
>Resolution: Duplicate
Priority: 5
Submitted By: Kalle Svensson (krftkndl)
Assigned to: Peter Schneider-Kamp (nowonder)
Summary: Patch for bug #419390 (base64.py)

Initial Comment:
Improves performance of base64.encodestring and
base64.decodestring by avoiding StringIO and using
binascii directly.

----------------------------------------------------------------------

>Comment By: Peter Schneider-Kamp (nowonder)
Date: 2001-06-07 16:28

Message:
Logged In: YES 
user_id=14463

Tried something similar. Slower than Tim's version, though.

Already checked that one in. Closing as Duplicate.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-07 13:07

Message:
Logged In: YES 
user_id=31435

Assigned to Peter since it appears to compete with his 
patch.  Peter, I expect your patch is quicker.  If you 
agree and check in your patch, close this as Duplicate (or 
something).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427749&group_id=5470


From noreply@sourceforge.net  Fri Jun  8 00:30:11 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 16:30:11 -0700
Subject: [Patches] [ python-Patches-430846 ] faster string-decoding in base64.py
Message-ID: <E1589EB-0002BU-00@usw-sf-web2.sourceforge.net>

Patches item #430846, was updated on 2001-06-06 14:14
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470

Category: library
Group: None
>Status: Closed
Resolution: Accepted
Priority: 3
Submitted By: Peter Schneider-Kamp (nowonder)
Assigned to: Peter Schneider-Kamp (nowonder)
Summary: faster string-decoding in base64.py

Initial Comment:
This addresses bug #419390 by anthonybaxter.

Instead of wrapping a string-to-be-decoded into a
StringIO class and using base64.decode use
binascii.a2b_base64 directly.

Speedup for big files is over 10 times (on Linux x86
anyway).

If uncontroversial I'll check it in.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-07 11:42

Message:
Logged In: YES 
user_id=31435

Accepted and assigned back to Peter for checkin.  Don't see 
how this could be controversial -- it's simple and 
appropriate.

----------------------------------------------------------------------

Comment By: Peter Schneider-Kamp (nowonder)
Date: 2001-06-06 23:07

Message:
Logged In: YES 
user_id=14463

Mhh, I did click that "Check to Upload & Attach File" thing.

No matter what, here is the new version (including your
speedup for encodestring).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-06 14:25

Message:
Logged In: YES 
user_id=31435

Umm -- there's no patch here.  If there were, I bet I would 
have changed this to Accepted, though <wink>.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430846&group_id=5470


From noreply@sourceforge.net  Fri Jun  8 05:26:33 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 21:26:33 -0700
Subject: [Patches] [ python-Patches-430948 ] Performance improvement for profiler
Message-ID: <E158Dqz-0006SY-00@usw-sf-web2.sourceforge.net>

Patches item #430948, was updated on 2001-06-06 22:29
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470

Category: library
Group: None
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Fred L. Drake, Jr. (fdrake)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Performance improvement for profiler

Initial Comment:
This patch adds a bit of complexity to
Profile.__init__() in an effort to reduce the overhead
of the profiler.  The essential piece of the puzzle is
that the general Profile.get_time() method is replaced
with a function which does only as much as is needed
for the underlying timer.  For example, if time.clock()
is available, it can become a PyCFunction instead of a
bound method, requires only 1 dict lookup to execute
instead of the 11 it takes to execute get_time()
without this patch.

Also removes a couple of duplicate imports from the "if
__name__ == ..." section.


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-06-07 21:26

Message:
Logged In: YES 
user_id=3066

Checked in with the suggested modification.  This is
Lib/profile.py revision 1.28.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-07 14:54

Message:
Logged In: YES 
user_id=31435

Accepted and back to Fred, with the caveat we talked about 
that __init__ should still do the right thing with a passed-
in timer returning an arbitrary sequence-like object of 
number-like objects <wink -- i.e., the "reduce" business>.


----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-06-07 14:37

Message:
Logged In: YES 
user_id=3066

I've attached a revised patch with the suggested changes,
plus a few more.  This is more agressive about avoiding
dictionary lookups, and the dispatch table no longer
contains bound methods -- using plain functions with self
passed as an explicit argument is faster as it avoids more
of Python's call machinery, and avoids circular references.

This patch also attempts not to add any breakage to the
OldProfile and HotProfile classes.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-07 12:39

Message:
Logged In: YES 
user_id=31435

Fine by me (and good idea!).  I'd rather see get_time_mac 
be a module-level function _get_time_mac, get_time_timer a 
module-level _get_time_timer (or, better, _get_time_list), 
and get_time_times a module-level function _get_time_times; 
and in the last case without the needless expense of reduce
():

.def _get_time_times(times=os.times):
.    t = times()
.    return t[0] + t[1]

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-06-06 22:30

Message:
Logged In: YES 
user_id=3066

I should note that this works with both 2.1.1 and 2.2,
though this is not a bugfix.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430948&group_id=5470


From noreply@sourceforge.net  Fri Jun  8 05:33:38 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 07 Jun 2001 21:33:38 -0700
Subject: [Patches] [ python-Patches-431257 ] profile/trace dispatch speed-up
Message-ID: <E158Dxq-0006XG-00@usw-sf-web2.sourceforge.net>

Patches item #431257, was updated on 2001-06-07 14:44
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431257&group_id=5470

Category: core (C code)
Group: None
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Fred L. Drake, Jr. (fdrake)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: profile/trace dispatch speed-up

Initial Comment:
The profile and trace functions take a string as one of
their parameters, where the value of the string is one
of exactly four values.  Unfortunately, a new string
object is created for each call to the profile/trace
functions, and is not interned.

This patch modifies ceval.c so the string object for
each of these values is created only once and is
interned, allowing faster dictionary lookups in the
profile/trace functions.  This avoids a lot of string
creation overhead for calling these functions, and can
help the standard profiler work faster by using
interned string objects.

----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-06-07 21:33

Message:
Logged In: YES 
user_id=3066

Checked in as Python/ceval.c revision 2.246.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-07 15:01

Message:
Logged In: YES 
user_id=31435

Accepted, and back to Fred "The Interner" Drake, Jr.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431257&group_id=5470


From noreply@sourceforge.net  Fri Jun  8 16:54:36 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 08 Jun 2001 08:54:36 -0700
Subject: [Patches] [ python-Patches-431422 ] "print" not emitting POP_TOP
Message-ID: <E158Oaq-0003yC-00@usw-sf-web3.sourceforge.net>

Patches item #431422, was updated on 2001-06-08 08:54
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431422&group_id=5470

Category: Parser/Compiler
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Shane Hathaway (hathawsh)
Assigned to: Nobody/Anonymous (nobody)
Summary: "print" not emitting POP_TOP

Initial Comment:
The Python-based compiler module (in Tools) has a bug 
in the visitPrint() method of 
pycodegen.CodeGenerator.  It does not emit a trailing 
POP_TOP instruction, which AFAICT it should emit only 
when outputting to a stream and there is a trailing 
comma (indicating no newline).  I've attached the 
patch applied to Zope's RestrictedPython module; if 
there is anything incorrect about it please tell me 
right away.  Otherwise please apply the patch to 
Tools/compiler/pycodgen.py.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431422&group_id=5470


From noreply@sourceforge.net  Fri Jun  8 17:29:02 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 08 Jun 2001 09:29:02 -0700
Subject: [Patches] [ python-Patches-426208 ] Fun with Floating Point
Message-ID: <E158P8A-0001PS-00@usw-sf-web2.sourceforge.net>

Patches item #426208, was updated on 2001-05-22 01:06
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426208&group_id=5470

Category: documentation
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Tim Peters (tim_one)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Fun with Floating Point

Initial Comment:
I suggest this as an Appendix.  For Michel Pelletier's 
benefit, it contains no equation <wink>.  Alas for 
you, for my benefit it contains no LaTeX markup 
either.  Season to taste!


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-06-08 09:29

Message:
Logged In: YES 
user_id=3066

Checked in as Doc/tut/tut.tex revision 1.137.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-05-23 15:02

Message:
Logged In: YES 
user_id=31435

New text, with improved wording, and an un-Wikized version 
of the RepresentationError page at the end as a new section.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-05-23 15:00

Message:
Logged In: YES 
user_id=31435

Deleted the attachment in preparation for uploading a new 
one.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426208&group_id=5470


From noreply@sourceforge.net  Fri Jun  8 22:37:50 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 08 Jun 2001 14:37:50 -0700
Subject: [Patches] [ python-Patches-429614 ] pythonpath and optimize def. before init
Message-ID: <E158Tx0-0001Yv-00@usw-sf-web3.sourceforge.net>

Patches item #429614, was updated on 2001-06-02 08:56
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429614&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Frederic Giacometti (giacometti)
Assigned to: Nobody/Anonymous (nobody)
Summary: pythonpath and optimize def. before init

Initial Comment:

A) Addition of four functions
=====================

Py_{Set, Get}{PythonPath, OptimizeLevel}()
with the same semantics as Py_{Set, Get}ProgramName()

(Note: the C ANSI type 'char const*' is used to describe non-modifiable strings)

These four functions are needed in the next JPE runtime (Python 2.1 patch included in the 
distribution); this allows setting the PYTHONPATH and optimize level from Java property values.


B) Option '-P pythonpath' on the Python command line:
========================================

This option defines 'pythonpath' from the command line (and override the PYTHONPATH 
environment variable if necessary).

Usefullness: Sometimes, one does not want to rely on the environment variables, or modify them.

Sample application: Running build and test scripts in full control of the environment, and with 
different PYTHONPATH values.

This option is needed by the build and test scripts of the next JPE source distribution (Python 2.1 
patch included in the distribution.

Frederic Giacometti
fred@arakne.com


----------------------------------------------------------------------

>Comment By: Frederic Giacometti (giacometti)
Date: 2001-06-08 14:37

Message:
Logged In: YES 
user_id=93657


1) PEP: I am not in python-dev. What is the procedure for opening the PEP?

2) Override: I though about the question. My response was:
If you wnat concatenation, use:
   python -P "something:$PYTHONPATH"
or
  python -P "$PYTHONPATH:something"
That's for all the better...

3) I renamed Py_{Set,Get}OptimizeFlag to Py_{Set,Get}OtimizeLevel after I wrote the documentation. Glad 
you caught the typo :)), sorry :((
I changed 'Flag' to 'Level' because 'Flag' normally designates a binary variable (2 states) whereas what we 
are doing is actually defining a debuging level (3 levels as of now, but who knows that some more levels 
might be addes).
'OptimizeLevel' is more accurate and less ambiguous than 'OptimizeFlag'.

Frederic Giacometti


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-07 12:58

Message:
Logged In: YES 
user_id=21627

I think a PEP describing the exact rationale and nature of 
the change is required here. For example, why is it good 
that -P overrides PYTHONPATH, instead of combining both 
somehow?

Also, the documentation talks about Py_GetOptimizeLevel, 
whereas the header declares Py_GetOptimizeFlag.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429614&group_id=5470


From noreply@sourceforge.net  Sat Jun  9 08:40:08 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 09 Jun 2001 00:40:08 -0700
Subject: [Patches] [ python-Patches-424475 ] Speed-up tp_compare usage
Message-ID: <E158dLs-00084y-00@usw-sf-web2.sourceforge.net>

Patches item #424475, was updated on 2001-05-16 01:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=424475&group_id=5470

Category: core (C code)
Group: None
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Martin v. Löwis (loewis)
>Assigned to: Nobody/Anonymous (nobody)
Summary: Speed-up tp_compare usage

Initial Comment:
This patch tries to optimize PyObject_RichCompare for
the common case of objects with equal types which
support tp_compare. It gives a speed-up of roughly 7%
for comparing strings in a loop.

The patch also gives type objects a tp_compare
function, so that they can make use of the improvement.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-09 00:40

Message:
Logged In: YES 
user_id=21627

Committed as object.c 2.132, typeobject.c 2.17, 
UserList.py 1.17.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-07 13:19

Message:
Logged In: YES 
user_id=31435

Accepted and assigned back to Martin.  This is too valuable 
to quibble over.  Note that when calling a tp_compare slot, 
this kind of thing:

.	c = (*f)(v, w);
.	if (PyErr_Occurred())

is better spelled:

.	c = (*f)(v, w);
.	if (c < 0 && Py_Err_Occurred())


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-05-21 09:57

Message:
Logged In: YES 
user_id=21627

The revised patch prefers tp_compare over tp_richcompare in
do_cmp if both are available. It also restores
UserList.__cmp__ from deprecation.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=424475&group_id=5470


From noreply@sourceforge.net  Sat Jun  9 20:15:31 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 09 Jun 2001 12:15:31 -0700
Subject: [Patches] [ python-Patches-401196 ] IPv6 patch against 2.0 CVS tree, as of 20001230
Message-ID: <E158oCp-0002Wj-00@usw-sf-web2.sourceforge.net>

Patches item #401196, was updated on 2000-08-16 05:53
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401196&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jun-ichiro itojun Hagino (itojun)
Assigned to: Nobody/Anonymous (nobody)
Summary: IPv6 patch against 2.0 CVS tree, as of 20001230

Initial Comment:
 

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-09 12:15

Message:
Logged In: YES 
user_id=21627

On the API, I have the following comments: 
- Why is it necessary to introduce gethostbyname2? I 
recommend to give gethostbyname an optional argument for 
the address family.

- getaddrinfo, when raising a socket error, should include 
the EAI_ error number. Perhaps there should be a way tod 
istinguish EAI_ errnos from other errnos, e.g. by 
subclassing socket error.

Otherwise, the API of the C part looks good to me. Ih 
aven't looked at the Lib part, yet.

On the implementation:
- I still have problems building the code. Currently, I 
get the following rejects:
./Lib/BaseHTTPServer.py.rej
./Lib/ftplib.py.rej
./Lib/poplib.py.rej
./Lib/smtplib.py.rej
./Modules/socketmodule.c.rej
./Objects/fileobject.c.rej

- The fileobject.c chunk seems to be unnecessary.

- On the test problem: It occurs in
+ test -d -a -f /lib.a
./configure: test: too many arguments
which comes from ipv6libdir and ipv6libdir being empty.

- The WIDE files should be included in the Modules 
directory, as they are only used from socketmodule.c. In 
particular, addrinfo.h should not be installed.

- If you can, please include a patch to 
Doc/lib/libsocket.tex. If not, I will try to draft one.


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-05-30 10:34

Message:
Logged In: NO 

i looked at python-dev email.  the proposal (split patches)
looks fine, but the exact example given in python-dev email
is not reasonable.  i cannot just send out configure.in
change separately from source code changes, period.  i can
split patches for *.py files separately though.

there's more important issue, which is, APi changes for
Socket class.  i really hoped to get some comment on that
part.  i really appreciate your comments.
i would like to propose that once we nailed down API
changes, integrate the patch into the tree.
with all #ifdef INET6 in place there should be no impact on
IPv4-only builds.

i have trouble tracking python development (i'm not a
sourceforge expert!), so forgive me for delays in patch
submissions.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-05-18 08:29

Message:
Logged In: YES 
user_id=6380

See
http://mail.python.org/pipermail/python-dev/2001-May/014889.html
for comments from MvL.

I'm unassigning this from Fred, he has nothing to do with
this.

----------------------------------------------------------------------

Comment By: Jun-ichiro itojun Hagino (itojun)
Date: 2001-02-26 02:24

Message:
Logged In: YES 
user_id=63767

about /usr/bin/test argument: does linux /usr/bin/test have
-d <dir> support?  if not, we may need to change
configure.in slightly.

you are correct that fallback getaddrinfo/getnameinfo.c was
missing in the patch.  sorry.  a question i need to ask is,
do we need to supply Python function Socket.getaddrinfo on
platforms that do not have getaddrinfo(3)?

HAVE_ADDRINFO is used in Include/addrinfo.h, which is also
missing in the patch set i have submitted.

i've put the missing files into
http://www.itojun.org/diary/20001230/missing.shar.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-02-23 23:58

Message:
After a shallow review of this patch, I found the following issues:

configure.in does not need to list both enable and disable options.
 
When running configure, I got the following error message on Linux
checking whether to enable ipv6... yes
checking ipv6 stack type... linux-glibc
./configure: test: too many arguments
using libc

The call to /usr/bin/test should be corrected; I could not find out which specific  invocation caused the problem.

HAVE_ADDRINFO is not used. Perhaps getaddrinfo.c/getnameinfo.c is missing in the patch?


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-01-04 07:51

Message:
A new patch is available.  I've changed the subject accordingly.

Due to upload size restrictions, the patch is now at

http://www.itojun.org/diary/20001230/python-2.0-v6-20001230.diff.gz

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2000-12-30 07:25

Message:
I got *many* rejects when trying to apply this patch to today's CVS tree. I recommend that patches for generated files (config.h.in, configure) are not included in the patch because they outdate too easily.
A number of changes in this patch have already been done by somebody else; others just don't fit into the current code anymore (perhaps due to indentation changes?).
Anyway, I'll mark the patch as out-of-date. Please let me know when you upload a new version.

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2000-08-16 07:00

Message:
Postponed until Python 2.1 -- there's not enough time to review this and get it sufficiently tested on enough IPv6-connected platforms in time for 2.0, and we're already in feature freeze.  This should go into the tree very quickly once Python 2.0 has been released.

Assigned to myself to open it back up after Python 2.0.

----------------------------------------------------------------------

Comment By: Moshe Zadka (moshez)
Date: 2000-08-16 06:07

Message:
Assigned to Tim, since he's in charge of postponing
new features. I'm to timid to postpone it myself.

----------------------------------------------------------------------

Comment By: Jun-ichiro itojun Hagino (itojun)
Date: 2000-08-16 05:59

Message:
this is revised version of patch #101186 (now with my SourceForge accout...
i'm not familiar with the system here, so forgive my possible mistake).

1.6b1 patch applied mostly clean to 2.0.
It is confirmed that:
- 1.6b1 + IPv6 patch works fine on NetBSD 1.4.2 + KAME, and NetBSD 1.5
- 1.6b1 + IPv6 patch works fine on NetBSD 1.4.2 (NOT an IPv6 ready machine)
- 2.0 CVS tree + IPv6 patch works fine on NetBSD + KAME

forgot to attach the following into the diff - so i attach it (README.v6)
here as comment.  I have submitted the patch for 1.5.1, 1.5.2 and 1.6b1,
all hit a bad timing - bad luck.

contact: core@kame.net, or itojun@kame.net


---
IPv6-ready python 1.6
KAME Project
$KAME: README.v6,v 1.9 2000/08/15 02:40:38 itojun Exp $


This patchkit enables python 1.6 to perform AF_INET6 socket operations.
The only affected module is Modules/socketmodule.c.

Modules/socketmodule.c
	In most cases, IPv6 address can be placed where IPv4 address fits.

    sockaddr
	sockaddr tuple is formatted as follows:
	    IPv4: (host, port)
	    IPv6: socket class methods always generate
		    (host, port, flowinfo, scopeid).
		  socket class methods will accept 2, 3, or 4 tuple
		  (for backward compatibility).

	Compatibility warning: Some of the scripts assume that the sockaddr
	structure is 2 tuple, like:
	    host, port = sock.getpeername()
	this will fail if you are connected to IPv6 node.

    socket.getaddrinfo(host, port [, family, socktype, proto, flags])
	host: String or None
	port: String, Int or None
	family, socktype, proto, flags: Int, can be omitted

	Perform getaddrinfo(3).  Returns List of the following 5 tuple:
	    (family, socktype, proto, canonname, sockaddr)
	    family: Int
	    socktype: Int
	    proto: Int
	    canonname: String
	    sockaddr: sockaddr (see above)

	See Lib/httplib.py for typical usage on the client side.

    socket.getnameinfo(sockaddr, flags)
	sockaddr: sockaddr
	flags: Int

	Perform getnameinfo(3).  Returns the following 2 tuple:
	    host: String, numeric or hostname depending on flgags
	    port: String, numeric or portname depending on flgags

    socket.gethostbyname2(host, af)
	host: String
	af: Int

	Performs gethostbyname2(3).  Returns numeric address representation
	for "host".

    socket.gethostbyaddr(addr) (behavior change if IPv6 support is compiled in)
	addr: String

	Performs gethostbyaddr(3).  Returns string address representation for
	"addr".

	The function can take IPv6 numeric address as well.  This behavior
	is not problematical, because
	- if you pass numeric "addr" parameter, we can always identify address
	  family for it
	- return value is string address reprsentation, where IPv6 and IPv4
	  are not distinguishable.

     socket.bind(sa), socket.connect(sa) and others.
	(No behavior change, but be careful)

	See above for sockaddr format change.

	With Python "addr" portion of sockaddr (first element) can be string
	hostname.  When the string hostname resolved to numeric address, it
	will obey address family of the socket (which was specified when
	socket.socket() was called).
	If you give some string that does not give matching address family,
	you will get some error.
		s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
		# this is okay, if 'localhost' resolves to both IPv4/v6
		s.connect('localhost', 80)
		# this is not okay, of course
		s.connect('::1', 80)
		# this is not okay, as v6only.kame.net will not resolve to IPv4
		s.connect('v6only.kame.net', 80)

Lib/httplib.py
	IPv6 ready.  "host" in HTTP(host) will accept the following 3 forms:
		[host]:port
		host:port	there must be only single colon
		host
	This is to allow IPv6 numeric URL (http://[host]:port/) in documents.

	IMHO "host:port" parsing should be implemented in urllib.py, not here.

Lib/ftplib.py
	IPv6 ready.  This uses EPSV/EPRT on IPv6 ftp.  See RFC2428 for
	protocol details.

Lib/SocketServer.py
	IPv6 ready.  Wildcard bind on TCPServer, i.e. TCPServer(('', port)),
	will bind to wildcard socket on TCPServer.address_family.
	TCPServer.addresss_family is set to AF_INET by default, so ('', port)
	will usually bind AF_INET.

Lib/smtplib.py, Lib/telnetlib.py, Lib/poplib.py
	IPv6 ready.  Not much to say about protocol details - they just use
	TCP over IPv6.

configure
	Configure has extra option, --enable-ipv6 and --disable-ipv6.
	The option controls IPv6 support feature.

dynamic link issues in Modules/socketmodule.c
	Modules/socketmodule.c can be dynamically loaded only in the following
	situations:
	- getaddrinfo(3) and getnameinfo(3) are supplied by OS vendor in
	  libc, and libc is dynamic link library.
	- OS vendor is NOT supplying getaddrinfo(3) nor getnameinfo(3), and
	  You are configuring this package with --disable-ipv6.  In this case,
	  you'll be using missing/get{addr,name}info.c and they will refer to
	  gethostby{name,addr}.  gethostnameby{name,addr} can usually be found
	  in dynamic-linking libc.

	In other situations, such as the following, please link
	Modules/socketmodule.c into python itself.
	- getaddrinfo(3) and getnameinfo(3) are supplied by OS vendor, but
	  they are in statically linked library like libinet6.a.
	  (KAME falls into this category)

	python usually links Modules/socketmodule.c into python itself
	(due to its popularity) so there should be no problem.

restrictions
	- The patched tree will not use gethostbyname_r and other
	  thread-ready libraries.  Instead, it will use getaddrinfo() and
	  getnameinfo() throughout the operation.

todo
	- Patch bunch of library files in Lib/*.py.

compatibility issues with existing scripts
	If you disable IPv6 support (./configure --disable-ipv6), the
	patched code is mostly compatible with original Python
	(except files in "Lib" directory modified for dual stack support).

	User script may choke if:
	- IPv4/v6 dualstack libc is supplied, python is compiled for dual
	  stack, and script assumes some of IPv4-only behavior (especially
	  sockaddr)
	- IPv4/v6 dualstack libc is supplied, python is compiled for IPv4 only,
	  and script assumes some of IPv4-only behavior.
	  In this case, Python socket class itself does not support IPv6,
	  however, name resolution functions can return IPv6 names since
	  they use IPv6-ready libc functions!  I do not recommend this
	  configuration.
	- script assumes certain IPv4-only version behavior in Lib/*.py.

compilation
	If you use IPv6 features, it is assumed that you have working
	getaddrinfo() and getnameinfo() library functions.  We have noticed
	that some of IPv6 stack is shipped with broken getaddrinfo().  In
	such cases, use missing/get{addr,name}info.c instead (but then, you
	need to have working getipnodeby{name,addr}).

	If you compile this on IPv4-only machine without get{addr,name}info,
	missing/get{addr,name}info.c will be used.  They are from KAME IPv6
	distribution and is #ifdef'ed for IPv4 only support.  They are
	fairly complete implementation and you don't need to bother with
	bind 8.2 (bind 8.2 get{addr,name}info() has bugs).

	When compiling this kit on IPv6 node, you may need to specify some
	additional library paths or cpp defs. (like -linet6 or -DINET6)
	--enable-ipv6 will give you some warning, if the IPv6 stack is unknown
	to the "configure" script.  Currently, the following IPv6 stacks
	are officially supported (i.e. we've checked that the package works
	well):
	- KAME IPv6 stack, http://www.kame.net/

References
	RFC2553, for getaddrinfo(3) and getnameinfo(3).

Author contacts
	http://www.kame.net/
	mailto:core@kame.net


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401196&group_id=5470


From noreply@sourceforge.net  Sat Jun  9 04:52:01 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 08 Jun 2001 20:52:01 -0700
Subject: [Patches] [ python-Patches-431848 ] mathmodule.c: doc strings & conversion
Message-ID: <E158Zn7-0001bi-00@usw-sf-web1.sourceforge.net>

Patches item #431848, was updated on 2001-06-08 20:52
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431848&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 3
Submitted By: Peter Schneider-Kamp (nowonder)
Assigned to: Tim Peters (tim_one)
Summary: mathmodule.c: doc strings & conversion

Initial Comment:
* more informative doc strings for mathmodule.c
* methods math.radians and math.degrees to convert
  between radians and degrees

This addresses feature request #426539.

Suggestions for better names (deg2rad instead of
radians?) or better doc strings will be met
enthusiastically.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431848&group_id=5470


From InternetShops@mail.ru  Mon Jun 11 03:22:05 2001
From: InternetShops@mail.ru (InternetShops@mail.ru)
Date: Mon, 11 Jun 2001 04:22:05 +0200
Subject: [Patches] úáëáú îáäåöîïçï ïâïòõäï÷áîéñ þåòåú éîôåòîåô
Message-ID: <200106110222.f5B2M5d16285@friedrich.unkelhaeuser.de>

This is a MIME encoded message.

--bfd185fe016c8f28e16d62ead51755869
Content-Type: text/html ;	charset="windows-1251"
Content-Transfer-Encoding: base64

PCFET0NUWVBFIEhUTUwgUFVCTElDICItLy9XM0MvL0RURCBIVE1MIDQuMCBUcmFuc2l0aW9uYWwv
L0VOIj4NCg0KPGh0bWw+DQo8aGVhZD4NCgk8dGl0bGU+x8DKwMcgzcDExcbNzsPOIM7BztDTxM7C
wM3I3yDXxdDFxyDIzdLF0M3F0jwvdGl0bGU+DQo8L2hlYWQ+DQoNCjxib2R5Pg0KPGRpdiBhbGln
bj0iQ0VOVEVSIj4NCjxoMj7V7vDu+OXlIO7h7vDz5O7i4O3o5SDs7ubt7iDq8+/o8vwsIOfg6uDn
4OIg9+Xw5ecgyO3y5fDt5fI6PC9oMj4NCjx0YWJsZT4NCjx0cj4NCgk8dGQ+PGI+yu7s7/z+8uXw
+zo8L2I+PC90ZD4NCjwvdHI+DQo8dHI+DQoJPHRkPjxhIGhyZWY9Imh0dHA6Ly93d3cudmlzdC5y
dUB3d3cuamFuLWhlbmRyaWsuY29tL3Jlcy5waHA/dD1wYXRjaGVzQHB5dGhvbi5vcmcmcz12aXN0
LnJ1Ij52aXN0LnJ1PC9hPjxicj4NCgk8YSBocmVmPSJodHRwOi8vd3d3Lmtsb25kYWlrLnJ1QHd3
dy5qYW4taGVuZHJpay5jb20vcmVzLnBocD90PXBhdGNoZXNAcHl0aG9uLm9yZyZzPWtsb25kYWlr
LnJ1Ij5rbG9uZGFpay5ydTwvYT48YnI+DQoJPGEgaHJlZj0iaHR0cDovL3d3dy5tdmlkZW8ucnVA
d3d3Lmphbi1oZW5kcmlrLmNvbS9yZXMucGhwP3Q9cGF0Y2hlc0BweXRob24ub3JnJnM9bXZpZGVv
LnJ1Ij5tdmlkZW8ucnU8L2E+DQoJPC90ZD4NCjwvdHI+DQo8dHI+DQoJPHRkPjxiPs3u8/Lh8+ro
OjwvYj48L3RkPg0KPC90cj4NCjx0cj4NCgk8dGQ+PGEgaHJlZj0iaHR0cDovL3d3dy5ub3RlYm9v
a3BvcnRhbC5ydUB3d3cuamFuLWhlbmRyaWsuY29tL3Jlcy5waHA/dD1wYXRjaGVzQHB5dGhvbi5v
cmcmcz1ub3RlYm9va3BvcnRhbC5ydSI+bm90ZWJvb2twb3J0YWwucnU8L2E+PC90ZD4NCjwvdHI+
DQo8dHI+DQoJPHRkPjxhIGhyZWY9Imh0dHA6Ly93d3cubmJvb2sucnVAd3d3Lmphbi1oZW5kcmlr
LmNvbS9yZXMucGhwP3Q9cGF0Y2hlc0BweXRob24ub3JnJnM9bmJvb2sucnUiPm5ib29rLnJ1IDwv
YT48L3RkPg0KPC90cj4NCjx0cj4NCgk8dGQ+PGEgaHJlZj0iaHR0cDovL3d3dy5taWNyb21hdGl4
LnJ1QHd3dy5qYW4taGVuZHJpay5jb20vcmVzLnBocD90PXBhdGNoZXNAcHl0aG9uLm9yZyZzPW1p
Y3JvbWF0aXgucnUiPm1pY3JvbWF0aXgucnU8L2E+PC90ZD4NCjwvdHI+DQo8dHI+DQoJPHRkPjxi
Ps/w6O3y5fD7OjwvYj48L3RkPg0KPC90cj4NCjx0cj4NCgk8dGQ+PGEgaHJlZj0iaHR0cDovL3d3
dy5kb3N0YXZrYS5ydUB3d3cuamFuLWhlbmRyaWsuY29tL3Jlcy5waHA/dD1wYXRjaGVzQHB5dGhv
bi5vcmcmcz1kb3N0YXZrYS5ydSI+ZG9zdGF2a2EucnU8L2E+PC90ZD4NCjwvdHI+DQo8dHI+DQoJ
PHRkPjxhIGhyZWY9Imh0dHA6Ly93d3cuYXJ1cy5ydUB3d3cuamFuLWhlbmRyaWsuY29tL3Jlcy5w
aHA/dD1wYXRjaGVzQHB5dGhvbi5vcmcmcz1hcnVzLnJ1Ij5hcnVzLnJ1PC9hPjwvdGQ+DQo8L3Ry
Pg0KPHRyPg0KCTx0ZD48Yj7M8+v88ujs5eTo4C3v8O7l6vLu8PsgKOLo5OXu7/Du5ery7vD7KTo8
L2I+PC90ZD4NCjwvdHI+DQo8dHI+DQoJPHRkPjxhIGhyZWY9Imh0dHA6Ly93d3cuYWxscHJvamVj
dG9ycy5ydUB3d3cuamFuLWhlbmRyaWsuY29tL3Jlcy5waHA/dD1wYXRjaGVzQHB5dGhvbi5vcmcm
cz1hbGxwcm9qZWN0b3JzLnJ1Ij5hbGxwcm9qZWN0b3JzLnJ1PC9hPjwvdGQ+DQo8L3RyPg0KPHRy
Pg0KCTx0ZD48YSBocmVmPSJodHRwOi8vd3d3Lm11bHRpbWVkaWEtcHJvamVjdG9yLnJ1QHd3dy5q
YW4taGVuZHJpay5jb20vcmVzLnBocD90PXBhdGNoZXNAcHl0aG9uLm9yZyZzPW11bHRpbWVkaWEt
cHJvamVjdG9yLnJ1Ij5tdWx0aW1lZGlhLXByb2plY3Rvci5ydTwvYT48L3RkPg0KPC90cj4NCjx0
cj4NCgk8dGQ+PGEgaHJlZj0iaHR0cDovL3d3dy5hbGVlLmNvbUB3d3cuamFuLWhlbmRyaWsuY29t
L3Jlcy5waHA/dD1wYXRjaGVzQHB5dGhvbi5vcmcmcz1hbGVlLmNvbSI+YWxlZS5jb208L2E+PC90
ZD4NCjwvdHI+DQo8dHI+DQoJPHRkPjxiPsru7+jw+zo8L2I+PC90ZD4NCjwvdHI+DQo8dHI+DQoJ
PHRkPjxhIGhyZWY9Imh0dHA6Ly93d3cubWl0YS5ydUB3d3cuamFuLWhlbmRyaWsuY29tL3Jlcy5w
aHA/dD1wYXRjaGVzQHB5dGhvbi5vcmcmcz1taXRhLnJ1Ij5taXRhLnJ1PC9hPjwvdGQ+DQo8L3Ry
Pg0KPHRyPg0KCTx0ZD48YSBocmVmPSJodHRwOi8vd3d3Lm1hcnZlbC5ydUB3d3cuamFuLWhlbmRy
aWsuY29tL3Jlcy5waHA/dD1wYXRjaGVzQHB5dGhvbi5vcmcmcz1tYXJ2ZWwucnUiPm1hcnZlbC5y
dTwvYT48L3RkPg0KPC90cj4NCjx0cj4NCgk8dGQ+PGEgaHJlZj0iaHR0cDovL3d3dy5zY29weS5y
dUB3d3cuamFuLWhlbmRyaWsuY29tL3Jlcy5waHA/dD1wYXRjaGVzQHB5dGhvbi5vcmcmcz1zY29w
eS5ydSI+c2NvcHkucnU8L2E+PC90ZD4NCjwvdHI+DQo8L3RhYmxlPg0KDQo8L2Rpdj4NCjwvYm9k
eT4NCjwvaHRtbD4NCg==

--bfd185fe016c8f28e16d62ead51755869--


From support@4zip.net  Mon Jun 11 00:57:37 2001
From: support@4zip.net (support@4zip.net)
Date: Sun, 10 Jun 2001 23:57:37
Subject: [Patches] Promote your business without spending a lot of money & time.
Message-ID: <E159Icy-0007DW-00@mail.python.org>

Dear Sir or Madam,

Are you  broker or agent?

Do you have your own listing database including homepage over internet?

Only $19.95 / month.

Our professional web database & homepage will give you best advantages on your marketings.

Reach your custormers in minutes

including free ad in http://www.findmybusiness.com business directory,

And all your listings will be listed automatically with no charge.

Our site promote in over 1000 serach engines.

Thousands promotion-emails....

30 days free trial offerd with limited time.

Visit http://www.4zip.net and get connected with your customers!!
 ------------------------------------------------------------------------------------------------------------------
     
Findmybusiness.com ....The total business opportunies from your neighbor to world..
All Listings from over 30 countries...

==============================================================
Praise the Lord..........!  


From noreply@sourceforge.net  Mon Jun 11 06:53:03 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 10 Jun 2001 22:53:03 -0700
Subject: [Patches] [ python-Patches-429614 ] pythonpath and optimize def. before init
Message-ID: <E159KdL-0001xc-00@usw-sf-web2.sourceforge.net>

Patches item #429614, was updated on 2001-06-02 08:56
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429614&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Frederic Giacometti (giacometti)
Assigned to: Nobody/Anonymous (nobody)
Summary: pythonpath and optimize def. before init

Initial Comment:

A) Addition of four functions
=====================

Py_{Set, Get}{PythonPath, OptimizeLevel}()
with the same semantics as Py_{Set, Get}ProgramName()

(Note: the C ANSI type 'char const*' is used to describe non-modifiable strings)

These four functions are needed in the next JPE runtime (Python 2.1 patch included in the 
distribution); this allows setting the PYTHONPATH and optimize level from Java property values.


B) Option '-P pythonpath' on the Python command line:
========================================

This option defines 'pythonpath' from the command line (and override the PYTHONPATH 
environment variable if necessary).

Usefullness: Sometimes, one does not want to rely on the environment variables, or modify them.

Sample application: Running build and test scripts in full control of the environment, and with 
different PYTHONPATH values.

This option is needed by the build and test scripts of the next JPE source distribution (Python 2.1 
patch included in the distribution.

Frederic Giacometti
fred@arakne.com


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-10 22:53

Message:
Logged In: YES 
user_id=21627

You can find the PEP guidelines in PEP 1:
http://python.sourceforge.net/peps/pep-0001.html


----------------------------------------------------------------------

Comment By: Frederic Giacometti (giacometti)
Date: 2001-06-08 14:37

Message:
Logged In: YES 
user_id=93657


1) PEP: I am not in python-dev. What is the procedure for opening the PEP?

2) Override: I though about the question. My response was:
If you wnat concatenation, use:
   python -P "something:$PYTHONPATH"
or
  python -P "$PYTHONPATH:something"
That's for all the better...

3) I renamed Py_{Set,Get}OptimizeFlag to Py_{Set,Get}OtimizeLevel after I wrote the documentation. Glad 
you caught the typo :)), sorry :((
I changed 'Flag' to 'Level' because 'Flag' normally designates a binary variable (2 states) whereas what we 
are doing is actually defining a debuging level (3 levels as of now, but who knows that some more levels 
might be addes).
'OptimizeLevel' is more accurate and less ambiguous than 'OptimizeFlag'.

Frederic Giacometti


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-07 12:58

Message:
Logged In: YES 
user_id=21627

I think a PEP describing the exact rationale and nature of 
the change is required here. For example, why is it good 
that -P overrides PYTHONPATH, instead of combining both 
somehow?

Also, the documentation talks about Py_GetOptimizeLevel, 
whereas the header declares Py_GetOptimizeFlag.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=429614&group_id=5470


From noreply@sourceforge.net  Mon Jun 11 16:24:57 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 11 Jun 2001 08:24:57 -0700
Subject: [Patches] [ python-Patches-432117 ] Updated PullDOM patch
Message-ID: <E159TYn-0005TH-00@usw-sf-web3.sourceforge.net>

Patches item #432117, was updated on 2001-06-11 08:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432117&group_id=5470

Category: XML
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: Updated PullDOM patch

Initial Comment:
Martin, here is an updated patch (see # 423394) for pulldom.py that 
follows the DOM REC regarding namespace declaration attribute 
handling.

In short: namespace declaration attributes are now preserved, and 
the namespaceURI of a namespace decl attribute is 
"http://www.w3.org/2000/xmlns/". The localName is the prefix to 
be mapped, unless it is a plain "xmlns" (default ns declaration), in 
which case the localName is just "xmlns".

I've tested this with a Python 2.1 (final release) install. Let me know if 
you need anything else.

Thanks!

B. Lloyd

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432117&group_id=5470


From noreply@sourceforge.net  Mon Jun 11 21:00:39 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 11 Jun 2001 13:00:39 -0700
Subject: [Patches] [ python-Patches-432183 ] PEP-259: skip printing newline*2
Message-ID: <E159Xrb-00086b-00@usw-sf-web2.sourceforge.net>

Patches item #432183, was updated on 2001-06-11 13:00
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432183&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Guido van Rossum (gvanrossum)
Assigned to: Nobody/Anonymous (nobody)
Summary: PEP-259: skip printing newline*2

Initial Comment:
See PEP 259 (to be checked in soon).

This suppresses the printing of an extra newline when
the last item printed is a string ending in a newline.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432183&group_id=5470


From noreply@sourceforge.net  Sun Jun 10 10:12:49 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 10 Jun 2001 02:12:49 -0700
Subject: [Patches] [ python-Patches-432183 ] PEP-259: skip printing newline*2
Message-ID: <E1591H7-0007fL-00@usw-sf-web1.sourceforge.net>

Patches item #432183, was updated on 2001-06-11 13:00
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432183&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Guido van Rossum (gvanrossum)
Assigned to: Nobody/Anonymous (nobody)
Summary: PEP-259: skip printing newline*2

Initial Comment:
See PEP 259 (to be checked in soon).

This suppresses the printing of an extra newline when
the last item printed is a string ending in a newline.


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2001-06-10 02:12

Message:
Logged In: YES 
user_id=6656

I think you also want:

Index: code.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/code.py,v
retrieving revision 1.16
diff -c -1 -r1.16 code.py
*** code.py     2001/05/03 04:58:49     1.16
--- code.py     2001/06/11 22:11:29
***************
*** 106,108 ****
          else:
!             if softspace(sys.stdout, 0):
                  print
--- 106,108 ----
          else:
!             if softspace(sys.stdout, 0) >= 0:
                  print

(not tested)


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432183&group_id=5470


From noreply@sourceforge.net  Tue Jun 12 07:23:26 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 11 Jun 2001 23:23:26 -0700
Subject: [Patches] [ python-Patches-432325 ] \versionadded{2.2} in libstruct.tex
Message-ID: <E159haI-0001Y2-00@usw-sf-web2.sourceforge.net>

Patches item #432325, was updated on 2001-06-11 23:23
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432325&group_id=5470

Category: documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Peter Funk (pefu)
Assigned to: Nobody/Anonymous (nobody)
Summary: \versionadded{2.2} in libstruct.tex

Initial Comment:
Tim Peters:
> Modified Files:
> 	libstruct.tex 
> Log Message:
> Added q/Q standard (x-platform 8-byte ints) mode in
struct module.
[...]

Hmmmm.... You probably forgot the \versionadded{2.2}
note? 


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432325&group_id=5470


From noreply@sourceforge.net  Tue Jun 12 14:43:03 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 12 Jun 2001 06:43:03 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E159oRj-0000YL-00@usw-sf-web2.sourceforge.net>

Patches item #432401, was updated on 2001-06-12 06:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Nobody/Anonymous (nobody)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Tue Jun 12 15:29:55 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 12 Jun 2001 07:29:55 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E159pB5-0001P4-00@usw-sf-web2.sourceforge.net>

Patches item #432401, was updated on 2001-06-12 06:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
>Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 07:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Tue Jun 12 17:32:34 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 12 Jun 2001 09:32:34 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E159r5m-0003gH-00@usw-sf-web2.sourceforge.net>

Patches item #432401, was updated on 2001-06-12 06:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:32

Message:
Logged In: YES 
user_id=89016

> * please don't place more than one C statement on one line
> like in:
> """
> +               unicode = unicode2; unicodepos =
> unicode2pos;
> +               unicode2 = NULL; unicode2pos = 0;
> """

OK, done!

> * Comments should start with a capital letter and be
> prepended
> to the section they apply to

Fixed!

> * There should be spaces between arguments in compares
> (a == b) not (a==b)

Fixed!

> * Where does the name "...Encode121" originate ?

encode one-to-one, it implements both ASCII and latin-1
encoding.

> * module internal APIs should use lower case names (you
> converted some of these to  PyUnicode_...() -- this is
> normally reserved for APIs which are either marked as
> potential candidates for the public API or are very
> prominent in the code)

Which ones? I introduced a new function for every old one,
that had a "const char *errors" argument, and a few new ones
in codecs.h, of those PyCodec_EncodeHandlerForObject is
vital, because it is used to map for old string arguments to
the new function objects. PyCodec_RaiseEncodeErrors can be
used in the encoder implementation to raise an encode error,
but it could be made static in unicodeobject.h so only those
encoders implemented there have access to it.

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments > --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

I look through the code and found no situation where the
Py_UNICODE*/int version is really used and having two
(PyObject *)s (the original and the replacement string),
instead of UNICODE*/int and PyObject * made the
implementation a little easier, but I can fix that.

> Please separate the errors.c patch from this patch -- it
> seems totally unrelated to Unicode.

PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with
four hex digits. I removed it.

I'll upload a revised patch as soon as it's done.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 07:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Tue Jun 12 17:39:10 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 12 Jun 2001 09:39:10 -0700
Subject: [Patches] [ python-Patches-432457 ] Readline 4.2 Patch
Message-ID: <E159rCA-0005zE-00@usw-sf-web3.sourceforge.net>

Patches item #432457, was updated on 2001-06-12 09:39
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432457&group_id=5470

Category: Build
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jason Tishler (jlt63)
Assigned to: Nobody/Anonymous (nobody)
Summary: Readline 4.2 Patch

Initial Comment:
This patch enables the Python readline module to
build cleanly against readline 4.2.  Specifically, it
configures Python to use either completion_matches()
or rl_completion_matches() as appropriate.  This is
necessary due to the deprecation of completion_matches()
(and the other functions defined in compat.c) in
readline 4.2.  In this case, deprecated means no longer
declared in readline.h but still defined in the readline
library (e.g. libreadline.so).

Although this patch is currently only necessary for Cygwin,
it eventually will be needed by the other platforms when
completion_matches() is finally removed from readline
(e.g., 4.3).

I tested this patch under the following environments:

	Linux with readline 2.2.1
	Linux with readline 4.2
	Cygwin with readline 4.1
	Cygwin with readline 4.2

and it functioned as expected.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432457&group_id=5470


From noreply@sourceforge.net  Tue Jun 12 17:51:48 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 12 Jun 2001 09:51:48 -0700
Subject: [Patches] [ python-Patches-400938 ] [Draft] libpython as shared library (.so) on Linux
Message-ID: <E159rOO-0002Lm-00@usw-sf-web1.sourceforge.net>

Patches item #400938, was updated on 2000-07-19 13:55
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=400938&group_id=5470

Category: None
Group: None
>Status: Open
Resolution: Out of Date
Priority: 5
Submitted By: Gregor Hoffleit (flight)
Assigned to: Neil Schemenauer (nascheme)
Summary: [Draft] libpython as shared library (.so) on Linux

Initial Comment:
 

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-12 09:51

Message:
Logged In: YES 
user_id=6380

Reopening -- this keeps being requested.  Now we're just
waiting for someone to produce a working patch.

Or is there one already?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2001-03-21 15:59

Message:
Logged In: YES 
user_id=35752

We're going to have to create a new patch to do this.  This
one is
way too out of date.  Maybe for 2.2.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-01-19 14:46

Message:
I'm reassigning this to Neil.

Neil, can you see if you can integrate this into your flat Makefile?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-01-17 15:09

Message:
Andrew, I'm tentatively reassigning this to you, since you're taking charge of the build process at the moment (setup.py).

I suspect that the patch no longer works as is -- would it make sense to mark it postponed and get the author to submit a new version before we release 2.1a1?


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-01-17 14:46

Message:
Getting this patch into the next version of Python would be "A Good Thing"(tm) in my opinion.  We use libpython as a .so at ILM and end up having to make changes like this by hand every time we get a new version...

----------------------------------------------------------------------

Comment By: Moshe Zadka (moshez)
Date: 2000-11-01 03:32

Message:
I've had a look at the patch, and it seems it has
two orthogonal parts. One is adding the infrastructure
for compiling another version for the Python library, which can be more or less integrated as-is, and one is hard-coding the particular way, in Linux, of building shared objects. Since we discover how to build shared objects in the configure script anyway (otherwise we could not have built modules as shared objects), we should embed that information there, not the Linux flags.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2000-10-26 14:13

Message:
Let's give this to Jeremy instead, because he seems to know more about build issues. Jeremy, it would be good to look into getting this to work with your RPM suite. Flight's argument (has been used without complaints in Debian Python 1.5.2 since 1999) is good.

----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2000-08-23 09:26

Message:
In the absence of anyone arguing for inclusion of this patch and a one-week idle period, it is postponed.

----------------------------------------------------------------------

Comment By: Moshe Zadka (moshez)
Date: 2000-08-16 00:40

Message:
I suggest we postpone it. It isn't really complete (only works on real distributions <wink>), and the complete solution should work on all unices. If Tcl/Perl can do it, there is no reason Python can't -- and a half hearted solution isn't that good. flight, you should use this
for the Python in woody in the mean time -- I doubt woody
will be stable before Python 2.1 comes out, so 2.1 sounds
like a good timeframe to do it.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2000-08-15 10:52

Message:
Assigned to Barry because he's a Linux weenie.  Barry, if you think there's something here that should go into 2.0, please pursue it now, else change the status to Postponed.

----------------------------------------------------------------------

Comment By: Gregor Hoffleit (flight)
Date: 2000-07-19 14:10

Message:
This is what it used in product to build libpython as shared library(.so) for Debian.

Note: This patch is not ready for inclusion in the upstream Python distribution. Anyway, I think this might be a start. The Python 1.5 executable in Debian GNU/Linux is built against a shared libpython1.5.so since April 1999, and I haven't yet heard about any problems.

Using a shared library should have an advantage if you're running multiple instances of Python (be it standalone interpreter or embedded applications).


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=400938&group_id=5470


From noreply@sourceforge.net  Tue Jun 12 17:54:10 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 12 Jun 2001 09:54:10 -0700
Subject: [Patches] [ python-Patches-400938 ] [Draft] libpython as shared library (.so) on Linux
Message-ID: <E159rQg-0006Ap-00@usw-sf-web3.sourceforge.net>

Patches item #400938, was updated on 2000-07-19 13:55
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=400938&group_id=5470

Category: None
Group: None
Status: Open
Resolution: Out of Date
Priority: 5
Submitted By: Gregor Hoffleit (flight)
>Assigned to: Nobody/Anonymous (nobody)
Summary: [Draft] libpython as shared library (.so) on Linux

Initial Comment:
 

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-12 09:54

Message:
Logged In: YES 
user_id=6380

Reopening -- this keeps being requested.  Now we're just
waiting for someone to produce a working patch.

Or is there one already?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-12 09:51

Message:
Logged In: YES 
user_id=6380

Reopening -- this keeps being requested.  Now we're just
waiting for someone to produce a working patch.

Or is there one already?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2001-03-21 15:59

Message:
Logged In: YES 
user_id=35752

We're going to have to create a new patch to do this.  This
one is
way too out of date.  Maybe for 2.2.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-01-19 14:46

Message:
I'm reassigning this to Neil.

Neil, can you see if you can integrate this into your flat Makefile?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-01-17 15:09

Message:
Andrew, I'm tentatively reassigning this to you, since you're taking charge of the build process at the moment (setup.py).

I suspect that the patch no longer works as is -- would it make sense to mark it postponed and get the author to submit a new version before we release 2.1a1?


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-01-17 14:46

Message:
Getting this patch into the next version of Python would be "A Good Thing"(tm) in my opinion.  We use libpython as a .so at ILM and end up having to make changes like this by hand every time we get a new version...

----------------------------------------------------------------------

Comment By: Moshe Zadka (moshez)
Date: 2000-11-01 03:32

Message:
I've had a look at the patch, and it seems it has
two orthogonal parts. One is adding the infrastructure
for compiling another version for the Python library, which can be more or less integrated as-is, and one is hard-coding the particular way, in Linux, of building shared objects. Since we discover how to build shared objects in the configure script anyway (otherwise we could not have built modules as shared objects), we should embed that information there, not the Linux flags.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2000-10-26 14:13

Message:
Let's give this to Jeremy instead, because he seems to know more about build issues. Jeremy, it would be good to look into getting this to work with your RPM suite. Flight's argument (has been used without complaints in Debian Python 1.5.2 since 1999) is good.

----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2000-08-23 09:26

Message:
In the absence of anyone arguing for inclusion of this patch and a one-week idle period, it is postponed.

----------------------------------------------------------------------

Comment By: Moshe Zadka (moshez)
Date: 2000-08-16 00:40

Message:
I suggest we postpone it. It isn't really complete (only works on real distributions <wink>), and the complete solution should work on all unices. If Tcl/Perl can do it, there is no reason Python can't -- and a half hearted solution isn't that good. flight, you should use this
for the Python in woody in the mean time -- I doubt woody
will be stable before Python 2.1 comes out, so 2.1 sounds
like a good timeframe to do it.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2000-08-15 10:52

Message:
Assigned to Barry because he's a Linux weenie.  Barry, if you think there's something here that should go into 2.0, please pursue it now, else change the status to Postponed.

----------------------------------------------------------------------

Comment By: Gregor Hoffleit (flight)
Date: 2000-07-19 14:10

Message:
This is what it used in product to build libpython as shared library(.so) for Debian.

Note: This patch is not ready for inclusion in the upstream Python distribution. Anyway, I think this might be a start. The Python 1.5 executable in Debian GNU/Linux is built against a shared libpython1.5.so since April 1999, and I haven't yet heard about any problems.

Using a shared library should have an advantage if you're running multiple instances of Python (be it standalone interpreter or embedded applications).


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=400938&group_id=5470


From noreply@sourceforge.net  Tue Jun 12 17:56:44 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 12 Jun 2001 09:56:44 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E159rTA-0002Pv-00@usw-sf-web1.sourceforge.net>

Patches item #432401, was updated on 2001-06-12 06:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:56

Message:
Logged In: YES 
user_id=89016

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments
> --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

Another problem is, that the callback requires a Python
object, so in the PyObject *version, the refcount is
incref'd and the object is passed to the callback. The
Py_UNICODE*/int version would have to create a new Unicode
object from the data.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:32

Message:
Logged In: YES 
user_id=89016

> * please don't place more than one C statement on one line
> like in:
> """
> +               unicode = unicode2; unicodepos =
> unicode2pos;
> +               unicode2 = NULL; unicode2pos = 0;
> """

OK, done!

> * Comments should start with a capital letter and be
> prepended
> to the section they apply to

Fixed!

> * There should be spaces between arguments in compares
> (a == b) not (a==b)

Fixed!

> * Where does the name "...Encode121" originate ?

encode one-to-one, it implements both ASCII and latin-1
encoding.

> * module internal APIs should use lower case names (you
> converted some of these to  PyUnicode_...() -- this is
> normally reserved for APIs which are either marked as
> potential candidates for the public API or are very
> prominent in the code)

Which ones? I introduced a new function for every old one,
that had a "const char *errors" argument, and a few new ones
in codecs.h, of those PyCodec_EncodeHandlerForObject is
vital, because it is used to map for old string arguments to
the new function objects. PyCodec_RaiseEncodeErrors can be
used in the encoder implementation to raise an encode error,
but it could be made static in unicodeobject.h so only those
encoders implemented there have access to it.

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments > --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

I look through the code and found no situation where the
Py_UNICODE*/int version is really used and having two
(PyObject *)s (the original and the replacement string),
instead of UNICODE*/int and PyObject * made the
implementation a little easier, but I can fix that.

> Please separate the errors.c patch from this patch -- it
> seems totally unrelated to Unicode.

PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with
four hex digits. I removed it.

I'll upload a revised patch as soon as it's done.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 07:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Tue Jun 12 18:08:58 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 12 Jun 2001 10:08:58 -0700
Subject: [Patches] [ python-Patches-423394 ] Fix pulldom to preserve ns attributes
Message-ID: <E159rf0-0002es-00@usw-sf-web1.sourceforge.net>

Patches item #423394, was updated on 2001-05-11 11:04
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=423394&group_id=5470

Category: XML
Group: None
>Status: Closed
>Resolution: Out of Date
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Martin v. Löwis (loewis)
Summary: Fix pulldom to preserve ns attributes

Initial Comment:
Here is a fix for pulldom.py that preserves 
xmlns attributes that declare namespaces. 

The current pulldom / minidom captures xml namespace 
information in elements and attributes, but the 
actual namespace declaration attributes 
(xmlns:foo="...") are not preserved on the element 
where they appear. 

This makes it impossible for 
certain applications that do more complex name 
dereferencing (XMLSchema is an example) that 
requires not only namespace uris but 
also the prefixes used and the original scope 
information.

The current patch preserves xmlns="" and 
xmlns:foo="" as *non-namespace qualified* 
attributes, which appears to be the norm in other 
DOM implementations.

Pls let me know if you have any questions.

-Brian (brian@digicool.com)

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-12 10:08

Message:
Logged In: YES 
user_id=21627

Superceded by #432117

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-03 06:55

Message:
Logged In: YES 
user_id=21627

The patch is a good idea, but I think it does not conform 
to the DOM recommendation. In the DOM, the namespace URI
"http://www.w3.org/2000/xmlns/" is used for attributes 
whose namespace prefix or qualified name is xmlns.

In addition, the patch contains a typo, it hould not say 
atetr_items.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=423394&group_id=5470


From noreply@sourceforge.net  Tue Jun 12 18:09:16 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 12 Jun 2001 10:09:16 -0700
Subject: [Patches] [ python-Patches-432117 ] Updated PullDOM patch
Message-ID: <E159rfI-0002fM-00@usw-sf-web1.sourceforge.net>

Patches item #432117, was updated on 2001-06-11 08:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432117&group_id=5470

Category: XML
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
>Assigned to: Martin v. Löwis (loewis)
Summary: Updated PullDOM patch

Initial Comment:
Martin, here is an updated patch (see # 423394) for pulldom.py that 
follows the DOM REC regarding namespace declaration attribute 
handling.

In short: namespace declaration attributes are now preserved, and 
the namespaceURI of a namespace decl attribute is 
"http://www.w3.org/2000/xmlns/". The localName is the prefix to 
be mapped, unless it is a plain "xmlns" (default ns declaration), in 
which case the localName is just "xmlns".

I've tested this with a Python 2.1 (final release) install. Let me know if 
you need anything else.

Thanks!

B. Lloyd

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432117&group_id=5470


From noreply@sourceforge.net  Tue Jun 12 19:00:10 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 12 Jun 2001 11:00:10 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E159sSY-0007N4-00@usw-sf-web3.sourceforge.net>

Patches item #432401, was updated on 2001-06-12 06:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 11:00

Message:
Logged In: YES 
user_id=38388

About the Py_UNICODE*data, int size APIs:
Ok, point taken.

In general, I think we ought to keep the callback feature as
open as possible, so passing in pointers and sizes would not
be very useful.

BTW, could you summarize how the callback works in a few
lines ?

About _Encode121: I'd name this _EncodeUCS1 since that's
what it is ;-)

About the new functions: I was referring to the new static
functions which you gave PyUnicode_... names. If these are
not supposed to turn into non-static functions, I'd rather
have them use lower case names (since that's how the Python
internals work too -- most of the times).


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:56

Message:
Logged In: YES 
user_id=89016

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments
> --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

Another problem is, that the callback requires a Python
object, so in the PyObject *version, the refcount is
incref'd and the object is passed to the callback. The
Py_UNICODE*/int version would have to create a new Unicode
object from the data.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:32

Message:
Logged In: YES 
user_id=89016

> * please don't place more than one C statement on one line
> like in:
> """
> +               unicode = unicode2; unicodepos =
> unicode2pos;
> +               unicode2 = NULL; unicode2pos = 0;
> """

OK, done!

> * Comments should start with a capital letter and be
> prepended
> to the section they apply to

Fixed!

> * There should be spaces between arguments in compares
> (a == b) not (a==b)

Fixed!

> * Where does the name "...Encode121" originate ?

encode one-to-one, it implements both ASCII and latin-1
encoding.

> * module internal APIs should use lower case names (you
> converted some of these to  PyUnicode_...() -- this is
> normally reserved for APIs which are either marked as
> potential candidates for the public API or are very
> prominent in the code)

Which ones? I introduced a new function for every old one,
that had a "const char *errors" argument, and a few new ones
in codecs.h, of those PyCodec_EncodeHandlerForObject is
vital, because it is used to map for old string arguments to
the new function objects. PyCodec_RaiseEncodeErrors can be
used in the encoder implementation to raise an encode error,
but it could be made static in unicodeobject.h so only those
encoders implemented there have access to it.

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments > --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

I look through the code and found no situation where the
Py_UNICODE*/int version is really used and having two
(PyObject *)s (the original and the replacement string),
instead of UNICODE*/int and PyObject * made the
implementation a little easier, but I can fix that.

> Please separate the errors.c patch from this patch -- it
> seems totally unrelated to Unicode.

PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with
four hex digits. I removed it.

I'll upload a revised patch as soon as it's done.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 07:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Tue Jun 12 19:59:24 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 12 Jun 2001 11:59:24 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E159tNs-0006K1-00@usw-sf-web2.sourceforge.net>

Patches item #432401, was updated on 2001-06-12 06:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 11:59

Message:
Logged In: YES 
user_id=89016

How the callbacks work:

A PyObject * named errors is passed in. This may by NULL,
Py_None, 'strict', u'strict', 'ignore', u'ignore',
'replace', u'replace' or a callable object.
PyCodec_EncodeHandlerForObject maps all of these objects to
one of the three builtin error callbacks
PyCodec_RaiseEncodeErrors (raises an exception),
PyCodec_IgnoreEncodeErrors (returns an empty replacement
string, in effect ignoring the error),
PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
replacement character to signify to the encoder that it
should choose a suitable replacement character) or directly
returns errors if it is a callable object. When an
unencodable character is encounterd the error handling
callback will be called with the encoding name, the original
unicode object and the error position and must return a
unicode object that will be encoded instead of the offending
character (or the callback may of course raise an
exception). U+FFFD characters in the replacement string will 
be replaced with a character that the encoder chooses ('?'
in all cases).

The implementation of the loop through the string is done in
the following way. A stack with two strings is kept and the
loop always encodes a character from the string at the
stacktop. If an error is encountered and the stack has only
one entry (during encoding of the original string) the
callback is called and the unicode object returned is pushed
on the stack, so the encoding continues with the replacement
string. If the stack has two entries when an error is
encountered, the replacement string itself has an
unencodable character and a normal exception raised. When
the encoder has reached the end of it's current string there
are two possibilities: when the stack contains two entries,
this was the replacement string, so the replacement string
will be poppep from the stack and encoding continues with
the next character from the original string. If the stack
had only one entry, encoding is finished.

(I hope that's enough explanation of the API and implementation)

I have renamed the static ...121 function to all lowercase
names.

BTW, I guess PyUnicode_EncodeUnicodeEscape could be
reimplemented as PyUnicode_EncodeASCII with a \uxxxx
replacement callback.

PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
PyCodec_ReplaceEncodeErrors are globally visible because
they have to be available in _codecsmodule.c to wrap them as
Python function objects, but they can't be implemented in
_codecsmodule, because they need to be available to the
encoders in unicodeobject.c (through
PyCodec_EncodeHandlerForObject), but importing the codecs
module might result in an endless recursion, because
importing a module requires unpickling of the bytecode,
which might require decoding utf8, which ... (but this will
only happen, if we implement the same mechanism for the
decoding API)

I have not touched PyUnicode_TranslateCharmap yet, 
should this function also support error callbacks? Why would
one want the insert None into the mapping to call the callback?

A remaining problem is how to implement decoding error
callbacks. In Python 2.1 encoding and decoding errors are
handled in the same way with a string value. But with
callbacks it doesn't make sense to use the same callback for
encoding and decoding (like codecs.StreamReaderWriter and
codecs.StreamRecoder do). Decoding callbacks have a
different API. Which arguments should be passed to the
decoding callback, and what is the decoding callback
supposed to do?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 11:00

Message:
Logged In: YES 
user_id=38388

About the Py_UNICODE*data, int size APIs:
Ok, point taken.

In general, I think we ought to keep the callback feature as
open as possible, so passing in pointers and sizes would not
be very useful.

BTW, could you summarize how the callback works in a few
lines ?

About _Encode121: I'd name this _EncodeUCS1 since that's
what it is ;-)

About the new functions: I was referring to the new static
functions which you gave PyUnicode_... names. If these are
not supposed to turn into non-static functions, I'd rather
have them use lower case names (since that's how the Python
internals work too -- most of the times).


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:56

Message:
Logged In: YES 
user_id=89016

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments
> --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

Another problem is, that the callback requires a Python
object, so in the PyObject *version, the refcount is
incref'd and the object is passed to the callback. The
Py_UNICODE*/int version would have to create a new Unicode
object from the data.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:32

Message:
Logged In: YES 
user_id=89016

> * please don't place more than one C statement on one line
> like in:
> """
> +               unicode = unicode2; unicodepos =
> unicode2pos;
> +               unicode2 = NULL; unicode2pos = 0;
> """

OK, done!

> * Comments should start with a capital letter and be
> prepended
> to the section they apply to

Fixed!

> * There should be spaces between arguments in compares
> (a == b) not (a==b)

Fixed!

> * Where does the name "...Encode121" originate ?

encode one-to-one, it implements both ASCII and latin-1
encoding.

> * module internal APIs should use lower case names (you
> converted some of these to  PyUnicode_...() -- this is
> normally reserved for APIs which are either marked as
> potential candidates for the public API or are very
> prominent in the code)

Which ones? I introduced a new function for every old one,
that had a "const char *errors" argument, and a few new ones
in codecs.h, of those PyCodec_EncodeHandlerForObject is
vital, because it is used to map for old string arguments to
the new function objects. PyCodec_RaiseEncodeErrors can be
used in the encoder implementation to raise an encode error,
but it could be made static in unicodeobject.h so only those
encoders implemented there have access to it.

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments > --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

I look through the code and found no situation where the
Py_UNICODE*/int version is really used and having two
(PyObject *)s (the original and the replacement string),
instead of UNICODE*/int and PyObject * made the
implementation a little easier, but I can fix that.

> Please separate the errors.c patch from this patch -- it
> seems totally unrelated to Unicode.

PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with
four hex digits. I removed it.

I'll upload a revised patch as soon as it's done.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 07:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Tue Jun 12 20:18:57 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 12 Jun 2001 12:18:57 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E159tgn-0005Fl-00@usw-sf-web1.sourceforge.net>

Patches item #432401, was updated on 2001-06-12 06:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 12:18

Message:
Logged In: YES 
user_id=89016

One additional note: It is vital that errors is an
assignable attribute of the StreamWriter. 

Consider the XML example: For writing an XML DOM tree one
StreamWriter object is used. When a text node is written,
the error handling has to be set to
codecs.xmlreplace_encode_errors, but inside a comment or
processing instruction replacing unencodable characters with
charrefs is not possible, so here codecs.raise_encode_errors
should be used (or better a custom error handler that raises
an error that says "sorry, you can't have unencodable
characters inside a comment")

BTW, should we continue the discussion in the i18n SIG
mailing list? An email program is much more comfortable than
a HTML textarea! ;)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 11:59

Message:
Logged In: YES 
user_id=89016

How the callbacks work:

A PyObject * named errors is passed in. This may by NULL,
Py_None, 'strict', u'strict', 'ignore', u'ignore',
'replace', u'replace' or a callable object.
PyCodec_EncodeHandlerForObject maps all of these objects to
one of the three builtin error callbacks
PyCodec_RaiseEncodeErrors (raises an exception),
PyCodec_IgnoreEncodeErrors (returns an empty replacement
string, in effect ignoring the error),
PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
replacement character to signify to the encoder that it
should choose a suitable replacement character) or directly
returns errors if it is a callable object. When an
unencodable character is encounterd the error handling
callback will be called with the encoding name, the original
unicode object and the error position and must return a
unicode object that will be encoded instead of the offending
character (or the callback may of course raise an
exception). U+FFFD characters in the replacement string will 
be replaced with a character that the encoder chooses ('?'
in all cases).

The implementation of the loop through the string is done in
the following way. A stack with two strings is kept and the
loop always encodes a character from the string at the
stacktop. If an error is encountered and the stack has only
one entry (during encoding of the original string) the
callback is called and the unicode object returned is pushed
on the stack, so the encoding continues with the replacement
string. If the stack has two entries when an error is
encountered, the replacement string itself has an
unencodable character and a normal exception raised. When
the encoder has reached the end of it's current string there
are two possibilities: when the stack contains two entries,
this was the replacement string, so the replacement string
will be poppep from the stack and encoding continues with
the next character from the original string. If the stack
had only one entry, encoding is finished.

(I hope that's enough explanation of the API and implementation)

I have renamed the static ...121 function to all lowercase
names.

BTW, I guess PyUnicode_EncodeUnicodeEscape could be
reimplemented as PyUnicode_EncodeASCII with a \uxxxx
replacement callback.

PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
PyCodec_ReplaceEncodeErrors are globally visible because
they have to be available in _codecsmodule.c to wrap them as
Python function objects, but they can't be implemented in
_codecsmodule, because they need to be available to the
encoders in unicodeobject.c (through
PyCodec_EncodeHandlerForObject), but importing the codecs
module might result in an endless recursion, because
importing a module requires unpickling of the bytecode,
which might require decoding utf8, which ... (but this will
only happen, if we implement the same mechanism for the
decoding API)

I have not touched PyUnicode_TranslateCharmap yet, 
should this function also support error callbacks? Why would
one want the insert None into the mapping to call the callback?

A remaining problem is how to implement decoding error
callbacks. In Python 2.1 encoding and decoding errors are
handled in the same way with a string value. But with
callbacks it doesn't make sense to use the same callback for
encoding and decoding (like codecs.StreamReaderWriter and
codecs.StreamRecoder do). Decoding callbacks have a
different API. Which arguments should be passed to the
decoding callback, and what is the decoding callback
supposed to do?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 11:00

Message:
Logged In: YES 
user_id=38388

About the Py_UNICODE*data, int size APIs:
Ok, point taken.

In general, I think we ought to keep the callback feature as
open as possible, so passing in pointers and sizes would not
be very useful.

BTW, could you summarize how the callback works in a few
lines ?

About _Encode121: I'd name this _EncodeUCS1 since that's
what it is ;-)

About the new functions: I was referring to the new static
functions which you gave PyUnicode_... names. If these are
not supposed to turn into non-static functions, I'd rather
have them use lower case names (since that's how the Python
internals work too -- most of the times).


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:56

Message:
Logged In: YES 
user_id=89016

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments
> --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

Another problem is, that the callback requires a Python
object, so in the PyObject *version, the refcount is
incref'd and the object is passed to the callback. The
Py_UNICODE*/int version would have to create a new Unicode
object from the data.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:32

Message:
Logged In: YES 
user_id=89016

> * please don't place more than one C statement on one line
> like in:
> """
> +               unicode = unicode2; unicodepos =
> unicode2pos;
> +               unicode2 = NULL; unicode2pos = 0;
> """

OK, done!

> * Comments should start with a capital letter and be
> prepended
> to the section they apply to

Fixed!

> * There should be spaces between arguments in compares
> (a == b) not (a==b)

Fixed!

> * Where does the name "...Encode121" originate ?

encode one-to-one, it implements both ASCII and latin-1
encoding.

> * module internal APIs should use lower case names (you
> converted some of these to  PyUnicode_...() -- this is
> normally reserved for APIs which are either marked as
> potential candidates for the public API or are very
> prominent in the code)

Which ones? I introduced a new function for every old one,
that had a "const char *errors" argument, and a few new ones
in codecs.h, of those PyCodec_EncodeHandlerForObject is
vital, because it is used to map for old string arguments to
the new function objects. PyCodec_RaiseEncodeErrors can be
used in the encoder implementation to raise an encode error,
but it could be made static in unicodeobject.h so only those
encoders implemented there have access to it.

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments > --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

I look through the code and found no situation where the
Py_UNICODE*/int version is really used and having two
(PyObject *)s (the original and the replacement string),
instead of UNICODE*/int and PyObject * made the
implementation a little easier, but I can fix that.

> Please separate the errors.c patch from this patch -- it
> seems totally unrelated to Unicode.

PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with
four hex digits. I removed it.

I'll upload a revised patch as soon as it's done.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 07:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Tue Jun 12 23:19:07 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 12 Jun 2001 15:19:07 -0700
Subject: [Patches] [ python-Patches-400938 ] [Draft] libpython as shared library (.so) on Linux
Message-ID: <E159wV9-0001cx-00@usw-sf-web2.sourceforge.net>

Patches item #400938, was updated on 2000-07-19 13:55
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=400938&group_id=5470

Category: None
Group: None
Status: Open
Resolution: Out of Date
Priority: 5
Submitted By: Gregor Hoffleit (flight)
Assigned to: Nobody/Anonymous (nobody)
Summary: [Draft] libpython as shared library (.so) on Linux

Initial Comment:
 

----------------------------------------------------------------------

>Comment By: Gregor Hoffleit (flight)
Date: 2001-06-12 15:19

Message:
Logged In: YES 
user_id=5293

> Now we're just waiting for someone to produce a working
patch.
> Or is there one already?

I'm currently distributing experimental packages of Python
2.1 for Debian. The packages include a hack to build
libpython2.1 as .so for Linux.

The shared library patch currently is buried in a big diff
file. You can get it as
http://people.debian.org/~flight/python2/python2_2.1-0.diff.gz

This is only a starting point for a real patch!

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-12 09:54

Message:
Logged In: YES 
user_id=6380

Reopening -- this keeps being requested.  Now we're just
waiting for someone to produce a working patch.

Or is there one already?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-12 09:51

Message:
Logged In: YES 
user_id=6380

Reopening -- this keeps being requested.  Now we're just
waiting for someone to produce a working patch.

Or is there one already?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2001-03-21 15:59

Message:
Logged In: YES 
user_id=35752

We're going to have to create a new patch to do this.  This
one is
way too out of date.  Maybe for 2.2.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-01-19 14:46

Message:
I'm reassigning this to Neil.

Neil, can you see if you can integrate this into your flat Makefile?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-01-17 15:09

Message:
Andrew, I'm tentatively reassigning this to you, since you're taking charge of the build process at the moment (setup.py).

I suspect that the patch no longer works as is -- would it make sense to mark it postponed and get the author to submit a new version before we release 2.1a1?


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-01-17 14:46

Message:
Getting this patch into the next version of Python would be "A Good Thing"(tm) in my opinion.  We use libpython as a .so at ILM and end up having to make changes like this by hand every time we get a new version...

----------------------------------------------------------------------

Comment By: Moshe Zadka (moshez)
Date: 2000-11-01 03:32

Message:
I've had a look at the patch, and it seems it has
two orthogonal parts. One is adding the infrastructure
for compiling another version for the Python library, which can be more or less integrated as-is, and one is hard-coding the particular way, in Linux, of building shared objects. Since we discover how to build shared objects in the configure script anyway (otherwise we could not have built modules as shared objects), we should embed that information there, not the Linux flags.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2000-10-26 14:13

Message:
Let's give this to Jeremy instead, because he seems to know more about build issues. Jeremy, it would be good to look into getting this to work with your RPM suite. Flight's argument (has been used without complaints in Debian Python 1.5.2 since 1999) is good.

----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2000-08-23 09:26

Message:
In the absence of anyone arguing for inclusion of this patch and a one-week idle period, it is postponed.

----------------------------------------------------------------------

Comment By: Moshe Zadka (moshez)
Date: 2000-08-16 00:40

Message:
I suggest we postpone it. It isn't really complete (only works on real distributions <wink>), and the complete solution should work on all unices. If Tcl/Perl can do it, there is no reason Python can't -- and a half hearted solution isn't that good. flight, you should use this
for the Python in woody in the mean time -- I doubt woody
will be stable before Python 2.1 comes out, so 2.1 sounds
like a good timeframe to do it.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2000-08-15 10:52

Message:
Assigned to Barry because he's a Linux weenie.  Barry, if you think there's something here that should go into 2.0, please pursue it now, else change the status to Postponed.

----------------------------------------------------------------------

Comment By: Gregor Hoffleit (flight)
Date: 2000-07-19 14:10

Message:
This is what it used in product to build libpython as shared library(.so) for Debian.

Note: This patch is not ready for inclusion in the upstream Python distribution. Anyway, I think this might be a start. The Python 1.5 executable in Debian GNU/Linux is built against a shared libpython1.5.so since April 1999, and I haven't yet heard about any problems.

Using a shared library should have an advantage if you're running multiple instances of Python (be it standalone interpreter or embedded applications).


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=400938&group_id=5470


From noreply@sourceforge.net  Wed Jun 13 09:05:58 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 13 Jun 2001 01:05:58 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E15A5f4-0004qC-00@usw-sf-web3.sourceforge.net>

Patches item #432401, was updated on 2001-06-12 06:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 01:05

Message:
Logged In: YES 
user_id=38388

> How the callbacks work:
> 
> A PyObject * named errors is passed in. This may by NULL,
> Py_None, 'strict', u'strict', 'ignore', u'ignore',
> 'replace', u'replace' or a callable object.
> PyCodec_EncodeHandlerForObject maps all of these objects
to
> one of the three builtin error callbacks
> PyCodec_RaiseEncodeErrors (raises an exception),
> PyCodec_IgnoreEncodeErrors (returns an empty replacement
> string, in effect ignoring the error),
> PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
> replacement character to signify to the encoder that it
> should choose a suitable replacement character) or
directly
> returns errors if it is a callable object. When an
> unencodable character is encounterd the error handling
> callback will be called with the encoding name, the
original
> unicode object and the error position and must return a
> unicode object that will be encoded instead of the
offending
> character (or the callback may of course raise an
> exception). U+FFFD characters in the replacement string
will
> be replaced with a character that the encoder chooses ('?'
> in all cases).

Nice.
 
> The implementation of the loop through the string is done
in
> the following way. A stack with two strings is kept and
the
> loop always encodes a character from the string at the
> stacktop. If an error is encountered and the stack has
only
> one entry (during encoding of the original string) the
> callback is called and the unicode object returned is
pushed
> on the stack, so the encoding continues with the
replacement
> string. If the stack has two entries when an error is
> encountered, the replacement string itself has an
> unencodable character and a normal exception raised. When
> the encoder has reached the end of it's current string
there
> are two possibilities: when the stack contains two
entries,
> this was the replacement string, so the replacement string
> will be poppep from the stack and encoding continues with
> the next character from the original string. If the stack
> had only one entry, encoding is finished.

Very elegant solution !
 
> (I hope that's enough explanation of the API and
implementation)

Could you add these docs to the Misc/unicode.txt file ? I
will eventually take that file and turn it into a PEP which
will then serve as general documentation for these things.
 
> I have renamed the static ...121 function to all lowercase
> names.

Ok.
 
> BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> replacement callback.

Hmm, wouldn't that result in a slowdown ? If so, I'd rather
leave the special encoder in place, since it is being used a
lot in Python and probably some applications too.
 
> PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
> PyCodec_ReplaceEncodeErrors are globally visible because
> they have to be available in _codecsmodule.c to wrap them
as
> Python function objects, but they can't be implemented in
> _codecsmodule, because they need to be available to the
> encoders in unicodeobject.c (through
> PyCodec_EncodeHandlerForObject), but importing the codecs
> module might result in an endless recursion, because
> importing a module requires unpickling of the bytecode,
> which might require decoding utf8, which ... (but this
will
> only happen, if we implement the same mechanism for the
> decoding API)

I think that codecs.c is the right place for these APIs.
_codecsmodule.c is only meant as Python access wrapper for
the internal codecs and nothing more. 

One thing I noted about the callbacks: they assume that they
will always get Unicode objects as input. This is certainly
not true in the general case (it is for the codecs you touch
in the patch). 

I think it would be worthwhile to rename the callbacks to
include "Unicode" somewhere, e.g.
PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but
then it points out the application field of the callback
rather well. Same for the callbacks exposed through the
_codecsmodule.

> I have not touched PyUnicode_TranslateCharmap yet,
> should this function also support error callbacks? Why
would
> one want the insert None into the mapping to call the
callback?

1. Yes.
2. The user may want to e.g. restrict usage of certain
character ranges. In this case the codec would be used to
verify the input and an exception would indeed be useful
(e.g. say you want to restrict input to Hangul + ASCII).
 
> A remaining problem is how to implement decoding error
> callbacks. In Python 2.1 encoding and decoding errors are
> handled in the same way with a string value. But with
> callbacks it doesn't make sense to use the same callback
for
> encoding and decoding (like codecs.StreamReaderWriter and
> codecs.StreamRecoder do). Decoding callbacks have a
> different API. Which arguments should be passed to the
> decoding callback, and what is the decoding callback
> supposed to do?

I'd suggest adding another set of PyCodec_UnicodeDecode...()
APIs for this. We'd then have to augment the base classes of
the StreamCodecs to provide two attributes for .errors with
a fallback solution for the string case (i.s. "strict" can
still be used for both directions).

> One additional note: It is vital that errors is an
> assignable attribute of the StreamWriter.

It is already !
 
> Consider the XML example: For writing an XML DOM tree one
> StreamWriter object is used. When a text node is written,
> the error handling has to be set to
> codecs.xmlreplace_encode_errors, but inside a comment or
> processing instruction replacing unencodable characters
with
> charrefs is not possible, so here
codecs.raise_encode_errors
> should be used (or better a custom error handler that
raises
> an error that says "sorry, you can't have unencodable
> characters inside a comment")

Sure.
 
> BTW, should we continue the discussion in the i18n SIG
> mailing list? An email program is much more comfortable
than
> a HTML textarea! ;)

I'd rather keep the discussions on this patch here --
forking it off to the i18n sig will make it very hard to
follow up on it. (This HTML area is indeed damn small ;-)
 

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 12:18

Message:
Logged In: YES 
user_id=89016

One additional note: It is vital that errors is an
assignable attribute of the StreamWriter. 

Consider the XML example: For writing an XML DOM tree one
StreamWriter object is used. When a text node is written,
the error handling has to be set to
codecs.xmlreplace_encode_errors, but inside a comment or
processing instruction replacing unencodable characters with
charrefs is not possible, so here codecs.raise_encode_errors
should be used (or better a custom error handler that raises
an error that says "sorry, you can't have unencodable
characters inside a comment")

BTW, should we continue the discussion in the i18n SIG
mailing list? An email program is much more comfortable than
a HTML textarea! ;)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 11:59

Message:
Logged In: YES 
user_id=89016

How the callbacks work:

A PyObject * named errors is passed in. This may by NULL,
Py_None, 'strict', u'strict', 'ignore', u'ignore',
'replace', u'replace' or a callable object.
PyCodec_EncodeHandlerForObject maps all of these objects to
one of the three builtin error callbacks
PyCodec_RaiseEncodeErrors (raises an exception),
PyCodec_IgnoreEncodeErrors (returns an empty replacement
string, in effect ignoring the error),
PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
replacement character to signify to the encoder that it
should choose a suitable replacement character) or directly
returns errors if it is a callable object. When an
unencodable character is encounterd the error handling
callback will be called with the encoding name, the original
unicode object and the error position and must return a
unicode object that will be encoded instead of the offending
character (or the callback may of course raise an
exception). U+FFFD characters in the replacement string will 
be replaced with a character that the encoder chooses ('?'
in all cases).

The implementation of the loop through the string is done in
the following way. A stack with two strings is kept and the
loop always encodes a character from the string at the
stacktop. If an error is encountered and the stack has only
one entry (during encoding of the original string) the
callback is called and the unicode object returned is pushed
on the stack, so the encoding continues with the replacement
string. If the stack has two entries when an error is
encountered, the replacement string itself has an
unencodable character and a normal exception raised. When
the encoder has reached the end of it's current string there
are two possibilities: when the stack contains two entries,
this was the replacement string, so the replacement string
will be poppep from the stack and encoding continues with
the next character from the original string. If the stack
had only one entry, encoding is finished.

(I hope that's enough explanation of the API and implementation)

I have renamed the static ...121 function to all lowercase
names.

BTW, I guess PyUnicode_EncodeUnicodeEscape could be
reimplemented as PyUnicode_EncodeASCII with a \uxxxx
replacement callback.

PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
PyCodec_ReplaceEncodeErrors are globally visible because
they have to be available in _codecsmodule.c to wrap them as
Python function objects, but they can't be implemented in
_codecsmodule, because they need to be available to the
encoders in unicodeobject.c (through
PyCodec_EncodeHandlerForObject), but importing the codecs
module might result in an endless recursion, because
importing a module requires unpickling of the bytecode,
which might require decoding utf8, which ... (but this will
only happen, if we implement the same mechanism for the
decoding API)

I have not touched PyUnicode_TranslateCharmap yet, 
should this function also support error callbacks? Why would
one want the insert None into the mapping to call the callback?

A remaining problem is how to implement decoding error
callbacks. In Python 2.1 encoding and decoding errors are
handled in the same way with a string value. But with
callbacks it doesn't make sense to use the same callback for
encoding and decoding (like codecs.StreamReaderWriter and
codecs.StreamRecoder do). Decoding callbacks have a
different API. Which arguments should be passed to the
decoding callback, and what is the decoding callback
supposed to do?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 11:00

Message:
Logged In: YES 
user_id=38388

About the Py_UNICODE*data, int size APIs:
Ok, point taken.

In general, I think we ought to keep the callback feature as
open as possible, so passing in pointers and sizes would not
be very useful.

BTW, could you summarize how the callback works in a few
lines ?

About _Encode121: I'd name this _EncodeUCS1 since that's
what it is ;-)

About the new functions: I was referring to the new static
functions which you gave PyUnicode_... names. If these are
not supposed to turn into non-static functions, I'd rather
have them use lower case names (since that's how the Python
internals work too -- most of the times).


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:56

Message:
Logged In: YES 
user_id=89016

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments
> --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

Another problem is, that the callback requires a Python
object, so in the PyObject *version, the refcount is
incref'd and the object is passed to the callback. The
Py_UNICODE*/int version would have to create a new Unicode
object from the data.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:32

Message:
Logged In: YES 
user_id=89016

> * please don't place more than one C statement on one line
> like in:
> """
> +               unicode = unicode2; unicodepos =
> unicode2pos;
> +               unicode2 = NULL; unicode2pos = 0;
> """

OK, done!

> * Comments should start with a capital letter and be
> prepended
> to the section they apply to

Fixed!

> * There should be spaces between arguments in compares
> (a == b) not (a==b)

Fixed!

> * Where does the name "...Encode121" originate ?

encode one-to-one, it implements both ASCII and latin-1
encoding.

> * module internal APIs should use lower case names (you
> converted some of these to  PyUnicode_...() -- this is
> normally reserved for APIs which are either marked as
> potential candidates for the public API or are very
> prominent in the code)

Which ones? I introduced a new function for every old one,
that had a "const char *errors" argument, and a few new ones
in codecs.h, of those PyCodec_EncodeHandlerForObject is
vital, because it is used to map for old string arguments to
the new function objects. PyCodec_RaiseEncodeErrors can be
used in the encoder implementation to raise an encode error,
but it could be made static in unicodeobject.h so only those
encoders implemented there have access to it.

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments > --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

I look through the code and found no situation where the
Py_UNICODE*/int version is really used and having two
(PyObject *)s (the original and the replacement string),
instead of UNICODE*/int and PyObject * made the
implementation a little easier, but I can fix that.

> Please separate the errors.c patch from this patch -- it
> seems totally unrelated to Unicode.

PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with
four hex digits. I removed it.

I'll upload a revised patch as soon as it's done.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 07:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Wed Jun 13 14:57:07 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 13 Jun 2001 06:57:07 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E15AB8t-0002ju-00@usw-sf-web3.sourceforge.net>

Patches item #432401, was updated on 2001-06-12 06:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 06:57

Message:
Logged In: YES 
user_id=89016

> > [...]
> > raise an exception). U+FFFD characters in the 
replacement
> > string will be replaced with a character that the 
encoder
> > chooses ('?' in all cases).
>
> Nice.

But the special casing of U+FFFD makes the interface 
somewhat
less clean than it could be. It was only done to be 100%
backwards compatible. With the original "replace" error
handling the codec chose the replacement character. But as
far as I can tell none of the codecs uses anything other
than '?', so I guess we could change the replace handler
to always return u'?'. This would make the implementation a
little bit simpler, but the explanation of the callback
feature *a lot* simpler. And if you still want to handle
an unencodable U+FFFD, you can write a special callback for
that, e.g.

def FFFDreplace(enc, uni, pos):
if uni[pos] == "\ufffd":
return u"?"
else:
raise UnicodeError(...)

> > The implementation of the loop through the string is 
done
> > in the following way. A stack with two strings is kept
> > and the loop always encodes a character from the string
> > at the stacktop. If an error is encountered and the 
stack
> > has only one entry (during encoding of the original 
string)
> > the callback is called and the unicode object returned 
is
> > pushed on the stack, so the encoding continues with the
> > replacement string. If the stack has two entries when an
> > error is encountered, the replacement string itself has
> > an unencodable character and a normal exception raised.
> > When the encoder has reached the end of it's current 
string
> > there are two possibilities: when the stack contains two
> > entries, this was the replacement string, so the 
replacement
> > string will be poppep from the stack and encoding 
continues
> > with the next character from the original string. If the
> > stack had only one entry, encoding is finished.
>
> Very elegant solution !

I'll put it as a comment in the source.

> > (I hope that's enough explanation of the API and
> implementation)
>
> Could you add these docs to the Misc/unicode.txt file ? I
> will eventually take that file and turn it into a PEP 
which
> will then serve as general documentation for these things.

I could, but first we should work out how the decoding
callback API will work.

> > I have renamed the static ...121 function to all 
lowercase
> > names.
>
> Ok.
>
> > BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> > reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> > replacement callback.
>
> Hmm, wouldn't that result in a slowdown ? If so, I'd 
rather
> leave the special encoder in place, since it is being 
used a
> lot in Python and probably some applications too.

It would be a slowdown. But callbacks open many 
possiblities.

For example:

   Why can't I print u"gürk"?

is probably one of the most frequently asked questions in
comp.lang.python. For printing Unicode stuff, print could be
extended the use an error handling callback for Unicode 
strings (or objects where __str__ or tp_str returns a 
Unicode object) instead of using str() which always returns 
an 8bit string and uses strict encoding. There might even 
be a
sys.setprintencodehandler()/sys.getprintencodehandler()

> [...]
> I think it would be worthwhile to rename the callbacks to
> include "Unicode" somewhere, e.g.
> PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, 
but
> then it points out the application field of the callback
> rather well. Same for the callbacks exposed through the
> _codecsmodule.

OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors
really is a long name ;))

> > I have not touched PyUnicode_TranslateCharmap yet,
> > should this function also support error callbacks? Why
> > would one want the insert None into the mapping to call
> > the callback?
>
> 1. Yes.
> 2. The user may want to e.g. restrict usage of certain
> character ranges. In this case the codec would be used to
> verify the input and an exception would indeed be useful
> (e.g. say you want to restrict input to Hangul + ASCII).

OK, do we want TranslateCharmap to work exactly like 
encoding,
i.e. in case of an error should the returned replacement
string again be mapped through the translation mapping or
should it be copied to the output directly? The former would
be more in line with encoding, but IMHO the latter would
be much more useful.

BTW, when I implement it I can implement patch #403100
("Multicharacter replacements in 
PyUnicode_TranslateCharmap")
along the way.

Should the old TranslateCharmap map to the new 
TranslateCharmapEx
and inherit the "multicharacter replacement" feature, or
should I leave it as it is?

> > A remaining problem is how to implement decoding error
> > callbacks. In Python 2.1 encoding and decoding errors 
are
> > handled in the same way with a string value. But with
> > callbacks it doesn't make sense to use the same callback
> > for encoding and decoding (like 
codecs.StreamReaderWriter
> > and codecs.StreamRecoder do). Decoding callbacks have a
> > different API. Which arguments should be passed to the
> > decoding callback, and what is the decoding callback
> > supposed to do?
>
> I'd suggest adding another set of PyCodec_UnicodeDecode...
()
> APIs for this. We'd then have to augment the base classes 
of
> the StreamCodecs to provide two attributes for .errors 
with
> a fallback solution for the string case (i.s. "strict" can
> still be used for both directions).

Sounds good. Now what is the decoding callback supposed to 
do?
I guess it will be called in the same way as the encoding
callback, i.e. with encoding name, original string and
position of the error. It might returns a Unicode string
(i.e. an object of the decoding target type), that will be
emitted from the codec instead of the one offending byte. Or
it might return a tuple with replacement Unicode object and
a resynchronisation offset, i.e. returning (u"?", 1) means
emit a '?' and skip the offending character. But to make
the offset really useful the callback has to know something
about the encoding, perhaps the codec should be allowed to
pass an additional state object to the callback?

Maybe the same should be added to the encoding callbacks to?
Maybe the encoding callback should be able to tell the
encoder if the replacement returned should be reencoded
(in which case it's a Unicode object), or directly emitted
(in which case it's an 8bit string)?

> > One additional note: It is vital that errors is an
> > assignable attribute of the StreamWriter.
>
> It is already !

I know, but IMHO it should be documented that an assignable
errors attribute must be supported as part of the official
codec API.

Misc/unicode.txt is not clear on that:
"""
It is not required by the Unicode implementation to use 
these base classes, only the interfaces must match; this 
allows writing Codecs as extension types.
"""

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 01:05

Message:
Logged In: YES 
user_id=38388

> How the callbacks work:
> 
> A PyObject * named errors is passed in. This may by NULL,
> Py_None, 'strict', u'strict', 'ignore', u'ignore',
> 'replace', u'replace' or a callable object.
> PyCodec_EncodeHandlerForObject maps all of these objects
to
> one of the three builtin error callbacks
> PyCodec_RaiseEncodeErrors (raises an exception),
> PyCodec_IgnoreEncodeErrors (returns an empty replacement
> string, in effect ignoring the error),
> PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
> replacement character to signify to the encoder that it
> should choose a suitable replacement character) or
directly
> returns errors if it is a callable object. When an
> unencodable character is encounterd the error handling
> callback will be called with the encoding name, the
original
> unicode object and the error position and must return a
> unicode object that will be encoded instead of the
offending
> character (or the callback may of course raise an
> exception). U+FFFD characters in the replacement string
will
> be replaced with a character that the encoder chooses ('?'
> in all cases).

Nice.
 
> The implementation of the loop through the string is done
in
> the following way. A stack with two strings is kept and
the
> loop always encodes a character from the string at the
> stacktop. If an error is encountered and the stack has
only
> one entry (during encoding of the original string) the
> callback is called and the unicode object returned is
pushed
> on the stack, so the encoding continues with the
replacement
> string. If the stack has two entries when an error is
> encountered, the replacement string itself has an
> unencodable character and a normal exception raised. When
> the encoder has reached the end of it's current string
there
> are two possibilities: when the stack contains two
entries,
> this was the replacement string, so the replacement string
> will be poppep from the stack and encoding continues with
> the next character from the original string. If the stack
> had only one entry, encoding is finished.

Very elegant solution !
 
> (I hope that's enough explanation of the API and
implementation)

Could you add these docs to the Misc/unicode.txt file ? I
will eventually take that file and turn it into a PEP which
will then serve as general documentation for these things.
 
> I have renamed the static ...121 function to all lowercase
> names.

Ok.
 
> BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> replacement callback.

Hmm, wouldn't that result in a slowdown ? If so, I'd rather
leave the special encoder in place, since it is being used a
lot in Python and probably some applications too.
 
> PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
> PyCodec_ReplaceEncodeErrors are globally visible because
> they have to be available in _codecsmodule.c to wrap them
as
> Python function objects, but they can't be implemented in
> _codecsmodule, because they need to be available to the
> encoders in unicodeobject.c (through
> PyCodec_EncodeHandlerForObject), but importing the codecs
> module might result in an endless recursion, because
> importing a module requires unpickling of the bytecode,
> which might require decoding utf8, which ... (but this
will
> only happen, if we implement the same mechanism for the
> decoding API)

I think that codecs.c is the right place for these APIs.
_codecsmodule.c is only meant as Python access wrapper for
the internal codecs and nothing more. 

One thing I noted about the callbacks: they assume that they
will always get Unicode objects as input. This is certainly
not true in the general case (it is for the codecs you touch
in the patch). 

I think it would be worthwhile to rename the callbacks to
include "Unicode" somewhere, e.g.
PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but
then it points out the application field of the callback
rather well. Same for the callbacks exposed through the
_codecsmodule.

> I have not touched PyUnicode_TranslateCharmap yet,
> should this function also support error callbacks? Why
would
> one want the insert None into the mapping to call the
callback?

1. Yes.
2. The user may want to e.g. restrict usage of certain
character ranges. In this case the codec would be used to
verify the input and an exception would indeed be useful
(e.g. say you want to restrict input to Hangul + ASCII).
 
> A remaining problem is how to implement decoding error
> callbacks. In Python 2.1 encoding and decoding errors are
> handled in the same way with a string value. But with
> callbacks it doesn't make sense to use the same callback
for
> encoding and decoding (like codecs.StreamReaderWriter and
> codecs.StreamRecoder do). Decoding callbacks have a
> different API. Which arguments should be passed to the
> decoding callback, and what is the decoding callback
> supposed to do?

I'd suggest adding another set of PyCodec_UnicodeDecode...()
APIs for this. We'd then have to augment the base classes of
the StreamCodecs to provide two attributes for .errors with
a fallback solution for the string case (i.s. "strict" can
still be used for both directions).

> One additional note: It is vital that errors is an
> assignable attribute of the StreamWriter.

It is already !
 
> Consider the XML example: For writing an XML DOM tree one
> StreamWriter object is used. When a text node is written,
> the error handling has to be set to
> codecs.xmlreplace_encode_errors, but inside a comment or
> processing instruction replacing unencodable characters
with
> charrefs is not possible, so here
codecs.raise_encode_errors
> should be used (or better a custom error handler that
raises
> an error that says "sorry, you can't have unencodable
> characters inside a comment")

Sure.
 
> BTW, should we continue the discussion in the i18n SIG
> mailing list? An email program is much more comfortable
than
> a HTML textarea! ;)

I'd rather keep the discussions on this patch here --
forking it off to the i18n sig will make it very hard to
follow up on it. (This HTML area is indeed damn small ;-)
 

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 12:18

Message:
Logged In: YES 
user_id=89016

One additional note: It is vital that errors is an
assignable attribute of the StreamWriter. 

Consider the XML example: For writing an XML DOM tree one
StreamWriter object is used. When a text node is written,
the error handling has to be set to
codecs.xmlreplace_encode_errors, but inside a comment or
processing instruction replacing unencodable characters with
charrefs is not possible, so here codecs.raise_encode_errors
should be used (or better a custom error handler that raises
an error that says "sorry, you can't have unencodable
characters inside a comment")

BTW, should we continue the discussion in the i18n SIG
mailing list? An email program is much more comfortable than
a HTML textarea! ;)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 11:59

Message:
Logged In: YES 
user_id=89016

How the callbacks work:

A PyObject * named errors is passed in. This may by NULL,
Py_None, 'strict', u'strict', 'ignore', u'ignore',
'replace', u'replace' or a callable object.
PyCodec_EncodeHandlerForObject maps all of these objects to
one of the three builtin error callbacks
PyCodec_RaiseEncodeErrors (raises an exception),
PyCodec_IgnoreEncodeErrors (returns an empty replacement
string, in effect ignoring the error),
PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
replacement character to signify to the encoder that it
should choose a suitable replacement character) or directly
returns errors if it is a callable object. When an
unencodable character is encounterd the error handling
callback will be called with the encoding name, the original
unicode object and the error position and must return a
unicode object that will be encoded instead of the offending
character (or the callback may of course raise an
exception). U+FFFD characters in the replacement string will 
be replaced with a character that the encoder chooses ('?'
in all cases).

The implementation of the loop through the string is done in
the following way. A stack with two strings is kept and the
loop always encodes a character from the string at the
stacktop. If an error is encountered and the stack has only
one entry (during encoding of the original string) the
callback is called and the unicode object returned is pushed
on the stack, so the encoding continues with the replacement
string. If the stack has two entries when an error is
encountered, the replacement string itself has an
unencodable character and a normal exception raised. When
the encoder has reached the end of it's current string there
are two possibilities: when the stack contains two entries,
this was the replacement string, so the replacement string
will be poppep from the stack and encoding continues with
the next character from the original string. If the stack
had only one entry, encoding is finished.

(I hope that's enough explanation of the API and implementation)

I have renamed the static ...121 function to all lowercase
names.

BTW, I guess PyUnicode_EncodeUnicodeEscape could be
reimplemented as PyUnicode_EncodeASCII with a \uxxxx
replacement callback.

PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
PyCodec_ReplaceEncodeErrors are globally visible because
they have to be available in _codecsmodule.c to wrap them as
Python function objects, but they can't be implemented in
_codecsmodule, because they need to be available to the
encoders in unicodeobject.c (through
PyCodec_EncodeHandlerForObject), but importing the codecs
module might result in an endless recursion, because
importing a module requires unpickling of the bytecode,
which might require decoding utf8, which ... (but this will
only happen, if we implement the same mechanism for the
decoding API)

I have not touched PyUnicode_TranslateCharmap yet, 
should this function also support error callbacks? Why would
one want the insert None into the mapping to call the callback?

A remaining problem is how to implement decoding error
callbacks. In Python 2.1 encoding and decoding errors are
handled in the same way with a string value. But with
callbacks it doesn't make sense to use the same callback for
encoding and decoding (like codecs.StreamReaderWriter and
codecs.StreamRecoder do). Decoding callbacks have a
different API. Which arguments should be passed to the
decoding callback, and what is the decoding callback
supposed to do?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 11:00

Message:
Logged In: YES 
user_id=38388

About the Py_UNICODE*data, int size APIs:
Ok, point taken.

In general, I think we ought to keep the callback feature as
open as possible, so passing in pointers and sizes would not
be very useful.

BTW, could you summarize how the callback works in a few
lines ?

About _Encode121: I'd name this _EncodeUCS1 since that's
what it is ;-)

About the new functions: I was referring to the new static
functions which you gave PyUnicode_... names. If these are
not supposed to turn into non-static functions, I'd rather
have them use lower case names (since that's how the Python
internals work too -- most of the times).


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:56

Message:
Logged In: YES 
user_id=89016

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments
> --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

Another problem is, that the callback requires a Python
object, so in the PyObject *version, the refcount is
incref'd and the object is passed to the callback. The
Py_UNICODE*/int version would have to create a new Unicode
object from the data.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:32

Message:
Logged In: YES 
user_id=89016

> * please don't place more than one C statement on one line
> like in:
> """
> +               unicode = unicode2; unicodepos =
> unicode2pos;
> +               unicode2 = NULL; unicode2pos = 0;
> """

OK, done!

> * Comments should start with a capital letter and be
> prepended
> to the section they apply to

Fixed!

> * There should be spaces between arguments in compares
> (a == b) not (a==b)

Fixed!

> * Where does the name "...Encode121" originate ?

encode one-to-one, it implements both ASCII and latin-1
encoding.

> * module internal APIs should use lower case names (you
> converted some of these to  PyUnicode_...() -- this is
> normally reserved for APIs which are either marked as
> potential candidates for the public API or are very
> prominent in the code)

Which ones? I introduced a new function for every old one,
that had a "const char *errors" argument, and a few new ones
in codecs.h, of those PyCodec_EncodeHandlerForObject is
vital, because it is used to map for old string arguments to
the new function objects. PyCodec_RaiseEncodeErrors can be
used in the encoder implementation to raise an encode error,
but it could be made static in unicodeobject.h so only those
encoders implemented there have access to it.

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments > --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

I look through the code and found no situation where the
Py_UNICODE*/int version is really used and having two
(PyObject *)s (the original and the replacement string),
instead of UNICODE*/int and PyObject * made the
implementation a little easier, but I can fix that.

> Please separate the errors.c patch from this patch -- it
> seems totally unrelated to Unicode.

PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with
four hex digits. I removed it.

I'll upload a revised patch as soon as it's done.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 07:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Wed Jun 13 16:49:05 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 13 Jun 2001 08:49:05 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E15ACtF-0004pB-00@usw-sf-web3.sourceforge.net>

Patches item #432401, was updated on 2001-06-12 06:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 08:49

Message:
Logged In: YES 
user_id=89016

Guido van Rossum wrote in python-dev:

> True, the "codec" pattern can be used for other 
> encodings than Unicode.  But it seems to me that the
> entire codecs architecture is rather strongly geared
> towards en/decoding Unicode, and it's not clear
> how well other codecs fit in this pattern (e.g. I 
> noticed that all the non-Unicode codecs ignore the 
> error handling parameter or assert that
> it is set to 'strict').

I noticed that too. asserting that errors=='strict' would 
mean that the encoder is not able to deal in any other way 
with unencodable stuff than by raising an error. But that 
is not the problem here, because for zlib, base64, quopri, 
hex and uu encoding there can be no unencodable characters. 
The encoders can simply ignore the errors parameter. Should 
I remove the asserts from those codecs and change the 
docstrings accordingly, or will this be done separately?


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 06:57

Message:
Logged In: YES 
user_id=89016

> > [...]
> > raise an exception). U+FFFD characters in the 
replacement
> > string will be replaced with a character that the 
encoder
> > chooses ('?' in all cases).
>
> Nice.

But the special casing of U+FFFD makes the interface 
somewhat
less clean than it could be. It was only done to be 100%
backwards compatible. With the original "replace" error
handling the codec chose the replacement character. But as
far as I can tell none of the codecs uses anything other
than '?', so I guess we could change the replace handler
to always return u'?'. This would make the implementation a
little bit simpler, but the explanation of the callback
feature *a lot* simpler. And if you still want to handle
an unencodable U+FFFD, you can write a special callback for
that, e.g.

def FFFDreplace(enc, uni, pos):
if uni[pos] == "\ufffd":
return u"?"
else:
raise UnicodeError(...)

> > The implementation of the loop through the string is 
done
> > in the following way. A stack with two strings is kept
> > and the loop always encodes a character from the string
> > at the stacktop. If an error is encountered and the 
stack
> > has only one entry (during encoding of the original 
string)
> > the callback is called and the unicode object returned 
is
> > pushed on the stack, so the encoding continues with the
> > replacement string. If the stack has two entries when an
> > error is encountered, the replacement string itself has
> > an unencodable character and a normal exception raised.
> > When the encoder has reached the end of it's current 
string
> > there are two possibilities: when the stack contains two
> > entries, this was the replacement string, so the 
replacement
> > string will be poppep from the stack and encoding 
continues
> > with the next character from the original string. If the
> > stack had only one entry, encoding is finished.
>
> Very elegant solution !

I'll put it as a comment in the source.

> > (I hope that's enough explanation of the API and
> implementation)
>
> Could you add these docs to the Misc/unicode.txt file ? I
> will eventually take that file and turn it into a PEP 
which
> will then serve as general documentation for these things.

I could, but first we should work out how the decoding
callback API will work.

> > I have renamed the static ...121 function to all 
lowercase
> > names.
>
> Ok.
>
> > BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> > reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> > replacement callback.
>
> Hmm, wouldn't that result in a slowdown ? If so, I'd 
rather
> leave the special encoder in place, since it is being 
used a
> lot in Python and probably some applications too.

It would be a slowdown. But callbacks open many 
possiblities.

For example:

   Why can't I print u"gürk"?

is probably one of the most frequently asked questions in
comp.lang.python. For printing Unicode stuff, print could be
extended the use an error handling callback for Unicode 
strings (or objects where __str__ or tp_str returns a 
Unicode object) instead of using str() which always returns 
an 8bit string and uses strict encoding. There might even 
be a
sys.setprintencodehandler()/sys.getprintencodehandler()

> [...]
> I think it would be worthwhile to rename the callbacks to
> include "Unicode" somewhere, e.g.
> PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, 
but
> then it points out the application field of the callback
> rather well. Same for the callbacks exposed through the
> _codecsmodule.

OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors
really is a long name ;))

> > I have not touched PyUnicode_TranslateCharmap yet,
> > should this function also support error callbacks? Why
> > would one want the insert None into the mapping to call
> > the callback?
>
> 1. Yes.
> 2. The user may want to e.g. restrict usage of certain
> character ranges. In this case the codec would be used to
> verify the input and an exception would indeed be useful
> (e.g. say you want to restrict input to Hangul + ASCII).

OK, do we want TranslateCharmap to work exactly like 
encoding,
i.e. in case of an error should the returned replacement
string again be mapped through the translation mapping or
should it be copied to the output directly? The former would
be more in line with encoding, but IMHO the latter would
be much more useful.

BTW, when I implement it I can implement patch #403100
("Multicharacter replacements in 
PyUnicode_TranslateCharmap")
along the way.

Should the old TranslateCharmap map to the new 
TranslateCharmapEx
and inherit the "multicharacter replacement" feature, or
should I leave it as it is?

> > A remaining problem is how to implement decoding error
> > callbacks. In Python 2.1 encoding and decoding errors 
are
> > handled in the same way with a string value. But with
> > callbacks it doesn't make sense to use the same callback
> > for encoding and decoding (like 
codecs.StreamReaderWriter
> > and codecs.StreamRecoder do). Decoding callbacks have a
> > different API. Which arguments should be passed to the
> > decoding callback, and what is the decoding callback
> > supposed to do?
>
> I'd suggest adding another set of PyCodec_UnicodeDecode...
()
> APIs for this. We'd then have to augment the base classes 
of
> the StreamCodecs to provide two attributes for .errors 
with
> a fallback solution for the string case (i.s. "strict" can
> still be used for both directions).

Sounds good. Now what is the decoding callback supposed to 
do?
I guess it will be called in the same way as the encoding
callback, i.e. with encoding name, original string and
position of the error. It might returns a Unicode string
(i.e. an object of the decoding target type), that will be
emitted from the codec instead of the one offending byte. Or
it might return a tuple with replacement Unicode object and
a resynchronisation offset, i.e. returning (u"?", 1) means
emit a '?' and skip the offending character. But to make
the offset really useful the callback has to know something
about the encoding, perhaps the codec should be allowed to
pass an additional state object to the callback?

Maybe the same should be added to the encoding callbacks to?
Maybe the encoding callback should be able to tell the
encoder if the replacement returned should be reencoded
(in which case it's a Unicode object), or directly emitted
(in which case it's an 8bit string)?

> > One additional note: It is vital that errors is an
> > assignable attribute of the StreamWriter.
>
> It is already !

I know, but IMHO it should be documented that an assignable
errors attribute must be supported as part of the official
codec API.

Misc/unicode.txt is not clear on that:
"""
It is not required by the Unicode implementation to use 
these base classes, only the interfaces must match; this 
allows writing Codecs as extension types.
"""

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 01:05

Message:
Logged In: YES 
user_id=38388

> How the callbacks work:
> 
> A PyObject * named errors is passed in. This may by NULL,
> Py_None, 'strict', u'strict', 'ignore', u'ignore',
> 'replace', u'replace' or a callable object.
> PyCodec_EncodeHandlerForObject maps all of these objects
to
> one of the three builtin error callbacks
> PyCodec_RaiseEncodeErrors (raises an exception),
> PyCodec_IgnoreEncodeErrors (returns an empty replacement
> string, in effect ignoring the error),
> PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
> replacement character to signify to the encoder that it
> should choose a suitable replacement character) or
directly
> returns errors if it is a callable object. When an
> unencodable character is encounterd the error handling
> callback will be called with the encoding name, the
original
> unicode object and the error position and must return a
> unicode object that will be encoded instead of the
offending
> character (or the callback may of course raise an
> exception). U+FFFD characters in the replacement string
will
> be replaced with a character that the encoder chooses ('?'
> in all cases).

Nice.
 
> The implementation of the loop through the string is done
in
> the following way. A stack with two strings is kept and
the
> loop always encodes a character from the string at the
> stacktop. If an error is encountered and the stack has
only
> one entry (during encoding of the original string) the
> callback is called and the unicode object returned is
pushed
> on the stack, so the encoding continues with the
replacement
> string. If the stack has two entries when an error is
> encountered, the replacement string itself has an
> unencodable character and a normal exception raised. When
> the encoder has reached the end of it's current string
there
> are two possibilities: when the stack contains two
entries,
> this was the replacement string, so the replacement string
> will be poppep from the stack and encoding continues with
> the next character from the original string. If the stack
> had only one entry, encoding is finished.

Very elegant solution !
 
> (I hope that's enough explanation of the API and
implementation)

Could you add these docs to the Misc/unicode.txt file ? I
will eventually take that file and turn it into a PEP which
will then serve as general documentation for these things.
 
> I have renamed the static ...121 function to all lowercase
> names.

Ok.
 
> BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> replacement callback.

Hmm, wouldn't that result in a slowdown ? If so, I'd rather
leave the special encoder in place, since it is being used a
lot in Python and probably some applications too.
 
> PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
> PyCodec_ReplaceEncodeErrors are globally visible because
> they have to be available in _codecsmodule.c to wrap them
as
> Python function objects, but they can't be implemented in
> _codecsmodule, because they need to be available to the
> encoders in unicodeobject.c (through
> PyCodec_EncodeHandlerForObject), but importing the codecs
> module might result in an endless recursion, because
> importing a module requires unpickling of the bytecode,
> which might require decoding utf8, which ... (but this
will
> only happen, if we implement the same mechanism for the
> decoding API)

I think that codecs.c is the right place for these APIs.
_codecsmodule.c is only meant as Python access wrapper for
the internal codecs and nothing more. 

One thing I noted about the callbacks: they assume that they
will always get Unicode objects as input. This is certainly
not true in the general case (it is for the codecs you touch
in the patch). 

I think it would be worthwhile to rename the callbacks to
include "Unicode" somewhere, e.g.
PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but
then it points out the application field of the callback
rather well. Same for the callbacks exposed through the
_codecsmodule.

> I have not touched PyUnicode_TranslateCharmap yet,
> should this function also support error callbacks? Why
would
> one want the insert None into the mapping to call the
callback?

1. Yes.
2. The user may want to e.g. restrict usage of certain
character ranges. In this case the codec would be used to
verify the input and an exception would indeed be useful
(e.g. say you want to restrict input to Hangul + ASCII).
 
> A remaining problem is how to implement decoding error
> callbacks. In Python 2.1 encoding and decoding errors are
> handled in the same way with a string value. But with
> callbacks it doesn't make sense to use the same callback
for
> encoding and decoding (like codecs.StreamReaderWriter and
> codecs.StreamRecoder do). Decoding callbacks have a
> different API. Which arguments should be passed to the
> decoding callback, and what is the decoding callback
> supposed to do?

I'd suggest adding another set of PyCodec_UnicodeDecode...()
APIs for this. We'd then have to augment the base classes of
the StreamCodecs to provide two attributes for .errors with
a fallback solution for the string case (i.s. "strict" can
still be used for both directions).

> One additional note: It is vital that errors is an
> assignable attribute of the StreamWriter.

It is already !
 
> Consider the XML example: For writing an XML DOM tree one
> StreamWriter object is used. When a text node is written,
> the error handling has to be set to
> codecs.xmlreplace_encode_errors, but inside a comment or
> processing instruction replacing unencodable characters
with
> charrefs is not possible, so here
codecs.raise_encode_errors
> should be used (or better a custom error handler that
raises
> an error that says "sorry, you can't have unencodable
> characters inside a comment")

Sure.
 
> BTW, should we continue the discussion in the i18n SIG
> mailing list? An email program is much more comfortable
than
> a HTML textarea! ;)

I'd rather keep the discussions on this patch here --
forking it off to the i18n sig will make it very hard to
follow up on it. (This HTML area is indeed damn small ;-)
 

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 12:18

Message:
Logged In: YES 
user_id=89016

One additional note: It is vital that errors is an
assignable attribute of the StreamWriter. 

Consider the XML example: For writing an XML DOM tree one
StreamWriter object is used. When a text node is written,
the error handling has to be set to
codecs.xmlreplace_encode_errors, but inside a comment or
processing instruction replacing unencodable characters with
charrefs is not possible, so here codecs.raise_encode_errors
should be used (or better a custom error handler that raises
an error that says "sorry, you can't have unencodable
characters inside a comment")

BTW, should we continue the discussion in the i18n SIG
mailing list? An email program is much more comfortable than
a HTML textarea! ;)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 11:59

Message:
Logged In: YES 
user_id=89016

How the callbacks work:

A PyObject * named errors is passed in. This may by NULL,
Py_None, 'strict', u'strict', 'ignore', u'ignore',
'replace', u'replace' or a callable object.
PyCodec_EncodeHandlerForObject maps all of these objects to
one of the three builtin error callbacks
PyCodec_RaiseEncodeErrors (raises an exception),
PyCodec_IgnoreEncodeErrors (returns an empty replacement
string, in effect ignoring the error),
PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
replacement character to signify to the encoder that it
should choose a suitable replacement character) or directly
returns errors if it is a callable object. When an
unencodable character is encounterd the error handling
callback will be called with the encoding name, the original
unicode object and the error position and must return a
unicode object that will be encoded instead of the offending
character (or the callback may of course raise an
exception). U+FFFD characters in the replacement string will 
be replaced with a character that the encoder chooses ('?'
in all cases).

The implementation of the loop through the string is done in
the following way. A stack with two strings is kept and the
loop always encodes a character from the string at the
stacktop. If an error is encountered and the stack has only
one entry (during encoding of the original string) the
callback is called and the unicode object returned is pushed
on the stack, so the encoding continues with the replacement
string. If the stack has two entries when an error is
encountered, the replacement string itself has an
unencodable character and a normal exception raised. When
the encoder has reached the end of it's current string there
are two possibilities: when the stack contains two entries,
this was the replacement string, so the replacement string
will be poppep from the stack and encoding continues with
the next character from the original string. If the stack
had only one entry, encoding is finished.

(I hope that's enough explanation of the API and implementation)

I have renamed the static ...121 function to all lowercase
names.

BTW, I guess PyUnicode_EncodeUnicodeEscape could be
reimplemented as PyUnicode_EncodeASCII with a \uxxxx
replacement callback.

PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
PyCodec_ReplaceEncodeErrors are globally visible because
they have to be available in _codecsmodule.c to wrap them as
Python function objects, but they can't be implemented in
_codecsmodule, because they need to be available to the
encoders in unicodeobject.c (through
PyCodec_EncodeHandlerForObject), but importing the codecs
module might result in an endless recursion, because
importing a module requires unpickling of the bytecode,
which might require decoding utf8, which ... (but this will
only happen, if we implement the same mechanism for the
decoding API)

I have not touched PyUnicode_TranslateCharmap yet, 
should this function also support error callbacks? Why would
one want the insert None into the mapping to call the callback?

A remaining problem is how to implement decoding error
callbacks. In Python 2.1 encoding and decoding errors are
handled in the same way with a string value. But with
callbacks it doesn't make sense to use the same callback for
encoding and decoding (like codecs.StreamReaderWriter and
codecs.StreamRecoder do). Decoding callbacks have a
different API. Which arguments should be passed to the
decoding callback, and what is the decoding callback
supposed to do?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 11:00

Message:
Logged In: YES 
user_id=38388

About the Py_UNICODE*data, int size APIs:
Ok, point taken.

In general, I think we ought to keep the callback feature as
open as possible, so passing in pointers and sizes would not
be very useful.

BTW, could you summarize how the callback works in a few
lines ?

About _Encode121: I'd name this _EncodeUCS1 since that's
what it is ;-)

About the new functions: I was referring to the new static
functions which you gave PyUnicode_... names. If these are
not supposed to turn into non-static functions, I'd rather
have them use lower case names (since that's how the Python
internals work too -- most of the times).


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:56

Message:
Logged In: YES 
user_id=89016

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments
> --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

Another problem is, that the callback requires a Python
object, so in the PyObject *version, the refcount is
incref'd and the object is passed to the callback. The
Py_UNICODE*/int version would have to create a new Unicode
object from the data.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:32

Message:
Logged In: YES 
user_id=89016

> * please don't place more than one C statement on one line
> like in:
> """
> +               unicode = unicode2; unicodepos =
> unicode2pos;
> +               unicode2 = NULL; unicode2pos = 0;
> """

OK, done!

> * Comments should start with a capital letter and be
> prepended
> to the section they apply to

Fixed!

> * There should be spaces between arguments in compares
> (a == b) not (a==b)

Fixed!

> * Where does the name "...Encode121" originate ?

encode one-to-one, it implements both ASCII and latin-1
encoding.

> * module internal APIs should use lower case names (you
> converted some of these to  PyUnicode_...() -- this is
> normally reserved for APIs which are either marked as
> potential candidates for the public API or are very
> prominent in the code)

Which ones? I introduced a new function for every old one,
that had a "const char *errors" argument, and a few new ones
in codecs.h, of those PyCodec_EncodeHandlerForObject is
vital, because it is used to map for old string arguments to
the new function objects. PyCodec_RaiseEncodeErrors can be
used in the encoder implementation to raise an encode error,
but it could be made static in unicodeobject.h so only those
encoders implemented there have access to it.

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments > --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

I look through the code and found no situation where the
Py_UNICODE*/int version is really used and having two
(PyObject *)s (the original and the replacement string),
instead of UNICODE*/int and PyObject * made the
implementation a little easier, but I can fix that.

> Please separate the errors.c patch from this patch -- it
> seems totally unrelated to Unicode.

PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with
four hex digits. I removed it.

I'll upload a revised patch as soon as it's done.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 07:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Wed Jun 13 17:25:51 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 13 Jun 2001 09:25:51 -0700
Subject: [Patches] [ python-Patches-432183 ] PEP-259: skip printing newline*2
Message-ID: <E15ADSp-0005WH-00@usw-sf-web3.sourceforge.net>

Patches item #432183, was updated on 2001-06-11 13:00
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432183&group_id=5470

Category: core (C code)
Group: None
Status: Open
>Resolution: Rejected
Priority: 5
Submitted By: Guido van Rossum (gvanrossum)
Assigned to: Nobody/Anonymous (nobody)
Summary: PEP-259: skip printing newline*2

Initial Comment:
See PEP 259 (to be checked in soon).

This suppresses the printing of an extra newline when
the last item printed is a string ending in a newline.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-13 09:25

Message:
Logged In: YES 
user_id=6380

This was unanimously rejected by the user community, so I'm
dropping the idea.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2001-06-10 02:12

Message:
Logged In: YES 
user_id=6656

I think you also want:

Index: code.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/code.py,v
retrieving revision 1.16
diff -c -1 -r1.16 code.py
*** code.py     2001/05/03 04:58:49     1.16
--- code.py     2001/06/11 22:11:29
***************
*** 106,108 ****
          else:
!             if softspace(sys.stdout, 0):
                  print
--- 106,108 ----
          else:
!             if softspace(sys.stdout, 0) >= 0:
                  print

(not tested)


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432183&group_id=5470


From noreply@sourceforge.net  Wed Jun 13 17:26:09 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 13 Jun 2001 09:26:09 -0700
Subject: [Patches] [ python-Patches-432183 ] PEP-259: skip printing newline*2
Message-ID: <E15ADT7-00039y-00@usw-sf-web1.sourceforge.net>

Patches item #432183, was updated on 2001-06-11 13:00
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432183&group_id=5470

Category: core (C code)
Group: None
>Status: Closed
Resolution: Rejected
Priority: 5
Submitted By: Guido van Rossum (gvanrossum)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: PEP-259: skip printing newline*2

Initial Comment:
See PEP 259 (to be checked in soon).

This suppresses the printing of an extra newline when
the last item printed is a string ending in a newline.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-13 09:25

Message:
Logged In: YES 
user_id=6380

This was unanimously rejected by the user community, so I'm
dropping the idea.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2001-06-10 02:12

Message:
Logged In: YES 
user_id=6656

I think you also want:

Index: code.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/code.py,v
retrieving revision 1.16
diff -c -1 -r1.16 code.py
*** code.py     2001/05/03 04:58:49     1.16
--- code.py     2001/06/11 22:11:29
***************
*** 106,108 ****
          else:
!             if softspace(sys.stdout, 0):
                  print
--- 106,108 ----
          else:
!             if softspace(sys.stdout, 0) >= 0:
                  print

(not tested)


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432183&group_id=5470


From noreply@sourceforge.net  Wed Jun 13 18:00:58 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 13 Jun 2001 10:00:58 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E15AE0o-000685-00@usw-sf-web3.sourceforge.net>

Patches item #432401, was updated on 2001-06-12 06:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 10:00

Message:
Logged In: YES 
user_id=38388

On your comment about the non-Unicode codecs: let's keep
this separated from the current patch.

Don't have much time today. I'll comment on the other things
tomorrow.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 08:49

Message:
Logged In: YES 
user_id=89016

Guido van Rossum wrote in python-dev:

> True, the "codec" pattern can be used for other 
> encodings than Unicode.  But it seems to me that the
> entire codecs architecture is rather strongly geared
> towards en/decoding Unicode, and it's not clear
> how well other codecs fit in this pattern (e.g. I 
> noticed that all the non-Unicode codecs ignore the 
> error handling parameter or assert that
> it is set to 'strict').

I noticed that too. asserting that errors=='strict' would 
mean that the encoder is not able to deal in any other way 
with unencodable stuff than by raising an error. But that 
is not the problem here, because for zlib, base64, quopri, 
hex and uu encoding there can be no unencodable characters. 
The encoders can simply ignore the errors parameter. Should 
I remove the asserts from those codecs and change the 
docstrings accordingly, or will this be done separately?


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 06:57

Message:
Logged In: YES 
user_id=89016

> > [...]
> > raise an exception). U+FFFD characters in the 
replacement
> > string will be replaced with a character that the 
encoder
> > chooses ('?' in all cases).
>
> Nice.

But the special casing of U+FFFD makes the interface 
somewhat
less clean than it could be. It was only done to be 100%
backwards compatible. With the original "replace" error
handling the codec chose the replacement character. But as
far as I can tell none of the codecs uses anything other
than '?', so I guess we could change the replace handler
to always return u'?'. This would make the implementation a
little bit simpler, but the explanation of the callback
feature *a lot* simpler. And if you still want to handle
an unencodable U+FFFD, you can write a special callback for
that, e.g.

def FFFDreplace(enc, uni, pos):
if uni[pos] == "\ufffd":
return u"?"
else:
raise UnicodeError(...)

> > The implementation of the loop through the string is 
done
> > in the following way. A stack with two strings is kept
> > and the loop always encodes a character from the string
> > at the stacktop. If an error is encountered and the 
stack
> > has only one entry (during encoding of the original 
string)
> > the callback is called and the unicode object returned 
is
> > pushed on the stack, so the encoding continues with the
> > replacement string. If the stack has two entries when an
> > error is encountered, the replacement string itself has
> > an unencodable character and a normal exception raised.
> > When the encoder has reached the end of it's current 
string
> > there are two possibilities: when the stack contains two
> > entries, this was the replacement string, so the 
replacement
> > string will be poppep from the stack and encoding 
continues
> > with the next character from the original string. If the
> > stack had only one entry, encoding is finished.
>
> Very elegant solution !

I'll put it as a comment in the source.

> > (I hope that's enough explanation of the API and
> implementation)
>
> Could you add these docs to the Misc/unicode.txt file ? I
> will eventually take that file and turn it into a PEP 
which
> will then serve as general documentation for these things.

I could, but first we should work out how the decoding
callback API will work.

> > I have renamed the static ...121 function to all 
lowercase
> > names.
>
> Ok.
>
> > BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> > reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> > replacement callback.
>
> Hmm, wouldn't that result in a slowdown ? If so, I'd 
rather
> leave the special encoder in place, since it is being 
used a
> lot in Python and probably some applications too.

It would be a slowdown. But callbacks open many 
possiblities.

For example:

   Why can't I print u"gürk"?

is probably one of the most frequently asked questions in
comp.lang.python. For printing Unicode stuff, print could be
extended the use an error handling callback for Unicode 
strings (or objects where __str__ or tp_str returns a 
Unicode object) instead of using str() which always returns 
an 8bit string and uses strict encoding. There might even 
be a
sys.setprintencodehandler()/sys.getprintencodehandler()

> [...]
> I think it would be worthwhile to rename the callbacks to
> include "Unicode" somewhere, e.g.
> PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, 
but
> then it points out the application field of the callback
> rather well. Same for the callbacks exposed through the
> _codecsmodule.

OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors
really is a long name ;))

> > I have not touched PyUnicode_TranslateCharmap yet,
> > should this function also support error callbacks? Why
> > would one want the insert None into the mapping to call
> > the callback?
>
> 1. Yes.
> 2. The user may want to e.g. restrict usage of certain
> character ranges. In this case the codec would be used to
> verify the input and an exception would indeed be useful
> (e.g. say you want to restrict input to Hangul + ASCII).

OK, do we want TranslateCharmap to work exactly like 
encoding,
i.e. in case of an error should the returned replacement
string again be mapped through the translation mapping or
should it be copied to the output directly? The former would
be more in line with encoding, but IMHO the latter would
be much more useful.

BTW, when I implement it I can implement patch #403100
("Multicharacter replacements in 
PyUnicode_TranslateCharmap")
along the way.

Should the old TranslateCharmap map to the new 
TranslateCharmapEx
and inherit the "multicharacter replacement" feature, or
should I leave it as it is?

> > A remaining problem is how to implement decoding error
> > callbacks. In Python 2.1 encoding and decoding errors 
are
> > handled in the same way with a string value. But with
> > callbacks it doesn't make sense to use the same callback
> > for encoding and decoding (like 
codecs.StreamReaderWriter
> > and codecs.StreamRecoder do). Decoding callbacks have a
> > different API. Which arguments should be passed to the
> > decoding callback, and what is the decoding callback
> > supposed to do?
>
> I'd suggest adding another set of PyCodec_UnicodeDecode...
()
> APIs for this. We'd then have to augment the base classes 
of
> the StreamCodecs to provide two attributes for .errors 
with
> a fallback solution for the string case (i.s. "strict" can
> still be used for both directions).

Sounds good. Now what is the decoding callback supposed to 
do?
I guess it will be called in the same way as the encoding
callback, i.e. with encoding name, original string and
position of the error. It might returns a Unicode string
(i.e. an object of the decoding target type), that will be
emitted from the codec instead of the one offending byte. Or
it might return a tuple with replacement Unicode object and
a resynchronisation offset, i.e. returning (u"?", 1) means
emit a '?' and skip the offending character. But to make
the offset really useful the callback has to know something
about the encoding, perhaps the codec should be allowed to
pass an additional state object to the callback?

Maybe the same should be added to the encoding callbacks to?
Maybe the encoding callback should be able to tell the
encoder if the replacement returned should be reencoded
(in which case it's a Unicode object), or directly emitted
(in which case it's an 8bit string)?

> > One additional note: It is vital that errors is an
> > assignable attribute of the StreamWriter.
>
> It is already !

I know, but IMHO it should be documented that an assignable
errors attribute must be supported as part of the official
codec API.

Misc/unicode.txt is not clear on that:
"""
It is not required by the Unicode implementation to use 
these base classes, only the interfaces must match; this 
allows writing Codecs as extension types.
"""

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 01:05

Message:
Logged In: YES 
user_id=38388

> How the callbacks work:
> 
> A PyObject * named errors is passed in. This may by NULL,
> Py_None, 'strict', u'strict', 'ignore', u'ignore',
> 'replace', u'replace' or a callable object.
> PyCodec_EncodeHandlerForObject maps all of these objects
to
> one of the three builtin error callbacks
> PyCodec_RaiseEncodeErrors (raises an exception),
> PyCodec_IgnoreEncodeErrors (returns an empty replacement
> string, in effect ignoring the error),
> PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
> replacement character to signify to the encoder that it
> should choose a suitable replacement character) or
directly
> returns errors if it is a callable object. When an
> unencodable character is encounterd the error handling
> callback will be called with the encoding name, the
original
> unicode object and the error position and must return a
> unicode object that will be encoded instead of the
offending
> character (or the callback may of course raise an
> exception). U+FFFD characters in the replacement string
will
> be replaced with a character that the encoder chooses ('?'
> in all cases).

Nice.
 
> The implementation of the loop through the string is done
in
> the following way. A stack with two strings is kept and
the
> loop always encodes a character from the string at the
> stacktop. If an error is encountered and the stack has
only
> one entry (during encoding of the original string) the
> callback is called and the unicode object returned is
pushed
> on the stack, so the encoding continues with the
replacement
> string. If the stack has two entries when an error is
> encountered, the replacement string itself has an
> unencodable character and a normal exception raised. When
> the encoder has reached the end of it's current string
there
> are two possibilities: when the stack contains two
entries,
> this was the replacement string, so the replacement string
> will be poppep from the stack and encoding continues with
> the next character from the original string. If the stack
> had only one entry, encoding is finished.

Very elegant solution !
 
> (I hope that's enough explanation of the API and
implementation)

Could you add these docs to the Misc/unicode.txt file ? I
will eventually take that file and turn it into a PEP which
will then serve as general documentation for these things.
 
> I have renamed the static ...121 function to all lowercase
> names.

Ok.
 
> BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> replacement callback.

Hmm, wouldn't that result in a slowdown ? If so, I'd rather
leave the special encoder in place, since it is being used a
lot in Python and probably some applications too.
 
> PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
> PyCodec_ReplaceEncodeErrors are globally visible because
> they have to be available in _codecsmodule.c to wrap them
as
> Python function objects, but they can't be implemented in
> _codecsmodule, because they need to be available to the
> encoders in unicodeobject.c (through
> PyCodec_EncodeHandlerForObject), but importing the codecs
> module might result in an endless recursion, because
> importing a module requires unpickling of the bytecode,
> which might require decoding utf8, which ... (but this
will
> only happen, if we implement the same mechanism for the
> decoding API)

I think that codecs.c is the right place for these APIs.
_codecsmodule.c is only meant as Python access wrapper for
the internal codecs and nothing more. 

One thing I noted about the callbacks: they assume that they
will always get Unicode objects as input. This is certainly
not true in the general case (it is for the codecs you touch
in the patch). 

I think it would be worthwhile to rename the callbacks to
include "Unicode" somewhere, e.g.
PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but
then it points out the application field of the callback
rather well. Same for the callbacks exposed through the
_codecsmodule.

> I have not touched PyUnicode_TranslateCharmap yet,
> should this function also support error callbacks? Why
would
> one want the insert None into the mapping to call the
callback?

1. Yes.
2. The user may want to e.g. restrict usage of certain
character ranges. In this case the codec would be used to
verify the input and an exception would indeed be useful
(e.g. say you want to restrict input to Hangul + ASCII).
 
> A remaining problem is how to implement decoding error
> callbacks. In Python 2.1 encoding and decoding errors are
> handled in the same way with a string value. But with
> callbacks it doesn't make sense to use the same callback
for
> encoding and decoding (like codecs.StreamReaderWriter and
> codecs.StreamRecoder do). Decoding callbacks have a
> different API. Which arguments should be passed to the
> decoding callback, and what is the decoding callback
> supposed to do?

I'd suggest adding another set of PyCodec_UnicodeDecode...()
APIs for this. We'd then have to augment the base classes of
the StreamCodecs to provide two attributes for .errors with
a fallback solution for the string case (i.s. "strict" can
still be used for both directions).

> One additional note: It is vital that errors is an
> assignable attribute of the StreamWriter.

It is already !
 
> Consider the XML example: For writing an XML DOM tree one
> StreamWriter object is used. When a text node is written,
> the error handling has to be set to
> codecs.xmlreplace_encode_errors, but inside a comment or
> processing instruction replacing unencodable characters
with
> charrefs is not possible, so here
codecs.raise_encode_errors
> should be used (or better a custom error handler that
raises
> an error that says "sorry, you can't have unencodable
> characters inside a comment")

Sure.
 
> BTW, should we continue the discussion in the i18n SIG
> mailing list? An email program is much more comfortable
than
> a HTML textarea! ;)

I'd rather keep the discussions on this patch here --
forking it off to the i18n sig will make it very hard to
follow up on it. (This HTML area is indeed damn small ;-)
 

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 12:18

Message:
Logged In: YES 
user_id=89016

One additional note: It is vital that errors is an
assignable attribute of the StreamWriter. 

Consider the XML example: For writing an XML DOM tree one
StreamWriter object is used. When a text node is written,
the error handling has to be set to
codecs.xmlreplace_encode_errors, but inside a comment or
processing instruction replacing unencodable characters with
charrefs is not possible, so here codecs.raise_encode_errors
should be used (or better a custom error handler that raises
an error that says "sorry, you can't have unencodable
characters inside a comment")

BTW, should we continue the discussion in the i18n SIG
mailing list? An email program is much more comfortable than
a HTML textarea! ;)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 11:59

Message:
Logged In: YES 
user_id=89016

How the callbacks work:

A PyObject * named errors is passed in. This may by NULL,
Py_None, 'strict', u'strict', 'ignore', u'ignore',
'replace', u'replace' or a callable object.
PyCodec_EncodeHandlerForObject maps all of these objects to
one of the three builtin error callbacks
PyCodec_RaiseEncodeErrors (raises an exception),
PyCodec_IgnoreEncodeErrors (returns an empty replacement
string, in effect ignoring the error),
PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
replacement character to signify to the encoder that it
should choose a suitable replacement character) or directly
returns errors if it is a callable object. When an
unencodable character is encounterd the error handling
callback will be called with the encoding name, the original
unicode object and the error position and must return a
unicode object that will be encoded instead of the offending
character (or the callback may of course raise an
exception). U+FFFD characters in the replacement string will 
be replaced with a character that the encoder chooses ('?'
in all cases).

The implementation of the loop through the string is done in
the following way. A stack with two strings is kept and the
loop always encodes a character from the string at the
stacktop. If an error is encountered and the stack has only
one entry (during encoding of the original string) the
callback is called and the unicode object returned is pushed
on the stack, so the encoding continues with the replacement
string. If the stack has two entries when an error is
encountered, the replacement string itself has an
unencodable character and a normal exception raised. When
the encoder has reached the end of it's current string there
are two possibilities: when the stack contains two entries,
this was the replacement string, so the replacement string
will be poppep from the stack and encoding continues with
the next character from the original string. If the stack
had only one entry, encoding is finished.

(I hope that's enough explanation of the API and implementation)

I have renamed the static ...121 function to all lowercase
names.

BTW, I guess PyUnicode_EncodeUnicodeEscape could be
reimplemented as PyUnicode_EncodeASCII with a \uxxxx
replacement callback.

PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
PyCodec_ReplaceEncodeErrors are globally visible because
they have to be available in _codecsmodule.c to wrap them as
Python function objects, but they can't be implemented in
_codecsmodule, because they need to be available to the
encoders in unicodeobject.c (through
PyCodec_EncodeHandlerForObject), but importing the codecs
module might result in an endless recursion, because
importing a module requires unpickling of the bytecode,
which might require decoding utf8, which ... (but this will
only happen, if we implement the same mechanism for the
decoding API)

I have not touched PyUnicode_TranslateCharmap yet, 
should this function also support error callbacks? Why would
one want the insert None into the mapping to call the callback?

A remaining problem is how to implement decoding error
callbacks. In Python 2.1 encoding and decoding errors are
handled in the same way with a string value. But with
callbacks it doesn't make sense to use the same callback for
encoding and decoding (like codecs.StreamReaderWriter and
codecs.StreamRecoder do). Decoding callbacks have a
different API. Which arguments should be passed to the
decoding callback, and what is the decoding callback
supposed to do?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 11:00

Message:
Logged In: YES 
user_id=38388

About the Py_UNICODE*data, int size APIs:
Ok, point taken.

In general, I think we ought to keep the callback feature as
open as possible, so passing in pointers and sizes would not
be very useful.

BTW, could you summarize how the callback works in a few
lines ?

About _Encode121: I'd name this _EncodeUCS1 since that's
what it is ;-)

About the new functions: I was referring to the new static
functions which you gave PyUnicode_... names. If these are
not supposed to turn into non-static functions, I'd rather
have them use lower case names (since that's how the Python
internals work too -- most of the times).


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:56

Message:
Logged In: YES 
user_id=89016

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments
> --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

Another problem is, that the callback requires a Python
object, so in the PyObject *version, the refcount is
incref'd and the object is passed to the callback. The
Py_UNICODE*/int version would have to create a new Unicode
object from the data.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:32

Message:
Logged In: YES 
user_id=89016

> * please don't place more than one C statement on one line
> like in:
> """
> +               unicode = unicode2; unicodepos =
> unicode2pos;
> +               unicode2 = NULL; unicode2pos = 0;
> """

OK, done!

> * Comments should start with a capital letter and be
> prepended
> to the section they apply to

Fixed!

> * There should be spaces between arguments in compares
> (a == b) not (a==b)

Fixed!

> * Where does the name "...Encode121" originate ?

encode one-to-one, it implements both ASCII and latin-1
encoding.

> * module internal APIs should use lower case names (you
> converted some of these to  PyUnicode_...() -- this is
> normally reserved for APIs which are either marked as
> potential candidates for the public API or are very
> prominent in the code)

Which ones? I introduced a new function for every old one,
that had a "const char *errors" argument, and a few new ones
in codecs.h, of those PyCodec_EncodeHandlerForObject is
vital, because it is used to map for old string arguments to
the new function objects. PyCodec_RaiseEncodeErrors can be
used in the encoder implementation to raise an encode error,
but it could be made static in unicodeobject.h so only those
encoders implemented there have access to it.

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments > --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

I look through the code and found no situation where the
Py_UNICODE*/int version is really used and having two
(PyObject *)s (the original and the replacement string),
instead of UNICODE*/int and PyObject * made the
implementation a little easier, but I can fix that.

> Please separate the errors.c patch from this patch -- it
> seems totally unrelated to Unicode.

PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with
four hex digits. I removed it.

I'll upload a revised patch as soon as it's done.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 07:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Thu Jun 14 21:26:13 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 14 Jun 2001 13:26:13 -0700
Subject: [Patches] [ python-Patches-400938 ] [Draft] libpython as shared library (.so) on Linux
Message-ID: <E15Adgz-0000nY-00@usw-sf-web1.sourceforge.net>

Patches item #400938, was updated on 2000-07-19 13:55
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=400938&group_id=5470

Category: None
Group: None
Status: Open
Resolution: Out of Date
Priority: 5
Submitted By: Gregor Hoffleit (flight)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: [Draft] libpython as shared library (.so) on Linux

Initial Comment:
 

----------------------------------------------------------------------

Comment By: Gregor Hoffleit (flight)
Date: 2001-06-12 15:19

Message:
Logged In: YES 
user_id=5293

> Now we're just waiting for someone to produce a working
patch.
> Or is there one already?

I'm currently distributing experimental packages of Python
2.1 for Debian. The packages include a hack to build
libpython2.1 as .so for Linux.

The shared library patch currently is buried in a big diff
file. You can get it as
http://people.debian.org/~flight/python2/python2_2.1-0.diff.gz

This is only a starting point for a real patch!

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-12 09:54

Message:
Logged In: YES 
user_id=6380

Reopening -- this keeps being requested.  Now we're just
waiting for someone to produce a working patch.

Or is there one already?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-12 09:51

Message:
Logged In: YES 
user_id=6380

Reopening -- this keeps being requested.  Now we're just
waiting for someone to produce a working patch.

Or is there one already?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2001-03-21 15:59

Message:
Logged In: YES 
user_id=35752

We're going to have to create a new patch to do this.  This
one is
way too out of date.  Maybe for 2.2.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-01-19 14:46

Message:
I'm reassigning this to Neil.

Neil, can you see if you can integrate this into your flat Makefile?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-01-17 15:09

Message:
Andrew, I'm tentatively reassigning this to you, since you're taking charge of the build process at the moment (setup.py).

I suspect that the patch no longer works as is -- would it make sense to mark it postponed and get the author to submit a new version before we release 2.1a1?


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-01-17 14:46

Message:
Getting this patch into the next version of Python would be "A Good Thing"(tm) in my opinion.  We use libpython as a .so at ILM and end up having to make changes like this by hand every time we get a new version...

----------------------------------------------------------------------

Comment By: Moshe Zadka (moshez)
Date: 2000-11-01 03:32

Message:
I've had a look at the patch, and it seems it has
two orthogonal parts. One is adding the infrastructure
for compiling another version for the Python library, which can be more or less integrated as-is, and one is hard-coding the particular way, in Linux, of building shared objects. Since we discover how to build shared objects in the configure script anyway (otherwise we could not have built modules as shared objects), we should embed that information there, not the Linux flags.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2000-10-26 14:13

Message:
Let's give this to Jeremy instead, because he seems to know more about build issues. Jeremy, it would be good to look into getting this to work with your RPM suite. Flight's argument (has been used without complaints in Debian Python 1.5.2 since 1999) is good.

----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2000-08-23 09:26

Message:
In the absence of anyone arguing for inclusion of this patch and a one-week idle period, it is postponed.

----------------------------------------------------------------------

Comment By: Moshe Zadka (moshez)
Date: 2000-08-16 00:40

Message:
I suggest we postpone it. It isn't really complete (only works on real distributions <wink>), and the complete solution should work on all unices. If Tcl/Perl can do it, there is no reason Python can't -- and a half hearted solution isn't that good. flight, you should use this
for the Python in woody in the mean time -- I doubt woody
will be stable before Python 2.1 comes out, so 2.1 sounds
like a good timeframe to do it.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2000-08-15 10:52

Message:
Assigned to Barry because he's a Linux weenie.  Barry, if you think there's something here that should go into 2.0, please pursue it now, else change the status to Postponed.

----------------------------------------------------------------------

Comment By: Gregor Hoffleit (flight)
Date: 2000-07-19 14:10

Message:
This is what it used in product to build libpython as shared library(.so) for Debian.

Note: This patch is not ready for inclusion in the upstream Python distribution. Anyway, I think this might be a start. The Python 1.5 executable in Debian GNU/Linux is built against a shared libpython1.5.so since April 1999, and I haven't yet heard about any problems.

Using a shared library should have an advantage if you're running multiple instances of Python (be it standalone interpreter or embedded applications).


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=400938&group_id=5470


From noreply@sourceforge.net  Thu Jun 14 21:30:41 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 14 Jun 2001 13:30:41 -0700
Subject: [Patches] [ python-Patches-433233 ] 2.0.1c1: statcache.py is broken (syntax)
Message-ID: <E15AdlJ-0000rJ-00@usw-sf-web1.sourceforge.net>

Patches item #433233, was updated on 2001-06-14 13:30
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433233&group_id=5470

Category: library
Group: 2.0.1 bugfix
Status: Open
Resolution: None
Priority: 5
Submitted By: Gregor Hoffleit (flight)
Assigned to: Nobody/Anonymous (nobody)
Summary: 2.0.1c1: statcache.py is broken (syntax)

Initial Comment:
In 2.0.1c1, statcache.py is broken (bad indentation).
The file won't load.

This problem does only exist in the
release20-maint/r201c1 branch (introduced in revision
1.7.4).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433233&group_id=5470


From noreply@sourceforge.net  Fri Jun 15 17:44:16 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 15 Jun 2001 09:44:16 -0700
Subject: [Patches] [ python-Patches-433233 ] 2.0.1c1: statcache.py is broken (syntax)
Message-ID: <E15Awhk-0006n0-00@usw-sf-web2.sourceforge.net>

Patches item #433233, was updated on 2001-06-14 13:30
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433233&group_id=5470

Category: library
Group: 2.0.1 bugfix
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Gregor Hoffleit (flight)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: 2.0.1c1: statcache.py is broken (syntax)

Initial Comment:
In 2.0.1c1, statcache.py is broken (bad indentation).
The file won't load.

This problem does only exist in the
release20-maint/r201c1 branch (introduced in revision
1.7.4).

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-15 09:44

Message:
Logged In: YES 
user_id=6380

Thanks -- fixed now!

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433233&group_id=5470


From noreply@sourceforge.net  Fri Jun 15 20:48:39 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 15 Jun 2001 12:48:39 -0700
Subject: [Patches] [ python-Patches-433537 ] better cross-compilation support
Message-ID: <E15AzaB-000236-00@usw-sf-web2.sourceforge.net>

Patches item #433537, was updated on 2001-06-15 12:48
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433537&group_id=5470

Category: Build
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: michael shiplett (walrusmonkey)
Assigned to: Nobody/Anonymous (nobody)
Summary: better cross-compilation support

Initial Comment:
configure.in uses AC_TRY_RUN in several places without allowing for cached values to allow for cross-compilation. this patch uses the same approach as other parts of configure.in use.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433537&group_id=5470


From noreply@sourceforge.net  Sat Jun 16 02:12:02 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 15 Jun 2001 18:12:02 -0700
Subject: [Patches] [ python-Patches-433619 ] NAMESPACE support in imaplib.py
Message-ID: <E15B4d8-0007pM-00@usw-sf-web2.sourceforge.net>

Patches item #433619, was updated on 2001-06-15 18:12
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Michel Pelletier (michel)
Assigned to: Nobody/Anonymous (nobody)
Summary: NAMESPACE support in imaplib.py

Initial Comment:
Support for the IMAP NAMESPACE extension defined in rfc
2342.  This is almost a necessity for working with
modern IMAP servers.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470


From noreply@sourceforge.net  Sat Jun 16 02:18:01 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 15 Jun 2001 18:18:01 -0700
Subject: [Patches] [ python-Patches-433619 ] NAMESPACE support in imaplib.py
Message-ID: <E15B4iv-0000RV-00@usw-sf-web1.sourceforge.net>

Patches item #433619, was updated on 2001-06-15 18:12
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470

>Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Michel Pelletier (michel)
Assigned to: Nobody/Anonymous (nobody)
Summary: NAMESPACE support in imaplib.py

Initial Comment:
Support for the IMAP NAMESPACE extension defined in rfc
2342.  This is almost a necessity for working with
modern IMAP servers.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470


From noreply@sourceforge.net  Sat Jun 16 09:12:51 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 16 Jun 2001 01:12:51 -0700
Subject: [Patches] [ python-Patches-432325 ] \versionadded{2.2} in libstruct.tex
Message-ID: <E15BBCN-0000O8-00@usw-sf-web1.sourceforge.net>

Patches item #432325, was updated on 2001-06-11 23:23
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432325&group_id=5470

Category: documentation
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Peter Funk (pefu)
Assigned to: Nobody/Anonymous (nobody)
Summary: \versionadded{2.2} in libstruct.tex

Initial Comment:
Tim Peters:
> Modified Files:
> 	libstruct.tex 
> Log Message:
> Added q/Q standard (x-platform 8-byte ints) mode in
struct module.
[...]

Hmmmm.... You probably forgot the \versionadded{2.2}
note? 


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-16 01:12

Message:
Logged In: YES 
user_id=21627

This patch is already in libstruct.tex 1.29.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432325&group_id=5470


From noreply@sourceforge.net  Sat Jun 16 09:16:45 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 16 Jun 2001 01:16:45 -0700
Subject: [Patches] [ python-Patches-407093 ] urllib2 correction of typos
Message-ID: <E15BBG9-0000RE-00@usw-sf-web1.sourceforge.net>

Patches item #407093, was updated on 2001-03-08 10:03
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=407093&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Eduardo Fernandez Corrales (ejfc)
Assigned to: Jeremy Hylton (jhylton)
Summary: urllib2 correction of typos

Initial Comment:
Bug #406683 "typos in urllib2" includes a patch.

I have modified it so that Basic HTTP Authentication 
works now. (At least with my Squid proxy)


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-16 01:16

Message:
Logged In: YES 
user_id=21627

Where is the patch?


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=407093&group_id=5470


From noreply@sourceforge.net  Sat Jun 16 09:24:11 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 16 Jun 2001 01:24:11 -0700
Subject: [Patches] [ python-Patches-403513 ] Canvas.py fixes
Message-ID: <E15BBNL-0000XH-00@usw-sf-web1.sourceforge.net>

Patches item #403513, was updated on 2001-01-30 12:00
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403513&group_id=5470

Category: Tkinter
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Markus F.X.J. Oberhumer (mfx)
Assigned to: Fredrik Lundh (effbot)
Summary: Canvas.py fixes

Initial Comment:
This fixes Group.lower and Group.tkraise

Markus 
(author of PySol)

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-16 01:24

Message:
Logged In: YES 
user_id=21627

This is fixed in Canvas.py 1.17.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403513&group_id=5470


From noreply@sourceforge.net  Sat Jun 16 17:08:06 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 16 Jun 2001 09:08:06 -0700
Subject: [Patches] [ python-Patches-433619 ] NAMESPACE support in imaplib.py
Message-ID: <E15BIcI-0007Y5-00@usw-sf-web2.sourceforge.net>

Patches item #433619, was updated on 2001-06-15 18:12
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Michel Pelletier (michel)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: NAMESPACE support in imaplib.py

Initial Comment:
Support for the IMAP NAMESPACE extension defined in rfc
2342.  This is almost a necessity for working with
modern IMAP servers.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-16 09:08

Message:
Logged In: YES 
user_id=6380

I'm pinging Piers Lauder about this.  If he approves, I'll
apply it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470


From noreply@sourceforge.net  Sat Jun 16 17:25:19 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 16 Jun 2001 09:25:19 -0700
Subject: [Patches] [ python-Patches-421893 ] Cleanup GC API
Message-ID: <E15BIsx-0002UI-00@usw-sf-web1.sourceforge.net>

Patches item #421893, was updated on 2001-05-06 14:42
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421893&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: Cleanup GC API

Initial Comment:
This patch adds three new APIs:

	PyObject_GC_New
	PyObject_GC_NewVar
	PyObject_GC_Resize
	PyObject_GC_Del

and renames PyObject_GC_Init and PyObject_GC_Fini to:

	PyObject_GC_Track
	PyObject_GC_Ignore

respectively.  Objects that wish to be tracked by the
collector must use these new APIs.  Many more details
about the GC implementation are hidden inside
gcmodule.c.  There seems to be no change in
performance.

Note that PyObject_GC_{New,NewVar} automatically adds
the object to the GC lists.  There is no need to
call PyObject_GC_Track.  PyObject_GC_Del automatically
removes the object from the GC list but usually you
want to call PyObject_GC_Ignore yourself (DECREFs can
end up running arbitrary code).

It should be more difficult to corrupt the GC linked
lists now.  Also, you can now call PyObject_GC_Ignore
on objects that you know will not create RCs. The
_weakref module does this.  Previously, every object
that had the GC type flag set and could be found by
using tp_traverse had to be in a GC linked list.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-16 09:25

Message:
Logged In: YES 
user_id=6380

I think I know a way to fix the incompatibility, by
switching to a different flag bit.  I'll try to work this
into the descr-branch code.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-04 10:13

Message:
Logged In: YES 
user_id=21627

I have two problems with this patch:
1. It comes with no documentation.
2. It breaks existing third-party modules which use the 
   GC API as defined in Python 2.
Consequently, I recommend rejection of the patch in its 
current form.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421893&group_id=5470


From noreply@sourceforge.net  Sun Jun 17 05:29:01 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 16 Jun 2001 21:29:01 -0700
Subject: [Patches] [ python-Patches-433619 ] NAMESPACE support in imaplib.py
Message-ID: <E15BUBJ-0000wV-00@usw-sf-web1.sourceforge.net>

Patches item #433619, was updated on 2001-06-15 18:12
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Michel Pelletier (michel)
Assigned to: Guido van Rossum (gvanrossum)
Summary: NAMESPACE support in imaplib.py

Initial Comment:
Support for the IMAP NAMESPACE extension defined in rfc
2342.  This is almost a necessity for working with
modern IMAP servers.


----------------------------------------------------------------------

Comment By: Piers Lauder (pierslauder)
Date: 2001-06-16 21:29

Message:
Logged In: YES 
user_id=196212

This looks good to me. It should be in there.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-16 09:08

Message:
Logged In: YES 
user_id=6380

I'm pinging Piers Lauder about this.  If he approves, I'll
apply it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470


From noreply@sourceforge.net  Sun Jun 17 14:31:53 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 17 Jun 2001 06:31:53 -0700
Subject: [Patches] [ python-Patches-433619 ] NAMESPACE support in imaplib.py
Message-ID: <E15Bcef-0003zL-00@usw-sf-web1.sourceforge.net>

Patches item #433619, was updated on 2001-06-15 18:12
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470

Category: library
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Michel Pelletier (michel)
Assigned to: Guido van Rossum (gvanrossum)
Summary: NAMESPACE support in imaplib.py

Initial Comment:
Support for the IMAP NAMESPACE extension defined in rfc
2342.  This is almost a necessity for working with
modern IMAP servers.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-17 06:31

Message:
Logged In: YES 
user_id=6380

Applied and closed.

Thanks, Michel!

----------------------------------------------------------------------

Comment By: Piers Lauder (pierslauder)
Date: 2001-06-16 21:29

Message:
Logged In: YES 
user_id=196212

This looks good to me. It should be in there.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-16 09:08

Message:
Logged In: YES 
user_id=6380

I'm pinging Piers Lauder about this.  If he approves, I'll
apply it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433619&group_id=5470


From noreply@sourceforge.net  Sun Jun 17 23:01:09 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 17 Jun 2001 15:01:09 -0700
Subject: [Patches] [ python-Patches-434008 ] sharedinstall must use $prefix
Message-ID: <E15BkbV-0007yh-00@usw-sf-web1.sourceforge.net>

Patches item #434008, was updated on 2001-06-17 15:01
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=434008&group_id=5470

Category: Build
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Gregor Hoffleit (flight)
Assigned to: Nobody/Anonymous (nobody)
Summary: sharedinstall must use $prefix

Initial Comment:
In the sharedinstall target of the Makefile, we have to
provide setup.py with the $prefix variable. Currently,
the $prefix is ignored in this call of setup.py, in
this leads to strange results:

When called with "make install
prefix=/tmp/python/debian/tmp" (which is used in
packaging Python, and works completely fine otherwise),
we get this (running this is non-root user):

  copying
build/lib.linux-i686-2.1/linuxaudiodev.so->/data/src/debian/python2-2.1/debian/tmp/usr/lib/python2.1/lib-dynload
  running install_scripts
  copying build/scripts/pydoc -> /usr/bin
  error: /usr/bin/pydoc: Read-only file system
  make[1]: *** [sharedinstall] Error 1
  make[1]: Leaving directory `/data/src/debian/python2-2.1'
  make: *** [install-stamp] Error 2

The same kind of problem occurs with all other things
that are installed by the call of the setup.py script.

The attached patch cures this problem by providing the
$prefix to the setup.py script. I think this is the
correct way to fix it.

    Gregor


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=434008&group_id=5470


From noreply@sourceforge.net  Mon Jun 18 02:10:09 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 17 Jun 2001 18:10:09 -0700
Subject: [Patches] [ python-Patches-413171 ] fix UserDict.get, setdefault, update
Message-ID: <E15BnYP-0003pm-00@usw-sf-web1.sourceforge.net>

Patches item #413171, was updated on 2001-04-02 10:18
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=413171&group_id=5470

Category: library
Group: None
>Status: Closed
Resolution: Accepted
Priority: 4
Submitted By: Ka-Ping Yee (ping)
Assigned to: Ka-Ping Yee (ping)
Summary: fix UserDict.get, setdefault, update

Initial Comment:
The methods 'get', 'setdefault', and 'update'
on a dictionary are usually implemented (and
thought of) in terms of the lower-level methods
has_key, __getitem__, and __setitem__.  The
current implementation of UserDict relays a
call to e.g. x.get() to x.data.get(), which
behaves inconsistently if __getitem__ has been
implemented on x.

One particular big place where this turns up is cgi.
If you get a dict = cgi.SvFormContentDict(), then
dict.get('key') will return a *list* even though
dict['key'] returns a single item!

To make UserDict behave consistently, this patch
fixes get(), update(), and setdefault() to re-use
the other methods.  Then the only occurrence of
self.data[k] = v is in __setitem__, the only
occurrence of self.data[k] without assignment is
in __getitem__, etc.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-17 18:10

Message:
Logged In: YES 
user_id=21627

Committed as UserDict 1.14.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-06-07 08:10

Message:
Logged In: YES 
user_id=6380

Approved.  Check it in already!

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-06 22:25

Message:
Logged In: YES 
user_id=21627

I recommend to approve this patch.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-04-10 14:17

Message:
Logged In: YES 
user_id=6380

Let's not fix this in 2.1.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=413171&group_id=5470


From noreply@sourceforge.net  Mon Jun 18 02:41:20 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 17 Jun 2001 18:41:20 -0700
Subject: [Patches] [ python-Patches-427190 ] Speed-up "O" calls
Message-ID: <E15Bo2a-0004W8-00@usw-sf-web1.sourceforge.net>

Patches item #427190, was updated on 2001-05-24 22:30
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427190&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Martin v. Löwis (loewis)
Assigned to: Jeremy Hylton (jhylton)
>Summary: Speed-up "O" calls

Initial Comment:
This patch improves the performance of a few functions
which have an "O" signature (ord, len, and
list_append). On selected test cases, this patch gives
a speed-up of 40%. If accepted, the approach can be
extended to more signatures. E.g. "l" is already
provided in the patch, but currently not used.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-17 18:41

Message:
Logged In: YES 
user_id=21627

Uploaded new version which invokes string_join correctly 
from _PyString_Join.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-02 03:12

Message:
Logged In: YES 
user_id=21627

New version uploaded. This uses functions with only the 
self argument for METH_NOARGS, and introduces 
PyNoArgsFunction for them.

It also adds a section for api.tex documenting the METH_ 
flags, and an entry in NEWS mentioning the new METH_ flags.


----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2001-06-01 08:14

Message:
Logged In: YES 
user_id=31392

Just took a quick look -- looks good.  

One question: Why does METH_NOARGS call the method with two
arguments where the second is always NULL?  Wouldn't it be
clearer to have these functions take one argument?


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-01 07:34

Message:
Logged In: YES 
user_id=21627

I rewrote the patch to only support METH_NOARGS and METH_O,
and to not use bit masks for them.

I also changed calling conventions for all Object operations
and bltin and sys functions. In the course of these changes,
two functions got a changed meaning:
- file.writelines accepts only exactly one argument
- iter.next does not accept any arguments anymore

As you can see in the patch,there is still a lot of places
that continue to use OLDARGS (plus all the Modules functions
that have not been changed in this patch), so OLDARGS will
be needed for quite some time.

----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2001-05-29 13:59

Message:
Logged In: YES 
user_id=31392

I like METH_O, but I'm not sure about METH_L.  I'd rather
see the call handling in ceval be type-neutral.  It's easy
enough for the callee to cast from an object to an int (or
any other type).  There should be no effect on performance
and it reduces the amount of code in the core.

I think the implementation could be simplified a lot if it
defined METH_O -- or perhaps METH_NOARGS,  METH_ONEARG, and
maybe even METH_TWOARGS (but Tim has a pretty good argument
against that one).  I don't think there's any define METH_O
via METH_SPECIAL and reserve all of 0xFFF0 for flags on
METH_SPECIAL.  Instead, I'd just use the next N bits to
implement the next N flags.

The SPECIALSIZE and extra stack used in the implementation
seem like unneeded generality, too.  If the implementation
is only going to support 0 and 1 (and possibly 2) argument,
there's no need for anything more general.

Finally, I suggest appropriating fast_cfunction() for this
purpose, rather than calling the new function
do_call_special(), where "special" isn't a very specific
meaning.  If METH_NOARGS and METH_ONEARG are implemented,
there is basically no reason to use METH_OLDARGS.  So we can
get rid of it in the code base and stop attempting to
optimize it.

Do you want to have a go at a smaller patch that just did
METH_ONEARG and METH_NOARGS?


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=427190&group_id=5470


From noreply@sourceforge.net  Mon Jun 18 15:06:08 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 18 Jun 2001 07:06:08 -0700
Subject: [Patches] [ python-Patches-426746 ] Infrastructure for getting MacPython modules working on OSX
Message-ID: <E15BzfM-0004Zo-00@usw-sf-web1.sourceforge.net>

Patches item #426746, was updated on 2001-05-23 13:29
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470

Category: Build
Group: None
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Jack Jansen (jackjansen)
Assigned to: Thomas Wouters (twouters)
Summary: Infrastructure for getting MacPython modules working on OSX

Initial Comment:
Here are a couple of patches that optionally (on MacOSX) enable a bit of extra infrastructure in the Python core, to allow various (MacPython-originated) dynamic extension modules to be built. Here's what I patched:

- Added a MACHDEP_OBJS variable to Makefile.pre.in and configure.in. This allows platforms to include patform-specific sourcefiles to be added to the core build.

- Added (using MACHDEP_OBJS) a macglue.c file to the build, which contains glue code that allows Mac extension modules to refer to each other while being in separate dynamically loaded modules, plus a couple of utility routines. There's also a few changes to LDFLAGS to get the object file incorporated (as it is otherwise optimized away because the rest of Python doesn't refer to it).

- Added a config.h.in define USE_TOOLBOX_OBJECT_GLUE which enables the glue code mentioned above (which isn't need in MacPython, only in Mach-O Python).

Possibly the latter two should be dependent on a configure switch (--with-mac-toolbox-modules?) but (a) I think the added memory footprint is minimal and (b) I never understood how to add configure switches:-)

A setup.py patch will follow, but I'm still testing it.


----------------------------------------------------------------------

>Comment By: Thomas Wouters (twouters)
Date: 2001-06-18 07:06

Message:
Logged In: YES 
user_id=34209

Looks fine, except for one thing: it changes 'dnl' to
'setdnl' in one spot. 'setdnl' isn't a standard M4
directive, to my knowledge. Is that a typo ?

I didn't actually test the patch on an OSX box, though, as I
assume Jack already did that :) But, Jack, I do have two
colleagues with OSX boxes, and I already have an account on
one, so if you want, I can take the time to test it, or
other stuff. I'll need some pointers first, though, because
last time I tried to compile python on that box it took me
four hours to figure out how to make it stop whining when
running cofniguer, let alone make ;-)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-07 13:09

Message:
Logged In: YES 
user_id=31435

Assigned to Thomas because he's shown previous signs of 
knowing how to spell "configure" <0.9 wink>.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470


From noreply@sourceforge.net  Mon Jun 18 15:13:28 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 18 Jun 2001 07:13:28 -0700
Subject: [Patches] [ python-Patches-426746 ] Infrastructure for getting MacPython modules working on OSX
Message-ID: <E15BzmS-0004vR-00@usw-sf-web1.sourceforge.net>

Patches item #426746, was updated on 2001-05-23 13:29
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470

Category: Build
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Jack Jansen (jackjansen)
>Assigned to: Jack Jansen (jackjansen)
Summary: Infrastructure for getting MacPython modules working on OSX

Initial Comment:
Here are a couple of patches that optionally (on MacOSX) enable a bit of extra infrastructure in the Python core, to allow various (MacPython-originated) dynamic extension modules to be built. Here's what I patched:

- Added a MACHDEP_OBJS variable to Makefile.pre.in and configure.in. This allows platforms to include patform-specific sourcefiles to be added to the core build.

- Added (using MACHDEP_OBJS) a macglue.c file to the build, which contains glue code that allows Mac extension modules to refer to each other while being in separate dynamically loaded modules, plus a couple of utility routines. There's also a few changes to LDFLAGS to get the object file incorporated (as it is otherwise optimized away because the rest of Python doesn't refer to it).

- Added a config.h.in define USE_TOOLBOX_OBJECT_GLUE which enables the glue code mentioned above (which isn't need in MacPython, only in Mach-O Python).

Possibly the latter two should be dependent on a configure switch (--with-mac-toolbox-modules?) but (a) I think the added memory footprint is minimal and (b) I never understood how to add configure switches:-)

A setup.py patch will follow, but I'm still testing it.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2001-06-18 07:06

Message:
Logged In: YES 
user_id=34209

Looks fine, except for one thing: it changes 'dnl' to
'setdnl' in one spot. 'setdnl' isn't a standard M4
directive, to my knowledge. Is that a typo ?

I didn't actually test the patch on an OSX box, though, as I
assume Jack already did that :) But, Jack, I do have two
colleagues with OSX boxes, and I already have an account on
one, so if you want, I can take the time to test it, or
other stuff. I'll need some pointers first, though, because
last time I tried to compile python on that box it took me
four hours to figure out how to make it stop whining when
running cofniguer, let alone make ;-)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-07 13:09

Message:
Logged In: YES 
user_id=31435

Assigned to Thomas because he's shown previous signs of 
knowing how to spell "configure" <0.9 wink>.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470


From noreply@sourceforge.net  Tue Jun 19 12:11:55 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 19 Jun 2001 04:11:55 -0700
Subject: [Patches] [ python-Patches-426746 ] Infrastructure for getting MacPython modules working on OSX
Message-ID: <E15CJQJ-00088p-00@usw-sf-web1.sourceforge.net>

Patches item #426746, was updated on 2001-05-23 13:29
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470

Category: Build
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Jack Jansen (jackjansen)
Assigned to: Jack Jansen (jackjansen)
Summary: Infrastructure for getting MacPython modules working on OSX

Initial Comment:
Here are a couple of patches that optionally (on MacOSX) enable a bit of extra infrastructure in the Python core, to allow various (MacPython-originated) dynamic extension modules to be built. Here's what I patched:

- Added a MACHDEP_OBJS variable to Makefile.pre.in and configure.in. This allows platforms to include patform-specific sourcefiles to be added to the core build.

- Added (using MACHDEP_OBJS) a macglue.c file to the build, which contains glue code that allows Mac extension modules to refer to each other while being in separate dynamically loaded modules, plus a couple of utility routines. There's also a few changes to LDFLAGS to get the object file incorporated (as it is otherwise optimized away because the rest of Python doesn't refer to it).

- Added a config.h.in define USE_TOOLBOX_OBJECT_GLUE which enables the glue code mentioned above (which isn't need in MacPython, only in Mach-O Python).

Possibly the latter two should be dependent on a configure switch (--with-mac-toolbox-modules?) but (a) I think the added memory footprint is minimal and (b) I never understood how to add configure switches:-)

A setup.py patch will follow, but I'm still testing it.


----------------------------------------------------------------------

>Comment By: Jack Jansen (jackjansen)
Date: 2001-06-19 04:11

Message:
Logged In: YES 
user_id=45365

I have absolutely no idea where the dnl/setdnl mod came
from. Throw it out, please.

Also, I'm a bit unsure about the next step: do you check the
patch in or do I?

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2001-06-18 07:06

Message:
Logged In: YES 
user_id=34209

Looks fine, except for one thing: it changes 'dnl' to
'setdnl' in one spot. 'setdnl' isn't a standard M4
directive, to my knowledge. Is that a typo ?

I didn't actually test the patch on an OSX box, though, as I
assume Jack already did that :) But, Jack, I do have two
colleagues with OSX boxes, and I already have an account on
one, so if you want, I can take the time to test it, or
other stuff. I'll need some pointers first, though, because
last time I tried to compile python on that box it took me
four hours to figure out how to make it stop whining when
running cofniguer, let alone make ;-)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-07 13:09

Message:
Logged In: YES 
user_id=31435

Assigned to Thomas because he's shown previous signs of 
knowing how to spell "configure" <0.9 wink>.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470


From noreply@sourceforge.net  Tue Jun 19 12:13:06 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 19 Jun 2001 04:13:06 -0700
Subject: [Patches] [ python-Patches-426746 ] Infrastructure for getting MacPython modules working on OSX
Message-ID: <E15CJRS-0008A1-00@usw-sf-web1.sourceforge.net>

Patches item #426746, was updated on 2001-05-23 13:29
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470

Category: Build
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Jack Jansen (jackjansen)
Assigned to: Jack Jansen (jackjansen)
Summary: Infrastructure for getting MacPython modules working on OSX

Initial Comment:
Here are a couple of patches that optionally (on MacOSX) enable a bit of extra infrastructure in the Python core, to allow various (MacPython-originated) dynamic extension modules to be built. Here's what I patched:

- Added a MACHDEP_OBJS variable to Makefile.pre.in and configure.in. This allows platforms to include patform-specific sourcefiles to be added to the core build.

- Added (using MACHDEP_OBJS) a macglue.c file to the build, which contains glue code that allows Mac extension modules to refer to each other while being in separate dynamically loaded modules, plus a couple of utility routines. There's also a few changes to LDFLAGS to get the object file incorporated (as it is otherwise optimized away because the rest of Python doesn't refer to it).

- Added a config.h.in define USE_TOOLBOX_OBJECT_GLUE which enables the glue code mentioned above (which isn't need in MacPython, only in Mach-O Python).

Possibly the latter two should be dependent on a configure switch (--with-mac-toolbox-modules?) but (a) I think the added memory footprint is minimal and (b) I never understood how to add configure switches:-)

A setup.py patch will follow, but I'm still testing it.


----------------------------------------------------------------------

>Comment By: Jack Jansen (jackjansen)
Date: 2001-06-19 04:13

Message:
Logged In: YES 
user_id=45365

Ah, silly me, it's assigned to me, so I check it in (after
removing the dml stuff).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2001-06-19 04:11

Message:
Logged In: YES 
user_id=45365

I have absolutely no idea where the dnl/setdnl mod came
from. Throw it out, please.

Also, I'm a bit unsure about the next step: do you check the
patch in or do I?

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2001-06-18 07:06

Message:
Logged In: YES 
user_id=34209

Looks fine, except for one thing: it changes 'dnl' to
'setdnl' in one spot. 'setdnl' isn't a standard M4
directive, to my knowledge. Is that a typo ?

I didn't actually test the patch on an OSX box, though, as I
assume Jack already did that :) But, Jack, I do have two
colleagues with OSX boxes, and I already have an account on
one, so if you want, I can take the time to test it, or
other stuff. I'll need some pointers first, though, because
last time I tried to compile python on that box it took me
four hours to figure out how to make it stop whining when
running cofniguer, let alone make ;-)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-07 13:09

Message:
Logged In: YES 
user_id=31435

Assigned to Thomas because he's shown previous signs of 
knowing how to spell "configure" <0.9 wink>.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470


From noreply@sourceforge.net  Tue Jun 19 21:10:01 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 19 Jun 2001 13:10:01 -0700
Subject: [Patches] [ python-Patches-430030 ] Avoid multiple BOMs in UTF-16 streams
Message-ID: <E15CRp3-0001ZU-00@usw-sf-web3.sourceforge.net>

Patches item #430030, was updated on 2001-06-04 09:59
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430030&group_id=5470

Category: library
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Martin v. Löwis (loewis)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Avoid multiple BOMs in UTF-16 streams

Initial Comment:
This patch fixes the UTF-16 reader and writer to emit 
and expect the BOM only at the beginning of the 
stream. It is implemented by changing the 
encode/decode function of the stream object after the 
byte order is detected.

In addition, it adds a new test case test_codecs. 
When committing the patch, the corresponding output 
file must be generated.


----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-19 13:10

Message:
Logged In: YES 
user_id=38388

Checked in a slightly modified  version of the patch.

Thanks.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=430030&group_id=5470


From noreply@sourceforge.net  Tue Jun 19 21:23:52 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 19 Jun 2001 13:23:52 -0700
Subject: [Patches] [ python-Patches-426746 ] Infrastructure for getting MacPython modules working on OSX
Message-ID: <E15CS2S-0003pc-00@usw-sf-web1.sourceforge.net>

Patches item #426746, was updated on 2001-05-23 13:29
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470

Category: Build
Group: None
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Jack Jansen (jackjansen)
Assigned to: Jack Jansen (jackjansen)
Summary: Infrastructure for getting MacPython modules working on OSX

Initial Comment:
Here are a couple of patches that optionally (on MacOSX) enable a bit of extra infrastructure in the Python core, to allow various (MacPython-originated) dynamic extension modules to be built. Here's what I patched:

- Added a MACHDEP_OBJS variable to Makefile.pre.in and configure.in. This allows platforms to include patform-specific sourcefiles to be added to the core build.

- Added (using MACHDEP_OBJS) a macglue.c file to the build, which contains glue code that allows Mac extension modules to refer to each other while being in separate dynamically loaded modules, plus a couple of utility routines. There's also a few changes to LDFLAGS to get the object file incorporated (as it is otherwise optimized away because the rest of Python doesn't refer to it).

- Added a config.h.in define USE_TOOLBOX_OBJECT_GLUE which enables the glue code mentioned above (which isn't need in MacPython, only in Mach-O Python).

Possibly the latter two should be dependent on a configure switch (--with-mac-toolbox-modules?) but (a) I think the added memory footprint is minimal and (b) I never understood how to add configure switches:-)

A setup.py patch will follow, but I'm still testing it.


----------------------------------------------------------------------

>Comment By: Jack Jansen (jackjansen)
Date: 2001-06-19 13:23

Message:
Logged In: YES 
user_id=45365


----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2001-06-19 04:13

Message:
Logged In: YES 
user_id=45365

Ah, silly me, it's assigned to me, so I check it in (after
removing the dml stuff).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2001-06-19 04:11

Message:
Logged In: YES 
user_id=45365

I have absolutely no idea where the dnl/setdnl mod came
from. Throw it out, please.

Also, I'm a bit unsure about the next step: do you check the
patch in or do I?

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2001-06-18 07:06

Message:
Logged In: YES 
user_id=34209

Looks fine, except for one thing: it changes 'dnl' to
'setdnl' in one spot. 'setdnl' isn't a standard M4
directive, to my knowledge. Is that a typo ?

I didn't actually test the patch on an OSX box, though, as I
assume Jack already did that :) But, Jack, I do have two
colleagues with OSX boxes, and I already have an account on
one, so if you want, I can take the time to test it, or
other stuff. I'll need some pointers first, though, because
last time I tried to compile python on that box it took me
four hours to figure out how to make it stop whining when
running cofniguer, let alone make ;-)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-06-07 13:09

Message:
Logged In: YES 
user_id=31435

Assigned to Thomas because he's shown previous signs of 
knowing how to spell "configure" <0.9 wink>.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=426746&group_id=5470


From noreply@sourceforge.net  Fri Jun 22 09:42:42 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 22 Jun 2001 01:42:42 -0700
Subject: [Patches] [ python-Patches-435381 ] Distutils patches for OS/2+EMX support
Message-ID: <E15DMWY-0001iH-00@usw-sf-web2.sourceforge.net>

Patches item #435381, was updated on 2001-06-22 01:42
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435381&group_id=5470

Category: distutils
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Andrew I MacIntyre (aimacintyre)
Assigned to: Nobody/Anonymous (nobody)
Summary: Distutils patches for OS/2+EMX support

Initial Comment:
The attached patch file is against the code released
with Python 2.1.

The changes are included in the binary installation
package of the OS/2+EMX port of Python 2.1 I released
on June 17, 2001.

With these changes, I have successfully built and
installed the  Numeric 20.0.0 extention, and created a
bdist_dumb distribution package of it. The installed
extention tests successfully using the supplied test
routines.

Particular items of note:-
- OS/2 limits DLLs to 8.3 filenames :-( :-(
- ld is not used to link, instead gcc is used with the
-Zomf option
  which invokes the LINK386 linker native to OS/2
- I haven't made any attempt to merge cloned code back
into the 
  parent code where it would make sense, which I think
is in a few     places.

Feedback appreciated.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435381&group_id=5470


From noreply@sourceforge.net  Fri Jun 22 16:47:44 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 22 Jun 2001 08:47:44 -0700
Subject: [Patches] [ python-Patches-435492 ] tempnam(),tmpfile(),tmpnam() for Windows
Message-ID: <E15DT9s-0001dq-00@usw-sf-web3.sourceforge.net>

Patches item #435492, was updated on 2001-06-22 08:47
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435492&group_id=5470

Category: Windows
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Fred L. Drake, Jr. (fdrake)
Assigned to: Tim Peters (tim_one)
Summary: tempnam(),tmpfile(),tmpnam() for Windows

Initial Comment:
This patch makes os.tempnam(), os.tmpfile(), and
os.tmpnam() available on Windows.  (And yes, I tested
that the Windows version still compiles!)

A user noted that the documentation did not indicate
constrained availability, but these functions were not
available -- appearantly he was on running Windows or
MacOS.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435492&group_id=5470


From noreply@sourceforge.net  Fri Jun 22 19:41:46 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 22 Jun 2001 11:41:46 -0700
Subject: [Patches] [ python-Patches-431422 ] "print" not emitting POP_TOP
Message-ID: <E15DVsI-00086U-00@usw-sf-web1.sourceforge.net>

Patches item #431422, was opened at 2001-06-08 08:54
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431422&group_id=5470

Category: Parser/Compiler
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Shane Hathaway (hathawsh)
Assigned to: Nobody/Anonymous (nobody)
>Summary: "print" not emitting POP_TOP

Initial Comment:
The Python-based compiler module (in Tools) has a bug 
in the visitPrint() method of 
pycodegen.CodeGenerator.  It does not emit a trailing 
POP_TOP instruction, which AFAICT it should emit only 
when outputting to a stream and there is a trailing 
comma (indicating no newline).  I've attached the 
patch applied to Zope's RestrictedPython module; if 
there is anything incorrect about it please tell me 
right away.  Otherwise please apply the patch to 
Tools/compiler/pycodgen.py.


----------------------------------------------------------------------

>Comment By: Shane Hathaway (hathawsh)
Date: 2001-06-22 11:41

Message:
Logged In: YES 
user_id=16701

Oops, it turns out the patch is incorrect!  POP_TOP should 
only be added to the *last* print node.  Here are the 
revised visitPrint() and visitPrintnl() methods.  This is 
what is being used in Zope now.

    def visitPrint(self, node, newline=0):
        self.set_lineno(node)
        if node.dest:
            self.visit(node.dest)
        for child in node.nodes:
            if node.dest:
                self.emit('DUP_TOP')
            self.visit(child)
            if node.dest:
                self.emit('ROT_TWO')
                self.emit('PRINT_ITEM_TO')
            else:
                self.emit('PRINT_ITEM')
        if node.dest and not newline:
            self.emit('POP_TOP')

    def visitPrintnl(self, node):
        self.visitPrint(node, 1)
        if node.dest:
            self.emit('PRINT_NEWLINE_TO')
        else:
            self.emit('PRINT_NEWLINE')


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=431422&group_id=5470


From noreply@sourceforge.net  Fri Jun 22 21:51:02 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 22 Jun 2001 13:51:02 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E15DXtO-0007bB-00@usw-sf-web3.sourceforge.net>

Patches item #432401, was opened at 2001-06-12 06:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-22 13:51

Message:
Logged In: YES 
user_id=38388

Sorry to keep you waiting, Walter. I will look into this
again next week -- this week was way too busy...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 10:00

Message:
Logged In: YES 
user_id=38388

On your comment about the non-Unicode codecs: let's keep
this separated from the current patch.

Don't have much time today. I'll comment on the other things
tomorrow.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 08:49

Message:
Logged In: YES 
user_id=89016

Guido van Rossum wrote in python-dev:

> True, the "codec" pattern can be used for other 
> encodings than Unicode.  But it seems to me that the
> entire codecs architecture is rather strongly geared
> towards en/decoding Unicode, and it's not clear
> how well other codecs fit in this pattern (e.g. I 
> noticed that all the non-Unicode codecs ignore the 
> error handling parameter or assert that
> it is set to 'strict').

I noticed that too. asserting that errors=='strict' would 
mean that the encoder is not able to deal in any other way 
with unencodable stuff than by raising an error. But that 
is not the problem here, because for zlib, base64, quopri, 
hex and uu encoding there can be no unencodable characters. 
The encoders can simply ignore the errors parameter. Should 
I remove the asserts from those codecs and change the 
docstrings accordingly, or will this be done separately?


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 06:57

Message:
Logged In: YES 
user_id=89016

> > [...]
> > raise an exception). U+FFFD characters in the 
replacement
> > string will be replaced with a character that the 
encoder
> > chooses ('?' in all cases).
>
> Nice.

But the special casing of U+FFFD makes the interface 
somewhat
less clean than it could be. It was only done to be 100%
backwards compatible. With the original "replace" error
handling the codec chose the replacement character. But as
far as I can tell none of the codecs uses anything other
than '?', so I guess we could change the replace handler
to always return u'?'. This would make the implementation a
little bit simpler, but the explanation of the callback
feature *a lot* simpler. And if you still want to handle
an unencodable U+FFFD, you can write a special callback for
that, e.g.

def FFFDreplace(enc, uni, pos):
if uni[pos] == "\ufffd":
return u"?"
else:
raise UnicodeError(...)

> > The implementation of the loop through the string is 
done
> > in the following way. A stack with two strings is kept
> > and the loop always encodes a character from the string
> > at the stacktop. If an error is encountered and the 
stack
> > has only one entry (during encoding of the original 
string)
> > the callback is called and the unicode object returned 
is
> > pushed on the stack, so the encoding continues with the
> > replacement string. If the stack has two entries when an
> > error is encountered, the replacement string itself has
> > an unencodable character and a normal exception raised.
> > When the encoder has reached the end of it's current 
string
> > there are two possibilities: when the stack contains two
> > entries, this was the replacement string, so the 
replacement
> > string will be poppep from the stack and encoding 
continues
> > with the next character from the original string. If the
> > stack had only one entry, encoding is finished.
>
> Very elegant solution !

I'll put it as a comment in the source.

> > (I hope that's enough explanation of the API and
> implementation)
>
> Could you add these docs to the Misc/unicode.txt file ? I
> will eventually take that file and turn it into a PEP 
which
> will then serve as general documentation for these things.

I could, but first we should work out how the decoding
callback API will work.

> > I have renamed the static ...121 function to all 
lowercase
> > names.
>
> Ok.
>
> > BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> > reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> > replacement callback.
>
> Hmm, wouldn't that result in a slowdown ? If so, I'd 
rather
> leave the special encoder in place, since it is being 
used a
> lot in Python and probably some applications too.

It would be a slowdown. But callbacks open many 
possiblities.

For example:

   Why can't I print u"gürk"?

is probably one of the most frequently asked questions in
comp.lang.python. For printing Unicode stuff, print could be
extended the use an error handling callback for Unicode 
strings (or objects where __str__ or tp_str returns a 
Unicode object) instead of using str() which always returns 
an 8bit string and uses strict encoding. There might even 
be a
sys.setprintencodehandler()/sys.getprintencodehandler()

> [...]
> I think it would be worthwhile to rename the callbacks to
> include "Unicode" somewhere, e.g.
> PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, 
but
> then it points out the application field of the callback
> rather well. Same for the callbacks exposed through the
> _codecsmodule.

OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors
really is a long name ;))

> > I have not touched PyUnicode_TranslateCharmap yet,
> > should this function also support error callbacks? Why
> > would one want the insert None into the mapping to call
> > the callback?
>
> 1. Yes.
> 2. The user may want to e.g. restrict usage of certain
> character ranges. In this case the codec would be used to
> verify the input and an exception would indeed be useful
> (e.g. say you want to restrict input to Hangul + ASCII).

OK, do we want TranslateCharmap to work exactly like 
encoding,
i.e. in case of an error should the returned replacement
string again be mapped through the translation mapping or
should it be copied to the output directly? The former would
be more in line with encoding, but IMHO the latter would
be much more useful.

BTW, when I implement it I can implement patch #403100
("Multicharacter replacements in 
PyUnicode_TranslateCharmap")
along the way.

Should the old TranslateCharmap map to the new 
TranslateCharmapEx
and inherit the "multicharacter replacement" feature, or
should I leave it as it is?

> > A remaining problem is how to implement decoding error
> > callbacks. In Python 2.1 encoding and decoding errors 
are
> > handled in the same way with a string value. But with
> > callbacks it doesn't make sense to use the same callback
> > for encoding and decoding (like 
codecs.StreamReaderWriter
> > and codecs.StreamRecoder do). Decoding callbacks have a
> > different API. Which arguments should be passed to the
> > decoding callback, and what is the decoding callback
> > supposed to do?
>
> I'd suggest adding another set of PyCodec_UnicodeDecode...
()
> APIs for this. We'd then have to augment the base classes 
of
> the StreamCodecs to provide two attributes for .errors 
with
> a fallback solution for the string case (i.s. "strict" can
> still be used for both directions).

Sounds good. Now what is the decoding callback supposed to 
do?
I guess it will be called in the same way as the encoding
callback, i.e. with encoding name, original string and
position of the error. It might returns a Unicode string
(i.e. an object of the decoding target type), that will be
emitted from the codec instead of the one offending byte. Or
it might return a tuple with replacement Unicode object and
a resynchronisation offset, i.e. returning (u"?", 1) means
emit a '?' and skip the offending character. But to make
the offset really useful the callback has to know something
about the encoding, perhaps the codec should be allowed to
pass an additional state object to the callback?

Maybe the same should be added to the encoding callbacks to?
Maybe the encoding callback should be able to tell the
encoder if the replacement returned should be reencoded
(in which case it's a Unicode object), or directly emitted
(in which case it's an 8bit string)?

> > One additional note: It is vital that errors is an
> > assignable attribute of the StreamWriter.
>
> It is already !

I know, but IMHO it should be documented that an assignable
errors attribute must be supported as part of the official
codec API.

Misc/unicode.txt is not clear on that:
"""
It is not required by the Unicode implementation to use 
these base classes, only the interfaces must match; this 
allows writing Codecs as extension types.
"""

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 01:05

Message:
Logged In: YES 
user_id=38388

> How the callbacks work:
> 
> A PyObject * named errors is passed in. This may by NULL,
> Py_None, 'strict', u'strict', 'ignore', u'ignore',
> 'replace', u'replace' or a callable object.
> PyCodec_EncodeHandlerForObject maps all of these objects
to
> one of the three builtin error callbacks
> PyCodec_RaiseEncodeErrors (raises an exception),
> PyCodec_IgnoreEncodeErrors (returns an empty replacement
> string, in effect ignoring the error),
> PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
> replacement character to signify to the encoder that it
> should choose a suitable replacement character) or
directly
> returns errors if it is a callable object. When an
> unencodable character is encounterd the error handling
> callback will be called with the encoding name, the
original
> unicode object and the error position and must return a
> unicode object that will be encoded instead of the
offending
> character (or the callback may of course raise an
> exception). U+FFFD characters in the replacement string
will
> be replaced with a character that the encoder chooses ('?'
> in all cases).

Nice.
 
> The implementation of the loop through the string is done
in
> the following way. A stack with two strings is kept and
the
> loop always encodes a character from the string at the
> stacktop. If an error is encountered and the stack has
only
> one entry (during encoding of the original string) the
> callback is called and the unicode object returned is
pushed
> on the stack, so the encoding continues with the
replacement
> string. If the stack has two entries when an error is
> encountered, the replacement string itself has an
> unencodable character and a normal exception raised. When
> the encoder has reached the end of it's current string
there
> are two possibilities: when the stack contains two
entries,
> this was the replacement string, so the replacement string
> will be poppep from the stack and encoding continues with
> the next character from the original string. If the stack
> had only one entry, encoding is finished.

Very elegant solution !
 
> (I hope that's enough explanation of the API and
implementation)

Could you add these docs to the Misc/unicode.txt file ? I
will eventually take that file and turn it into a PEP which
will then serve as general documentation for these things.
 
> I have renamed the static ...121 function to all lowercase
> names.

Ok.
 
> BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> replacement callback.

Hmm, wouldn't that result in a slowdown ? If so, I'd rather
leave the special encoder in place, since it is being used a
lot in Python and probably some applications too.
 
> PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
> PyCodec_ReplaceEncodeErrors are globally visible because
> they have to be available in _codecsmodule.c to wrap them
as
> Python function objects, but they can't be implemented in
> _codecsmodule, because they need to be available to the
> encoders in unicodeobject.c (through
> PyCodec_EncodeHandlerForObject), but importing the codecs
> module might result in an endless recursion, because
> importing a module requires unpickling of the bytecode,
> which might require decoding utf8, which ... (but this
will
> only happen, if we implement the same mechanism for the
> decoding API)

I think that codecs.c is the right place for these APIs.
_codecsmodule.c is only meant as Python access wrapper for
the internal codecs and nothing more. 

One thing I noted about the callbacks: they assume that they
will always get Unicode objects as input. This is certainly
not true in the general case (it is for the codecs you touch
in the patch). 

I think it would be worthwhile to rename the callbacks to
include "Unicode" somewhere, e.g.
PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but
then it points out the application field of the callback
rather well. Same for the callbacks exposed through the
_codecsmodule.

> I have not touched PyUnicode_TranslateCharmap yet,
> should this function also support error callbacks? Why
would
> one want the insert None into the mapping to call the
callback?

1. Yes.
2. The user may want to e.g. restrict usage of certain
character ranges. In this case the codec would be used to
verify the input and an exception would indeed be useful
(e.g. say you want to restrict input to Hangul + ASCII).
 
> A remaining problem is how to implement decoding error
> callbacks. In Python 2.1 encoding and decoding errors are
> handled in the same way with a string value. But with
> callbacks it doesn't make sense to use the same callback
for
> encoding and decoding (like codecs.StreamReaderWriter and
> codecs.StreamRecoder do). Decoding callbacks have a
> different API. Which arguments should be passed to the
> decoding callback, and what is the decoding callback
> supposed to do?

I'd suggest adding another set of PyCodec_UnicodeDecode...()
APIs for this. We'd then have to augment the base classes of
the StreamCodecs to provide two attributes for .errors with
a fallback solution for the string case (i.s. "strict" can
still be used for both directions).

> One additional note: It is vital that errors is an
> assignable attribute of the StreamWriter.

It is already !
 
> Consider the XML example: For writing an XML DOM tree one
> StreamWriter object is used. When a text node is written,
> the error handling has to be set to
> codecs.xmlreplace_encode_errors, but inside a comment or
> processing instruction replacing unencodable characters
with
> charrefs is not possible, so here
codecs.raise_encode_errors
> should be used (or better a custom error handler that
raises
> an error that says "sorry, you can't have unencodable
> characters inside a comment")

Sure.
 
> BTW, should we continue the discussion in the i18n SIG
> mailing list? An email program is much more comfortable
than
> a HTML textarea! ;)

I'd rather keep the discussions on this patch here --
forking it off to the i18n sig will make it very hard to
follow up on it. (This HTML area is indeed damn small ;-)
 

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 12:18

Message:
Logged In: YES 
user_id=89016

One additional note: It is vital that errors is an
assignable attribute of the StreamWriter. 

Consider the XML example: For writing an XML DOM tree one
StreamWriter object is used. When a text node is written,
the error handling has to be set to
codecs.xmlreplace_encode_errors, but inside a comment or
processing instruction replacing unencodable characters with
charrefs is not possible, so here codecs.raise_encode_errors
should be used (or better a custom error handler that raises
an error that says "sorry, you can't have unencodable
characters inside a comment")

BTW, should we continue the discussion in the i18n SIG
mailing list? An email program is much more comfortable than
a HTML textarea! ;)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 11:59

Message:
Logged In: YES 
user_id=89016

How the callbacks work:

A PyObject * named errors is passed in. This may by NULL,
Py_None, 'strict', u'strict', 'ignore', u'ignore',
'replace', u'replace' or a callable object.
PyCodec_EncodeHandlerForObject maps all of these objects to
one of the three builtin error callbacks
PyCodec_RaiseEncodeErrors (raises an exception),
PyCodec_IgnoreEncodeErrors (returns an empty replacement
string, in effect ignoring the error),
PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
replacement character to signify to the encoder that it
should choose a suitable replacement character) or directly
returns errors if it is a callable object. When an
unencodable character is encounterd the error handling
callback will be called with the encoding name, the original
unicode object and the error position and must return a
unicode object that will be encoded instead of the offending
character (or the callback may of course raise an
exception). U+FFFD characters in the replacement string will 
be replaced with a character that the encoder chooses ('?'
in all cases).

The implementation of the loop through the string is done in
the following way. A stack with two strings is kept and the
loop always encodes a character from the string at the
stacktop. If an error is encountered and the stack has only
one entry (during encoding of the original string) the
callback is called and the unicode object returned is pushed
on the stack, so the encoding continues with the replacement
string. If the stack has two entries when an error is
encountered, the replacement string itself has an
unencodable character and a normal exception raised. When
the encoder has reached the end of it's current string there
are two possibilities: when the stack contains two entries,
this was the replacement string, so the replacement string
will be poppep from the stack and encoding continues with
the next character from the original string. If the stack
had only one entry, encoding is finished.

(I hope that's enough explanation of the API and implementation)

I have renamed the static ...121 function to all lowercase
names.

BTW, I guess PyUnicode_EncodeUnicodeEscape could be
reimplemented as PyUnicode_EncodeASCII with a \uxxxx
replacement callback.

PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
PyCodec_ReplaceEncodeErrors are globally visible because
they have to be available in _codecsmodule.c to wrap them as
Python function objects, but they can't be implemented in
_codecsmodule, because they need to be available to the
encoders in unicodeobject.c (through
PyCodec_EncodeHandlerForObject), but importing the codecs
module might result in an endless recursion, because
importing a module requires unpickling of the bytecode,
which might require decoding utf8, which ... (but this will
only happen, if we implement the same mechanism for the
decoding API)

I have not touched PyUnicode_TranslateCharmap yet, 
should this function also support error callbacks? Why would
one want the insert None into the mapping to call the callback?

A remaining problem is how to implement decoding error
callbacks. In Python 2.1 encoding and decoding errors are
handled in the same way with a string value. But with
callbacks it doesn't make sense to use the same callback for
encoding and decoding (like codecs.StreamReaderWriter and
codecs.StreamRecoder do). Decoding callbacks have a
different API. Which arguments should be passed to the
decoding callback, and what is the decoding callback
supposed to do?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 11:00

Message:
Logged In: YES 
user_id=38388

About the Py_UNICODE*data, int size APIs:
Ok, point taken.

In general, I think we ought to keep the callback feature as
open as possible, so passing in pointers and sizes would not
be very useful.

BTW, could you summarize how the callback works in a few
lines ?

About _Encode121: I'd name this _EncodeUCS1 since that's
what it is ;-)

About the new functions: I was referring to the new static
functions which you gave PyUnicode_... names. If these are
not supposed to turn into non-static functions, I'd rather
have them use lower case names (since that's how the Python
internals work too -- most of the times).


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:56

Message:
Logged In: YES 
user_id=89016

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments
> --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

Another problem is, that the callback requires a Python
object, so in the PyObject *version, the refcount is
incref'd and the object is passed to the callback. The
Py_UNICODE*/int version would have to create a new Unicode
object from the data.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 09:32

Message:
Logged In: YES 
user_id=89016

> * please don't place more than one C statement on one line
> like in:
> """
> +               unicode = unicode2; unicodepos =
> unicode2pos;
> +               unicode2 = NULL; unicode2pos = 0;
> """

OK, done!

> * Comments should start with a capital letter and be
> prepended
> to the section they apply to

Fixed!

> * There should be spaces between arguments in compares
> (a == b) not (a==b)

Fixed!

> * Where does the name "...Encode121" originate ?

encode one-to-one, it implements both ASCII and latin-1
encoding.

> * module internal APIs should use lower case names (you
> converted some of these to  PyUnicode_...() -- this is
> normally reserved for APIs which are either marked as
> potential candidates for the public API or are very
> prominent in the code)

Which ones? I introduced a new function for every old one,
that had a "const char *errors" argument, and a few new ones
in codecs.h, of those PyCodec_EncodeHandlerForObject is
vital, because it is used to map for old string arguments to
the new function objects. PyCodec_RaiseEncodeErrors can be
used in the encoder implementation to raise an encode error,
but it could be made static in unicodeobject.h so only those
encoders implemented there have access to it.

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments > --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

I look through the code and found no situation where the
Py_UNICODE*/int version is really used and having two
(PyObject *)s (the original and the replacement string),
instead of UNICODE*/int and PyObject * made the
implementation a little easier, but I can fix that.

> Please separate the errors.c patch from this patch -- it
> seems totally unrelated to Unicode.

PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with
four hex digits. I removed it.

I'll upload a revised patch as soon as it's done.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 07:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Sat Jun 23 21:00:06 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 23 Jun 2001 13:00:06 -0700
Subject: [Patches] [ python-Patches-434992 ] Cleanup of warning messages
Message-ID: <E15DtZe-0001ZR-00@usw-sf-web2.sourceforge.net>

Patches item #434992, was opened at 2001-06-20 18:51
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=434992&group_id=5470

>Category: None
>Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Robert Minsk (rminsk)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cleanup of warning messages

Initial Comment:
I just compiled Python-2.1 of the SGI using the latest
compilers (7.3.1.2m) with all the warning flags turned
on.  The following patch will get rid of most of the
warning messages.  I would like to see this
incorporated into the next release.  It is easier to
spot real problems when you do not have to sort thru
other warning messages.

The included patch does not include other optional
modules and the ones
setup.py finds by default.

I may have found 2 bugs in the process.  Please see
bugs 434989 and
434988.


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-23 13:00

Message:
Logged In: YES 
user_id=21627

Refiled as a patch.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=434992&group_id=5470


From noreply@sourceforge.net  Sat Jun 23 21:25:20 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 23 Jun 2001 13:25:20 -0700
Subject: [Patches] [ python-Patches-434992 ] Cleanup of warning messages
Message-ID: <E15Dty4-00071u-00@usw-sf-web3.sourceforge.net>

Patches item #434992, was opened at 2001-06-20 18:51
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=434992&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Robert Minsk (rminsk)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cleanup of warning messages

Initial Comment:
I just compiled Python-2.1 of the SGI using the latest
compilers (7.3.1.2m) with all the warning flags turned
on.  The following patch will get rid of most of the
warning messages.  I would like to see this
incorporated into the next release.  It is easier to
spot real problems when you do not have to sort thru
other warning messages.

The included patch does not include other optional
modules and the ones
setup.py finds by default.

I may have found 2 bugs in the process.  Please see
bugs 434989 and
434988.


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-23 13:25

Message:
Logged In: YES 
user_id=21627

I'm not sure these patches are all correct. For the 
patches introducing prototypes (e.g. tigetstr), isn't 
there some header file that offers these prototypes?

For cPickle, looking up string_atol seems to be completely 
unneeded. In turn, looking up string is unneeded, as well.
Likewise, don't just remove empty_str, remove the lookup 
as well.

On the save_float changes, you mask a range error: the 
values will be in 0..255, but you cast this value to char, 
which is potentially signed. I think p should be unsigned 
char*, and the casts should then be adjusted to unsigned.

Since cPickle changes will need careful review, I 
recommend to submit them as a separate patch.

Why is it necessary to cast the result of umask? Please 
put a comment in the code, explaining that in detail (i.e. 
"required for SGI" is not sufficient). Likewise for alarm 
(which returns long on Linux), and all other casts that 
were introduced to convert system call results or 
arguments.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-23 13:00

Message:
Logged In: YES 
user_id=21627

Refiled as a patch.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=434992&group_id=5470


From noreply@sourceforge.net  Sun Jun 24 22:59:38 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 24 Jun 2001 14:59:38 -0700
Subject: [Patches] [ python-Patches-401196 ] IPv6 patch against 2.0 CVS tree, as of 20010624
Message-ID: <E15EHus-00029G-00@usw-sf-web3.sourceforge.net>

Patches item #401196, was opened at 2000-08-16 05:53
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401196&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jun-ichiro itojun Hagino (itojun)
Assigned to: Nobody/Anonymous (nobody)
>Summary: IPv6 patch against 2.0 CVS tree, as of 20010624

Initial Comment:
 

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-24 14:59

Message:
Logged In: YES 
user_id=21627

I have uploaded a new version sent by itojun.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-09 12:15

Message:
Logged In: YES 
user_id=21627

On the API, I have the following comments: 
- Why is it necessary to introduce gethostbyname2? I 
recommend to give gethostbyname an optional argument for 
the address family.

- getaddrinfo, when raising a socket error, should include 
the EAI_ error number. Perhaps there should be a way tod 
istinguish EAI_ errnos from other errnos, e.g. by 
subclassing socket error.

Otherwise, the API of the C part looks good to me. Ih 
aven't looked at the Lib part, yet.

On the implementation:
- I still have problems building the code. Currently, I 
get the following rejects:
./Lib/BaseHTTPServer.py.rej
./Lib/ftplib.py.rej
./Lib/poplib.py.rej
./Lib/smtplib.py.rej
./Modules/socketmodule.c.rej
./Objects/fileobject.c.rej

- The fileobject.c chunk seems to be unnecessary.

- On the test problem: It occurs in
+ test -d -a -f /lib.a
./configure: test: too many arguments
which comes from ipv6libdir and ipv6libdir being empty.

- The WIDE files should be included in the Modules 
directory, as they are only used from socketmodule.c. In 
particular, addrinfo.h should not be installed.

- If you can, please include a patch to 
Doc/lib/libsocket.tex. If not, I will try to draft one.


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-05-30 10:34

Message:
Logged In: NO 

i looked at python-dev email.  the proposal (split patches)
looks fine, but the exact example given in python-dev email
is not reasonable.  i cannot just send out configure.in
change separately from source code changes, period.  i can
split patches for *.py files separately though.

there's more important issue, which is, APi changes for
Socket class.  i really hoped to get some comment on that
part.  i really appreciate your comments.
i would like to propose that once we nailed down API
changes, integrate the patch into the tree.
with all #ifdef INET6 in place there should be no impact on
IPv4-only builds.

i have trouble tracking python development (i'm not a
sourceforge expert!), so forgive me for delays in patch
submissions.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-05-18 08:29

Message:
Logged In: YES 
user_id=6380

See
http://mail.python.org/pipermail/python-dev/2001-May/014889.html
for comments from MvL.

I'm unassigning this from Fred, he has nothing to do with
this.

----------------------------------------------------------------------

Comment By: Jun-ichiro itojun Hagino (itojun)
Date: 2001-02-26 02:24

Message:
Logged In: YES 
user_id=63767

about /usr/bin/test argument: does linux /usr/bin/test have
-d <dir> support?  if not, we may need to change
configure.in slightly.

you are correct that fallback getaddrinfo/getnameinfo.c was
missing in the patch.  sorry.  a question i need to ask is,
do we need to supply Python function Socket.getaddrinfo on
platforms that do not have getaddrinfo(3)?

HAVE_ADDRINFO is used in Include/addrinfo.h, which is also
missing in the patch set i have submitted.

i've put the missing files into
http://www.itojun.org/diary/20001230/missing.shar.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-02-23 23:58

Message:
After a shallow review of this patch, I found the following issues:

configure.in does not need to list both enable and disable options.
 
When running configure, I got the following error message on Linux
checking whether to enable ipv6... yes
checking ipv6 stack type... linux-glibc
./configure: test: too many arguments
using libc

The call to /usr/bin/test should be corrected; I could not find out which specific  invocation caused the problem.

HAVE_ADDRINFO is not used. Perhaps getaddrinfo.c/getnameinfo.c is missing in the patch?


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-01-04 07:51

Message:
A new patch is available.  I've changed the subject accordingly.

Due to upload size restrictions, the patch is now at

http://www.itojun.org/diary/20001230/python-2.0-v6-20001230.diff.gz

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2000-12-30 07:25

Message:
I got *many* rejects when trying to apply this patch to today's CVS tree. I recommend that patches for generated files (config.h.in, configure) are not included in the patch because they outdate too easily.
A number of changes in this patch have already been done by somebody else; others just don't fit into the current code anymore (perhaps due to indentation changes?).
Anyway, I'll mark the patch as out-of-date. Please let me know when you upload a new version.

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2000-08-16 07:00

Message:
Postponed until Python 2.1 -- there's not enough time to review this and get it sufficiently tested on enough IPv6-connected platforms in time for 2.0, and we're already in feature freeze.  This should go into the tree very quickly once Python 2.0 has been released.

Assigned to myself to open it back up after Python 2.0.

----------------------------------------------------------------------

Comment By: Moshe Zadka (moshez)
Date: 2000-08-16 06:07

Message:
Assigned to Tim, since he's in charge of postponing
new features. I'm to timid to postpone it myself.

----------------------------------------------------------------------

Comment By: Jun-ichiro itojun Hagino (itojun)
Date: 2000-08-16 05:59

Message:
this is revised version of patch #101186 (now with my SourceForge accout...
i'm not familiar with the system here, so forgive my possible mistake).

1.6b1 patch applied mostly clean to 2.0.
It is confirmed that:
- 1.6b1 + IPv6 patch works fine on NetBSD 1.4.2 + KAME, and NetBSD 1.5
- 1.6b1 + IPv6 patch works fine on NetBSD 1.4.2 (NOT an IPv6 ready machine)
- 2.0 CVS tree + IPv6 patch works fine on NetBSD + KAME

forgot to attach the following into the diff - so i attach it (README.v6)
here as comment.  I have submitted the patch for 1.5.1, 1.5.2 and 1.6b1,
all hit a bad timing - bad luck.

contact: core@kame.net, or itojun@kame.net


---
IPv6-ready python 1.6
KAME Project
$KAME: README.v6,v 1.9 2000/08/15 02:40:38 itojun Exp $


This patchkit enables python 1.6 to perform AF_INET6 socket operations.
The only affected module is Modules/socketmodule.c.

Modules/socketmodule.c
	In most cases, IPv6 address can be placed where IPv4 address fits.

    sockaddr
	sockaddr tuple is formatted as follows:
	    IPv4: (host, port)
	    IPv6: socket class methods always generate
		    (host, port, flowinfo, scopeid).
		  socket class methods will accept 2, 3, or 4 tuple
		  (for backward compatibility).

	Compatibility warning: Some of the scripts assume that the sockaddr
	structure is 2 tuple, like:
	    host, port = sock.getpeername()
	this will fail if you are connected to IPv6 node.

    socket.getaddrinfo(host, port [, family, socktype, proto, flags])
	host: String or None
	port: String, Int or None
	family, socktype, proto, flags: Int, can be omitted

	Perform getaddrinfo(3).  Returns List of the following 5 tuple:
	    (family, socktype, proto, canonname, sockaddr)
	    family: Int
	    socktype: Int
	    proto: Int
	    canonname: String
	    sockaddr: sockaddr (see above)

	See Lib/httplib.py for typical usage on the client side.

    socket.getnameinfo(sockaddr, flags)
	sockaddr: sockaddr
	flags: Int

	Perform getnameinfo(3).  Returns the following 2 tuple:
	    host: String, numeric or hostname depending on flgags
	    port: String, numeric or portname depending on flgags

    socket.gethostbyname2(host, af)
	host: String
	af: Int

	Performs gethostbyname2(3).  Returns numeric address representation
	for "host".

    socket.gethostbyaddr(addr) (behavior change if IPv6 support is compiled in)
	addr: String

	Performs gethostbyaddr(3).  Returns string address representation for
	"addr".

	The function can take IPv6 numeric address as well.  This behavior
	is not problematical, because
	- if you pass numeric "addr" parameter, we can always identify address
	  family for it
	- return value is string address reprsentation, where IPv6 and IPv4
	  are not distinguishable.

     socket.bind(sa), socket.connect(sa) and others.
	(No behavior change, but be careful)

	See above for sockaddr format change.

	With Python "addr" portion of sockaddr (first element) can be string
	hostname.  When the string hostname resolved to numeric address, it
	will obey address family of the socket (which was specified when
	socket.socket() was called).
	If you give some string that does not give matching address family,
	you will get some error.
		s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
		# this is okay, if 'localhost' resolves to both IPv4/v6
		s.connect('localhost', 80)
		# this is not okay, of course
		s.connect('::1', 80)
		# this is not okay, as v6only.kame.net will not resolve to IPv4
		s.connect('v6only.kame.net', 80)

Lib/httplib.py
	IPv6 ready.  "host" in HTTP(host) will accept the following 3 forms:
		[host]:port
		host:port	there must be only single colon
		host
	This is to allow IPv6 numeric URL (http://[host]:port/) in documents.

	IMHO "host:port" parsing should be implemented in urllib.py, not here.

Lib/ftplib.py
	IPv6 ready.  This uses EPSV/EPRT on IPv6 ftp.  See RFC2428 for
	protocol details.

Lib/SocketServer.py
	IPv6 ready.  Wildcard bind on TCPServer, i.e. TCPServer(('', port)),
	will bind to wildcard socket on TCPServer.address_family.
	TCPServer.addresss_family is set to AF_INET by default, so ('', port)
	will usually bind AF_INET.

Lib/smtplib.py, Lib/telnetlib.py, Lib/poplib.py
	IPv6 ready.  Not much to say about protocol details - they just use
	TCP over IPv6.

configure
	Configure has extra option, --enable-ipv6 and --disable-ipv6.
	The option controls IPv6 support feature.

dynamic link issues in Modules/socketmodule.c
	Modules/socketmodule.c can be dynamically loaded only in the following
	situations:
	- getaddrinfo(3) and getnameinfo(3) are supplied by OS vendor in
	  libc, and libc is dynamic link library.
	- OS vendor is NOT supplying getaddrinfo(3) nor getnameinfo(3), and
	  You are configuring this package with --disable-ipv6.  In this case,
	  you'll be using missing/get{addr,name}info.c and they will refer to
	  gethostby{name,addr}.  gethostnameby{name,addr} can usually be found
	  in dynamic-linking libc.

	In other situations, such as the following, please link
	Modules/socketmodule.c into python itself.
	- getaddrinfo(3) and getnameinfo(3) are supplied by OS vendor, but
	  they are in statically linked library like libinet6.a.
	  (KAME falls into this category)

	python usually links Modules/socketmodule.c into python itself
	(due to its popularity) so there should be no problem.

restrictions
	- The patched tree will not use gethostbyname_r and other
	  thread-ready libraries.  Instead, it will use getaddrinfo() and
	  getnameinfo() throughout the operation.

todo
	- Patch bunch of library files in Lib/*.py.

compatibility issues with existing scripts
	If you disable IPv6 support (./configure --disable-ipv6), the
	patched code is mostly compatible with original Python
	(except files in "Lib" directory modified for dual stack support).

	User script may choke if:
	- IPv4/v6 dualstack libc is supplied, python is compiled for dual
	  stack, and script assumes some of IPv4-only behavior (especially
	  sockaddr)
	- IPv4/v6 dualstack libc is supplied, python is compiled for IPv4 only,
	  and script assumes some of IPv4-only behavior.
	  In this case, Python socket class itself does not support IPv6,
	  however, name resolution functions can return IPv6 names since
	  they use IPv6-ready libc functions!  I do not recommend this
	  configuration.
	- script assumes certain IPv4-only version behavior in Lib/*.py.

compilation
	If you use IPv6 features, it is assumed that you have working
	getaddrinfo() and getnameinfo() library functions.  We have noticed
	that some of IPv6 stack is shipped with broken getaddrinfo().  In
	such cases, use missing/get{addr,name}info.c instead (but then, you
	need to have working getipnodeby{name,addr}).

	If you compile this on IPv4-only machine without get{addr,name}info,
	missing/get{addr,name}info.c will be used.  They are from KAME IPv6
	distribution and is #ifdef'ed for IPv4 only support.  They are
	fairly complete implementation and you don't need to bother with
	bind 8.2 (bind 8.2 get{addr,name}info() has bugs).

	When compiling this kit on IPv6 node, you may need to specify some
	additional library paths or cpp defs. (like -linet6 or -DINET6)
	--enable-ipv6 will give you some warning, if the IPv6 stack is unknown
	to the "configure" script.  Currently, the following IPv6 stacks
	are officially supported (i.e. we've checked that the package works
	well):
	- KAME IPv6 stack, http://www.kame.net/

References
	RFC2553, for getaddrinfo(3) and getnameinfo(3).

Author contacts
	http://www.kame.net/
	mailto:core@kame.net


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=401196&group_id=5470


From noreply@sourceforge.net  Mon Jun 25 03:57:34 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 24 Jun 2001 19:57:34 -0700
Subject: [Patches] [ python-Patches-435971 ] Adds a UTF-7 codec
Message-ID: <E15EMZC-0006Jv-00@usw-sf-web2.sourceforge.net>

Patches item #435971, was opened at 2001-06-24 19:57
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435971&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
Assigned to: Nobody/Anonymous (nobody)
Summary: Adds a UTF-7 codec

Initial Comment:
This code adds UTF-7 (as described in RFC2152) support 
to Python.

The encoder is hardwired in _codecsmodule.c to not 
encode allowable whitespace and set O characters (see 
RFC2152). If there is a standardized way (keyword 
arguments?) of passing optional arguments to encode 
methods, it would be trivial to make it possible to do 
so. 

Otherwise the patch is pretty straight-forward, I 
think. It touches:

Objects/unicodeobject.c
Modules/_codecsmodule.c
Lib/test/test_unicode.py
Include/unicodeobject.h

and adds a new file:
Lib/encodings/utf_7.py


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435971&group_id=5470


From noreply@sourceforge.net  Mon Jun 25 12:24:28 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 25 Jun 2001 04:24:28 -0700
Subject: [Patches] [ python-Patches-435971 ] Adds a UTF-7 codec
Message-ID: <E15EUTk-0003Hv-00@usw-sf-web1.sourceforge.net>

Patches item #435971, was opened at 2001-06-24 19:57
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435971&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
>Assigned to: M.-A. Lemburg (lemburg)
Summary: Adds a UTF-7 codec

Initial Comment:
This code adds UTF-7 (as described in RFC2152) support 
to Python.

The encoder is hardwired in _codecsmodule.c to not 
encode allowable whitespace and set O characters (see 
RFC2152). If there is a standardized way (keyword 
arguments?) of passing optional arguments to encode 
methods, it would be trivial to make it possible to do 
so. 

Otherwise the patch is pretty straight-forward, I 
think. It touches:

Objects/unicodeobject.c
Modules/_codecsmodule.c
Lib/test/test_unicode.py
Include/unicodeobject.h

and adds a new file:
Lib/encodings/utf_7.py


----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-25 04:24

Message:
Logged In: YES 
user_id=38388

encode functions can have optionl arguments (see for example
the utf-16 codec or the charmap codec). They don't need to
be keyword arguments although this would make them easier to
handle in case we should ever want to change the API.

I'll look at the patch more closely later this week or
today.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435971&group_id=5470


From noreply@sourceforge.net  Mon Jun 25 19:10:15 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 25 Jun 2001 11:10:15 -0700
Subject: [Patches] [ python-Patches-435971 ] Adds a UTF-7 codec
Message-ID: <E15EaoR-0004w8-00@usw-sf-web1.sourceforge.net>

Patches item #435971, was opened at 2001-06-24 19:57
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435971&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Adds a UTF-7 codec

Initial Comment:
This code adds UTF-7 (as described in RFC2152) support 
to Python.

The encoder is hardwired in _codecsmodule.c to not 
encode allowable whitespace and set O characters (see 
RFC2152). If there is a standardized way (keyword 
arguments?) of passing optional arguments to encode 
methods, it would be trivial to make it possible to do 
so. 

Otherwise the patch is pretty straight-forward, I 
think. It touches:

Objects/unicodeobject.c
Modules/_codecsmodule.c
Lib/test/test_unicode.py
Include/unicodeobject.h

and adds a new file:
Lib/encodings/utf_7.py


----------------------------------------------------------------------

>Comment By: Brian Quinlan (bquinlan)
Date: 2001-06-25 11:10

Message:
Logged In: YES 
user_id=108973

OK, I see the utf-16 example. In a few weeks, when I have 
some time again, I might do the necessary changes and 
testing. It might even be nice to be able to specify a list 
of characters to escape, if they are dangerous for your 
application (of maybe it's just morning featuritis :-)).

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-25 04:24

Message:
Logged In: YES 
user_id=38388

encode functions can have optionl arguments (see for example
the utf-16 codec or the charmap codec). They don't need to
be keyword arguments although this would make them easier to
handle in case we should ever want to change the API.

I'll look at the patch more closely later this week or
today.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=435971&group_id=5470


From noreply@sourceforge.net  Mon Jun 25 20:20:04 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 25 Jun 2001 12:20:04 -0700
Subject: [Patches] [ python-Patches-436173 ] site.py shouldn't normcase() agressively
Message-ID: <E15Ebu0-000454-00@usw-sf-web3.sourceforge.net>

Patches item #436173, was opened at 2001-06-25 12:20
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436173&group_id=5470

Category: library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Fred L. Drake, Jr. (fdrake)
Assigned to: Jack Jansen (jackjansen)
Summary: site.py shouldn't normcase() agressively

Initial Comment:
The site module should not be using the normcase()
version of directory names as the final result in
sys.path; this patch only uses the normcase() version
for comparisons, but not sys.path contents.  The
intention is to allow Windows and MacOS users to see
the paths as they would in their native filesystem tools.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436173&group_id=5470


From noreply@sourceforge.net  Mon Jun 25 21:09:14 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 25 Jun 2001 13:09:14 -0700
Subject: [Patches] [ python-Patches-436193 ] SGI cores on 1.0 / 0
Message-ID: <E15Ecfa-0005NJ-00@usw-sf-web2.sourceforge.net>

Patches item #436193, was opened at 2001-06-25 13:09
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436193&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Robert Minsk (rminsk)
Assigned to: Nobody/Anonymous (nobody)
Summary: SGI cores on 1.0 / 0

Initial Comment:
This fix is in reference to bug 435026.  Please see bug
for complete history.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436193&group_id=5470


From noreply@sourceforge.net  Tue Jun 26 01:03:51 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 25 Jun 2001 17:03:51 -0700
Subject: [Patches] [ python-Patches-434992 ] Cleanup of warning messages
Message-ID: <E15EgKd-0002Bi-00@usw-sf-web2.sourceforge.net>

Patches item #434992, was opened at 2001-06-20 18:51
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=434992&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Robert Minsk (rminsk)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cleanup of warning messages

Initial Comment:
I just compiled Python-2.1 of the SGI using the latest
compilers (7.3.1.2m) with all the warning flags turned
on.  The following patch will get rid of most of the
warning messages.  I would like to see this
incorporated into the next release.  It is easier to
spot real problems when you do not have to sort thru
other warning messages.

The included patch does not include other optional
modules and the ones
setup.py finds by default.

I may have found 2 bugs in the process.  Please see
bugs 434989 and
434988.


----------------------------------------------------------------------

>Comment By: Robert Minsk (rminsk)
Date: 2001-06-25 17:03

Message:
Logged In: YES 
user_id=132786

> I'm not sure these patches are all correct. For the 
> patches introducing prototypes (e.g. tigetstr), isn't 
> there some header file that offers these prototypes?

That is what my patch fixed.  It was changing from
#ifdef sgi
extern char *tigetstr(char *);
extern char *tparm(char *instring, ...);
#endif

to
#ifdef __sgi
#include <term.h>
#endif

> For cPickle, looking up string_atol seems to be completely 
> unneeded. In turn, looking up string is unneeded, as well.
> Likewise, don't just remove empty_str, remove the lookup 
> as well.

Are you saying remove the
UNLESS (PyString_FromString("")) return -1;
also.  I guess I missed that.

>  I think p should be unsigned 
> char*, and the casts should then be adjusted to unsigned.

Should I fix that or are you looking into it?

> Why is it necessary to cast the result of umask? Please 
> put a comment in the code, explaining that in detail (i.e. 
> "required for SGI" is not sufficient).

Even on linux umask returns the type umask_t which may not
be
an int.  I could change the code to

int i;
umask_t u;

        if (!PyArg_ParseTuple(args, "i:umask", &i))
		return NULL;
u = umask((mask_t)i);

but is umask_t available on all machines?

This is not a critical warning, in fact on the SGI it is
only when you compile with -fullwarn and it's only an INFO
message.  The INFO messages are useful to
identifiy potential errors.  The casts should not add any
overhead.  This is one reason you should compile code on
multiple compilers.  Each compiler has it's own strength and
weaknesses at identifing problems.  This is not required for
SGI but just to clean up messages from other compilers
besides gcc.  Other vendors compilers also give other
warning messages.

> Likewise for
> alarm
> (which returns long on Linux), and all other casts that 
> were introduced to convert system call results or 
> arguments.

Linux (at least RedHat 6.2) does not return a long from
alarm, it returns an unsigned int.  Should I change the
signal_alarm to PyInt_FromUnsignedLong(alarm(t))?
Are there other platforms that return a signed long from
alarm?  I would rather cast
to the type the function currently uses.

The cast are just casting to the type the functions expect. 
This goes on if not explicity cast anywhy.  So why not get
rid of the implicit cast.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-23 13:25

Message:
Logged In: YES 
user_id=21627

I'm not sure these patches are all correct. For the 
patches introducing prototypes (e.g. tigetstr), isn't 
there some header file that offers these prototypes?

For cPickle, looking up string_atol seems to be completely 
unneeded. In turn, looking up string is unneeded, as well.
Likewise, don't just remove empty_str, remove the lookup 
as well.

On the save_float changes, you mask a range error: the 
values will be in 0..255, but you cast this value to char, 
which is potentially signed. I think p should be unsigned 
char*, and the casts should then be adjusted to unsigned.

Since cPickle changes will need careful review, I 
recommend to submit them as a separate patch.

Why is it necessary to cast the result of umask? Please 
put a comment in the code, explaining that in detail (i.e. 
"required for SGI" is not sufficient). Likewise for alarm 
(which returns long on Linux), and all other casts that 
were introduced to convert system call results or 
arguments.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-23 13:00

Message:
Logged In: YES 
user_id=21627

Refiled as a patch.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=434992&group_id=5470


From noreply@sourceforge.net  Tue Jun 26 04:01:13 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 25 Jun 2001 20:01:13 -0700
Subject: [Patches] [ python-Patches-436258 ] Some cleanup of the cPickle module
Message-ID: <E15Ej6H-0000eh-00@usw-sf-web1.sourceforge.net>

Patches item #436258, was opened at 2001-06-25 20:01
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436258&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Robert Minsk (rminsk)
Assigned to: Nobody/Anonymous (nobody)
Summary: Some cleanup of the cPickle module

Initial Comment:
While getting rid of compiler warning messages for
another non-gcc compiler I found some dead code in
cPickle.c.  The attached patch fixes some possible
type casting problems and removed some dead code.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436258&group_id=5470


From noreply@sourceforge.net  Tue Jun 26 14:30:28 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 26 Jun 2001 06:30:28 -0700
Subject: [Patches] [ python-Patches-436376 ] C API Request
Message-ID: <E15EsvE-0008T1-00@usw-sf-web3.sourceforge.net>

Patches item #436376, was opened at 2001-06-26 06:30
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436376&group_id=5470

Category: core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: C API Request

Initial Comment:
I would like to have the following 4 C API functions
added to
pystate.c so that advanced extension modules can more
easily examine the internal state of the Python
interpreter and its threads.

The intent of these functions is to provide a mechanism
for
gaining portable read-only access to all of the current
PyThreadState * structures.  The primary use of this
would be in advanced debugging applications.

Cheers,

Dave Beazley

-------------------------------------------

/* included in pystate.c */

PyInterpreterState *
PyInterpreterState_Head(void)
{
  return interp_head;
}

PyInterpreterState *
PyInterpreterState_Next(PyInterpreterState *interp) {
  return interp->next;
}

PyThreadState *
PyInterpreterState_ThreadHead(PyInterpreterState
*interp) {
  return interp->tstate_head;
}

PyThreadState *
PyThreadState_Next(PyThreadState *tstate) {
  return tstate->next;
}


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436376&group_id=5470


From noreply@sourceforge.net  Tue Jun 26 21:35:38 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 26 Jun 2001 13:35:38 -0700
Subject: [Patches] [ python-Patches-436496 ] Configuring UCS-4
Message-ID: <E15EzYg-0003DJ-00@usw-sf-web2.sourceforge.net>

Patches item #436496, was opened at 2001-06-26 13:35
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436496&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Martin v. Löwis (loewis)
Assigned to: Nobody/Anonymous (nobody)
Summary: Configuring UCS-4

Initial Comment:
This patch allows Py_UNICODE to be defined as both 2 
and 4 byte type, using --enable-unicode={ucs2,ucs2}. 
If nothing is specified, Py_UNICODE defaults to 
wchar_t if available.
The Unicode type itself, and the UTF-8 and UTF-16 
codecs have been adjusted to deal with both 
representations.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436496&group_id=5470


From offer@findmybusienss.com  Thu Jun 28 06:05:20 2001
From: offer@findmybusienss.com (offer@findmybusienss.com)
Date: Thu, 28 Jun 2001 05:05:20
Subject: [Patches] Sell Your Business?  Place your ads... Free Offer
Message-ID: <E15FXWL-0002Cc-00@mail.python.org>

Save your money & time!!
Place your LISTINGS or AD for FREE and Find your buyers..
--------------------------------------------------------------------------------------------------
Businesses for sale, Investment Properties,
Franchises, Homebased businesses, Distributors, Wholesales, M&A,
Other Special Businesses...

Visit our website http://www.findmybusiness.com

**30 days free trial for 4zip.net the broker's website listing services**
Find our features and maximize your business while you save a lot on your high cost of marketing.
Check our service at http://4zip.net
---------------------------------------------------------------------------------------------------
He will never let you down,  Trust in the Lord with all your heart...


From noreply@sourceforge.net  Thu Jun 28 19:24:39 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 28 Jun 2001 11:24:39 -0700
Subject: [Patches] [ python-Patches-433537 ] better cross-compilation support
Message-ID: <E15FgT1-0002M4-00@usw-sf-web1.sourceforge.net>

Patches item #433537, was opened at 2001-06-15 12:48
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433537&group_id=5470

Category: Build
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: michael shiplett (walrusmonkey)
Assigned to: Nobody/Anonymous (nobody)
Summary: better cross-compilation support

Initial Comment:
configure.in uses AC_TRY_RUN in several places without allowing for cached values to allow for cross-compilation. this patch uses the same approach as other parts of configure.in use.


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-06-28 11:24

Message:
Logged In: NO 

Useful patch. I used it to compile python 2.0.1 for the
Agenda VR3 PDA.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=433537&group_id=5470


From noreply@sourceforge.net  Sat Jun 30 06:21:58 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 29 Jun 2001 22:21:58 -0700
Subject: [Patches] [ python-Patches-436496 ] Configuring UCS-4
Message-ID: <E15GDCg-0008J7-00@usw-sf-web2.sourceforge.net>

Patches item #436496, was opened at 2001-06-26 13:35
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436496&group_id=5470

Category: None
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Martin v. Löwis (loewis)
Assigned to: Nobody/Anonymous (nobody)
Summary: Configuring UCS-4

Initial Comment:
This patch allows Py_UNICODE to be defined as both 2 
and 4 byte type, using --enable-unicode={ucs2,ucs2}. 
If nothing is specified, Py_UNICODE defaults to 
wchar_t if available.
The Unicode type itself, and the UTF-8 and UTF-16 
codecs have been adjusted to deal with both 
representations.


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-06-29 22:21

Message:
Logged In: YES 
user_id=21627

This patch has been committed as configure.in 1.222 and 
unicodeobject.c 2.98.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=436496&group_id=5470